Feature Engineering for Machine Learning

Feature Engineering for Machine Learning

image description

What you will learn

  • Pre-process variables that contain missing data

  • Capture information from the missing values in your data

  • Work successfully with categorical variables

  • Convert labels of categorical variables into numbers that capture insight

  • Manipulate and transform numerical variables to extract the most predictive power

  • Transform date variables into insightful features

  • Apply different techniques of variable transformation to make features more predictive

  • Confidently clean and transform data sets for successful machine learning model building


Section 1: Introduction

Section 2: Variable Types

Section 3: Engineering missing values (NA) in numerical variables

Section 4: Engineering missing values (NA) in categorical variables

Section 5: Engineering outliers in numerical variables

Section 6: Engineering rare values in categorical variables

Section 7: Engineer labels of categorical variables

Section 8: Engineering mixed variables

Section 9: Engineering dates

Section 10: Gaussian Transformation

Course Description

From beginner to advanced


  • A Python installation
  • Jupyter notebook installation
  • Python coding skills
  • Some experience with Numpy and Pandas
  • Familiarity with Machine Learning algorithms
  • Familiarity with Scikit-Learn


Learn how to engineer features and build more powerful machine learning models.

This is the most comprehensive, yet easy to follow, course for feature engineering available online. Throughout this course you will learn a variety of techniques used worldwide for data cleaning and feature transformation, gathered from data competition websites, white papers, scientific articles, and from the instructor’s experience as a Data Scientist.

You will have at your fingertips, altogether in one place, a variety of techniques that you can apply to capture as much insight as possible with the features of your data set.   

The course starts describing the most simple and widely used methods for feature engineering, and then describes more advanced and innovative techniques that automatically capture insight from your variables. It includes an explanation of the feature engineering technique, the rationale to use it, the advantages and limitations, and the assumptions the technique makes on the data. It also includes full code that you can then take on and apply to your own data sets.

This course is suitable for complete beginners in data science looking to learn their first steps into data pre-processing, as well as for intermediate and advanced data scientists seeking to level up their skills.

With more than 50 lectures and 10 hours of video this comprehensive course covers every aspect of variable transformation. The course includes several techniques for missing data imputation, categorical variable encoding, numerical variable transformation and discretisation, as well as how to extract useful features from date and time variables. Throughout the course we use python as our main language, and open source packages for feature engineering, including the package "Feature Engine" which was specifically designed for this course.

This course comes with a 30 day money back guarantee. In the unlikely event you don't find this course useful, you'll get your money back.

So what are you waiting for? Enrol today, embrace the power of feature engineering and build better machine learning models.

Who this course is for:

  • Beginner Data Scientists who want to get started in pre-processing datasets to build machine learning models
  • Intermediate Data Scientists who want to level up their experience in feature engineering for machine learning
  • Advanced Data Scientists who want to discover new and innovative techniques for feature engineering
  • Software engineers, mathematicians and academics switching careers into data science
  • Software engineers, mathematicians and academics stepping into data science
  • Data Scientists who want to try different feature engineering techniques on data competitions