
According to the Microsoft Azure AI Fundamentals (AI-900) official study guide and the Microsoft Learn module “Prepare data for machine learning”, feature engineering refers to the process of transforming raw data into meaningful features that can be effectively used by machine learning algorithms. This includes steps such as scaling, normalization, encoding categorical variables, handling missing values, and creating new features derived from existing ones.
The question states:
“Ensuring that the numeric variables in training data are on a similar scale.”
This directly describes a data normalization or standardization step, which is a core component of feature engineering. The purpose of scaling numeric variables is to ensure that all features contribute equally to the model’s learning process. Without normalization, features with large numeric ranges (such as “income in dollars”) could dominate smaller-scale features (like “age in years”), leading to biased model performance.
In Azure Machine Learning, this is typically done using the Normalize Data module or transformations in the data preparation stage. Microsoft Learn explains that normalization and feature scaling are applied before model training to ensure that gradient-based algorithms (such as regression or neural networks) converge more efficiently and produce more accurate results.
The other options are not correct:
Data ingestion refers to collecting and importing data into a system.
Feature selection involves choosing the most relevant features, not scaling them.
Model training is the phase where the algorithm learns patterns from the processed data, which occurs after feature engineering.
Therefore, ensuring that numeric variables are on a similar scale is a step in Feature Engineering.