Discover Top Posts Tagged with #datetimefeatures

Creating New Features

🔍 Introduction

Creating new features is one of the most impactful steps in feature engineering. While algorithms learn patterns, features tell the model what patterns to look for. By transforming raw data into meaningful representations, we help machine-learning models uncover relationships that are not immediately obvious.

Feature creation goes far beyond simple preprocessing — it uses domain knowledge, mathematical transformations, and behavioural insights. In this episode, we explore methods like polynomial features, interaction terms, binning, datetime extraction, and rolling statistics, with practical examples from finance, e-commerce, and healthcare.

1. Polynomial Features

Polynomial features introduce power transformations that help models capture nonlinear relationships.

✔ What it does

Adds squared, cubic, or higher-degree versions of features

Adds interactions between features

Helps simple models (e.g., linear regression) learn complex curves

✔ Example

If you have a feature “age”, you can create: age², age³ — capturing nonlinear growth trends.

✔ Use cases

Finance: modelling compound growth effects

Healthcare: capturing nonlinear relationships between age and disease risk

Engineering: modelling stress vs. pressure curves

2. Interaction Features

Interaction terms represent how two or more features influence each other.

✔ What it does

Multiplies or combines two features

Highlights relationships not visible individually

✔ Example

price × number_of_items Shows how spending behaves at different price points.

✔ Use cases

E-commerce: modelling promotion × customer segment

Healthcare: medication dosage × weight

Finance: interest rate × loan amount

3. Binning (Discretization)

Converts continuous variables into grouped categories.

✔ Why it’s useful

Reduces noise

Highlights thresholds

Makes patterns more interpretable

✔ Example

Age → 0–18, 19–35, 36–60, 60+

✔ Use cases

Credit risk: income brackets

Marketing: customer age groups

Education: score bands

4. Datetime Feature Extraction

Datetime columns contain hidden features that can dramatically improve model performance.

✔ Extractable elements

Hour

Day

Day of week

Month

Quarter

Weekend/weekday

Season

Time since last event

✔ Use cases

Finance: identifying seasonality or high-volatility months

E-commerce: peak shopping hours, holiday spikes

Healthcare: hourly patient inflow patterns, flu season peaks

5. Rolling & Aggregation Features

Used heavily in time-series and behavioural modelling.

✔ What it does

Generates:

Rolling mean

Rolling sum

Rolling count

Exponential moving averages

Lag features (previous day/week/month values)

✔ Use cases

Finance: moving averages for stock price trends

E-commerce: previous 7-day purchase patterns

Healthcare: patient vital sign trends over time

6. Domain-Specific Feature Examples

Finance

Volatility over last 30 days

Transaction frequency

Ratio of credit used to credit limit

Time since last default

E-Commerce

Session duration

Number of items viewed

Discount percentage

Cart abandonment indicator

Click-through behaviour patterns

Healthcare

BMI (weight/height²)

Risk scores combining multiple vitals

Medication adherence ratio

Time since last appointment

Change in vital signs over time

7. When to Avoid Creating Too Many Features

Too many features may cause overfitting

Polynomial features can explode dimensionality

Unsupervised feature creation without domain understanding may create noise

Highly correlated new features may reduce model stability

8. Best Practices for Feature Creation

Start simple — do not create hundreds of features at once

Use domain knowledge wherever possible

Validate new features with cross-validation

Keep track of transformations in pipelines

Remove features that do not improve performance

Avoid data leakage (especially with rolling features)

#feature-creation #featureengineering #ml-features #polynomialfeatures #datetimefeatures #domain-knowledge #interactionterms #data-transformation #timeseries-features #datascience

Creating New Features

🔍 Introduction

1. Polynomial Features

Polynomial features introduce power transformations that help models capture nonlinear relationships.

✔ What it does

Adds squared, cubic, or higher-degree versions of features

Adds interactions between features

Helps simple models (e.g., linear regression) learn complex curves

✔ Example

If you have a feature “age”, you can create: age², age³ — capturing nonlinear growth trends.

✔ Use cases

Finance: modelling compound growth effects

Healthcare: capturing nonlinear relationships between age and disease risk

Engineering: modelling stress vs. pressure curves

2. Interaction Features

Interaction terms represent how two or more features influence each other.

✔ What it does

Multiplies or combines two features

Highlights relationships not visible individually

✔ Example

price × number_of_items Shows how spending behaves at different price points.

✔ Use cases

E-commerce: modelling promotion × customer segment

Healthcare: medication dosage × weight

Finance: interest rate × loan amount

3. Binning (Discretization)

Converts continuous variables into grouped categories.

✔ Why it’s useful

Reduces noise

Highlights thresholds

Makes patterns more interpretable

✔ Example

Age → 0–18, 19–35, 36–60, 60+

✔ Use cases

Credit risk: income brackets

Marketing: customer age groups

Education: score bands

4. Datetime Feature Extraction

Datetime columns contain hidden features that can dramatically improve model performance.

✔ Extractable elements

Hour

Day

Day of week

Month

Quarter

Weekend/weekday

Season

Time since last event

✔ Use cases

Finance: identifying seasonality or high-volatility months

E-commerce: peak shopping hours, holiday spikes

Healthcare: hourly patient inflow patterns, flu season peaks

5. Rolling & Aggregation Features

Used heavily in time-series and behavioural modelling.

✔ What it does

Generates:

Rolling mean

Rolling sum

Rolling count

Exponential moving averages

Lag features (previous day/week/month values)

✔ Use cases

Finance: moving averages for stock price trends

E-commerce: previous 7-day purchase patterns

Healthcare: patient vital sign trends over time

6. Domain-Specific Feature Examples

Finance

Volatility over last 30 days

Transaction frequency

Ratio of credit used to credit limit

Time since last default

E-Commerce

Session duration

Number of items viewed

Discount percentage

Cart abandonment indicator

Click-through behaviour patterns

Healthcare

BMI (weight/height²)

Risk scores combining multiple vitals

Medication adherence ratio

Time since last appointment

Change in vital signs over time

7. When to Avoid Creating Too Many Features

Too many features may cause overfitting

Polynomial features can explode dimensionality

Unsupervised feature creation without domain understanding may create noise

Highly correlated new features may reduce model stability

8. Best Practices for Feature Creation

Start simple — do not create hundreds of features at once

Use domain knowledge wherever possible

Validate new features with cross-validation

Keep track of transformations in pipelines

Remove features that do not improve performance

Avoid data leakage (especially with rolling features)

#feature-creation #featureengineering #ml-features #polynomialfeatures #datetimefeatures #domain-knowledge #interactionterms #data-transformation #timeseries-features #datascience

#datetimefeatures

Trending Tags

Recently Viewed Tags

#datetimefeatures