Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
270 changes: 255 additions & 15 deletions docs/advanced/numerical-embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,63 @@ The `NumericalEmbedding` layer processes each numerical feature through two para
- Adaptively weights continuous vs. discrete representations
- Learns optimal combination per feature and dimension

### Periodic Embeddings (`PeriodicEmbedding`)

The `PeriodicEmbedding` layer uses trigonometric functions to capture cyclical patterns:

1. **Frequency Learning**:
- Learns optimal frequencies for each feature
- Supports multiple initialization strategies (uniform, log-uniform, constant)
- Frequencies are constrained to be positive

2. **Periodic Transformation**:
- Applies sin/cos transformations: `sin(freq * x)` and `cos(freq * x)`
- Captures cyclical patterns and smooth, differentiable representations
- Particularly effective for features with natural periodicity

3. **Post-Processing**:
- Optional MLP for further feature transformation
- Residual connections for stability
- Batch normalization and dropout for regularization

### PLE Embeddings (`PLEEmbedding`)

The `Parameterized Linear Expansion` layer provides learnable piecewise linear transformations:

1. **Segment Learning**:
- Learns optimal segment boundaries for each feature
- Supports uniform and quantile-based initialization
- Each segment has learnable slope and intercept

2. **Piecewise Linear Transformation**:
- Applies different linear transformations to different input ranges
- Captures complex non-linear patterns through piecewise approximation
- Supports various activation functions (ReLU, Sigmoid, Tanh)

3. **Flexible Architecture**:
- Configurable number of segments for precision vs. efficiency trade-off
- Optional MLP and residual connections
- Batch normalization and dropout for regularization

### Advanced Combined Embeddings (`AdvancedNumericalEmbedding`)

The `AdvancedNumericalEmbedding` layer combines multiple embedding approaches:

1. **Multi-Modal Processing**:
- Supports any combination of periodic, PLE, and dual-branch embeddings
- Learnable gates to combine different embedding types
- Adaptive weighting per feature and dimension

2. **Flexible Configuration**:
- Choose from `['periodic', 'ple', 'dual_branch']` embedding types
- Configure each embedding type independently
- Enable/disable gating mechanism

3. **Optimal Performance**:
- Empirically closes the gap between MLPs/Transformers and tree-based baselines
- Particularly effective on tabular tasks
- Maintains interpretability while improving performance

```
Input value
┌────────┐ ┌────────┐
Expand Down Expand Up @@ -220,6 +277,35 @@ This approach is ideal for:
| `numerical_dropout_rate` | float | 0.1 | Dropout rate for regularization |
| `numerical_use_batch_norm` | bool | True | Apply batch normalization |

### Periodic Embeddings

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `use_periodic_embedding` | bool | False | Enable periodic embeddings |
| `num_frequencies` | int | 4 | Number of frequency components |
| `frequency_init` | str | "log_uniform" | Frequency initialization method |
| `min_frequency` | float | 1e-4 | Minimum frequency for initialization |
| `max_frequency` | float | 1e2 | Maximum frequency for initialization |
| `use_residual` | bool | True | Use residual connections |

### PLE Embeddings

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `use_ple_embedding` | bool | False | Enable PLE embeddings |
| `num_segments` | int | 8 | Number of linear segments |
| `segment_init` | str | "uniform" | Segment initialization method |
| `ple_activation` | str | "relu" | Activation function for PLE |
| `use_residual` | bool | True | Use residual connections |

### Advanced Combined Embeddings

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `use_advanced_combined_embedding` | bool | False | Enable combined embeddings |
| `embedding_types` | list | ["dual_branch"] | List of embedding types to use |
| `use_gating` | bool | True | Use learnable gates to combine embeddings |

### Global Embeddings

| Parameter | Type | Default | Description |
Expand Down Expand Up @@ -267,24 +353,23 @@ features_specs = {
name="income",
feature_type=FeatureType.FLOAT_RESCALED,
use_embedding=True,
embedding_type="periodic", # Use periodic embedding for income
embedding_dim=8,
num_bins=15,
init_min=0,
init_max=1000000
num_frequencies=4
),
"debt_ratio": NumericalFeature(
name="debt_ratio",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="ple", # Use PLE for debt ratio
embedding_dim=4,
num_bins=8,
init_min=0,
init_max=1 # Ratio typically between 0-1
num_segments=8
),
"credit_score": NumericalFeature(
name="credit_score",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="dual_branch", # Traditional dual-branch
embedding_dim=6,
num_bins=10,
init_min=300,
Expand All @@ -294,21 +379,160 @@ features_specs = {
name="payment_history",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="combined", # Combined approach
embedding_dim=8,
num_bins=5,
init_min=0,
init_max=1 # Simplified score between 0-1
num_frequencies=4,
num_segments=8
)
}

# Create preprocessing model
# Create preprocessing model with advanced embeddings
preprocessor = PreprocessingModel(
path_data="data/financial_data.csv",
features_specs=features_specs,
use_numerical_embedding=True,
numerical_mlp_hidden_units=16,
numerical_dropout_rate=0.2, # Higher dropout for financial data
numerical_use_batch_norm=True
use_advanced_numerical_embedding=True,
use_periodic_embedding=True,
use_ple_embedding=True,
use_advanced_combined_embedding=True,
embedding_dim=8,
num_frequencies=4,
num_segments=8,
dropout_rate=0.2, # Higher dropout for financial data
use_batch_norm=True
)
```

### Healthcare Patient Analysis with Periodic Embeddings

```python
from kdp import PreprocessingModel
from kdp.features import NumericalFeature
from kdp.enums import FeatureType

# Define patient features with periodic embeddings for cyclical patterns
features_specs = {
"age": NumericalFeature(
name="age",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="periodic",
embedding_dim=8,
num_frequencies=6, # More frequencies for age patterns
kwargs={
"frequency_init": "constant",
"min_frequency": 1e-3,
"max_frequency": 1e2
}
),
"bmi": NumericalFeature(
name="bmi",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="ple",
embedding_dim=6,
num_segments=12, # More segments for BMI precision
kwargs={
"segment_init": "uniform",
"ple_activation": "relu"
}
),
"blood_pressure": NumericalFeature(
name="blood_pressure",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="combined",
embedding_dim=10,
num_frequencies=4,
num_segments=8,
kwargs={
"embedding_types": ["periodic", "ple"],
"use_gating": True
}
)
}

# Create preprocessing model
preprocessor = PreprocessingModel(
path_data="data/patient_data.csv",
features_specs=features_specs,
use_advanced_numerical_embedding=True,
use_periodic_embedding=True,
use_ple_embedding=True,
use_advanced_combined_embedding=True,
embedding_dim=8,
num_frequencies=6,
num_segments=12,
frequency_init="constant",
segment_init="uniform",
ple_activation="relu",
use_gating=True
)
```

### Time Series Forecasting with PLE Embeddings

```python
from kdp import PreprocessingModel
from kdp.features import NumericalFeature
from kdp.enums import FeatureType

# Define time series features with PLE embeddings for trend capture
features_specs = {
"temperature": NumericalFeature(
name="temperature",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="periodic", # Periodic for seasonal patterns
embedding_dim=12,
num_frequencies=8,
kwargs={
"frequency_init": "log_uniform",
"min_frequency": 1e-4,
"max_frequency": 1e3
}
),
"humidity": NumericalFeature(
name="humidity",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="ple", # PLE for humidity trends
embedding_dim=8,
num_segments=16,
kwargs={
"segment_init": "quantile",
"ple_activation": "sigmoid"
}
),
"pressure": NumericalFeature(
name="pressure",
feature_type=FeatureType.FLOAT_NORMALIZED,
use_embedding=True,
embedding_type="combined", # Combined for complex patterns
embedding_dim=10,
num_frequencies=6,
num_segments=12,
kwargs={
"embedding_types": ["periodic", "ple", "dual_branch"],
"use_gating": True
}
)
}

# Create preprocessing model
preprocessor = PreprocessingModel(
path_data="data/weather_data.csv",
features_specs=features_specs,
use_advanced_numerical_embedding=True,
use_periodic_embedding=True,
use_ple_embedding=True,
use_advanced_combined_embedding=True,
embedding_dim=10,
num_frequencies=8,
num_segments=16,
frequency_init="log_uniform",
segment_init="quantile",
ple_activation="sigmoid",
use_gating=True
)
```

Expand Down Expand Up @@ -349,27 +573,43 @@ preprocessor = PreprocessingModel(
1. **Choose the Right Embedding Type**
- Use individual embeddings for interpretability and precise control
- Use global embeddings for efficiency with many numerical features
- Use periodic embeddings for features with cyclical patterns (time, angles, seasons)
- Use PLE embeddings for features with complex non-linear relationships
- Use combined embeddings for maximum performance on challenging datasets

2. **Distribution-Aware Initialization**
- Set `init_min` and `init_max` based on your data's actual distribution
- Use domain knowledge to set meaningful boundary points
- Initialize closer to anticipated feature range for faster convergence
- For periodic embeddings, use log-uniform initialization for better frequency distribution
- For PLE embeddings, use quantile-based initialization for data-driven segment boundaries

3. **Dimensionality Guidelines**
- Start with `embedding_dim` = 4-8 for simple features
- Use 8-16 for complex features with non-linear patterns
- For global embeddings, scale with the number of features (16-64)
- For periodic embeddings, use 4-8 frequencies for most features
- For PLE embeddings, use 8-16 segments for smooth approximations

4. **Performance Tuning**
- Increase `num_bins` for more granular discrete representations
- Adjust `mlp_hidden_units` to 2-4x the embedding dimension
- Use batch normalization for faster, more stable training
- Adjust dropout based on dataset size (higher for small datasets)
- For periodic embeddings, experiment with different frequency ranges
- For PLE embeddings, try different activation functions (relu, sigmoid, tanh)

5. **Advanced Embedding Strategies**
- **Periodic Embeddings**: Best for time-based features, angles, cyclical patterns
- **PLE Embeddings**: Best for features with piecewise linear relationships
- **Combined Embeddings**: Best for maximum performance, especially on tabular tasks
- **Mixed Strategies**: Use different embedding types for different features based on their characteristics

5. **Combine with Other KDP Features**
6. **Combine with Other KDP Features**
- Pair with distribution-aware encoding for optimal numerical handling
- Use with tabular attention to learn cross-feature interactions
- Combine with feature selection for automatic dimensionality reduction
- Use with transformer blocks for advanced feature interactions

## 🔗 Related Topics

Expand Down
Loading
Loading