Partitioned Models
Partitioned models enable training separate transparent models on different data segments, then combining them for improved accuracy and deeper insights. Perfect for datasets with natural groupings or heterogeneous patterns.
Overview
Partitioned models are a powerful technique for handling datasets where different segments exhibit distinct patterns. Instead of training one model on all data, partitioned models train specialized models for each segment, then intelligently route predictions to the appropriate model.
Xplainable provides built-in PartitionedClassifier and PartitionedRegressor classes that handle routing, prediction, and fallback automatically.
Key Benefits
- Specialized models: Each segment gets a model optimized for its specific patterns
- Improved accuracy: Often outperforms single models by capturing segment-specific relationships
- Deeper insights: Understand how different segments behave and what drives their outcomes
- Robust fallback: Automatic fallback to a
'__dataset__'model for unknown segments
How Partitioned Models Work
Partitioned models are not ensemble models. Instead of combining predictions from multiple models, they:
- Route data to the appropriate model based on the
partition_oncolumn value - Train specialized models on homogeneous data subsets
- Maintain transparency -- each prediction comes from a single, explainable model
- Provide fallback -- unknown partition values automatically use the
'__dataset__'model
- Geographic segmentation -- Different regions have different patterns
- Customer segments -- B2B vs B2C, different industries, etc.
- Time-based segments -- Seasonal models, weekday vs weekend
- Product categories -- Different products have different drivers
- Heterogeneous data -- Mixed populations with distinct characteristics
The PartitionedClassifier Class
Constructor
Parameters:
partition_on(str, optional): The column name used to route predictions to the correct sub-model.
Key Methods
Key Attributes
Classification Example
The PartitionedRegressor Class
Constructor
Parameters:
partition_on(str, optional): The column name used to route predictions.
Key Methods
Regression Example
Explaining Partitioned Models
Each partition can be explained independently:
Rapid Refitting with Partitioned Models
You can apply update_feature_params() to individual partition models for fine-tuning:
Dynamic Partitioning
Adapt model complexity based on segment characteristics:
Performance Comparison
Partitioned vs Single Model
Best Practices
- Business logic: Partitions should make business sense
- Sufficient data: Each partition needs enough samples (typically 100+)
- Distinct patterns: Segments should have genuinely different relationships with the target
- Stability: Partition values should be consistent over time
- Interpretability: Partitions should be explainable to stakeholders
- Always include a
'__dataset__'partition as a fallback for unknown segment values - The
partition_oncolumn must be present in the prediction data for routing to work - Target maps must match across all partition models (enforced automatically)
- Use
explain()per partition to understand segment-specific behavior - Combine with XEvolutionaryNetwork for regression partition models to maximize performance
Next Steps
- Explore rapid refitting for real-time partition optimization
- Learn about XEvolutionaryNetwork for advanced weight optimization
- Check out custom transformers for partition-specific preprocessing