Skip to main content
Version: v1.4.1

XRegressor

Transparent Regression

XRegressor provides transparent regression modeling with real-time explainability. Unlike black-box models, you get instant insights into how predictions are made for continuous target variables.

Overview

The XRegressor is xplainable's transparent regression model that uses the same feature-wise ensemble approach as XClassifier, but optimized for continuous target variables. It provides complete transparency while maintaining competitive performance with traditional regression models.

Important note on performance: XRegressor alone can be a weak predictor. To get strong predictive power, use the optimise_tail_sensitivity() method and/or fit an XEvolutionaryNetwork with Tighten and Evolve layers on top of the model.

Key Features

Real-time explainability

Inspect any prediction with one call. No surrogate fits, no Shapley sampling.

Rapid refitting

Update a single feature's tree in milliseconds — no full retrain required.

Feature-wise ensemble

Each feature contributes through its own decision tree, providing granular insights.

Prediction bounds

Built-in prediction range constraints for realistic and bounded outputs.

Quick Start

1from xplainable.core.models import XRegressor
2from sklearn.model_selection import train_test_split
3import pandas as pd
4
5# Load and prepare data
6data = pd.read_csv('data.csv')
7X, y = data.drop('target', axis=1), data['target']
8X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
9
10# Train model
11model = XRegressor()
12model.fit(X_train, y_train)
13
14# Improve performance
15model.optimise_tail_sensitivity(X_train, y_train)
16
17# Make predictions
18y_pred = model.predict(X_test)
19
20# Get explanations
21model.explain()

Constructor Parameters

1import numpy as np
2
3model = XRegressor(
4 max_depth=8,
5 min_info_gain=0.0001,
6 min_leaf_size=0.0001,
7 ignore_nan=False,
8 weight=1,
9 power_degree=1,
10 sigmoid_exponent=0,
11 tail_sensitivity=1.0,
12 prediction_range=(-np.inf, np.inf)
13)
max_depthintdefault: 8
Maximum depth of each feature's decision tree
min_info_gainfloatdefault: 0.0001
Minimum information gain required to make a split
min_leaf_sizefloatdefault: 0.0001
Minimum proportion of samples required to make a split
ignore_nanbooldefault: False
Whether to ignore NaN/missing values during training
weightfloatdefault: 1
Activation function weight parameter
power_degreeintdefault: 1
Power degree for activation function
sigmoid_exponentintdefault: 0
Sigmoid exponent for activation function
tail_sensitivityfloatdefault: 1.0
Weight applied to divisive leaf nodes
prediction_rangetupledefault: (-np.inf, np.inf)
Lower and upper bounds for predictions

Methods

fit()

Fits the model to the training data. Internally normalises leaf weights using Ridge regression.

1model.fit(x, y, id_columns=[], column_names=None, target_name='target', alpha=0.1)
xDataFrame or ndarrayRequired
Feature matrix
ySeries or ndarrayRequired
Target values
id_columnslistdefault: []
Columns to exclude from training (e.g. IDs)
column_nameslistdefault: None
Column names when passing a numpy array
target_namestrdefault: 'target'
Name for the target column when passing a numpy array
alphafloatdefault: 0.1
Controls the number of possible splits relative to unique values

Returns the fitted XRegressor instance.

predict()

Predicts the target value for each row. Predictions are clipped to prediction_range.

1y_pred = model.predict(X_test)

Returns a numpy array of predicted values.

evaluate()

Returns a dictionary of regression metrics.

1metrics = model.evaluate(X_test, y_test)

The returned dictionary contains:

  • Explained Variance
  • MAE (Mean Absolute Error)
  • MAPE (Mean Absolute Percentage Error)
  • MSE (Mean Squared Error)
  • RMSE (Root Mean Squared Error)
  • RMSLE (Root Mean Squared Log Error, NaN if inputs are negative)
  • R2 Score

explain()

Renders an interactive Altair chart showing feature importances and per-feature contribution profiles. Takes no data input -- it visualises the fitted model's internal profile.

1model.explain()
2
3# Optional: control numeric label rounding
4model.explain(label_rounding=3)
note

Requires the altair package. Install with pip install xplainable[plotting].

optimise_tail_sensitivity()

Automatically optimises the tail_sensitivity parameter at a global level to minimise MAE.

1model.optimise_tail_sensitivity(X_train, y_train)

Returns the optimised XRegressor instance.

update_feature_params()

Updates model parameters for a subset of features without retraining from scratch.

1model.update_feature_params(
2 features=['feature1', 'feature2'],
3 max_depth=5,
4 min_info_gain=0.01,
5 min_leaf_size=0.01,
6 ignore_nan=True,
7 weight=0.8,
8 power_degree=2,
9 sigmoid_exponent=1,
10 tail_sensitivity=0.5
11)

All parameter arguments are optional -- only the ones you pass will be updated.

Returns the updated XRegressor instance.

warning

If you have already optimised the model with an XEvolutionaryNetwork, calling update_feature_params() will overwrite the optimised weights and the network will need to be re-run.

feature_importances (property)

Returns a dictionary mapping feature names to their normalised importance scores.

1importances = model.feature_importances # property -- no parentheses

profile (property)

Returns the full model profile as a dictionary with keys 'base_value', 'numeric', and 'categorical'.

1prof = model.profile
2print(prof['base_value'])

_transform()

Transforms input data into per-feature contribution scores.

1contributions = model._transform(X_test) # numpy array, shape (n_samples, n_features)

Advanced Optimisation with XEvolutionaryNetwork

For stronger predictive performance, use the XEvolutionaryNetwork with Tighten and/or Evolve layers:

1from xplainable.core.optimisation.genetic import XEvolutionaryNetwork
2from xplainable.core.optimisation.layers import Tighten, Evolve
3
4# Train base model
5model = XRegressor()
6model.fit(X_train, y_train)
7model.optimise_tail_sensitivity(X_train, y_train)
8
9# Create evolutionary network
10xnetwork = XEvolutionaryNetwork(model, apply_range=True)
11
12# Add optimisation layers
13xnetwork.add_layer(Tighten(
14 iterations=100,
15 learning_rate=0.03,
16 early_stopping=20
17))
18xnetwork.add_layer(Evolve(
19 mutations=100,
20 generations=50,
21 max_generation_depth=10,
22 max_severity=0.5,
23 max_leaves=20,
24 early_stopping=10
25))
26
27# Fit network to data
28xnetwork.fit(X_train, y_train)
29
30# Run optimisation
31xnetwork.optimise()
32
33# The model is now optimised in-place
34y_pred = model.predict(X_test)

Tighten Layer

A leaf boosting algorithm that iteratively identifies the leaf node with the greatest potential impact and incrementally adjusts its weight.

1Tighten(
2 iterations=100, # number of boosting iterations
3 learning_rate=0.03, # step size (0.001 to 1)
4 early_stopping=None # stop if no improvement after n iterations
5)

Evolve Layer

A genetic algorithm that mutates leaf weights and selects the best mutations through natural selection and reproduction.

1Evolve(
2 mutations=100, # mutations per generation
3 generations=50, # total generations
4 max_generation_depth=10, # max depth within each generation
5 max_severity=0.5, # max mutation severity
6 max_leaves=20, # max leaves to mutate per step
7 early_stopping=None # stop if no improvement after n generations
8)

Partitioned Regression

For datasets with natural segments:

1from xplainable.core.models import PartitionedRegressor, XRegressor
2
3# Create partitioned model
4partitioned_model = PartitionedRegressor(partition_on='segment_column')
5
6# Train separate models for each segment
7for segment in train['segment_column'].unique():
8 segment_data = train[train['segment_column'] == segment]
9 X_seg, y_seg = segment_data.drop('target', axis=1), segment_data['target']
10
11 segment_model = XRegressor(prediction_range=(0, 1000))
12 segment_model.fit(X_seg, y_seg)
13 segment_model.optimise_tail_sensitivity(X_seg, y_seg)
14
15 partitioned_model.add_partition(segment_model, segment)
16
17# Predict with automatic segment routing
18predictions = partitioned_model.predict(X_test)
19
20# Explain a specific partition
21partitioned_model.explain(partition='some_segment')

Hyperparameter Optimization

note

XParamOptimiser is designed for classification models. It uses StratifiedKFold and classification metrics internally. For regression, tune parameters manually or use optimise_tail_sensitivity() combined with XEvolutionaryNetwork.

Evaluation with sklearn

1from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
2
3y_pred = model.predict(X_test)
4
5rmse = mean_squared_error(y_test, y_pred, squared=False)
6mae = mean_absolute_error(y_test, y_pred)
7r2 = r2_score(y_test, y_pred)
8
9print(f"RMSE: {rmse:.3f}")
10print(f"MAE: {mae:.3f}")
11print(f"R2 Score: {r2:.3f}")

Or use the built-in evaluate() method:

1metrics = model.evaluate(X_test, y_test)
2print(f"MAE: {metrics['MAE']}")
3print(f"R2 Score: {metrics['R2 Score']}")
4print(f"RMSE: {metrics['RMSE']}")

Cloud Deployment

For deploying models to the Xplainable Cloud platform, see the REST API documentation.

Best Practices

Data Preparation

Regression Data Quality
  • Handle outliers carefully (they significantly impact regression)
  • Set realistic prediction bounds using prediction_range
  • Check for multicollinearity between features
  • Remove ID columns by passing them via id_columns parameter in fit()
1from xplainable.core.models import XRegressor
2from xplainable.core.optimisation.genetic import XEvolutionaryNetwork
3from xplainable.core.optimisation.layers import Tighten, Evolve
4
5# 1. Train base model
6model = XRegressor(
7 max_depth=8,
8 prediction_range=(0, 500000) # set realistic bounds
9)
10model.fit(X_train, y_train)
11
12# 2. Optimise tail sensitivity
13model.optimise_tail_sensitivity(X_train, y_train)
14
15# 3. Apply evolutionary optimisation
16xnetwork = XEvolutionaryNetwork(model, apply_range=True)
17xnetwork.add_layer(Tighten(iterations=200, learning_rate=0.03, early_stopping=30))
18xnetwork.add_layer(Evolve(mutations=100, generations=50, early_stopping=15))
19xnetwork.fit(X_train, y_train)
20xnetwork.optimise()
21
22# 4. Evaluate
23metrics = model.evaluate(X_test, y_test)
24print(metrics)
25
26# 5. Explain
27model.explain()

Troubleshooting

Poor prediction accuracy

Possible causes:

  • Base model alone is weak (this is expected)
  • Insufficient optimisation layers

Solutions:

  • Always run optimise_tail_sensitivity() after fitting
  • Add Tighten and Evolve layers via XEvolutionaryNetwork
  • Increase max_depth or decrease min_info_gain
Predictions outside expected range

Solutions:

  • Set prediction_range parameter in the constructor
  • Set apply_range=True when creating XEvolutionaryNetwork
Model overfitting

Solutions:

  • Increase min_leaf_size parameter
  • Decrease max_depth
  • Use a separate validation set to monitor XEvolutionaryNetwork optimisation
  • Add regularization through min_info_gain

Next Steps

Ready to Explore More?