Skip to main content
Version: v1.4.1

Telco Churn

Package Imports

1!pip install xplainable
2!pip install xplainable-client
1import pandas as pd
2import xplainable as xp
3from xplainable.core.models import XClassifier
4from xplainable.core.optimisation.bayesian import XParamOptimiser
5from xplainable_preprocessing import PipelineSpec, StepSpec, compile_spec
6from sklearn.model_selection import train_test_split
7import requests
8
9from xplainable_client.client.client import XplainableClient
10from xplainable_client.client.base import XplainableAPIError
11import json
1xp.__version__
Out:

'1.3.0'

Instantiate Xplainable Cloud

Initialise the xplainable cloud using an API key from: https://platform.xplainable.io/

This allows you to save and collaborate on models, create deployments, create shareable reports.

1# Initialize Xplainable Cloud client
2client = XplainableClient(
3 # api_key="", #Create api key in xplainable cloud - https://platform.xplainable.io/
4 # hostname="https://platform.xplainable.io"
5 api_key="c57f3c4d-9af5-449e-a481-43c010f5096f",
6 hostname="http://localhost:8000"
7)
8
Out:

Connected to Xplainable Cloud

User: jtuppa

Hostname: http://localhost:8000

API Key Expires: 2025-11-30T11:19:25.328905

Python Version: 3.10.18

Xplainable Version: 1.3.0

Email: [email protected]

Organization: Xplainable Data Science

Team: Data Science

Read IBM Telco Churn Dataset

1df = pd.read_csv('https://xplainable-public-storage.syd1.digitaloceanspaces.com/example_data/telco_customer_churn.csv')

Sample of the IBM Telco Churn Dataset

1df.head()
CustomerIDCountCountryStateCityZip CodeLat LongLatitudeLongitudeGender...ContractPaperless BillingPayment MethodMonthly ChargesTotal ChargesChurn LabelChurn ValueChurn ScoreCLTVChurn Reason
03668-QPYBK1United StatesCaliforniaLos Angeles9000333.964131, -118.27278333.9641-118.273Male...Month-to-monthYesMailed check53.85108.15Yes1863239Competitor made better offer
19237-HQITU1United StatesCaliforniaLos Angeles9000534.059281, -118.3074234.0593-118.307Female...Month-to-monthYesElectronic check70.7151.65Yes1672701Moved
29305-CDSKC1United StatesCaliforniaLos Angeles9000634.048013, -118.29395334.048-118.294Female...Month-to-monthYesElectronic check99.65820.5Yes1865372Moved
37892-POOKP1United StatesCaliforniaLos Angeles9001034.062125, -118.31570934.0621-118.316Female...Month-to-monthYesElectronic check104.83046.05Yes1845003Moved
40280-XJGEX1United StatesCaliforniaLos Angeles9001534.039224, -118.26629334.0392-118.266Male...Month-to-monthYesBank transfer (automatic)103.75036.3Yes1895340Competitor had better devices

1. Data Preprocessing

Turn Label into Binary input

1df["Churn Label"] = df["Churn Label"].map({"Yes":1,"No":0})
1# Define the preprocessing spec
2preprocessing_spec = PipelineSpec(steps=[
3 StepSpec(
4 id="lowercase_text",
5 type="TextCleanTransformer",
6 columns=['City', 'Gender', 'Senior Citizen', 'Partner', 'Dependents',
7 'Phone Service', 'Multiple Lines', 'Internet Service',
8 'Online Security', 'Online Backup', 'Device Protection', 'Tech Support',
9 'Streaming TV', 'Streaming Movies', 'Contract', 'Paperless Billing',
10 'Payment Method'],
11 params={"operations": ["lowercase"]},
12 description="Convert text columns to lowercase"
13 ),
14 StepSpec(
15 id="condense_city",
16 type="CategoryCondenseTransformer",
17 columns=["City"],
18 params={"min_frequency": 0.25},
19 description="Condense long tail city values into 'Other'"
20 ),
21 StepSpec(
22 id="cast_monthly_charges",
23 type="TypeCastTransformer",
24 params={"dtypes": {"Monthly Charges": "float"}},
25 description="Cast Monthly Charges to float"
26 ),
27 StepSpec(
28 id="drop_columns",
29 type="DropColumnsTransformer",
30 params={"columns": [
31 'CustomerID', # Highly Cardinal
32 "Total Charges", # Reduce Multicollinearity between Tenure and Monthly Costs
33 'Count', # Only one value
34 "Country", # Only one value
35 "State", # Only one value
36 "Zip Code", # Highly Cardinal and Data Leakage if you keep City
37 "Lat Long", # Highly Cardinal
38 "Latitude", # Highly Cardinal
39 "Longitude", # Highly Cardinal
40 "Churn Value", # Data Leakage
41 "Churn Score", # Data Leakage
42 "CLTV", # Data Leakage
43 "Churn Reason", # Data Leakage
44 ]},
45 description="Drop high cardinality, single-value, and data leakage columns"
46 ),
47])
48
49# Compile and apply the pipeline
50pipeline = compile_spec(preprocessing_spec)

Preprocessed data

1df_transformed = pipeline.fit_transform(df)
2df_transformed.head()
CityGenderSenior CitizenPartnerDependentsTenure MonthsPhone ServiceMultiple LinesInternet ServiceOnline SecurityOnline BackupDevice ProtectionTech SupportStreaming TVStreaming MoviesContractPaperless BillingPayment MethodMonthly ChargesChurn Label
0los angelesmalenonono2yesnodslyesyesnonononomonth-to-monthyesmailed check53.851
1los angelesfemalenonoyes2yesnofiber opticnonononononomonth-to-monthyeselectronic check70.71
2los angelesfemalenonoyes8yesyesfiber opticnonoyesnoyesyesmonth-to-monthyeselectronic check99.651
3los angelesfemalenoyesyes28yesyesfiber opticnonoyesyesyesyesmonth-to-monthyeselectronic check104.81
4los angelesmalenonoyes49yesyesfiber opticnoyesyesnoyesyesmonth-to-monthyesbank transfer (automatic)103.71

Create Preprocessor to Persist to Xplainable Cloud

1try:
2 preprocessor_id, preprocessor_version_id = client.preprocessing.create_preprocessor(
3 name="Telco Churn Preprocessing - v1.3.1",
4 description="Handling all preprocessing steps in the IBM Telco Churn Dataset",
5 spec=preprocessing_spec.model_dump(),
6 sample_df=df,
7 )
8except XplainableAPIError as e:
9 print(f"Error creating preprocessor: {e}")
10 preprocessor_id, preprocessor_version_id = None, None

Loading the Preprocessor steps

Use the api to load pre-existing preprocessor steps from the xplainable cloud and transform data inplace.

1pp_cloud = client.preprocessing.load_pipeline(preprocessor_version_id)
1pp_cloud.stages
1if pp_cloud:
2 df_transformed_cloud = pp_cloud.transform(df)
3else:
4 print("Using local preprocessing pipeline")
5 df_transformed_cloud = df_transformed.copy()

Create Train/Test split for model training validation

1X, y = df_transformed.drop(columns=['Churn Label']), df['Churn Label']
2
3X_train, X_test, y_train, y_test = train_test_split(
4 X, y, test_size=0.33, random_state=42)

2. Model Optimisation

The XParamOptimiser is utilised to fine-tune the hyperparameters of our model. This process searches for the optimal parameters that will yield the best model performance, balancing accuracy and computational efficiency.

1opt = XParamOptimiser()
2params = opt.optimise(X_train, y_train)
Out:

100%|██████████| 30/30 [00:02<00:00, 11.79trial/s, best loss: -0.8268936134101414]

3. Model Training

With the optimised parameters obtained, the XClassifier is trained on the dataset. This classifier undergoes a fitting process with the training data, ensuring that it learns the underlying patterns and can make accurate predictions.

1model = XClassifier(**params)
2model.fit(X_train, y_train)
Out:

<xplainable.core.ml.classification.XClassifier at 0x15d314b20>

4. Model Interpretability and Explainability

Following training, the model.explain() method is called to generate insights into the model's decision-making process. This step is crucial for understanding the factors that influence the model's predictions and ensuring that the model's behaviour is transparent and explainable.

1model.explain()

The image displays two graphs related to a churn prediction model.

On the left is the 'Feature Importances' bar chart, which ranks the features by their ability to predict customer churn. 'Tenure Months' has the highest importance, confirming that the length of customer engagement is the most significant indicator of churn likelihood. 'Monthly Charges' and 'Contract' follow, suggesting that financial and contractual commitments are also influential in churn prediction.

The right graph is a 'Contributions' histogram, which quantifies the impact of a specific feature's values on the prediction outcome. The red bars indicate that higher values within the selected feature correspond to a decrease in the likelihood of churn, whereas the green bars show that lower values increase this likelihood.

The placement of 'Gender' at the bottom of the 'Feature Importances' chart conclusively indicates that the model does not consider gender a determinant in predicting churn, thereby ensuring the model's impartiality regarding gender.

5. Model Persisting

In this step, we first create a unique identifier for our churn prediction model using client.create_model_id. This identifier, shown as model_id, represents the newly instantiated model which predicts the likelihood of customers leaving within the next month. Following this, we generate a specific version of the model with client.create_model_version, passing in our training data. The output version_id represents this particular iteration of our model, allowing us to track and manage different versions systematically.

1# Create a model
2try:
3 model_id, version_id = client.models.create_model(
4 model=model,
5 model_name="Telco Churn Model - v1.3.1",
6 model_description="Predicting customers who are likely to leave the business within the next month.",
7 x=X_train,
8 y=y_train
9 )
10except XplainableAPIError as e:
11 print(f"Error creating model: {e}")
12 model_id, version_id = None, None
Out:

0%| | 0/19 [00:00<?, ?it/s]

SaaS Models View

Model Image

SaaS Explainer View

Model Image

6. Model Deployment

The code block illustrates the deployment of our churn prediction model using the client.deployments.deploy function. The deployment process involves specifying the unique model_version_id that we obtained in the previous steps. This step effectively activates the model's endpoint, allowing it to receive and process prediction requests. The deployment response confirms the successful deployment with a deployment_id and other relevant information.

1if model_id and version_id:
2 try:
3 deployment_response = client.deployments.deploy(
4 model_version_id=version_id #<- Use version id produced above
5 )
6 deployment_id = deployment_response.deployment_id
7 except XplainableAPIError as e:
8 print(f"Error deploying model: {e}")
9 deployment_id = None
10else:
11 deployment_id = None

SaaS Deployment View

Model Image

Testing the Deployment programatically

This section demonstrates the steps taken to programmatically test a deployed model. These steps are essential for validating that the model's deployment is functional and ready to process incoming prediction requests.

  1. Activating the Deployment: The model deployment is activated using client.activate_deployment, which changes the deployment status to active, allowing it to accept prediction requests.
1if deployment_id:
2 try:
3 client.deployments.activate_deployment(deployment_id=deployment_id)
4 except XplainableAPIError as e:
5 print(f"Error activating deployment: {e}")
6else:
7 print("Deployment ID not available")
  1. Creating a Deployment Key: A deployment key is generated with client.generate_deploy_key. This key is required to authenticate and make secure requests to the deployed model.
1if deployment_id:
2 try:
3 deploy_key = client.deployments.generate_deploy_key(
4 deployment_id=deployment_id,
5 description='API key for Telco Churn',
6 days_until_expiry=1
7 )
8 print(f"Deploy key created: {str(deploy_key)}")
9 except XplainableAPIError as e:
10 print(f"Error generating deploy key: {e}")
11 deploy_key = None
12else:
13 deploy_key = None
  1. Generating Example Payload: An example payload for a deployment request is generated by client.generate_example_deployment_payload. This payload mimics the input data structure the model expects when making predictions.
1#Set the option to highlight multiple ways of creating data
2option = 2
1if option == 1 and version_id:
2 try:
3 body = client.deployments.generate_example_deployment_payload(
4 model_version_id=version_id
5 )
6 except XplainableAPIError as e:
7 print(f"Error generating example payload: {e}")
8 body = []
9else:
10 body = json.loads(df_transformed.drop(columns=["Churn Label"]).sample(1).to_json(orient="records"))
11 body[0]["Gender"] = None #<- Won't require this line the next release of xplainable
1body
  1. Making a Prediction Request: A POST request is made to the model's prediction endpoint with the example payload. The model processes the input data and returns a prediction response, which includes the predicted class (e.g., 'No' for no churn) and the prediction probabilities for each class.
1if deploy_key and body:
2 response = requests.post(
3 url="https://inference.xplainable.io/v1/predict",
4 headers={'api_key': str(deploy_key)},
5 json=body
6 )
7
8 value = response.json()
9 print("Prediction result:", value)
10else:
11 print("Deploy key or body not available for prediction")

SaaS Deployment Info

The SaaS application interface displayed above mirrors the operations performed programmatically in the earlier steps. It displays a dashboard for managing the 'Telco Customer Churn' model, facilitating a range of actions from deployment to testing, all within a user-friendly web interface. This makes it accessible even to non-technical users who prefer to manage model deployments and monitor performance through a graphical interface rather than code. Features like the deployment checklist, example payload, and prediction response are all integrated into the application, ensuring that users have full control and visibility over the deployment lifecycle and model interactions.

Model Image

Rendering AI-Generated Reports in Markdown

When working with AI-generated reports, readability is key. Instead of printing raw text, we can render the output directly as Markdown inside a Jupyter Notebook.

This allows us to use headings, lists, and other formatting styles, making the report much easier to consume and present.

1
2report = client.gpt.generate_report(
3 model_id=model_id,
4 version_id=version_id,
5 target_description="Customer churn likelihood (1 = will churn, 0 = will stay)",
6 project_objective="Identify customers at risk of leaving to improve retention strategies",
7 max_features=10,
8 temperature=0.7
9)
10
11print(f"AI report generated!")
12print(f"Report length: {len(report.report):,} characters")
1from IPython.display import Markdown, display
2
3display(Markdown(report.body))
1