After gathering data, it has to undergo data cleaning, pre-processing, and wrangling. Next, you’ll have to feed it into an outstanding model and get output in probabilities. All these make up the confusion matrix.

In this guide, you’ll discover answers to the “what is a confusion matrix” question. You’ll also discover what a confusion matrix tells you, and when you should use a confusion matrix.

- What is a Confusion Matrix?
- How Do You Read the Confusion Matrix?
- Key Metrics Derived from the Confusion Matrix
- Why Do We Need a Confusion Matrix?
- What Does a Confusion Matrix Tell You?
- When Should You Use a Confusion Matrix?
- How to Plot Confusion Matrix Using Sankey Chart in PBI?
- Wrap Up

First…

A confusion matrix summarizes the performance of a machine learning model on a set of test data. That is, it displays the number of accurate and inaccurate instances based on the model’s predictions.

The matrix shows the number of instances produced by the model on the test data.

- True positives (TP): happens when the model accurately predicts a positive data point.
- True negatives (TN): happens when the model accurately predicts a negative data point.
- False positives (FP): happens when the model mispredicts a positive data point.
- False negatives (FN): happens when the model mispredicts a negative data point.

Here are easy ways of reading and interpreting a confusion matrix.

The confusion matrix for a binary classification problem (two classes, denoted as Positive and Negative) looks like this:

- True Positive (TP):
- False Positive (FP):
- False Negative (FN):
- True Negative (TN):

Here’s how the performance metrics are calculated.

**Accuracy:**The overall accuracy of the model is calculated as TP+TN/TP+TN+FP+FN**Precision:**It measures the proportion of true positive predictions among all positive predictions, and is calculated as TP/TP+FP**Recall (Sensitivity):**It measures the proportion of true positive instances that were correctly predicted, and it’s calculated as TP/TP+FN**Specificity:**It measures the proportion of true negative instances that were correctly predicted, and it’s calculated as TN/TN+FP**F1-score:**Harmonic mean of precision and recall, and it provides a single metric to evaluate the model’s performance across both precision and recall.

Here are the major reasons why a confusion matrix is essential for evaluating the performance of a classification model.

**Performance Evaluation:**It offers a thorough breakdown of how well a model is performing in terms of correctly and incorrectly classified instances across multiple classes. This helps stakeholders to get a good grasp of the model’s strengths and weaknesses.**Metrics Calculation:**You can calculate various performance metrics from the confusion matrix. These performance metrics could be precision, specificity, recall (sensitivity), accuracy, and F1-score. All these metrics offer quantitative measures of the model’s effectiveness and suitability for the intended application.**Business Impact:**A good grasp of the confusion matrix in assessing the business impact of model prediction. For instance, in medical diagnostics, correctly identifying disease patients (true positives) and minimizing missed diagnoses (false negatives) is critical for patient care and outcomes.**Error Analysis:**It helps in pointing out the types of errors a model is making. It could be false positives and false negatives. This helps in figuring out where the model needs improvement or where adjustments in data preprocessing or thresholds are necessary.**Model Selection:**Stakeholders can choose the best-performing model by comparing confusion matrices from different models. This helps in making data-driven decisions on the model to deploy based on its capability to correctly classify instances.

Confusion matrix offers insights into the performance of a classification model:

**Accuracy:**Shows the overall correctness of predictions.**Precision:**It shows the proportion of true positives among positive predictions.**Recall (Sensitivity):**It shows the proportion of actual positives correctly predicted.**Specificity:**It shows the proportion of actual negatives correctly predicted.**False Positives and Negatives:**Instances where the predictions do not match actual outcomes.**True Positives and Negatives:**Instances where predictions match actual outcomes.**Performance Across Classes:**Helps in identifying the classes that are well-predicted, and the ones that need improvement.**Threshold Optimization:**Guides adjustments to decision thresholds based on desired trade-offs between different types of errors.

A confusion matrix is useful in these scenarios:

**Model Evaluation:**Helps in assessing the performance of a classification model across multiple classes.**Error Analysis:**Helps to figure out where the model makes mistakes like false negatives and false positives.**Comparative Analysis:**Compares multiple models to determine which one performs best.**Metric Calculation:**Computes performance metrics like accuracy, recall, precision, specificity, and F1-score.**Business Impact Assessment:**Evaluates the practical implications of model predictions, especially in scenarios where multiple types of errors have varying consequences.**Threshold Adjustment:**Optimizes decision thresholds based on the desired balance between precision and recall.**Continuous Improvement:**Provides insights for refining the model through iterative adjustments in data preprocessing, algorithm selection, and feature engineering.

- Here are steps to help you create a confusion matrix.
**Obtain Predictions and Actual Labels:**You need a set of predictions made by your classification model and the corresponding actual labels (ground truth).**Count True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN):**Compare each prediction with the actual label:- TP: Predicted positive and positive.
- TN: Predicted negative and negative.
- FP: Predicted positive but negative
- FN: Predicted negative but positive.

- Create a 2×2 matrix where:
- Rows represent the actual classes (e.g., Positive and Negative).
- Columns represent the predicted classes (e.g., Predicted Positive and Predicted Negative).
- Populate the matrix with the counts of TP, TN, FP, and FN in their respective cells.

- From your confusion matrix, compute metrics like accuracy, precision, recall (sensitivity), specificity, and F1-score using the formulas derived from TP, TN, FP, and FN counts.

- You need to visualize the confusion matrix to interpret the model’s performance visually. You’ll also need to analyze the metrics to understand how well the model is classifying instances and where improvements may be needed.

**Stage 1: Logging in to Power BI**

- Log in to Power BI.
- Enter your email address and click the “
**Submit**” button.

- You are redirected to your Microsoft account.
- Enter your password and click “
**Sign in**“.

- You can choose whether to stay signed in.

- Once done, the Power BI home screen will open.

- Go to the left-side menu and click the “
**Create**” button. - Select “
**Paste or manually enter data**“.

- We’ll use the sample data below for this example.

Source |
Target |
Count |

Class-1 instances correctly classified as class-1 | Predicted Class-1 | 10 |

Class-1 instances misclassified as class-2 | Predicted Class-2 | 6 |

Class-2 instances misclassified as class-1 | Predicted Class-1 | 2 |

Class-2 instances correctly classified as class-2 | Predicted Class-2 | 12 |

- Paste the data table into the “
**Power Query**” window. Next, choose the “**Create a semantic model only**” option.

- Navigate to the left-side menu, and click on the “
**Data Hub**” option. The Power BI populates the data set list. If no data set has been created, you’ll get the error message. Next, click “**Create report**.”

- Click on “
**Get more visuals**” and search ChartExpo. After that, select “**Sankey Chart**.”

- Click on “
**Add**.”

- You’ll see the Sankey Chart in the visuals list.

- In the visual, click on “License Settings,” and add the key. After adding the key, the Sankey Chart will be displayed on your screen.

- You’ll have to add the header text on top of the chart.

- You can disable the percentage values as shown below:

- You can add the color of all nodes.

- The final look of the Sankey Chart is shown below.

From the data, you’ll see the classification model’s performance: 10 Class-1 instances are correctly identified, while 6 are misclassified as Class-2. For Class-2, 12 instances are correctly classified, but 2 are misclassified as Class-1.

The four values in a confusion matrix are:

- True Positive (TP)
- True Negative (TN)
- False Positive (FP)
- False Negative (FN).

Type 1 error (False Positive): Predicted positive but negative.

Type 2 error (False Negative): Predicted negative but positive.

A good confusion matrix shows high values on the diagonal (True Negatives and True Positives) and low values off-diagonal (False Negatives and False Positives). All these help to indicate accurate predictions across classes.

A confusion matrix is designed to show model predictions versus the actual outcomes in a classification task. It helps in evaluating model performance and understanding errors (like false negatives/positives). It also helps in calculating metrics like recall, precision, and accuracy.

With a confusion matrix, you can easily set decision thresholds for classification inputs. Stakeholders have the option of adjusting these thresholds based on the trade-offs between different types of errors.

To analyze the confusion matrix, you’ll have to use good visuals — and that’s where tools like ChartExpo come into play.