Confusion Matrix and Cyber Security

Arnab Saha
8 min readAug 11, 2021

Hello All …

Here I come with another article in that we will clear all the confusion about Confusion Matrix and let see how it is used in Cyber Crime..?

So let’s start with basic points… :)

Confusion Matrix ….?

A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is making.

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

A confusion matrix is a good and reliable metric to use with classification problems. It is used to prove that the model is good or bad for different classes and their different impact. For example, if the model needs to catch classes of one particular class more than the other, we can create that measure from the confusion matrix. Let’s understand this by the example of two classes 0 and 1. There are four possible scenarios can happen while prediction:

Class is 1 and our model predicted 1 — That’s correct!

Class is 1 and our model predicted 0 — Not good.

Class is 0 and our model predicted 1 — Again not good.

Class is 0 and our model predicted 0 — Correct!

We can bind all these scenarios in a matrix-like this :

Let’s understand TP, FP, FN, TN in terms of pregnancy analogy.

◼** True Positive:

Interpretation: You predicted positive and it’s true.

You predicted that a woman is pregnant and she actually is.

◼** True Negative:

Interpretation: You predicted negative and it’s true.

You predicted that a man is not pregnant and he actually is not.

◼** False Positive: (Type 1 Error)

Interpretation: You predicted positive and it’s false.

You predicted that a man is pregnant but he actually is not.

◼** False Negative: (Type 2 Error)

Interpretation: You predicted negative and it’s false.

You predicted that a woman is not pregnant but she actually is.

Just Remember, We describe predicted values as Positive and Negative and actual values as True and False.

Classification Accuracy is given by the relation:

1. ACCURACY:

Accuracy is the number of correctly (True) predicted results out of the total.

Accuracy = (TP + TN) / (TP + TN + FP + FN)

= (4 + 3) / 9 = 0.77

Accuracy should be considered when TP and TN are more important and the dataset is balanced because in that case the model will not get baised based on the class distribution. But in real-life classification problem, imbalanced class distribution exists.

2. Precision:
Precision is defined as the ratio of the total number of correctly classified positive classes divided by the total number of predicted positive classes. Or, out of all the predictive positive classes, how much we predicted correctly. Precision should be high.

Out of the total predicted positive values, how many were actually positivePrecision = TP / (TP + FP) = 4/5 = 0.8

3. RECALL:

Out of the total actual positive values, how many were correctly predicted as positive

Recall= TP / (TP + FN) = 4/5 = 0.8

Based on the problem statement, whenever the FP is having a greater impact, go for Precision and whenever the FN is important, go for Recall

4. F beta SCORE

In some use cases, both precision and recall are important. Also, in some use cases even though precision plays an important role or recall plays is important, we should combine both to get the most accurate result.

WHAT IS CYBERSECURITY?

Cybersecurity is the protection of internet-connected systems such as hardware, software and data from cyberthreats. The practice is used by individuals and enterprises to protect against unauthorized access to data centers and other computerized systems.

A strong cybersecurity strategy can provide a good security posture against malicious attacks designed to access, alter, delete, destroy or extort an organization’s or user’s systems and sensitive data. Cybersecurity is also instrumental in preventing attacks that aim to disable or disrupt a system’s or device’s operations.

Common types of cyber attacks

  • Malware
  • Phishing
  • Man-in-the-middle attack
  • Denial-of-service attack
  • SQL injection
  • Zero-day exploit
  • DNS Tunneling

WHY IS CYBERSECURITY IMPORTANT?

With an increasing number of users, devices and programs in the modern enterprise, combined with the increased deluge of data, much of which is sensitive or confidential, the importance of cybersecurity continues to grow. The growing volume and sophistication of cyber attackers and attack techniques compound the problem even further.

HOW CAN WE RELATE Confusion Matrix WITH CYBERSECURITY?

Consider cyberattacks happened in the organization, but Machine Learning Model haven’t informed which results huge loss to the organization in terms of data security, data privacy and Trust. And this was happened because Model failed to predict the Threats, simply known as False Negative Case.

But when cyberattacks were not happened but model predict that it is happened. This results to unnecessary waste your precious time. This is case of False Positive Case.

What is the risk?

Risk is defined as the potential for loss or damage when a threat exploits a vulnerability. Examples of risk include:

  • Financial losses
  • Loss of privacy
  • Damage to your reputation Rep
  • Legal implications
  • Even loss of life

Risk can also be defined as:

Risk = Threat x Vulnerability

So to monitoring the cyber attacks in the cybersecurity field confusion metrics plays a vital role as:

We proposed a new method for cyber attack classification based on a parallel support vector machine based on a distant feature set of attack attributes. All of the features are ranked based on their KullbackLeibler (K-L) distances, which is an alternative way to measure the importance of a feature in discriminating two classes. The features discriminating based on the euclidian distance formula for finding similarity of features based on attack category. After the calculation of discrimination, we apply a parallel support vector machine. SVM which was developed by Vapnikis one of the methods that are receiving increasing attention with remarkable results. SVM implements the principle of Structural Risk Minimization by constructing an optimal separating hyperplane in the hidden feature space, using quadratic programming to find a unique solution. Originally SVM was developed for pattern recognition problems.

Recently, a regression version of SVM has emerged as an alternative and powerful technique to solve regression problems by introducing an alternative loss function. Although SVM has been successfully applied in many fields, there is a conspicuous problem that appeared in the practical application of SVM. In parallel SVM machine first, we reduced nonclassified features data by distance matrix of the binary pattern. From this concept, the cascade structure is developed by initializing the problem with a number of independent smaller optimizations and the partial results are combined in later stages in a hierarchical way, as shown in the below image, supposing the training data subsets and are independent among each other

This figure shows that cascaded support vector machine, in this machine we passed five-stage of features discernment and all these passes to optimized support vector machine for the processing of classification.

Step for Data Preprocessing

  • Transform data to the format of an SVM
  • Conduct scaling on the data
  • Consider the RBF kernel K(x; y)
  • Use cross-validation to 2nd the best parameter C and
  • Use the best parameter C and to train the whole training set
  • Generate formatted data.

Step of Cyber Data Classification

  • Read preprocessing data
  • All the classes are represented

Conclusion

This blog presents new cyber attack detection and classification system to classify cyber-attacks. In this blog, we developed the performance of IDS using a parallel support vector machine for distributed cyber-attack detection and classification. The new PSVM is shown more efficient for the detection and classification of different types of cyber-attacks compared to SDF.

Extra

Four Major Examples of Cyber Crime

In 2013–2016, Yahoo experienced a data breach which resulted in the theft of 3 billion user accounts. For some of these accounts, the attackers got hold of private information and passwords, which could be used to access user accounts in other online services. Much of this data is available today, either free or for a price, on the dark web.

In 2014, US retailer Home Depot’s point of sale systems were breached. Attackers stole 50 million personal credit cards, and for some time any credit card swiped at Home Depot stores was captured and its details compromised by the attackers.

In 2016, the largest ever distributed denial of service (DDoS) attack took place, which used over 1 million connected devices in the Internet of Things, which were compromised by the attackers due to software vulnerabilities. The attack caused outages in the global domain name system (DNS) and popular services including Twitter, Netflix and PayPal.

In 2017, the WannaCry attack, allegedly launched by North Korea, unleashed a type of ransomware which not only locks down content on user devices, but also rapidly spreads itself. WannaCry infected 300,000 computers around the world, and users were asked to pay hundreds of dollars to decrypt and restore their data.

If you like it please clap it.😁

Connect with me on LinkedIn | GitHub

--

--