Data poisoning attacks

By Matteo Bregonzio January 25, 2023

Data poisoning attacks

Data poisoning is an increasingly important security concern for Machine Learning (ML) systems. As machine learning models are becoming more prevalent in our lives, they are also becoming more vulnerable to malicious attacks. Data poisoning attacks are one of the most insidious and difficult-to-detect kinds of threats on ML models.

Data poisoning is a type of adversarial attack in which a cybercriminal injects malicious data into a machine learning model. These attacks can be used to manipulate the results of a machine learning system, or to redirect the system’s resources away from its intended purpose.

This threat is particularly dangerous because it is difficult to detect. The malicious data often looks just like normal data, so it is complex for the machine learning system to distinguish between the malicious and the legitimate ones. This makes these types of offenses a powerful and dangerous tool for attackers.

These attacks work by taking advantage of how machine learning systems process data and can be used to attack both supervised and unsupervised algorithms. In fact, the attacker tries to manipulate the training dataset in order to control the prediction behavior of a trained model.

The malicious data is designed to cause the system to learn inaccurate or unreliable results. For example, an attacker may inject malicious data that causes the system to incorrectly classify an image as a dog when it is actually a cat. The attacker may also inject malicious data that causes the system to become unreliable, for example, by adding data that causes the system to always predict the same result regardless of the input.

It is therefore essential for organizations to take steps to detect and mitigate the risk of data poisoning attacks. By monitoring the data used by the machine learning system, using data validation techniques, and using data sanitization techniques, organizations can protect their machine learning systems from data poisoning attacks.

Interestingly AIA Guard developed an advanced data poisoning module to concretely evaluate each step of the AI model development including training, validation and testing. As a result, a broader awareness of poisoning risk and AI trustworthiness is delivered.