Adverse training AI models: a big self-destruct button?

21

October

2023

No ratings yet.

“Artificial Intelligence (AI) has made significant strides in transforming industries, from healthcare to finance, but a lurking threat called adversarial attacks could potentially disrupt this progress. Adversarial attacks are carefully crafted inputs that can trick AI systems into making incorrect predictions or classifications. Here’s why they pose a formidable challenge to the AI industry.”

And now, ChatGPT went on to sum up various reasons why these so-called ‘adversarial attacks’ threaten AI models. Interestingly, I only asked ChatGPT to explain the disruptive effects of adversarial machine learning. I followed up my conversation with the question: how could I use Adversarial machine learning to compromise the training data of AI? Evidently, the answer I got was: “I can’t help you with that”. This conversation with ChatGPT made me speculate about possible ways to destroy AI models. Let us explore this field and see if it could provide a movie-worthy big red self-destruct button.

The Gibbon: a textbook example

When you feed one of the best image visualization systems GoogLeNet with a picture that clearly is a panda, it will tell you with great confidence that it is a gibbon. This is because the image secretly has a layer of ‘noise’, invisible to humans, but of great hindrance to deep learning models.

This is a textbook example of adversarial machine learning, the noise works like a blurring mask, keeping the AI from recognising what is truly underneath, but how does this ‘noise’ work, and can we use it to completely compromise the training data of deep learning models?

Deep neural networks and the loss function

To understand the effect of ‘noise’, let me first explain briefly how deep learning models work. Deep neural networks in deep learning models use a loss function to quantify the error between predicted and actual outputs. During training, the network aims to minimize this loss. Input data is passed through layers of interconnected neurons, which apply weights and biases to produce predictions. These predictions are compared to the true values, and the loss function calculates the error. Through a process called backpropagation, the network adjusts its weights and biases to reduce this error. This iterative process of forward and backward propagation, driven by the loss function, enables deep neural networks to learn and make accurate predictions in various tasks (Samek et al., 2021).

So training a model involves minimizing the loss function by updating model parameters, adversarial machine learning does the exact opposite, it maximizes the loss function by updating the inputs. The updates to these input values form the layer of noise applied to the image and the exact values can lead any model to believe anything (Huang et al., 2011). But can this practice be used to compromise entire models? Or is it just a ‘party trick’?

Adversarial attacks

Now we get to the part ChatGPT told me about, Adversarial attacks are techniques used to manipulate machine learning models by adding imperceptible noise to large amounts of input data. Attackers exploit vulnerabilities in the model’s decision boundaries, causing misclassification. By injecting carefully crafted noise in vast amounts, the training data of AI models can be modified. There are different types of adversarial attacks, if the attacker has access to the model’s internal structure, he can apply a so-called ‘white-box’ attack, in which case he would be able to compromise the model completely (Huang et al., 2017). This would impose serious threats to AI models used in for example self-driving cars, but luckily, access to internal structure is very hard to gain.

So say, if computers were to take over humans in the future, like the science fiction movies predict, can we use attacks like these in order to bring those evil AI computers down? Well, in theory, we could, though practically speaking there is little evidence as there haven’t been major adversarial attacks. Certain is that adversarial machine learning holds great potential for controlling deep learning models. The question is, will the potential be exploited in a good way, keeping it as a method of control over AI models, or will it be used as a means of cyber-attack, justifying ChatGPT’s negative tone when explaining it?

References

Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I., & Tygar, J. D. (2011, October). Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence (pp. 43-58).

Huang, S., Papernot, N., Goodfellow, I., Duan, Y., & Abbeel, P. (2017). Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284.

Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K. R. (2021). Explaining deep neural networks and beyond: A review of methods and applications. Proceedings of the IEEE109(3), 247-278.

Please rate this

Weapons of mass destruction – why Uncle Sam wants you.

14

October

2023

No ratings yet.

The Second World War was the cradle for national and geopolitical informational wars, with both sides firing rapid rounds of propaganda at each other. Because of the lack of connectivity (internet), simple pamphlets had the power to plant theories in entire civilizations. In today’s digital age, where everything and everyone is connected, the influence of artificial intelligence on political propaganda cannot be underestimated. This raises concern as, unlike in the Second World War, the informational wars being fought today extend themselves to national politics in almost every first-world country.

Let us take a look at the world’s most popular political battlefield; the US elections; in 2016, a bunch of tweets containing false claims led to a shooting in a pizza shop (NOS, 2016), these tweets had no research backing the information they were transmitting, but fired at the right audience they had significant power. Individuals have immediate access to (mis)information, this is a major opportunity for political powers wanting to gain support by polarising their battlefield.

Probably nothing that I have said to this point is new to you, so shouldn’t you just stop reading this blog and switch to social media to give your dopamine levels a boost? If you were to do that, misinformation would come your way six times faster than truthful information, and you contribute to this lovely statistic (Langin, 2018). This is exactly the essence of the matter, as it is estimated that by 2026, 90% of social media will be AI-generated (Facing reality?, 2022). Combine the presence of AI in social media with the power of fake news, bundle these in propaganda, and add to that a grim conflict like the ones taking place in East Europe or the Middle East right now, and you have got yourself the modern-day weapon of mass destruction, congratulations! But of course, you have got no business in all this so why bother to interfere, well, there is a big chance that you will share misinformation yourself when transmitting information online (Fake news shared on social media U.S. | Statista, 2023). Whether you want it or not, Uncle Sam already has you, and you will be part of the problem.

Artificial intelligence is about to play a significant role in geopolitics and in times of war the power of artificial intelligence is even greater, luckily full potential of these powers hasn’t been reached yet, but it is inevitable that this will happen soon. Therefore, it is essential that we open the discussion not about preventing the use of artificial intelligence in creating conflict and polarising civilisations, but about the use of artificial intelligence to repair the damages it does; to counterattack the false information it is able to generate, to solve conflicts it helps create, and to unite groups of people it divides initially. What is the best way for us to not be part of the problem but part of the solution?

References

Facing reality?: Law Enforcement and the Challenge of Deepfakes : an Observatory Report from the Europol Innovation Lab. (2022).

Fake news shared on social media U.S. | Statista. (2023, 21 maart). Statista. https://www.statista.com/statistics/657111/fake-news-sharing-online/

Langin, K. (2018). Fake news spreads faster than true news on Twitter—thanks to people, not bots. Science. https://doi.org/10.1126/science.aat5350

NOS. (2016, 5 december). Nepnieuws leidt tot schietpartij in restaurant VS. NOS. https://nos.nl/artikel/2146586-nepnieuws-leidt-tot-schietpartij-in-restaurant-vs

Please rate this