Adverse training AI models: a big self-destruct button?

21

October

2023

No ratings yet.

“Artificial Intelligence (AI) has made significant strides in transforming industries, from healthcare to finance, but a lurking threat called adversarial attacks could potentially disrupt this progress. Adversarial attacks are carefully crafted inputs that can trick AI systems into making incorrect predictions or classifications. Here’s why they pose a formidable challenge to the AI industry.”

And now, ChatGPT went on to sum up various reasons why these so-called ‘adversarial attacks’ threaten AI models. Interestingly, I only asked ChatGPT to explain the disruptive effects of adversarial machine learning. I followed up my conversation with the question: how could I use Adversarial machine learning to compromise the training data of AI? Evidently, the answer I got was: “I can’t help you with that”. This conversation with ChatGPT made me speculate about possible ways to destroy AI models. Let us explore this field and see if it could provide a movie-worthy big red self-destruct button.

The Gibbon: a textbook example

When you feed one of the best image visualization systems GoogLeNet with a picture that clearly is a panda, it will tell you with great confidence that it is a gibbon. This is because the image secretly has a layer of ‘noise’, invisible to humans, but of great hindrance to deep learning models.

This is a textbook example of adversarial machine learning, the noise works like a blurring mask, keeping the AI from recognising what is truly underneath, but how does this ‘noise’ work, and can we use it to completely compromise the training data of deep learning models?

Deep neural networks and the loss function

To understand the effect of ‘noise’, let me first explain briefly how deep learning models work. Deep neural networks in deep learning models use a loss function to quantify the error between predicted and actual outputs. During training, the network aims to minimize this loss. Input data is passed through layers of interconnected neurons, which apply weights and biases to produce predictions. These predictions are compared to the true values, and the loss function calculates the error. Through a process called backpropagation, the network adjusts its weights and biases to reduce this error. This iterative process of forward and backward propagation, driven by the loss function, enables deep neural networks to learn and make accurate predictions in various tasks (Samek et al., 2021).

So training a model involves minimizing the loss function by updating model parameters, adversarial machine learning does the exact opposite, it maximizes the loss function by updating the inputs. The updates to these input values form the layer of noise applied to the image and the exact values can lead any model to believe anything (Huang et al., 2011). But can this practice be used to compromise entire models? Or is it just a ‘party trick’?

Adversarial attacks

Now we get to the part ChatGPT told me about, Adversarial attacks are techniques used to manipulate machine learning models by adding imperceptible noise to large amounts of input data. Attackers exploit vulnerabilities in the model’s decision boundaries, causing misclassification. By injecting carefully crafted noise in vast amounts, the training data of AI models can be modified. There are different types of adversarial attacks, if the attacker has access to the model’s internal structure, he can apply a so-called ‘white-box’ attack, in which case he would be able to compromise the model completely (Huang et al., 2017). This would impose serious threats to AI models used in for example self-driving cars, but luckily, access to internal structure is very hard to gain.

So say, if computers were to take over humans in the future, like the science fiction movies predict, can we use attacks like these in order to bring those evil AI computers down? Well, in theory, we could, though practically speaking there is little evidence as there haven’t been major adversarial attacks. Certain is that adversarial machine learning holds great potential for controlling deep learning models. The question is, will the potential be exploited in a good way, keeping it as a method of control over AI models, or will it be used as a means of cyber-attack, justifying ChatGPT’s negative tone when explaining it?

References

Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I., & Tygar, J. D. (2011, October). Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence (pp. 43-58).

Huang, S., Papernot, N., Goodfellow, I., Duan, Y., & Abbeel, P. (2017). Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284.

Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K. R. (2021). Explaining deep neural networks and beyond: A review of methods and applications. Proceedings of the IEEE109(3), 247-278.

Please rate this

Time to defend ourselves in the furious cyber war!

22

October

2016

No ratings yet. A major hack attack snarled Web traffic on the East Coast Friday, all the most famous website such as Twitter, Spotify, Reddit, SoundCloud, PayPal, eBay, Amazon and even Netflix were not accessible for users for hours. The FBI and Homeland Security is current still investigating on this case and trying to find out who is responsible for the attack. (BBC News, 2016)

The company being attacked is a internet service company called Dyn. The company claimed that the incidents is due to Distributed Denial of Service (DDoS) attacks which is an attempt to take websites offline by overloading them with internet traffic from different sources. For more information regarding to this certain type of cyber attack, have a look at the video:(BBC News, 2016)

https://www.youtube.com/watch?v=qr5BZAj7kLs

Such news has reminded us again about the importance of  cyber security as the Internet of things has growing faster than ever. I believe that the term cyber attack is not unfamiliar to anyone anymore these days.  We have seen many news recent year that companies being hacked and that users account or information is leaked or so, such as what happened earlier this year with Linkedin and Yahoo. It is now a big concern for any active online companies to take into account in their daily operation. In my opinion, there are some things that could be done to improve the cyber security for companies. Firstly, more investment on IT service is necessary to building up the castle wall, firewall seems not enough as a protection anymore. Secondly, they should know better about the data being stored so that more specific solution could be provided to protect it or in the worst case scenario, companies would know better what information is hacked/leaked and how to makeup for it.

What do you think of this issue concerning cyber security? What could be the possible ways to improve it in your view ?

Reference:

BBC News. (2016). Cyber attacks briefly knock out top sites – BBC News. [online] Available at: http://www.bbc.com/news/technology-37728015 [Accessed 22 Oct. 2016].

 

Please rate this