The thread between Thermodynamics, Chaos Theory and Data Science

5

October

2021

No ratings yet.

Such a title certainly does sound somehow pompous and buzz-wordy but bear with me.

In this article, I plan to take you on a multidisciplinary journey across physics, mathematics and data science, showing you that diverse, multilayered curiosity may very well lead you in uncharted territories ripe with new creative, elegant ways of seeing the world and academia.

Information, while not too special a word is pivotal towards understanding, well virtually everything. As per Claude Shannon, information is a measure of order, a reduction in uncertainty. As such, information describes how the specific arrangement of constituting elements of something make it inherently different from something else. In physics, information describes how arranging a certain collection atoms in one way makes a squirrel, and in another a banana. The particles making up these objects are the same, their order, their information are not. Likewise, in a perhaps more familiar field, information in analytics defines how arranging bits in certain ways yield insightful takeaways on how to adjust prices, decide budget and allocate resources.

Our universe, the system in which we evolve has a tendency of balancing things out, therefore, if there is one way to define order, another will describe disorder, increase in uncertainty, which birthed the term “entropy”. While it may sound daunting at first, entropy is a rather simple term to understand yet difficult to fully grasp. Entropy, while one of the most critical concepts in thermodynamics, is not secluded to physics and chemistry alone, but rather stands as a study of information theory. The entropy of variable describes the uncertainty the surprise inherent to that variable. In essence, it describes the hidden information of that variable, the different states or values that variable may take, thus telling different stories. A coin toss, has an entropy of 1 bit, heads or tails, 0 or 1, with a 50/50 percent attached probability. The higher the probability of an outcome, the lower its surprise, the lower its entropy.

Every closed system evolves from a state of low entropy to a state of high entropy. Still underlying the elegant similitude between thermodynamics and information system -that is, digital system processing and generating data-, ordered physical systems move towards disorder: artificially heat up room temperatures move toward homogenous state, state of increased entropy where temperatures have evened out. In a similar fashion, we consume energy to maintain our biological functions, to maintain the order of our biological system. Left unattended, our bodies would gradually evolve towards increasing entropy (a nicer way of saying that not sustaining yourself leads to death and decomposition, again, elegant isn’t it). In a same vane, information system that produce data will inevitably end up generating meaningless noise, a homogeneous whole where nothing is differentiated, state of increased entropy. As with a body that needs up keeping, data cleaning and filtering is crucial towards maintaining viable business processes.

The last stop of our journey is chaos theory. I will save you (and me) the mind-bending aspect of it by sticking to a definition which may be made analogous to information management. Chaos theory is the study of hidden pattern in seemingly disordered system. In other words, finding information in high entropy environments, which in the world of data science, translates to finding valuable information in noise.

For those three enthousiasts that read through this slight deviation from usual posts, I hope you found some new ways to approach problems and view academics. I have the strong conviction that we can find many original and compelling ways to solve problems by linking different fields and surfing along different topics through sheer curiosity alone.

Please rate this

The AI alignment problem, misinformation and other externalities.

2

October

2021

No ratings yet.

Artificial intelligence may never truly do anything more than what we want, but in a world where we more often than not lack comprehensiveness in spelling out what we wish for, these algorithmic minds may be led astray and deliver us product and services misaligned with what we truly desire (Quanta, 2020)

Now common across many media platforms, YouTube’s content recommendation system, one aimed at maximizing time spent on the platform, uses AI programs to deliver us content fitting our viewing history and profile. One issue that was not foreseen though, is that said recommendation become increasingly “extreme” with respect to the first video watched. Within a few clicks, one would end up watching videos promoting one-sided views on any given subjects. Watching one videos on climat change may thus lead you to climat change denials. One video on adopting a vegetarian diet would lead to veganism promoting content. Such approach, which consists of steadily using the “intensity” of videos may thus lead to increasing polarisation on all matters of subjects and undermine the efforts for healthy and constructive dialogs.

This issue, ever so conspicuous with the novel corona-virus driven pandemic and inherent heated debates, stands as only but an instance of the many externalities that arise from the broader, technical problem of alignment. While it may appear easy to point the flaws of an AI programmation, developers and engineers do not possess the luxury of after-the-facts evaluation and must therefore account for each an every ramification of the targeted wants. The stress put on designing AI with absolute certainty with little to no certain nor definitive information in conjuction with no comprehensive definition of the targeted objectives is pivotal in the deployment of large scale, complex Ai solutions such as self-driving cars.

One critical challenge of developing AIs is the implementation of nuances in achieving the often already oversimplified goals we assigned them to. Building up on the idea of self-driving cars, one must first consider what we want a self-driving car to do. While trivial at first, this question rapidly evolves into an extremely complex set of requirements. First, we will say we want to be driven from point A to to B. Then we add that want so to be done safely, after which we need to define safely. Then we realise that being driven safely may lead to the car stopping for even the slightest of reasons such as a plastic bag floating in the wind, creating congestion and uncomfortable journeys. We now realise we want to be driven rapidly, which too needs definition. So on and so forth, designers and engineers are soon to face a maze-like set of requirements, rules, and constraints that, if overlooked or oversimplified, lead to misaligned AI-driven services.

While AIs shine in performing narrow, repetitive tasks at record-breaking speeds and volumes, they are still severely inconsistent and limited in performing even mundane task with no inherent predictability. That is because, since AI are computational aggregate constructs, the reward function which they seek to solve is one that deals in absolutes, thus making the perfect weighting of ever so complex set of goals impossible, especially as these goals progressively entail less exactitude and more subjectiveness. 

Quite an oversimplification but compelling an argument nonetheless, these limitations find resonance in the word “creativity” one for which we still lack proper neurological explanation, even less so an algorithmic one. 

This post is only but the mere introduction to the complete spelling out of the alignment problem, one which lies at the core of modern AI developments.If you wish to know more about the technical complexity and go beyond the buzz-wordy nature of new technology, I can only recommend you give the website “quanta magazine” a read as they delve deeper into these questions.

Reference:

https://www.quantamagazine.org/artificial-intelligence-will-do-what-we-ask-thats-a-problem-20200130/

Please rate this