The thread between Thermodynamics, Chaos Theory and Data Science

5

October

2021

No ratings yet.

Such a title certainly does sound somehow pompous and buzz-wordy but bear with me.

In this article, I plan to take you on a multidisciplinary journey across physics, mathematics and data science, showing you that diverse, multilayered curiosity may very well lead you in uncharted territories ripe with new creative, elegant ways of seeing the world and academia.

Information, while not too special a word is pivotal towards understanding, well virtually everything. As per Claude Shannon, information is a measure of order, a reduction in uncertainty. As such, information describes how the specific arrangement of constituting elements of something make it inherently different from something else. In physics, information describes how arranging a certain collection atoms in one way makes a squirrel, and in another a banana. The particles making up these objects are the same, their order, their information are not. Likewise, in a perhaps more familiar field, information in analytics defines how arranging bits in certain ways yield insightful takeaways on how to adjust prices, decide budget and allocate resources.

Our universe, the system in which we evolve has a tendency of balancing things out, therefore, if there is one way to define order, another will describe disorder, increase in uncertainty, which birthed the term “entropy”. While it may sound daunting at first, entropy is a rather simple term to understand yet difficult to fully grasp. Entropy, while one of the most critical concepts in thermodynamics, is not secluded to physics and chemistry alone, but rather stands as a study of information theory. The entropy of variable describes the uncertainty the surprise inherent to that variable. In essence, it describes the hidden information of that variable, the different states or values that variable may take, thus telling different stories. A coin toss, has an entropy of 1 bit, heads or tails, 0 or 1, with a 50/50 percent attached probability. The higher the probability of an outcome, the lower its surprise, the lower its entropy.

Every closed system evolves from a state of low entropy to a state of high entropy. Still underlying the elegant similitude between thermodynamics and information system -that is, digital system processing and generating data-, ordered physical systems move towards disorder: artificially heat up room temperatures move toward homogenous state, state of increased entropy where temperatures have evened out. In a similar fashion, we consume energy to maintain our biological functions, to maintain the order of our biological system. Left unattended, our bodies would gradually evolve towards increasing entropy (a nicer way of saying that not sustaining yourself leads to death and decomposition, again, elegant isn’t it). In a same vane, information system that produce data will inevitably end up generating meaningless noise, a homogeneous whole where nothing is differentiated, state of increased entropy. As with a body that needs up keeping, data cleaning and filtering is crucial towards maintaining viable business processes.

The last stop of our journey is chaos theory. I will save you (and me) the mind-bending aspect of it by sticking to a definition which may be made analogous to information management. Chaos theory is the study of hidden pattern in seemingly disordered system. In other words, finding information in high entropy environments, which in the world of data science, translates to finding valuable information in noise.

For those three enthousiasts that read through this slight deviation from usual posts, I hope you found some new ways to approach problems and view academics. I have the strong conviction that we can find many original and compelling ways to solve problems by linking different fields and surfing along different topics through sheer curiosity alone.

Please rate this

Leave a Reply

Your email address will not be published. Required fields are marked *