Twitch Data Leak – Are Platforms Doing Enough To Secure Our Data?

9

October

2021

Data Security: Recent Twitch data leak shows how confidential information can be accessible for anyone through data breaches. Are platforms doing enough to prevent this?

5/5 (1)

Three days ago, another platform and its users became victims of a data leakage. This time it was Twitch, a highly popular (game-)streaming platform owned by Amazon with approximately 8.07 million active streamers/users just last month (Clement, 2021). The top streamers on the platform gather millions of viewers around the world and subsequently get paid by Twitch for providing their users with entertainment through streams. Last Wednesday, for the first time in Twitch history, confidential company information and streamers’ earnings were leaked as it became clear how much the top streamers have earned in revenue. And it was not a small leak either: BBC has reported that it was due to a massive breach of over 100GB in data (Tidy & Molley, 2021).

2021: Record-breaking amount of data leaks?

Unfortunately, this data leak of a widely-used platform is not the first and certainly not the last. According to The Identity Theft Research Center, the number of (publicly announced) data breaches so far this year has already surpassed the total number in 2020 by 17%, with nearly 281.5 million people being affected by these breaches in 2021. There have been 1,291 breaches so far, compared to 1,108 breaches last year. The report also states that we could be headed towards a record-breaking year when it comes to total amount of data leaks, with the current all-time high of 1,529 breaches being set in 2017 (Morris, 2021).

More data = more data security?

Whether this year will mark the most amount of data breaches or not, it illustrates that data security is becoming increasingly more important in order to prevent these breaches from happening. With the growth in data produced and collected by almost every business or organisation, the likelihood of the (increasingly valuable) data being leaked or systems being breached naturally increases. To put the increase of data into perspective: In 2010, the world created about 2 ZB (zettabytes) of digital information. Last year, this increased to a whopping 44 ZB in that year alone (Saha, 2020).

Needless to say, more data requires better data security. Especially considering the increase in breaches/leaks this year, companies should look to invest more in protecting their (users’) data. According to a cybersecurity market report, the global cybersecurity market size is projected to grow from 217.9 billion USD in 2021 to 345.4 billion USD by 2026 (MarketsAndMarkets, 2021). Although the cybersecurity market is increasing, will it be enough to significantly decrease data leaks/breaches?

Data equals money

Not only does a data leak hurt a platform’s reputation or its users’ privacy, it can also cost the concerned organization a lot of money. According to the annual Cost of a Data Breach Report, 2021 had the highest average cost in 17 years as data breach costs rose from 3.86 million USD to 4.24 million USD: “the highest average total cost in the 17-year history of this report” (IBM, n.d.). When looking at the example of Twitch, source code was leaked alongside revenue information of top streamers. Therefore, its competitors (e.g. YouTube Gaming) now have access to their rival’s source code and revenue information about the platform’s most valuable asset: their content providers. With the added privacy aspect of the leak, this might result in a significant loss of competitive advantage and thus loss of revenue for Twitch.

Discussion: is it enough?

Now you know how much is invested in cybersecurity and how much an average data leak actually costs, do you think companies should invest even more? In addition, do you think 2021 will go into the history books as the “least safe” year for online platforms so far? And do you think this particular breach will mark the end of Twitch’s dominant competitive position in its industry?

Let me know your thoughts and perspective.

References

Tidy, J. & Molloy, D. (2021). Twitch confirms massive data breach. Available at: https://www.bbc.com/news/technology-58817658

Clement, J. (2021). Active streamers on Twitch worldwide 2021. Available at: https://www.statista.com/statistics/746173/monthly-active-streamers-on-twitch/

Morris, C. (2021). The number of data breaches in 2021 has already surpassed last year’s total. Available at: https://fortune.com/2021/10/06/data-breach-2021-2020-total-hacks/

Saha, D. (2020). How The World Became Data-Driven, And What’s Next. Available at: https://www.forbes.com/sites/googlecloud/2020/05/20/how-the-world-became-data-driven-and-whats-next/?sh=2161cb1d57fc

MarketsAndMarkets. (2021). Cybersecurity Market with Covid-19 Impact Analysis by Component (Software, Hardware, and Services), Software (IAM, Encryption, APT, Firewall), Security Type, Deployment Mode, Organization Size, Vertical, and Region – Global Forecast to 2026. Available at: https://www.marketsandmarkets.com/Market-Reports/cyber-security-market-505.html#:%7E:text=global%20Cybersecurity%20market%3F-,In%20the%20post%2DCOVID%2D19%20scenario%2C%20the%20global%20cybersecurity,9.7%25%20from%202020%20to%202026.

IBM. (n.d.). How much does a data breach cost? Available at: https://www.ibm.com/nl-en/security/data-breach

Please rate this

Author: Roël van der Valk

MSc Business Information Management student at RSM Erasmus University - Student number: 483426 TA BM01BIM Information Strategy 2022

Should we start teaching coding since the first years of education?

6

October

2021

No ratings yet.

We live in a society where data and technology are everywhere and are becoming an essential part of our lives. Companies are using  big data in order to acquire competitive advantage and improve their process and products. Some observed benefit that data driven companies are enjoying compared to those who are not are:

  1. A likelihood 23 times higher to acquire new customers thanks data driven marketing campaign
  2. a profit increase of 8% 
  3. reduction of cost of 10% (keboola, 2019)

The demands from employer of coding and data analytics skills is skyrocketing. Glassdoor reported that eight of the top 25 jobs in the US are tech-based and require some level of coding proficiency. Similarly, a 2016 Burning Glass report found that the demand for roles such as data analyst is rising 12% than the market average (Nord Anglia 2020).

Reading this data one question came immediately to my mind. Since schools have a responsibility to provide students with all the tools and skills they need in order to succeed in their future should the educational system be more prone to the digital revolution and start teaching codes and data analytics since the first years of instruction? I personally believe that our educational system should empathize more the importance of those skills since the earlier age of our education.

When I started my master in Lisbon I had one  mandatory course that was about econometrics and data analytics. So, the first time I started learning coding was at 22 years old and I wish I could have started way before. During my bachelor in management in Padova I did not have the possibility to choose any elective course regarding those topics. Many italian business university still do not have a dedicated bachelor or master to business and data analyst. I believe every students in management , finance or economics nowadays should have at least a basic knowledge in coding and data analytics in order to be competitive in the job market in the future. Like with every other subject if we start to approach it since we are young it would be easier to learn it and master it.

Let me know what you think and how is the situation in your countries.

Keboola 2019 available at https://www.keboola.com/blog/5-stats-that-show-how-data-driven-organizations-outperform-their-competition

Nord Anglia 2020 avaiable at https://www.nordangliaeducation.com/news/2020/08/18/the-benefits-of-coding-in-school-and-how-to-teach-it

Please rate this

Google’s DeepMind facing data privacy lawsuit

5

October

2021

4/5 (1)

From data to app to lawsuit

2015: Alphabet Inc.’s British artificial intelligence subsidiary DeepMind obtains private health records of 1.6 million patients from the Royal Free London NHS Foundation Trust. 

This data was to be used to develop the ‘Streams’ app which aims to alert, detect, and diagnose kidney injuries. The app was being developed for use by doctors to detect acute kidney injury. This app was already being used by the Royal free with great praise.

From DeepMinds point of view, they are making use of valuable data in order to progress healthcare and save lives. From Royal Free’s point of view, they are enabling this by sharing this data and then using the app created by this to treat patients. However, for some citizens, this seems like a breach of data privacy.

The British law firm Mishcon de Reya has filed a class-action lawsuit against DeepMind to represent Andrew Prismall and the other 1.6 million patients whose data was shared. 

Who is at fault?

Something I find quite interesting about this case is that DeepMind is accused of being at fault rather than the Royal Free, who shared the data in the first place. Although the Streams app was developed by DeepMind, the app was a collaboration between DeepMind and Royal Free and could not have succeeded without both of their inputs.

I believe that both players are to blame in this situation and that DeepMind can not be put at fault alone. Who do you believe is at fault in this situation?

How can we prevent this in the future?

For such situations, a healthcare system with strong regulations regarding data privacy, and healthcare providers who abide by such regulations, would largely diminish the threat of major tech firms such as Alphabet. However, too many regulations can inhibit innovation in some situations. Finding a balance between innovation and safety is a challenge that many industries and regulators struggle with worldwide.

I believe that it is no easy task to find such a balance. There is a growing number of factors influencing a push for both regulation and free innovation as digital information becomes one of the most important assets for innovative development. Experts on data privacy and innovation must come together to form regulations that can foster safe innovation.

What do you think should be done to foster safe innovation in the information era?

References:

https://www.bbc.com/news/technology-40483202

https://www.bbc.com/news/technology-58761324

https://www.cnbc.com/2021/10/01/google-deepmind-face-lawsuit-over-data-deal-with-britains-nhs.html

https://deepmind.com/

Please rate this

Security consequences of AI

5

October

2021

No ratings yet.

Companies store more and more data, because data is valuable. However with the increase in data also comes the responsibility to keep the data private. There is a necessity for companies to secure their data, to make sure it does not fall into the hands of malicious third parties. The old protections systems based on software designed and operated by humans might not be enough anymore, as can be seen from some high profile data leaks. Some examples are the 500 million LinkedIn profiles that got scraped of the database of the company in April of this year and the cyberattack on EasyJet where over 9 million customers were affected [1].

New technologies however, may bring change for the good or bad of companies. Especially AI is promising when used in both security and hacking of a company’s data.

Hacking with AI

Getting hacked by AI systems may seem far away in the future, but there are already real world examples of AI software that would be capable of hacking. There currently exist hacking events composed of AI systems that compete against each other. In 2016, DARPA ran a hacking event where the AI systems of one hundred teams had to try and hack software that was not analysed or tested before [2]. To keep it short, the results were that the winning AI system was still worse than the human hackers that competed with the winning AI system. The gap between AI and humans is getting smaller however. The most worrying thing is that AI systems seem particularly good at finding vulnerabilities, which is one of the most often used ways to get to data of companies [2].

AI in security

There is a silver lining however, AI systems can also be incorporated in the security systems of companies. There are already a multitude of AI systems that companies utilize. Some of these include modelling user behaviour, antivirus products, automated network and system analysis and email scanning [3]. These AI systems will also be improved upon over time and might be able to hold of the AI systems build for hacking.

[1]https://www.cnet.com/tech/services-and-software/14-of-the-worst-data-leaks-breaches-scrapes-and-security-snafus-in-the-last-decade/

[2] https://www.schneier.com/academic/archives/2021/04/the-coming-ai-hackers.html

[3] https://cisomag.eccouncil.org/hackers-using-ai/

Please rate this

What is all this data doing in my protest?

8

October

2020

No ratings yet.

Over the last few years there have been a plethora of protests throughout the world. From the strikes in Bolivia, to the currently ongoing Hong Kong extradition protests as well as Black Lives Matter protests all over the world. Although the people in all these protests are not directly connected with regards to their mission, there is one thing that all of them are subject to, and that is data collection. To anyone that has been following data collection practices over the past decade, it should be of no surprise that also data of protests is being used, monitored, evaluated, and profited off by a variety of parties. But who exactly are these parties, and what do they gain from analyzing the data that is gathered during protests?

Probably the most obvious parties that track data during protests are news and research firms, whom use data collection in order to provide insights into such events. For example, the company MobileWalla, that usually does not publicly bring out their data collection results, has provided an in-depth demographic (and more) overview of Black Live Matters protesters on multiple U.S. cities (Doffman, 2020). By tracking phones of individuals they are able to provide insights not only into demographic factors (like race and gender), but also whether protestors came from inside or outside the cities in which they protested.

Another group that is able to profit off of data-collection practices during protests are political movements. One example is Vestige Strategies. whose aim it is to promote the election of African-Americans into governmental functions. They used geofencing during George Floyd protests in order to target specific audiences for voting registration promotions (Mann, 2020).

The aforementioned two parties are not necessarily negatively influencing protesters (apart from possible privacy concerns). However, what happens if the party opposing protesters makes use of their data? The CCP, for example, has been tracking Hong Kong protesters using a variety of data- and AI related practices. Using face recognition, protest leaders are targeted, leading to a growing numbers of protestors trying to conceal their faces (Mozur, 2019). In the US, law enforcement also has been implementing data practices to track protests. The company Dataminr has been providing local law enforcement with Twitter (meta)data under the guise of “delivering breaking news alerts” (Biddle, 2020). This allows them to, for example, track protest locations.

Now it might seem that protesters themselves do not have anything to gain from the data that becomes available during protests. However, even protesters themselves have started implementing data-driven practices. The crowdsourcing app HKmap.live is used by Hong Kong protestors in order to track police activity (He, 2019). Thus, even protestors themselves are able to utilize data in their advantage.

This blogpost was not written to necessarily criticize the usage of data collected during protests. Its aim is merely to shed a light onto how data has become of importance even in areas one might not expect it. Also, with the large amount of protests happening these days, those that get involved should be aware of how their data might be used.

Sources:

Biddle, S. (2020). Police Surveilled George Floyd Protests With Help Form Twitter-Affiliated Startup Dataminr. The Intercept_. Retrieved from: https://theintercept.com/2020/07/09/ twitter-dataminr-police-spy-surveillance-black-lives-matter-protests/

Doffman, Z. (2020). Black Lives Matter: U.S. Protesters Tracked By Secretive Phone Location Technology. Forbes. Received from: https://www.forbes.com/sites/zakdoffman/2020/06/26/secretive-phone-tracking-company-publishes-location-data-on-black-lives-matter-protesters/#1b9ab67c4a1e

He, L. (2019). Apple removes app used by Hong Kong protesters to track police movements. CNN Business. Retrieved from: https://edition.cnn.com/2019/10/10/tech/apple-china-hkmap-app/index.html

Mann, S. (2020). Political groups use the cellphone data of protestors to better reach their target audiences. Just the News. Retrieved from: https://justthenews.com/politics-policy/privacy/political-groups-use-cellphone-data-protestors-better-reach-their-target

Mozur, P. (2019). In Hong Kong Protests, Faces Become Weapons. The New York Times. Retrieved from: https://www-nytimes-com.eur.idm.oclc.org/2019/07/26/technology/hong-kong-protests-facial-recognition-surveillance.html

Please rate this

BIM, Meet Gertrude!

6

October

2020

Gertrude enjoying a well deserved drink during her performance. 

In August 2020, famous tech entrepreneur Elon Musk revealed his latest technological project: a pig called Gertrude. On first sight, Gertrude looks like an ordinary Pig. She seems healthy, curious, and eager to taste some delicious snacks. When looking at her, it is hard to imagine how she managed to get one of the world’s most radical and well known tech entrepreneurs so excited. Gertrude just seems normal.

This is exactly the point!

ElonMuskGotcha

Elon Musk “Gotcha”

Gertrude is no ordinary pig. She has been surgically implanted with a brain-monitoring chip, Link V0.9, created by one of Elon Musk’s latest start-ups named Neuralink.

Neuralink was founded in 2016, by Elon Musk and several neuroscientists. The short term goal of the company is to create devices to treat serious brain diseases and overcome damaged nervous systems. Our brain is made up of 86 billion neurons: nerve cells which send and receive information through electrical signals. According to Neuralink, your brain is like electric wiring. Rather than having neurons send electrical signals, these signals could be send and received by a wireless Neuralink chip.

To simplify: Link is a Fitbit in your skull with tiny wires

The presentation in August was intended to display that the current version of the Link chip works and has no visible side-effects for its user. The user, in this case Gertrude, behaves and acts like she would without it. The chip is designed to be planted directly into the brain by a surgical robot. Getting a Link would be a same day surgery which could take less than an hour. This creates opportunities for Neuralink to go to the next stage: the first human implantation. Elon Musk expressed that the company is preparing for this step, which will take place after further safety testing and receiving the required approvals.

The long term goal of the Neuralink is even more ambitious: human enhancement through merging the human brain with AI. The system could help people store memories, or download their mind into robotic bodies. An almost science-fictional idea, fuelled by Elon Musk’s fear of Artificial Intelligence (AI). Already in 2014, Musk called AI “the biggest existential threat to humanity”. He fears, that with the current development rate, AI will soon reach the singularity: the point where AI has reached intelligence levels substantially greater than that of the human brain and technological growth has become uncontrollable and irreversible, causing unforeseeable effects to human civilization. Hollywood has given us examples of this with The Matrix and Terminator. With the strategy of “if you cannot beat them, join them”, Elon Musk sees the innovation done by Neuralink as an answer to this (hypothetical) catastrophical point in time. By allowing human brains to merge with AI, Elon Musk wants to vastly increase the capabilities of humankind and prevent human extinction.

Singularity
Man versus Machine

So, will we all soon have Link like chips in our brains while we await the AI-apocalypse?

Probably not. Currently, the Link V0.9 only covers data collected from a small number of neurons in a coin size part of the cortex. With regards to Gertrude, Neuralink’s pig whom we met earlier in this article, this means being able to wirelessly monitor her brain activity in a part of the brain linked to the nerves in her snout. When Gertrude’s snout is touched, the Neuralink system can registers the neural spikes produced by the neurons firing electronical signals. However, in contrast: major human functions typically involve millions of neurons from different parts of the brain. To make the device capable of helping patients with brain diseases or damaged nervous system, it will need to become capable of collecting larger quantities of data from multiple different areas in the brain.

On top of that, brain research has not yet achieved a complete understanding of the human brain. There are many functions and connections that are not yet understood. It appears that the ambitions of both Elon Musk and Neuralink are ahead of current scientific understanding.

So, what next?

Neuralink has received a Breakthrough Device Designation from the US Food and Drug Administration (FDA), the organisation that regulates the quality of medical products. This means Neuralink has the opportunity to interact with FDA’s experts during the premarket development phase and opens the opportunity towards human testing. The first clinical trials will be done on a small group of patients with severe spinal cord injuries, to see if they can regain motor functions through thoughts alone. For now a medical goal with potentially life changing outcomes, while we wait for science to catch up with Elon Musk’s ambitions.

 Neuralink-Logo

Thank you for reading. Did this article spark your interest?
For more information, I recommend you to check out Neuralink’s website https://neuralink.com/

Curious how Gertrude is doing?
Neuralink often posts updates on their Instagram page https://www.instagram.com/neura.link/?hl=en

Want to read more BIM-articles like this?
Check out relating articles created by other BIM-students in 2020:

Sources used for this article:

4.88/5 (8)

Please rate this

Differential privacy – A sustainable way of anonymizing data?

5

October

2020

No ratings yet. Since a lot of blog contributions mention the increase of data collection, data analytics, and the potential threat to privacy, I thought it would make sense to introduce the technique of differential privacy which is currently on the rise in the US. Apart from the US Consensus Bureau, Apple, and Facebook are in the front row of exploring capabilities and potentials of this technique.

 

What does differential privacy mean?
Differential privacy describes a technique to measure the privacy of a crucial data set.

 

Differential privacy in action
In 2020, the US government is facing a big challenge. It needs to collect data on all of the country’s 330 million residents. At the same time, it must ensure to keep all the identities private. By law, the government needs to ensure that the data collected cannot be traced back to any individual within the data set. The data collected by the US government collects is released in statistical tables for academics and policymakers to analyze when conducting research or writing legislation.

To solve the need for privacy, the US Census Bureau presented a technique, to alter the data collected, making it impossible to trace it back to the individual, without changing the overall information provided through the data set. The Census Bureau technique is a mathematical technique, to inject inaccuracies, or ‘noise’, to the data. That way, some of the individuals within the data might get younger or older, change in ethnicity or religious believes, while keeping the total number of individuals in each group (i.e. age/sex/ethnicity) the same. The more noise injected into the data sets, the harder the activity to de-anonymize the individuals.

This mathematical technique is also used by Apple and Facebook, to collect aggregated data without identifying particular users of products and services.

However, this activity also poses some challenges. Injecting too many inaccuracies can render the data useless. A study of the differentially private data set of the 2010 Census showed households that supposedly had 90 people, which cannot be true. However, since the owner of a data set can decide to which level the ‘noise’ should be injected, that challenge shouldn’t pose too much of a problem. Further, the more noise is included, the harder it gets to see correlations between data attributes and specific characteristics of individuals.

If a further analysis of differentially private data sets proves the technique to ensure required privacy, especially for governmentally created data sets, it is likely that other federal agencies or countries will use the methodology as well.

 

 

From my point of view, differential privacy as used for governmentally created data sets seems to a big step towards getting a clearer view about the status quo of a country, thanks to increased privacy and therefore increased trust by residents as well as probably increased participation in the process of data collection.

However, based on the complexity of the technique, to me it seems unlikely, that differential privacy will be used widely within companies (for the moment). Losing the ability to analyze data in detail due to increased privacy for the user and therefore lost correlations within data sets is a payoff I do not think a lot of companies are willing to take. Especially, since a lot of smaller companies are just starting to analyze the data they are collecting.
Right now, research shows that only big multinationals with high R&D budgets are able to sustainably increase privacy through differential privacy without losing too many insights derived from the data collected.

 

What do you think
Can differential privacy be a step in the right direction? Or should governments limit companies in the collection, aggregation, and analysis of data to increase privacy for the customers?

 

Sources:
https://aircloak.com/de/wie-funktioniert-differential-privacy/
https://hci.iwr.uni-heidelberg.de/system/files/private/downloads/182992120/boehme_differential-privacy-report.pdf
https://www.technologyreview.com/10-breakthrough-technologies/2020/#differential-privacy
https://towardsdatascience.com/understanding-differential-privacy-85ce191e198a?gi=9d3ad94ea2e4

Please rate this

AI-enabled China’s Social Credit System: in-depth analysis

5

October

2020

5/5 (1)

Automation has transformed every aspect of modern individuals’ lives. Trivial tasks that used to take a person hours to complete, can now be performed within a matter of seconds due to technological advancements. Artificial Intelligence (AI) is one such advancement of technology that is paving the way for the prevalence of automation in every industry. The ability of AI to perform tasks autonomously is primarily possible due to its ability to be able to process large amounts of data and infer patterns and conclusions within this data, thus effectively learning tasks by itself. However, the procedures used by the AI to analyze the data are initially inputted by an administrator in the form of algorithms and statistical models. An algorithm is essentially a set of rules and the process to be followed by the machine/computer to perform a calculation/action. Modern automation stripped to its core, is a collection of algorithms and related statistical models programmed by an administrator. Due to the increased adoption of the internet, algorithms have become integrated into every aspect of our lives.

The financial credit system used in many western countries can be seen as an example of how algorithms govern our lives. The system involves gathering financial data relevant to an individual from multiple sources, followed by an algorithm that analyses the likelihood of an individual defaulting on a loan. The data gathered primarily consists of previous debts taken, payment deductibles not met and other forms of credit taken up by the individual in the past. After the careful analysis of this data, the algorithm calculates a score for the individual, the credit score. This score is then used by banks, insurance companies, and other financial institutions to determine the creditworthiness of the individual when he/she requests their services (Petrasic & Saul, 2017). In China, such a system exists not only to determine a citizen’s financial credit score, but it expands to all aspects of a citizen’s life by judging citizens’ behavior and trustworthiness, known as the Social Credit System, introduced in 2014. The Social Credit System will have a complete database on all Chinese citizens by 2020, which will be collected from a variety of sources. This scale of data collection is possible in China as Baidu, Alibaba and Tencent are the major providers of internet infrastructure in the country; they work closely with the Chinese Communist Party (Kobie, 2019). The majority of the digital footprint left by Chinese citizens is on infrastructure established by these companies thereby making it easy for the Chinese Communist Party to access its citizens’ data. This sharing of data between private companies and the government is not commonly heard of in China’s western counterparts and shows the importance of data protection laws enforced in those countries. The implementation of the Social Credit System has numerous effects on the country and citizens on economic and social levels.

On an economic level, the algorithms that facilitate the Social Credit System help bridge a major institutional gap that is the underdeveloped financial credit system in China. As mentioned earlier, the financial credit system utilizes algorithms to calculate a credit score to determine the creditworthiness of individuals. Such credit checks can make it more difficult or even deny individuals to access credits. Often, these credit checks focus on only certain aspects such as the timely manner in which we pay our debts (Petrasic & Saul, 2017). This is simply not enough to determine the creditworthiness of individuals as there are other factors at play as to why individuals pay their debts over a certain time period as they do. The commercial credit systems such as the Sesame Credit (developed by Ant Financial Services Group) can therefore be seen as more valuable in determining the creditworthiness of individuals. The Sesame credit score is arguably a better predictor of trustworthiness, as the scores take a broad range of important factors into account. This will prove to be very beneficial for the financial institutions as they will have the highest level of guarantee that the credit extended will be in safe hands. At the same time though, the citizen with a low rating will not be eligible for large loans and will be asked to pay a very high interest rate. Thus, effectively positioning the algorithm behind the Social Credit System as the decisive entity on whether a citizen can be eligible for a loan or not. The argumentation behind the decision to allow an algorithm to govern the credit eligibility of the citizens states that, due to the restrictions placed on the citizen with a lower score, it would motivate them to be better citizens thus achieving a better score. However, citizens with a lower social credit score than a certain threshold may be subject to more restrictions. For example, citizens with low social credit scores are restricted access to certain services such as (quality) education or (quality) transportation. On a social level, the Social Credit System may give rise to social segregation, where citizens with low social credits are exempted from social activities as well as leading to reduced interactions between citizens with higher social credits and those with lower social credits. Moreover, on the work floor, people with low social credit scores may fail to get a promotion because of their scores. The combined effect of restricted access to education, social segregation as well as limited career prospects, can lead to the next generation of those citizens, who have low social credits, being given unfair chances to increase their social credits, and, as a result, their quality of life. Questions arise whether algorithms account for bridging the social inequality gap or if it even strengthens it (Ebadi, 2018).

References

Ebadi, B. (2018). Artificial Intelligence Could Magnify Social Inequality. Centre for International Governance Innovation. Retrieved from https://www.cigionline.org/articles/artificial-intelligence-could-magnify-social-inequality

Kobie, N. (2019). The complicated truth about China’s social credit system. Wired. Retrieved from https://www.wired.co.uk/article/china-social-credit-system-explained

Petrasic, K., & Saul, B. (2017). Algorithms and bias: What lenders need to know. White & Case. Retrieved from https://www.whitecase.com/publications/insight/algorithms-and-bias-what-lenders-ne ed-know

Please rate this

How Algorithms Discriminate Against Women: The Hidden Gender Bias

9

September

2020

5/5 (2) In past decades, AI worries have moved from whether it will take over the world, to whether it will take our jobs. Today we have a new, and justifiably serious, concern: AIs might be perpetuating or accentuating societal biases and making racist, sexist or other prejudiced decisions.

 

Machine learning technology is inherently biased

Many believe that software and algorithms that rely on data are objective. But machine learning technology is inherently biased because it works on the fundamental assumption of bias. That is, it is biasing certain input data to map them to other output data points. Of course, there is also the option to directly modify the data that is fed in through techniques like data augmentation to enable less biased data. But there is a problem; humans consciously know not to apply certain kinds of bias, yet, subconsciously they end up applying certain kinds of bias that cannot be controlled.

 

Tech-hiring platform Gild

This being the case, it is not surprising to find hidden biases all around us in the world today. For example, let’s talk about the secretive algorithms that have become increasingly involved in hiring processes. American scientist Cathy O’Neil explains how online tech-hiring platform Gild enables employers to go well beyond a job applicant’s CV, by combining through the trace they leave behind them online. This data is used to rank candidates by ‘social capital’ which is measured through how much time they spend sharing and developing code on development platforms like GitHub or Stack Overflow. 

 

This all sounds very promising, but the data Gild shifts through also reveal other patterns. For instance, according to Gild’s data, frequenting a particular Japanese manga site is a ‘solid predictor for strong coding’. Programmers who visit this site, therefore, receive higher scores. As O’Neil points out, awarding marks for this is a large problem for diversity. She suggests ‘if, like most techdom, that manga site is dominated by males and has a sexist tone, a good number of the women industry will probably avoid it’. 

 

‘Gild undoubtedly did not intend to create an algorithm that discriminated against women. They were intending to remove human biases’

 

In the book “invisible women”, Caroline Criado Perez noted that ‘Gild undoubtedly did not intend to create an algorithm that discriminated against women. They were intending to remove human biases’. However, if managers are not aware of how those biases operate, if they are not collecting data, and if they are taking little time to produce evidence-based processes, an organisation will continue to blindly perpetuate old injustices. Indeed, by not considering how women’s lives differ from men’s, Gild’s coders accidentally created an algorithm with a hidden data bias against women. 

 

But that is not even the worst part. The worst part is that we have no idea about how bad the problem really is. Most algorithms of this kind are kept secret and protected as proprietary code. This implies that we do not know how decisions are being made and what biases they are hiding.  Perez points out, ‘The only reason we know about this potential bias in Gild’s algorithm is because one of its creators happened to tell us’. This, therefore, is a double data gap: (1)  First,  in the knowledge of the coders designing the algorithm, and (2) second, in the knowledge of society at large, about just how discriminatory these AIs can be (Perez, 2020).

 

‘The only reason we know about this potential bias in Gild’s algorithm is because one of its creators happened to tell us’

 

We need more diversity in tech to reduce the hidden gender bias

Many argue that one easy way to combat the hidden gender bias is to increase the diversity of thought through the number of women in tech. According to the World Economic Forum, currently only 22% of AI professionals globally are female, compared to 78% who are male. Additionally, at Facebook and Google, less than 2% of technical roles are filled by black employees. To remove hidden bias in algorithms, tech companies should step up their recruiting practices and increase diversity in technical roles. 

 

Do you have any other suggestions for managers to reduce hidden bias? Or have you come across a type of hidden bias? Feel free to leave a comment.

 

References:

The Guardian (2016). How algorithms rule our working lives. Retrieved from https://www.theguardian.com/science/2016/sep/01/how-algorithms-rule-our-working-lives

Perez, C. C. (2020). Invisible women Data bias in a world designed for men. New York: Abrams Press. 

Forbes (2020). AI Bias Could Put Women’s Lives At Risk – A Challenge For Regulators. Retrieved from  https://www.forbes.com/sites/carmenniethammer/2020/03/02/ai-bias-could-put-womens-lives-at-riska-challenge-for-regulators/#2201ee44534f

World Economic Forum (2020). Assessing Gender Gaps in Artificial Intelligence. Retrieved from  http://reports.weforum.org/global-gender-gap-report-2018/assessing-gender-gaps-in-artificial-intelligence/

Dogtown Media (2019). Can AI’s Racial & Gender Bias Problem Be Solved? Retrieved from https://www.dogtownmedia.com/can-ais-racial-gender-bias-problem-be-solved/

Please rate this

Being Human in the Age of Black Box Algorithms and Subjective Truths

17

October

2019

5/5 (4)

esther-jiao-ADv0GiMBlmI-unsplash
Photo by Esther Jiao on Unsplash

Algorithms are everywhere and play an important role in our daily lives. They decide what we see on our social media feeds, which ads are used to target us and what route we should take to get places.

The problem is that many algorithms are black boxes. They are complex systems that shape our world, whose internal workings are hidden or not easily understood (Oxford English Dictionary Online, 2011). With these algorithms, which often have a complex design, it is unclear how the output or conclusions were reached. With historically little oversight or accountability regarding their design, this problem has a profound effect on society as our day-to-day lives and our personal decisions are increasingly controlled by algorithms (Carey, 2018; Illing, 2018). Most of us have no idea what algorithms are or how exactly we are being influenced by them. And how could we if we cannot look inside, ‘under the hood’? And if we could, if sometimes even the coders that built an algorithm do not know how the system reached its conclusion (Carey, 2018), how should we?

Does this mean that we cannot trust algorithms anymore? Hannah Fry, an Associate Professor in Mathematics at University College London and author of the book “Hello World: Being Human in the Age of Algorithms”, explains in an interview with Sean Illing that our behaviour to algorithms tends to be in extremes (Illing, 2018). On the one hand, we have very high expectations of algorithms and will trust them blindly. On the other hand, as soon as we see that an algorithm or the outcomes are somewhat inaccurate, we do no longer trust them and disregard them. Fry thinks the right attitude is somewhere in the middle: “we should not blindly trust algorithms, but we also should not dismiss them altogether” (Illing, 2018).

Subjective Truths
A larger concern with algorithms is that they often contain the biases of the people who create them and that they reinforce biases and stereotypes we may inherently have, but might now be aware of (Li, 2019). As Bill and Melinda Gates (2019) describe this can even be the result of non-existent or sexist data. This is especially dangerous with black-box algorithms, which do not explain their results to their programmers – let alone to the end-users.

And what if information is deliberately misrepresented or differs depending on who you are or where you are from? Take for example Google Maps. Google claims to be objective in marking disputed regions in various parts of the world (Boorstin, 2009). Depending on from what country you access Google Maps, you will see Crimea portrayed as part of Ukraine or Russia (Usborne, 2016). If you consider that at least 124 countries are involved in a territorial dispute, there is a lot of potential for subjective truths (Galka, n.d.; Metrocosm, 2015). Another example is Apple. If you are in Hong Kong or Macau, from iOS 13.1.1 onwards you will no longer find the Taiwanese flag ?? on the emoji keyboard (Peters & Statt, 2019). Generally, as an user, you are not made aware of these intentional differences, but they do shape our perception of reality.

Conclusion
When it comes to algorithms, the people behind them or really anything in life, you should not blindly trust the information that is presented to you. Besides, as Fry argues, we should not think of algorithms themselves as either good or bad, but we should rather focus on the people behind the scenes that create these algorithms (Illing, 2018). Although algorithms may not be perfect and they often are biased, they still are extremely effective and have made our lives easier.

Whereas endings are is inevitable, the direction of technological progress is not. We have to ensure that technological progress remains aligned with human’s best interests. There might be unintended or undesired consequences, but as French philosopher Paul Virilio said:

“When you invent the ship, you also invent the shipwreck; when you invent the plane you also invent the plane crash; and when you invent electricity, you invent electrocution…Every technology carries its own negativity, which is invented at the same time as technical progress.” (Virilio, Petit & Lotringer, 1999).

 

References:
Black box. (2011). In Oxford English Dictionary Online. Retrieved 12 October 2019, from https://www-oed-com.eur.idm.oclc.org/view/Entry/282116
Boorstin, B. (2009, December 4). When sources disagree: borders and place names in Google Earth and Maps. Retrieved from https://publicpolicy.googleblog.com/2009/12/when-sources-disagree-borders-and-place.html
Carey, S. (2018). How IBM is leading the fight against black box algorithms. Retrieved 16 October 2019, from https://www.computerworld.com/article/3427845/how-ibm-is-leading-the-fight-against-black-box-algorithms.html
Gates, B. & Gates, M. (2019, February 12). Our 2019 Annual Letter. Retrieved from https://www.gatesnotes.com/2019-Annual-Letter#ALChapter4
Galka, M. (n.d.). Every Disputed Territory in the World [Interactive Map]. Retrieved 16 October 2019, from http://metrocosm.com/disputed-territories-map.html
Illing, S. (2018, October 1). How algorithms are controlling your life. Retrieved from https://www.vox.com/technology/2018/10/1/17882340/how-algorithms-control-your-life-hannah-fry
Li, M. (2019, May 13). Addressing the Biases Plaguing Algorithms. Retrieved from https://hbr.org/2019/05/addressing-the-biases-plaguing-algorithms
Metrocosm. (2015, November 20). Mapping Every Disputed Territory in the World. Retrieved from http://metrocosm.com/mapping-every-disputed-territory-in-the-world/
Peters, J., & Statt, N. (2019, October 7). Apple is hiding Taiwain’s flag emoji if you’re in Hong Kong or Macau. Retrieved from https://www.theverge.com/2019/10/7/20903613/apple-hiding-taiwan-flag-emoji-hong-kong-macau-china
Usborne, S. (2016, Augustus 10). Disputed territories: where Google Maps draws the line. Retrieved from https://www.theguardian.com/technology/shortcuts/2016/aug/10/google-maps-disputed-territories-palestineishere
Virilio, P., Petit, P., & Lotringer, S. (1999). Politics of the very worst. New York: Semiotext(e).

Please rate this