Machine Learning in Python: Main developments and technology trends

May 8, 2021 · 5 minute read

Machine Learning in Python: Main developments and technology trends - Some thoughts on Nolet et al. (2020)

Here, I briefly talk about the paper by Nolet, Patterson and Raschka (2020) “Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence.”

If you are interested in Python and machine learning, the paper is a good read. It gives an historical account of why Python became a go-to language for ML and is here to stay.

TL;DR

Python as the go-to language for machine learning.

Python leverages the wisdom of the crowd through community efforts.

Main motivation behind Software: Automate the tedious, boring and even humanly unattainble stuff.

Python is here to stay

Without going into details of this insightful paper, I want to highlight three takeaways from its introduction section and make some informal comments.

Python as the go-to language for machine learning.

“Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and clean high-level APIs.”

Some statistics could substantiate this claim, but they will be outdated the second I write this sentence. Instead, I will continue to one of the main driving factor behind Python’s success.

Crowd’s wisdom

Python leverages the wisdom of the crowd.

“Aside from the benefits of the language itself, the community around the available tools and libraries make Python particularly attractive for workloads in data science, machine learning, and scientific computing.”

I cannot recount how often I googled python + keywords, functions, libraries, workarounds, errors, APIs, … . The list goes on forever. Encountering a problem in a data science project written in Python is like you are standing at a workbench, crafting something and have a crowd of several million people around you. When you face a problem you just grab the mic (Google), describe the problem (error) and wait for the first one to answer (Stackoverflow). In my opinion, this is the most powerful part of Python. Facing a problem, my first thought is “someone else already faced the same situation” and I am almost never disappointed after asking the crowd. However, if you encounter a problem, google it and still do not have sufficient answers, you know that you somehow left the path of what is actually done in Python or you just phrased it weird. If nobody else encountered the problem before, you know that the issue comes from between your headphones. The last resort is to raise an issue on Github or write a post on Stackoverflow, which will either gets you into the heart of others that were close to the edge of what is possible in Python or leaves you with a dry comment “double post, check out XY”.

Concluding on my thoughts on three statements from Nolet et al. (2020), I have an optimistic view about AI’s future. It is exciting to see how much progress the scientific, professional and open-source community achieves each year. Such a high pace in technological advancements, might seem daunting as it requires data scientists to be on the top their game. But crowd’s wisdom is just a Google search away and learning curves for Python are as steep as ever. This leaves a positive end note for every aspiring Python developers, users and enthusiasts.

Software is key

Recall the main motivation behind Software: It automates the tedious, boring and even humanly unattainble stuff.

“One of the core ideas and motivations behind the multifaceted and fascinating field of computer programming is the automation and augmentation of tedious tasks.”

In today’s world, tasks can not only be tedious, but also infeasible to do. Just imagine you would have to go through thousands of tweets, TikTok and Instagram posts per day to find out what is currently trending in a niche topic you are interested in and want to develop a product for. It would be tedious, ineffective foremost infeasible to do it. When machine learning can be deployed within hours, it equips you with the wisdom of the crowd, saves time and ultimately flows into an informed product development.

However, automation could be hardly achieved when developers need to define every single decision rule of a program. Here is where machine learning jumps in by enabling not only scalability, but also “automate[d] complex decision-making”.

Another factor to consider is the mega trend of an ageing population with low fertility and longer lifes which shrinks the workforce to an extent that person intensive tasks need to be outsourced to a digital worker. One statistic to gauge this is the old age dependency ratio, which is the share of 15-65 year old (economically active population) to >65 year olds. Within the last 20 years from 1997 to 2017, the old age dependency ratio increased by a third from 22.5% to 30% among EU27 members (Eurostat, 2018). Looking 80 years ahead to 2100 shows an almost doubled dependency ratio jumping from 31% in 2019 close to 60%. Put differently, today there are 3.3 people potential workers for each elderly with the trend going towards 1.7 at the end of this century (Eurostat, 2020). Fortunately, advancements in machine learning have the potential to increase productivity by a lot more, than people can age.

References

Eurostat (2018). Record high old-age dependency ratio in the EU. Accessed 7th August 2020, https://bit.ly/2Dt2LYk .

Eurostat (2020). Old-age dependency ratio increasing in the EU. Accessed 7th August 2020, https://bit.ly/31vmSgt .

Raschka, S., Patterson, J., & Nolet, C. (2020). Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information, 11(4), 193.