One of the biggest sources of confusion and misinformation for people wanting to learn Python is which version they should learn.
Should I learn Python 2.x or Python 3.x?
Indeed, this is one of the questions we are asked most often at Dataquest, where we teach Python as part of our Data Science curriculum.
This post gives some context behind the question, explains the pespective, and tells you which version you should learn.
Let’s start by taking a brief look at the history.
Python 3.0 was released in 2008 (not a typo – 9 years ago!)
On December 3rd, 2008, Python released version 3.0 . What was special about this was that it was a backwards incompatible release (if you want to read more about why, I recommend this excellent post by Brett Cannon)
As a result, for anyone who was using Python 2.x at that time, migrating their project to 3.x required large changes. This not only included individual projects and applications, but also all the libraries that form part of the Python ecosystem.
The change was seen as extremely controversial, and many projects resisted the pain of moving over, especially in the Scientific Python community. It took two years for the main numeric library NumPy to release its first 3.x release, after which other projects started to release 3.x compatible versions in the years that followed.
By 2012, a lot of libraries had support for 3.x, but most were still being written in 2.x. Over time, tools were released that made porting code across easier, but there was still a great resistance to move.
A great read on the topic is Jake VanderPlas’ post from 2013: Will Scientists Ever Move to Python 3?
In the few years that followed, several tools were release to help the transition of older codebases from Python 2 to Python 3.
Originally, Python had scheduled the ‘end of life’ date for Python 2.x for 2015, but in 2014 they announced they would extend this by 5 years to 2020, in part to relieve worries for those users who cannot yet migrate to Python 3.
Fast-forward to today
Today, there are very few libraries that do not support Python 3. Python 3 Readiness shows that 344 of the 360 top packages for Python support 3.x.
In addition, many packages are announcing the end of support for 2.x. Python 3 Statement is a project where many of the main (scientific) libraries are committing to stop supporting 2.x in 2020 or sooner.
Recently, the popular web-framework Django announced that their new 2.0 version would not support Python 2.x.
So why is this still a question?
There are a lot of older, free resources online to learn Python that are based in Python 2, including most MOOC courses at places like Coursera, Udemy and edX.
Added to this, Zed Shaw’s extremely popular ‘Learn Python the Hard Way’ was written in Python 2.x and has not been updated. Until recently, I thought this was just because Zed was too lazy to update his course, but recently he published a controversial article: The Case against Python 3.
In short – the number of people who agree with Zed’s rant are in the extreme minority.
So which should I learn?
Why you should learn Python 3
From one perspective, I was lucky enough to enter the world of Python much more recently, in early 2016. I started learning Python as part of the data science curriculum at Dataquest. Dataquest only teaches 3.x, and for quite a time I didn’t know of this whole 2 vs 3 controversy.
I’ve used Python 3.x exclusively and rarely run into compatibility issues.
Very occasionally (maybe once every 3–4 months), I’ll find I’m trying to run something that requires Python 2 support, and Python’s virtualenv allows me to instantly create a 2.x environment on my machine to run that piece of legacy software.
Python 3.x is the future, and with Python 2.x support dwindling, you should put your time into learning the version that will help you into the future.
Why you should learn Python 2
You shouldn’t. Very soon there will be no future security or bug fixes for Python 2.x, and your time is better spent learning 3.x.
In the unlikely event that you end up working with a legacy Python 2 code base, tools like python-future will make it easy for you to use having only learned Python 3.
I hope this has helped you understand this controversial topic and make your decision to learn Python 3.
Dataquest is the best online platform for learning to be a Data Scientist using Python (3.x, of course!).
We have graduates working at SpaceX, Amazon and more. If that interests you, you can signup and complete our first course for free at Dataquest.io
This post is based on this Quora answer
Data Scientist at Dataquest.io. Loves Data and Aussie Rules Football. Australian living in Texas.