We’ve just started 2019, but hey, the clock is ticking, and we only have 11 months to port all our Python2.X code to version 3. Are you ready? In this blog post, we will share a few tips that we have learned while migrating our PYBOSSA code to python 3.
Creating a virtual environment
We always run our Python libraries within a virtual env. However, you don’t create them in the same way. In Python 2, you do it like this:
While for Python 3 you should do the following:
python3 -m venv env
Then you can activate it as usual:
Using 2to3 to save some time
Every project varies in size and number of lines of code, but you, the developer want to be fast on this, right? Or at least not too painful.
Python has a beneficial tool that will help you to migrate your old code to the new version by only running a command line: 2to3.
This command is pretty simple. If you want to see the changes that it will do to your code, you just run it like this:
If you want to write the changes directly into your example.py file, pass a -w. Be sure to use git (or something similar so that you can check the differences):
2to3 -w example.py
And if you want, you can do something crazy like changing all files at once:
2to3 -w *.py
Then, check the damage ;-)
Strings, Bytes, UTF-8, aka your nightmare
While migrating PYBOSSA to Python3 most of the issues on our side were related to UTF-8 and how we had the strings in Python 2.
The general advice is to check for the type of the strings, and based on that do whatever you need.
Bytes to Strings
if type(foo) == bytes: bytes.decode('utf-8')
Strings to Bytes
You will have several places where your code will fail because now Python 3 expects Bytes instead of strings. For those cases, you can do the following:
if type(foo) == str: bytes.encode('utf-8')
StringIO and BytesIO
You have probably used StringIO in your code. Well, it will not work probably in your migration. You will need to adapt it. Usually, based on the type of the text, you will have to use StringIO or BytesIO. Both will solve it for you. A typical case is like this:
if type(foo) == bytes: data = BytesIO(foo) else: data = StringIO(foo)
CSV and UTF-8
We used the old Python 2 CSV UTF-8 reader example given in the documentation. For Python 2 it works great, but when you try it in Python 3, everything goes to hell :-)
The solution? Use Pandas. Yes, Pandas. It will make your life so much easier.
Pandas will handle everything for you. All you have to do is replace your csv_reader with this:
import pandas as pd df = pd.read_csv(yourfile)
If you then need to test this code, and you don’t have a file, use the previous StringIO or BytesIO to do the magic. Then, load it, and you are done!
def do_something(csvfile): return pandas.read_csv(csvfile) def test_01(): fakefile = StringIO('foo,bar\n,1,2') df = do_something(fakefile) # Check whatever you want
And that’s it. Nothing else ;-) Well, yes, put all your prints with () :D