After years of working as Python developer in various teams I learned that most of Python backend developers do not respect Python packaging for backend applications. This makes me sad. So I decided to draw a blog post to explain why I do otherwise.
I think I started packaging all the python software I produce ages ago (in the beginning of 2013) right after I posted this article (in Russian) to be vetted as an admission fee to habr (to have write and commenting rights).
Many developers asked me why I do Python packaging while everything works just without it. "It is a higher level of maturity" usually was not taken as an answer.
Bullet proof reason (apparently not)
UPDATE: Workaround with
So the 100% bullet proof reason for Python packaging is: without packaging the source code cannot be reused in certain cases. You just can not import it because it appears to be outside of visible scope of running python interpreter.
$ PYTHONPATH=$PWD:$PYTHONPATH python ./tmp/test1.py
So the 100% bullet proof reason for Python packaging is: without packaging the source code cannot be reused in certain cases. You just can not import it because it appears to be outside of visible scope of running python interpreter.
Imagine you have created a skeleton Django project (I am using Django as example just to make the example more practical) and a Django application with (as described in official Django tutorial):
$ python --version
$ python --version
Python 3.7.2 $ python -m django --version 2.1.7 $ django-admin startproject mysite $ cd mysite $ python manage.py startapp polls
Now you need to do some quick research to test your code, so you create a directory that should not be a part of a project and file with your snippet:
$ cd ..
$ mkdir tmp $ vim tmp/test1.py
tmp/test1.py
content:from mysite.polls import models
This would look like:
(django) $ tree . ├── mysite │ ├── manage.py │ ├── mysite │ │ ├── __init__.py │ │ ├── settings.py │ │ ├── urls.py │ │ └── wsgi.py │ └── polls │ ├── admin.py │ ├── apps.py │ ├── __init__.py │ ├── migrations │ │ └── __init__.py │ ├── models.py │ ├── tests.py │ └── views.py └── tmp └── test1.py
When you run the
tmp/test1.py
script you get:$ python ./tmp/test1.pyTraceback (most recent call last): File "./tmp/test1.py", line 1, in <module> from mysite.polls import models ModuleNotFoundError: No module named 'mysite'
It does not import the module, because it appears to be outside visible scope of the script (python interpreter).
Relative import neither works:
from ..mysite.polls import models
$ python ./tmp/test1.pyTraceback (most recent call last): File "./tmp/test1.py", line 1, in <module> from ..mysite.polls import models ValueError: attempted relative import beyond top-level package
Let's bring
$ tree
tmp/test1.py
into mytest directory
:$ tree
. └── mysite ├── manage.py ├── mysite │ ├── __init__.py │ ├── settings.py │ ├── urls.py │ └── wsgi.py ├── polls │ ├── admin.py │ ├── apps.py │ ├── __init__.py │ ├── migrations │ │ └── __init__.py │ ├── models.py │ ├── tests.py │ └── views.py └── tmp └── test1.py
tmp/test1.py
content:from ..polls import models
Nope:
$ python ./mysite/tmp/test1.py Traceback (most recent call last): File "./mysite/tmp/test1.py", line 1, infrom ..polls import models ValueError: attempted relative import beyond top-level package
You can import your code only if you move your
test1.py
to the root level like this:$ tree . ├── mysite │ ├── manage.py │ ├── mysite │ │ ├── __init__.py │ │ ├── settings.py │ │ ├── urls.py │ │ └── wsgi.py │ ├── polls │ │ ├── admin.py │ │ ├── apps.py │ │ ├── __init__.py │ │ ├── migrations │ │ │ └── __init__.py │ │ ├── models.py │ │ ├── tests.py │ │ └── views.py │ └── test1.py (content: from polls import models) └── test1.py (content: from mysite.polls import models)
This leads to polluting source root with a bunch of
testN.py
files which contradicts Python Zen statement "Namespaces are one honking great idea -- let's do more of those!" and just makes code structure messy and harder to navigate.
Creating a package installable to virtualenv in edit mode (
pip install -e .
) will let you import any submodule from script or submodule does not matter where they are located relative to the source code as long as virutalenv is activated.The content of
tmp/test1.py
would be the same no matter where you place it:from mysite.polls import models
Other reasons
There are more reasons to do Python packaging (although most of them fall into "higher maturity" category):
- Code can be uploaded to a private (local) PyPI repository (e.x. Gemfury)
- Forces to have properly versioned software (independent from particular code version control system) with comparable version numbers
- Forces to namespace the code which well-aligned with Python Zen statement "Namespaces are one honking great idea -- let's do more of those!".
- Distributable independent from particular code version control system
- No need to clone entire repository while you need only latest version
- Allows not to give access to the entire source history to someone who is authorized only for deployment
- No need to install code version control system client (like git) to deploy
- Installable/uninstallable/upgradeable with
pip
,pipenv
and other similar tools - Allows to distribute only the code required during run time (tests and other auxilary stuff may be excluded from package)
- Package may include C-extensions or Cython code which are automatically compiled during installation
- Package can be precompiled as Python wheel
- Package can be compressed
P.S. Advanced tutorial on Python packaging for Django users