Feb 21, 2019

Python packaging

After years of working as Python developer in various teams I learned that most of Python backend developers do not respect Python packaging for backend applications. This makes me sad. So I decided to draw a blog post to explain why I do otherwise.

I think I started packaging all the python software I produce ages ago (in the beginning of 2013) right after I posted this article (in Russian) to be vetted as an admission fee to habr (to have write and commenting rights).

Many developers asked me why I do Python packaging while everything works just without it. "It is a higher level of maturity" usually was not taken as an answer.

Bullet proof reason (apparently not)

UPDATE: Workaround with
$ PYTHONPATH=$PWD:$PYTHONPATH python ./tmp/test1.py

So the 100% bullet proof reason for Python packaging is: without packaging the source code cannot be reused in certain cases. You just can not import it because it appears to be outside of visible scope of running python interpreter.

Imagine you have created a skeleton Django project (I am using Django as example just to make the example more practical) and a Django application with (as described in official Django tutorial):

$ python --version
Python 3.7.2
$ python -m django --version
2.1.7
$ django-admin startproject mysite
$ cd mysite
$ python manage.py startapp polls

Now you need to do some quick research to test your code, so you create a directory that should not be a part of a project and file with your snippet:

$ cd ..
$ mkdir tmp
$ vim tmp/test1.py

tmp/test1.py content:
from mysite.polls import models

This would look like:
(django) $ tree
.
├── mysite
│   ├── manage.py
│   ├── mysite
│   │   ├── __init__.py
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   └── polls
│       ├── admin.py
│       ├── apps.py
│       ├── __init__.py
│       ├── migrations
│       │   └── __init__.py
│       ├── models.py
│       ├── tests.py
│       └── views.py
└── tmp
    └── test1.py

When you run the tmp/test1.py script you get:
$ python ./tmp/test1.py 
Traceback (most recent call last):
  File "./tmp/test1.py", line 1, in <module>
    from mysite.polls import models
ModuleNotFoundError: No module named 'mysite'

It does not import the module, because it appears to be outside visible scope of the script (python interpreter).

Relative import neither works:
from ..mysite.polls import models
$ python ./tmp/test1.py 
Traceback (most recent call last):
  File "./tmp/test1.py", line 1, in <module>
    from ..mysite.polls import models
ValueError: attempted relative import beyond top-level package
Let's bring tmp/test1.py into mytest directory:
$ tree
.
└── mysite
    ├── manage.py
    ├── mysite
    │   ├── __init__.py
    │   ├── settings.py
    │   ├── urls.py
    │   └── wsgi.py
    ├── polls
    │   ├── admin.py
    │   ├── apps.py
    │   ├── __init__.py
    │   ├── migrations
    │   │   └── __init__.py
    │   ├── models.py
    │   ├── tests.py
    │   └── views.py
    └── tmp
        └── test1.py

tmp/test1.py content:
from ..polls import models

Nope:
$ python ./mysite/tmp/test1.py 
Traceback (most recent call last):
  File "./mysite/tmp/test1.py", line 1, in 
    from ..polls import models
ValueError: attempted relative import beyond top-level package

You can import your code only if you move your test1.py to the root level like this:
$ tree
.
├── mysite
│   ├── manage.py
│   ├── mysite
│   │   ├── __init__.py
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   ├── polls
│   │   ├── admin.py
│   │   ├── apps.py
│   │   ├── __init__.py
│   │   ├── migrations
│   │   │   └── __init__.py
│   │   ├── models.py
│   │   ├── tests.py
│   │   └── views.py
│   └── test1.py (content: from polls import models)
└── test1.py (content: from mysite.polls import models)

This leads to polluting source root with a bunch of testN.py files which contradicts Python Zen statement "Namespaces are one honking great idea -- let's do more of those!" and just makes code structure messy and harder to navigate.

Creating a package installable to virtualenv in edit mode (pip install -e .) will let you import any submodule from script or submodule does not matter where they are located relative to the source code as long as virutalenv is activated.

The content of tmp/test1.py would be the same no matter where you place it:
from mysite.polls import models

Other reasons

There are more reasons to do Python packaging (although most of them fall into "higher maturity" category):

  • Code can be uploaded to a private (local) PyPI repository (e.x. Gemfury)
  • Forces to have properly versioned software (independent from particular code version control system) with comparable version numbers
  • Forces to namespace the code which well-aligned with Python Zen statement "Namespaces are one honking great idea -- let's do more of those!".
  • Distributable independent from particular code version control system
    • No need to clone entire repository while you need only latest version
    • Allows not to give access to the entire source history to someone who is authorized only for deployment
    • No need to install code version control system client (like git) to deploy
  • Installable/uninstallable/upgradeable with pip, pipenv and other similar tools
  • Allows to distribute only the code required during run time (tests and other auxilary stuff may be excluded from package)
  • Package may include C-extensions or Cython code which are automatically compiled during installation
  • Package can be precompiled as Python wheel
  • Package can be compressed

Feb 13, 2019

My public source code

Prospective clients and employers often ask to show some source code to estimate my coding skills. Unfortunately, or fortunately, most of the code I write I do for money (aka professionally) therefore it is covered by NDA either explicitly or implicitly. Here is the list of mostly leisure projects which code I can show:

Stats on repositories that cannot be open sourced.