Newcircuit 1c8db4401779333389b774086d6390f160bbb49078a0eda577b43dcf5018a77f

Ashley Whetter's Blog

[programming, linux] ++ [stuff]

Social inside twitter icon ee809e0d7c5b2294b7d548e7223909c678d76c5484045ae68ecb7c559cd7fade Social inside google icon dbbc88b746b52525f1e1bd5e2bf44152b2713fb976f7e68b903c2d13e9fbd160 Octocat 1fcbe3003defcb5ee415b3a7aa989bbc724ccf3c446436e848c27437901c6ca2 Social inside rss icon ff8ccbbb13f265afbe3cadba03c838cdbcad75b6f223a7937eab3ee35afc933d Social inside linkedin icon 5f3acefcb096b75afe32a0e0b3d90d1f2c1b374049eb8c1996203225e1ae34ba

Using imports in __init__.py to define your public API

The Problem

In many languages outside of Python, packages may look normal. They can also look like a great way to organise an API. Where am I going to find the Tiger class? In the tiger module of course! But overuse of packages can make an API tiresome to use. The import that users will have to use could look like:

import animals.tiger
rajah = animals.tiger.Tiger()
By overusing packages, users have to type "tiger" twice just to use our class! Furthermore, the import could get quite long. As more levels of packages are added, the more users have to type to use our API. It also gives them less space in the 79 characters per line for their own code. Fortunately, __init__.py can help us solve these problems while still keeping our code organised.

An Almost Perfect Solution

Currently our code looks like this:

|- animals
|    || - __init__.py
|    || - base.py
|    || - tiger.py
|    || - utils.py
# animals/base.py
class Animal(object):
    pass
# animals/tiger.py
import animals.base
import animals.utils

class Tiger(animals.base.Animal):
    TOOTH_SHAPE = animals.utils.ToothShape.SHARP
# animals/utils.py
import enum

class ToothShape(enum.Enum):
    SHARP = 0

By adding an import of Tiger into animals/__init__.py, our users would no longer have to type "tiger" twice and we've shortened our import.

# animals/__init__.py
from animals.base import Animal
from animals.tiger import Tiger
Now users can use Tiger with:
import animals
rajah = animals.Tiger()
We can do this with submodules as well. If we were to import utils into animal's __init__.py then users would only have to use import animals to be able to use animals.utils (similarly to how you can access os.path when only importing os).
# animals/__init__.py
from animals import utils
And users use utils with:
import animals
sharp = animals.utils.ToothShape.SHARP

A further benefit of this usage of __init__.py is that you have the freedom to rearrange your codebase without breaking users' code. Let's say that animal/utils.py starts to get really big and we want to split it into multiple files. Because we can define our public API, we have the flexibility to turn utils.py into a package, and import all of it's methods from our new split files, into it's __init__.py. Users will never see a change as they still use import animals.utils as before.

Some Traps

You should not use your public API inside the code of your library. Your code will become sensitive to changes in the ordering of imports inside __init__.py files. For example, the following will fail:

# animals/__init__.py
from animals.tiger import Tiger
from animals.base import Animal
# animals/tiger.py
import animals
import animals.utils

class Tiger(animals.Animal):
    TOOTH_SHAPE = animals.utils.ToothShape.SHARP
with the exception:
Traceback (most recent call last):
  File "", line 1, in 
  File "animals/__init__.py", line 1, in 
    from animals.tiger import Tiger
  File "animals/tiger.py", line 4, in 
    class Tiger(animals.Animal):
AttributeError: module 'animals' has no attribute 'Animal'
Can you spot why this fails?

animals is trying to import tiger, and tiger imports animals.Animal, but the import of animals has not yet reached it's import of Animal. Therefore Animal does not yet exist on animals.

One way to avoid this problem is to use your "private" API inside of your library's code (just as users could unwisely choose to). But this introduces the problems of having to use long imports. You would also lose the benefits of being able to (easily) change your code's layout, as your own code now uses those fixed package positions in it's imports.

Instead you should use relative imports in your own library so that you have shorter imports, and layout flexibility.

The Final Solution

# animals/__init__.py
from .base import Animal
from .tiger import Tiger

from . import utils
# animals/base.py
class Animal(object):
    pass
# animals/tiger.py
from . import base
from . import utils

class Tiger(base.Animal):
    TOOTH_SHAPE = utils.ToothShape.SHARP
# animals/utils.py
import enum

class ToothShape(enum.Enum):
    SHARP = 0

Further Reading

The requests library's __init__.py is a great example of how to define a public API.

The standard library itself is also a great example of package layout. It abides by the rule that packages should be used only to resolve namespace conflicts, and for splitting up large modules. It therefore has a very flat hierarchy and only a few packages.

As the size of your code base increases, putting everything under a single module can make imports slow. The werkzeug library's __init__.py uses a custom lazy loading system to put it's large codebase under a single module that is still fast to import. Note that the authors did this because splitting it into a package would not have made organisational sense, otherwise this would have been a legitimate use case for a package.

Sources

Categories: Programming Python