Structuring our Python packages

In the previous posts, we learned how to use python scripts and modules to avoid repeating ourselves when we write programs. This is important. If you subsequently discover an error in your code (i.e. a bug), you have to fix it in only one place. Problems arise when you have to find and fix the same bug in multiple places. We also saw how to use the __name__
method to run python modules (a single Python file with useful functions, classes, and variables) as a stand-alone program.
In this post we will learn one option to organise several modules into folders, and provide a nice way for the user to call our various Python classes and functions when they import our package.
Package structure
Let’s pretend that we created a Python package called scipoly. The package allows us to model scientific institutes.
The top-most folder is called package_demo
. In contains 4 folders that contain various types of files.
docs
: Documentation for our package.examples
: Contains examples of how to use our package.tests
: Contains unit tests; tests that verify our package and all its parts are working correctly and as expected.scipoly
: Contains files to our scipoly package.
package_demo/
├── docs
├── examples
├── tests
└── scipoly
├── __init__.py
├── actions
│ ├── confuse.py
│ ├── help.py
│ ├── hinder.py
│ ├── __init__.py
├── people
│ ├── __init__.py
│ ├── person.py
│ ├── scientist.py
│ └── student.py
└── places
├── __init__.py
├── lab.py
├── office.py
└── room.py
The files related to this package are available from GitHub here. Note that only the scipoly package is provided in the GitHub repo; the docs
, examples
and tests
are not included.
__init__.py files
You will have noticed that scipoly
and the three folders it contains (actions
, people
, and places
) all have an __init__.py
file. At his point, these files are empty. But having them in these folders tells the Python interpreter that the files in each of these folders are modules; this means we will be able to import them into our python programs.
As explain in a previous post, you will have to point the Python interpreter to scipoly
so that it knows these files and folders exists. This can be done by adding the scipoly
folder to your PYTHONPATH
system variable, or by adding it at runtime with:
import sys
sys.path.insert(0,'/path/to/folder')
Using our package
Here is a simple example of our package in action:
>>> import scipoly.actions.hinder
>>> scipoly.actions.hinder.days_to_completion(56)
156
>>> from scipoly.actions import confuse
>>> confuse.addition(2, 2)
5
>>> import scipoly.actions.help as h
>>> h.dishes()
I will do the dishes.
As you can see, we were able to import modules from our package (remember that modules are simply individual Python files that contain useful code such as classes and functions) and call some of the functions. We used various import statements for each example. While the shortcut h
in the last example results in much less typing in our program, it will be much less clear to someone else (or our future self) reading our code where h
came from, especially if the import statement occurred a few hundred lines above.
To ensure we understand what is going on, the folder actions
contains a module called confuse.py
. This module contains a function called addition
:
def addition (a, b):
return a + b + 1
That is why were able to write the following code:
>>> from scipoly.actions import confuse
>>> confuse.addition(2, 2)
5
Populating one of our __init__.py files
Thus far, with empty __init__.py
files, we were able to import modules via their full path. That is, by specifying scipoly
, then the subfolder, for example actions
and then the module name, for example confuse
. While completely transparent, we might like to shorten some of these import statements, but keep them highly informative and transparent.
Lets add the following text the __init__.py
file in scipoly/people/
:
from .person import Person
from .scientist import Scientist
from .student import Student
What are these lines of code doing? They are importing classes we have coded (Person
, Scientist
, Student
) in the various modules located in the people
folder: person.py
, scientist.py
, and student.py
. Note that the .
before person
, scientist
and student
tells the Python interpreter to look for these modules in the current folder.
With this code now located in our scipoly/people/__init__.py
file, we can use our package as follows:
>>> import scipoly.people as people
>>> w = people.Student(age=22, name="Willson", student_id=23434298)
Why is this better? It will now be very clear where Student
comes from (from people
). This also means that we will not run into issues with namespaces; that is, even if another package we use contains a class called Student
, it will not conflict with our Student
class because we will always be calling it using people.Student
. This also means that if you create a variable called Student
, it will not shadow (i.e. hide) our Student
class.
Going all the way
What if we really wanted to provide all our functions and classes from a single scipoly
import? We use various modules and folders to organise our code so that it is more logical to work with. But people using our package don’t necessarily need to know or worry about how we organised our package. Moreover, by having everything accessible from a single scipoly
import, it means that we can restructure our package as much as we like, as long as we keep the user interface the same (that is, the user will continue to use scipoly.Student()
to create new students, but the actual code describing the Student
class might be located in an entirely different module, or in a module of itself).
How can we achieve this? First, let’s delete the code we previously added to scipoly/people/__init__.py
. Next, let’s add the following code to scipoly/__init__.py
:
from .actions.confuse import addition
from .actions.confuse import all_caps
from .actions.confuse import picker
from .actions.help import dishes
from .actions.help import speedup
from .actions.hinder import days_to_completion
from .actions.hinder import guidance
from .people.person import Person
from .people.scientist import Scientist
from .people.student import Student
from .places.room import Room
from .places.lab import Lab
from .places.office import Office
With this in place, we can now use our package as follows:
>>> import scipoly
>>> scipoly.days_to_completion(56)
156
>>> scipoly.addition(2, 2)
5
>>> scipoly.dishes()
I will do the dishes.
>>> w = scipoly.Student(age=22, name="Willson", student_id=23434298)
>>> muscle = scipoly.Lab(number=132, capacity=32, equipment=['stimulator', 'pressure sensor'])
>>> muscle.accident()
Unfortunately, your graduate student just broke your stimulator.
>>> doug = scipoly.Scientist(name='Dr. Peters', age=62, discipline='Phrenology')
>>> room121 = scipoly.Office(number=121, capacity=1, person=doug)
>>> room121.clean()
True
>>> room121.person.name
'Dr. Peters'
>>> room121.person.interact('Julie McNeil')
Hi Julie McNeil, my name is Dr. Peters.
This also means that smart
independent development environments (IDEs) will give you hints of what is available from your package when it is imported. Below is a screenshot of on interactive session in Pycharm.
Summary
Wow! That was a lot of code and a lot of new concepts. This level of complexity is not always required, but it is something to consider if you have created a large package that will be reused by yourself and others.
If my software uses other packages such as numpy, where should the import statements for these other packages go? Should it go in the main program, or an init.py file at the project root? I’m working on cleaning up some code I have and my import statements are scatter across all the module files and main program. A mess. 😦
BTW: You site is great! I find your python blogs really helpful.
LikeLike
Hi Samuel,
Glad to hear you enjoy blogs.
You probably don’t want your various imports in an
__init__.py
file. You want to important the various packages you need in each of your.py
files. However, when you group functions/classes that are related into the same module (i.e..py
file), you often reduce how often you import certain package.It is often a good idea to look at other projects to see how they are structured. It might help you figure out if your current structure is on pare with that of others.
Marty
LikeLike
Thank you for the feedback. I’ll take a look at some other projects on GitHub.
LikeLike