Structuring our Python packages
In the previous posts, we learned how to use python scripts and modules to avoid repeating ourselves when we write programs. This is important. If you subsequently discover an error in your code (i.e. a bug), you have to fix it in only one place. Problems arise when you have to find and fix the same bug in multiple places. We also saw how to use the
__name__ method to run python modules (a single Python file with useful functions, classes, and variables) as a stand-alone program.
In this post we will learn one option to organise several modules into folders, and provide a nice way for the user to call our various Python classes and functions when they import our package.
Let’s pretend that we created a Python package called scipoly. The package allows us to model scientific institutes.
The top-most folder is called
package_demo. In contains 4 folders that contain various types of files.
docs: Documentation for our package.
examples: Contains examples of how to use our package.
tests: Contains unit tests; tests that verify our package and all its parts are working correctly and as expected.
scipoly: Contains files to our scipoly package.
package_demo/ ├── docs ├── examples ├── tests └── scipoly ├── __init__.py ├── actions │ ├── confuse.py │ ├── help.py │ ├── hinder.py │ ├── __init__.py ├── people │ ├── __init__.py │ ├── person.py │ ├── scientist.py │ └── student.py └── places ├── __init__.py ├── lab.py ├── office.py └── room.py
The files related to this package are available from GitHub here. Note that only the scipoly package is provided in the GitHub repo; the
tests are not included.
You will have noticed that
scipoly and the three folders it contains (
places) all have an
__init__.py file. At his point, these files are empty. But having them in these folders tells the Python interpreter that the files in each of these folders are modules; this means we will be able to import them into our python programs.
As explain in a previous post, you will have to point the Python interpreter to
scipoly so that it knows these files and folders exists. This can be done by adding the
scipoly folder to your
PYTHONPATH system variable, or by adding it at runtime with:
import sys sys.path.insert(0,'/path/to/folder')
Using our package
Here is a simple example of our package in action:
>>> import scipoly.actions.hinder >>> scipoly.actions.hinder.days_to_completion(56) 156 >>> from scipoly.actions import confuse >>> confuse.addition(2, 2) 5 >>> import scipoly.actions.help as h >>> h.dishes() I will do the dishes.
As you can see, we were able to import modules from our package (remember that modules are simply individual Python files that contain useful code such as classes and functions) and call some of the functions. We used various import statements for each example. While the shortcut
h in the last example results in much less typing in our program, it will be much less clear to someone else (or our future self) reading our code where
h came from, especially if the import statement occurred a few hundred lines above.
To ensure we understand what is going on, the folder
actions contains a module called
confuse.py. This module contains a function called
def addition (a, b): return a + b + 1
That is why were able to write the following code:
>>> from scipoly.actions import confuse >>> confuse.addition(2, 2) 5
Populating one of our __init__.py files
Thus far, with empty
__init__.py files, we were able to import modules via their full path. That is, by specifying
scipoly, then the subfolder, for example
actions and then the module name, for example
confuse. While completely transparent, we might like to shorten some of these import statements, but keep them highly informative and transparent.
Lets add the following text the
__init__.py file in
from .person import Person from .scientist import Scientist from .student import Student
What are these lines of code doing? They are importing classes we have coded (
Student) in the various modules located in the
student.py. Note that the
student tells the Python interpreter to look for these modules in the current folder.
With this code now located in our
scipoly/people/__init__.py file, we can use our package as follows:
>>> import scipoly.people as people >>> w = people.Student(age=22, name="Willson", student_id=23434298)
Why is this better? It will now be very clear where
Student comes from (from
people). This also means that we will not run into issues with namespaces; that is, even if another package we use contains a class called
Student, it will not conflict with our
Student class because we will always be calling it using
people.Student. This also means that if you create a variable called
Student, it will not shadow (i.e. hide) our
Going all the way
What if we really wanted to provide all our functions and classes from a single
scipoly import? We use various modules and folders to organise our code so that it is more logical to work with. But people using our package don’t necessarily need to know or worry about how we organised our package. Moreover, by having everything accessible from a single
scipoly import, it means that we can restructure our package as much as we like, as long as we keep the user interface the same (that is, the user will continue to use
scipoly.Student() to create new students, but the actual code describing the
Student class might be located in an entirely different module, or in a module of itself).
How can we achieve this? First, let’s delete the code we previously added to
scipoly/people/__init__.py. Next, let’s add the following code to
from .actions.confuse import addition from .actions.confuse import all_caps from .actions.confuse import picker from .actions.help import dishes from .actions.help import speedup from .actions.hinder import days_to_completion from .actions.hinder import guidance from .people.person import Person from .people.scientist import Scientist from .people.student import Student from .places.room import Room from .places.lab import Lab from .places.office import Office
With this in place, we can now use our package as follows:
>>> import scipoly >>> scipoly.days_to_completion(56) 156 >>> scipoly.addition(2, 2) 5 >>> scipoly.dishes() I will do the dishes. >>> w = scipoly.Student(age=22, name="Willson", student_id=23434298) >>> muscle = scipoly.Lab(number=132, capacity=32, equipment=['stimulator', 'pressure sensor']) >>> muscle.accident() Unfortunately, your graduate student just broke your stimulator. >>> doug = scipoly.Scientist(name='Dr. Peters', age=62, discipline='Phrenology') >>> room121 = scipoly.Office(number=121, capacity=1, person=doug) >>> room121.clean() True >>> room121.person.name 'Dr. Peters' >>> room121.person.interact('Julie McNeil') Hi Julie McNeil, my name is Dr. Peters.
This also means that
smart independent development environments (IDEs) will give you hints of what is available from your package when it is imported. Below is a screenshot of on interactive session in Pycharm.
Wow! That was a lot of code and a lot of new concepts. This level of complexity is not always required, but it is something to consider if you have created a large package that will be reused by yourself and others.