-
Notifications
You must be signed in to change notification settings - Fork 4
Project Structure
The provided template is intended to support the development of simple python packages. Indeed, the proposed structure can be freely changed to better adapt to each project's specificity: consider it a (tested) starting point to simplify an initial setup.
Here’s the minimal structure (at the folder level, plus a few configuration files):
├── .github
│ ├── ISSUE_TEMPLATES
│ ├── workflows
├── conda
├── docs
├── package_name
│ ├── algorithms
│ ├── classes
│ ├── readwrite
│ ├── test
├── setup.py
├── README.md
├── requirements*.txt
├── environment.yaml
├── LICENSE
├── MANIFEST.in
├── .coveragerc
├── .readthedocs.yaml
└── .gitignore
In detail:
-
.github/workflows: contains all the GitHub actions configuration files. An action is a set of instructions that needs to be triggered when a specific precondition verifies (e.g., packaging when a new release is created, check tests on push/demand); -
.github/ISSUE_TEMPLATES: templates for open issues, etc., etc.; -
conda/: configuration files needed to generate the Anaconda package from the project (dependencies, pre/post actions…); -
docs/: python configuration (and markdown files) for generating sphinx configuration starting from docstrings; -
package_name/*: source folders organized to cover different functionalities exposed by the project; -
package_name/test/: unit tests for the package functionalities; -
setup.py: setuptools entry point describing how to generate PyPi packages (installable using pip); -
requirements*.txt|environments.yaml: packaging dependencies (for Pypi|Conda); -
README.md|LICENSE|MANIFEST.md: textual files describing project aim|license|description files list; -
.coveragerc: selection of folders to check for unit test code coverage; -
.readthedocs.yaml: configuration files needed from ReadTheDocs to upload project documentation on their servers.
A rational organization of a package in coherent sub-modules has a double positive effect:
- it allows the maintainer to keep the package logic clean;
- it allows users/contributors to better understand the design choices of the package author.
Indeed, this topic is particularly tied to each project and developer's sensibility. However, my experience (highly impacted by imperative and OOP programming paradigms) suggested that the module structure exemplified in this template project works just fine for small to medium size projects.
In particular, the current project identifies four sub-modules:
-
classes: collecting the object that abstracts the data used by library functions/method (e.g., classes and data classes); -
algorithms: sub-module specifying the package functionalities (to be further structured in semantically coherent files collecting related functions); -
readwrite: sub-module dedicated to input/output functions and data transformation (e.g., how to save/load package objects from file(s); -
test: unit tests script to check the behavior of package functionalities and perform automatic regression analysis when functions/methods implementations are updated/modified.
Most importantly, structuring a package in sub-modules allows for a fine-grained definition of import rules for different package functionalities through nested __init__.py.
When importing a package, the developer's visibility strategies impact how different functionalities can be accessed.
The primary purpose of __init__.py is to set such visibility rules hierarchically.
Each sub-module has to specify a __init__.py reporting the local imports of the functions/objects defined in the *.py files that describe its functionalities.
As an example, consider the init.py within the algorithm sub-module:
from .sorting import *In this case, all methods specified within algorithms/sorting.py are available at the sub-module level.
In other words, all sorting functions can be used by importing them with
from package_name.algorithms import *rather then with
from package_name.algorithms.sorting import *Note: the functions that are made available are only the ones listed in the variable __all__ of the imported python files (if specified).
Note 2: if (some of) the imports need to be exposed at the main package level (i.e., not specifying package_name.algorithms but using only from package_name import *), propagate them till reaching the outer __init__.py (For a working example consider the objects defined in the package_name.classes sub-module).
Ok, your code is ready to be packaged and distributed! Are you sure your functions are doing what you designed them to do?
One of the best practices to ensure that the expected behavior of your code is satisfied is to write unit tests (possibly as soon as new functionality is added to the project).
A unit test is a small test, one that checks that a single component operates in the right way. It helps you to isolate what is broken in your application and fix it faster.
To get started, check the official documentation and give a look at the (minimal) working examples in package_name/test/.
As a toy example, consider this piece of code from package_name/test/test_io.py that checks the to_json() and from_json() functions implemented in package_name/readwrite/io.py.
import unittest
from package_name import Profiles, Profile
from package_name.readwrite import *
class TestAlgorithms(unittest.TestCase):
def test_io_profiles(self):
pls = Profiles()
pls.add_profile(Profile("John", 20, "M"))
pls.add_profile(Profile("Jane", 25, "F"))
pls.add_profile(Profile("Jack", 22, "M"))
pls.add_profile(Profile("Jill", 21, "F"))
pls.add_profile(Profile("Jenny", 23, "F"))
pls.add_profile(Profile("Jared", 24, "M")
json_str = to_json(pls)
pls2 = from_json(json_str)
self.assertEqual(pls, pls2)The unit test imports all the needed classes and methods from your package (and other ones if required), reproduce a minimal flow to invoke the functionality you want to test, and try to verify some conditions (i.e., in our case that the original object is equal to the one reconstructed from its JSON representation).
Note: unit test file names must follow the naming convention test_*.py.
Note 2: unit test functions must follow the naming convention def test_* and being defined as methods of a Test*(unittest.TestCase) object.