├── .gitignore ├── README.md ├── my_package ├── __init__.py ├── my_submodule │ ├── __init__.py │ ├── submodule_functions.py │ └── test_submodule.py └── utils.py └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | /my_package.egg-info/* 2 | *.pyc 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # An example of python package 2 | 3 | ## Why use packages? 4 | It is a good practice to not code the same function twice, and to reuse common code from one python script to the other. 5 | 6 | To import the function `some_function` defined in `some_file.py`, you can do: 7 | ```python 8 | from some_file import some_function 9 | ``` 10 | but this only works if you run your code in the same folder as `some_file.py`. 11 | If you want to make `some_function` accessible from anywhere on your computer, *you should not use* 12 | ```python 13 | import sys 14 | sys.path.append("the/path/to/some_file.py") 15 | ``` 16 | because when you share your code with other people, this breaks most of the time. 17 | 18 | Instead, you should **create a python package** containing the code you need. 19 | The following shows how to do it. 20 | 21 | ## Package structure 22 | This repository contains a basic python package, named `my_package`. 23 | Its structure is as follows (ignore the `my_submodule` folder so far): 24 | ``` 25 | example_package (the main directory/folder) 26 | ├── my_package (code folder, must have the name of the module) 27 | │ └── __init__.py (a special Python file, is executed whenever you import my_package) 28 | | └── utils.py (a regular Python file in which you define functions, variables, classes, etc) 29 | ├── setup.py (this special file must be executed to install the package) 30 | └── README.md (contains what you are currently reading) 31 | ``` 32 | 33 | The mandatory files, which must have exactly this name, are `setup.py` and `my_package/__init__.py`. On the other hand, note that: 34 | - `utils.py` could have an arbitrary other name 35 | - `README.md` is not necessary for a package, and is used here to give information to the people browsing the Github repository 36 | 37 | ## Package installation 38 | Once you have the files defined above, you should open a terminal, move to where the `setup.py` file is (using the `cd` command), then execute 39 | ```pip install -e .``` 40 | 41 | After that, **from any location on your computer** you can open an ipython terminal and run: 42 | ```python 43 | import my_package 44 | from my_package import my_function 45 | my_function() 46 | # etc, just like when you do: from pandas import read_csv 47 | ``` 48 | 49 | 50 | ## How does it work? 51 | Running `pip install -e .` tells python to remember where it should look when you refer to `my_package` in some code. 52 | Whenever you run `import my_package`, it will go to this location, and run the `__init__.py`. 53 | Inside the `__init__.py`, you have imported or defined some variables (functions, classes, constants, etc), that are now usable in your main script. 54 | 55 | The `-e` stands for `--editable`: if you use this flag, changes made to the source code after installation will have repercussions on future uses of the package. If you don't use it, when running `pip install .`, pip will copy the source code in its current state and place it in a different location, hence if you modify the source code, changes will not be taken into account (you'd need to reinstall the package for that). 56 | 57 | # More advanced 58 | 59 | ## Submodules 60 | When you do: 61 | ```python 62 | from sklearn.linear_model import Lasso 63 | ``` 64 | you are using the submodule `linear_model` of `sklearn`. 65 | When you codebase grows, splitting it into submodules is nice to keep your code organized (for example, all code related to Linear Models go into the `linear_model` submodule; preprocessing go into `sklearn.preprocessing`, etc). 66 | 67 | in simple terms, a submodule is a package defined inside a package (meaning it also has its own `__init__.py`), using this folder structure: 68 | ``` 69 | example_package 70 | ├── my_package 71 | └── __init__.py 72 | └── my_submodule 73 | └── __init__.py 74 | ``` 75 | usually, the `__init__.py` file import variables defined in other files inside the `my_submodule/` folder (not shown here for simplicity). 76 | Here we just defined one function, `square`, in an auxiliary file inside `my_submodule/`, and we import it inside the submodule's `__init__.py`, thus making it accessible with: 77 | ```python 78 | from my_package.my_submodule import square 79 | # similar to: from numpy.linalg import norm 80 | ``` 81 | 82 | ## Unit tests 83 | You want to make sure that the code you wrote behaves as you expect, and that when you change other parts of the code, you don't break existing parts. 84 | To automate these checks, you should write unit checks: functions that test the output of your code on some data, and check that it is equal to the value you want in that case. Usually, this check is done through an assertion. 85 | 86 | In `my_package/my_submodule/test_submodule.py`, we give an example of such a test: we call our function `square(2)` and check that it returns 4. 87 | 88 | You should run your tests regularly (with a *Continuous Integration* - CI, you can do this every time you push to your repo, for example); we recommend using `pytest` (installable with pip/conda). 89 | 90 | At the root of the repo, run `pytest` and check the output: in all files starting with `test_`, all functions starting with `test_` are run, and if an assertion inside fails, the test fails. 91 | 92 | **Exercice**: modify the content of `test_square` so that the test passes. 93 | 94 | -------------------------------------------------------------------------------- /my_package/__init__.py: -------------------------------------------------------------------------------- 1 | print("my_package is being imported") # noqa E402 2 | 3 | from .utils import my_function 4 | # we will be able to do: from my_module import my_function, or 5 | # my_module.my_function 6 | -------------------------------------------------------------------------------- /my_package/my_submodule/__init__.py: -------------------------------------------------------------------------------- 1 | # the function square, imported here, will be accessible as: 2 | # my_package.my_submodule.square 3 | 4 | from .submodule_functions import square 5 | -------------------------------------------------------------------------------- /my_package/my_submodule/submodule_functions.py: -------------------------------------------------------------------------------- 1 | # This is a regular python file, where we define a function that will 2 | # be imported in the submodule's __init__.py, and thus made accessible 3 | # as: my_package.my_module.square 4 | 5 | def square(x): 6 | return x ** 2 7 | -------------------------------------------------------------------------------- /my_package/my_submodule/test_submodule.py: -------------------------------------------------------------------------------- 1 | # this is a unit test file, designed to check that the code works as expected 2 | # test files should have a name starting with `test_` and inside, 3 | # test functions should also have a name starting with `test_` 4 | 5 | import numpy as np 6 | from my_package.my_submodule import square 7 | 8 | 9 | def test_square(): 10 | """We test that, on some simple cases, our `square` function behaves 11 | as expected.""" 12 | 13 | np.testing.assert_equal(4, square(2)) 14 | # this is wrong on purpose, fix it: 15 | np.testing.assert_equal(10, square(- 1)) 16 | 17 | # np.testing functions are nice because if the assertion is wrong, 18 | # they raise a detailed, precise error message: 19 | # compare: `np.testing.assert_equal(1, 2)`` 20 | # vs the python built-in `assert 1 == 2` 21 | -------------------------------------------------------------------------------- /my_package/utils.py: -------------------------------------------------------------------------------- 1 | def my_function(): 2 | """A dummy function, that we can call after importing my_package.""" 3 | return 1 4 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages 2 | 3 | setup(name="my_package", 4 | packages=find_packages(), 5 | ) 6 | --------------------------------------------------------------------------------