43.1. Creating and Using Python Modules
To have truly reusable code, we need to access functions, variables, and objects that have already been written. Thus we need to have a way to share our code. This is where modules and packages are useful. In this lesson, we demonstrate how to create our first Python module and access its contents from a different Python program.
Documentation
What Is a Module?
Working with Python it’s very easy to define new functions and assign values to variables that we would like to use multiple times. It would be great if we could write these useful pieces of code once and then use them whenever we need them. Thankfully, we can do just that because of modules. In Python, a module is just a Python file. This means that we can use modules to divide our code into logical groupings by putting them into separate modules and then pulling those modules into our scripts or applications when we need them.
Creating Our First Module
To demonstrate how to create and use modules, let’s create a new directory called using_modules
. Within it, we’ll define our first module by creating the using_modules/helpers.py
file.
$ mkdir ~/using_modules
$ cd ~/using_modules
$ touch helpers.py
Within helpers.py
, we’re placing some functions that we think will be generally useful and likely to be used in other files. Let’s write a few functions that can manipulate strings. ~/using_modules/helpers.py
def extract_upper(phrase):
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
Now we have two functions defined and we’d like to use them in other scripts and modules.
Using Our Module from Another Script
For this section of the course, we’re going to be putting our example code into a script called main.py
. Let’s create that script now and look at what we can do to pull in these functions so that we can use them.
The key to working with modules is the import statement . We’re going to dig deeper into all that we can do while importing modules in the next lesson. But for now, we’re going to leverage the fact that we can import modules in the same directory as our script by referencing them by their file name minus the extension. In our case, this will be helpers
.
~/using_modules/main.py
import helpers
Before we use our functions, let’s make sure that this file is valid by running it.
$ python3.7 main.py
$
No output is a good sign. To utilize the functions defined in our module, we’ll add a period to the end of our module name (i.e. the file name) and then type the name of our function to call it as we otherwise would.
~/using_modules/main.py
import helpers
name = "Keith Thompson"
print(f"Lowercase letters: {helpers.extract_lower(name)}")
print(f"Uppercase letters: {helpers.extract_upper(name)}")
Let’s run this and verify it works as expected.
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
Perfect! Now we know the simplest way to define and use modules. In the next lesson, we’ll dig deeper into the various ways and places that we can import modules.
43.2. Importing Modules
Python provides a few different ways to import modules and packages. In this lesson, we’ll take a look at how importing works and the various ways we can import definitions from a module.
Documentation
The Standard import Statement
When we learned how to create a module, we also learned how to import the module as a singular entity into other Python files. To reiterate this, we use the following format to import an entire module under its namespace.
import my_module_name
By doing this, we’re able to access anything exposed by the module by chaining off of the module’s name. Occasionally, we might have a naming conflict when importing a module. In those cases, we can also use the keyword as
in the import
statement to change the identifier that we use to represent the module. Let’s change our using_modules/main.py
so that the helpers
module is accessed using the h name.
~/using_modules/helpers.py
import helpers as h
name = "Keith Thompson"
print(f"Lowercase letters: {h.extract_lower(name)}")
print(f"Uppercase letters: {h.extract_upper(name)}")
The name of h
isn’t great, but it does demonstrate that we can change the name of modules when we import them. If we run this script, we will see there’s no difference in the output.
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
Importing from
More often than not, we don’t need to use everything provided by a module. In these cases, we can leverage the from
version of an import
statement to import only the definitions we need from the module, and then we can access them directly. To demonstrate how to do this for multiple functions, let’s directly import the functions from our helpers
module. The from
statement works like this:
from <MODULE_NAME> import <definition>, <definition>, <etc.>
Here’s what it looks like in main.py
:
~/using_modules/main.py
from helpers import extract_lower, extract_upper
name = "Keith Thompson"
print(f"Lowercase letters: {extract_lower(name)}")
print(f"Uppercase letters: {extract_upper(name)}")
It’s worth noting that now we don’t have access to the helpers
name in our code at all. If we change our extract_upper
line to be chained off of helpers
name it will cause an error.
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Traceback (most recent call last):
File "using_modules/main.py", line 5, in <module>
print(f"Uppercase letters: {helpers.extract_upper(name)}")
NameError: name 'helpers' is not defined
Lastly, we can also combine the as
keyword with each definition that we’re importing to explicitly rename that definition.
~/using_modules/main.py
from helpers import extract_lower as e_low, extract_upper
name = "Keith Thompson"
print(f"Lowercase letters: {e_low(name)}")
print(f"Uppercase letters: {extract_upper(name)}")
Importing Everything from a Module
The final way we can import definitions from a module is to import all of them at once by using *
. This is generally not the recommended way of importing things, but sometimes a module provides a lot of functions that we’ll be using, and we don’t want to explicitly import them one at a time.
Let’s utilize the *
to import our two functions from the helpers
module without explicitly naming them.
~/using_modules/main.py
from helpers import *
name = "Keith Thompson"
print(f"Lowercase letters: {extract_lower(name)}")
print(f"Uppercase letters: {extract_upper(name)}")
Once again, if we run this, it will work just as it did before.
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
43.3. Executing Modules as Scripts
Python modules are just files, but sometimes we want them to behave slightly differently if they’re being run directly. In this lesson, we’ll learn about how modules are interpreted when imported and also how to only run code when a module is run directly by using the name variable.
Documentation
Expressions in a Module
Since modules are just Python files, they can contain expressions and the file will be interpreted from top to bottom. So a few good questions to ask ourselves are:
- When is a module interpreted?
- Can a module be interpreted twice?
To test this, let’s create another module that imports our helpers
module and also import that new module into our main.py
. We’ll call this module extras.py
.
~/using_modules/extras.py
print("Importing 'helpers' in 'extras'")
import helpers
name = "Keith Thompson"
In main.py
, let’s import extras.
~/using_modules/main.py
print("We're importing 'helpers' from 'main'")
from helpers import *
print("We're importing 'extras' from 'main'")
import extras
print(f"Lowercase letters: {extract_lower(extras.name)}")
print(f"Uppercase letters: {extract_upper(extras.name)}")
Finally, in helpers.py
we’ll add print
, so that we can see when it is run and how many times it is run.
~/using_modules/helpers.py
def extract_upper(phrase):
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
print("HELLO FROM HELPERS")
We now have enough print lines to helps us really see how main.py
is processed and when our modules are interpreted. When we run it, this is what we see:
$ python3.7 main.py
We're importing 'helpers' from 'main'
HELLO FROM HELPERS
We're importing 'extras' from 'main'
We're import 'helpers' from 'extras'
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
As we can see, the code within the helpers
module was only interpreted the first time that it was imported. So even though it was imported into two different modules, it was only ever run one time.
Running a Module Directly
Ideally, we don’t want to run this print
line when our module is imported, but sometimes we do want a module to execute something if it is run directly. To handle this, we can access the __name__
variable. The __name__
variable is set in each module and can be used to determine if the module is being run directly as opposed to being imported. Let’s change the various print lines from our previous lesson to help us understand the values set to __name__
in each of our scripts.
~/using_modules/main.py
from helpers import *
import extras
print(f"__name__ in main.py: {__name__}")
print(f"Lowercase letters: {extract_lower(extras.name)}")
print(f"Uppercase letters: {extract_upper(extras.name)}")
~/using_modules/helpers.py
def extract_upper(phrase):
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
print(f"__name__ in helpers.py: {__name__}")
print("HELLO FROM HELPERS")
~/using_modules/extras.py
import helpers
print(f"__name__ in extras.py: {__name__}")
name = "Keith Thompson"
Here's what we see when we run main.py:
$ python3.7 main.py
__name__ in helpers.py: helpers
HELLO FROM HELPERS
__name__ in extras.py: extras
__name__ in main.py: __main__
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
All of the modules that we imported have __name__
set to the actual module name, but main.py is set to __main__
because it is running in the main context. A common pattern is to add a condition like this if we want to add functionality to a module only if it is running in the main context:
if __name__ == "__main__":
print("Something only when running in main scope")
To demonstrate this, let’s remove all of these debugging lines, but move “HELLO FROM HELPERS” into this conditional in helpers.py
.
(We’re only showing the change to helpers.py
, but we removed all of the '__name__
in …’ output)
~/using_modules/helpers.py
def extract_upper(phrase):
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
if __name__ == "__main__":
print("HELLO FROM HELPERS")
If we now run main.py
we should see the following:
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
If we run helpers.py
directly, we should see the print line being run.
$ python3.7 helpers.py
HELLO FROM HELPERS
43.4. Hiding Module Entities
Now that we know how to import our modules, we might want to restrict what is exposed. In this lesson, we’ll look at how we can hide some of our module’s contents from being imported by other modules and scripts.
Documentation
What Are Module Entities?
When we see module entities
, we need to see variables
, functions
, and classes
(we’ll cover classes in the next section). A module entity is anything we provide with a name in our module. As we’ve seen, these things are importable by name when we used from <module> import <name>
.
Using __all__
If we want to prevent someone from importing an entity from our module, there aren’t very many options. There are only two reasonable things we can do to restrict what is imported if someone uses from <module> import *
. The first is by setting the __all__
variable in our module. Let’s test this out by setting __all__
to a list including only extract_upper
to see what happens in main.py
.
~/using_modules/helpers.py
__all__ = ["extract_upper"]
def extract_upper(phrase):
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
if __name__ == "__main__":
print("HELLO FROM HELPERS")
In main.py
, we had been using both of these functions after loading them with from helpers import *
. Here’s another look at what main.py
currently looks like.
~/using_modules/main.py
from helpers import *
import extras
print(f"Lowercase letters: {extract_lower(extras.name)}")
print(f"Uppercase letters: {extract_upper(extras.name)}")
With __all__
set in helpers
, let’s run main.py
to see what happens.
$ python3.7 main.py
Traceback (most recent call last):
File "main.py", line 4, in <module>
print(f"Lowercase letters: {extract_lower(extras.name)}")
NameError: name 'extract_lower' is not defined
Although name
exists within helpers.py
, it is not available in other modules via from helpers import *
. This does not mean that we can’t explicitly import extract_lower
though. Let’s modify main.py
to import extract_lower
by name.
~/using_modules/main.py
from helpers import *
from helpers import extract_lower
import extras
print(f"Lowercase letters: {extract_lower(extras.name)}")
print(f"Uppercase letters: {extract_upper(extras.name)}")
Let’s run this one more time.
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
While it doesn’t allow us to prevent an entity from ever being imported, using __all__
does provide a way of sometimes restricting what is imported by modules and scripts consuming our modules and packages.
Using Underscored Entities
The other way we can prevent an entity from being exported automatically when someone uses from <module> import *
is by making the first character an underscore (_)
. If we removed __all__
from helpers.py
and created a variable called _hidden_var = "test"
, we would not have access to _hidden_var
after running from helpers import *
.
43.5. The Module Search Path
We’ve seen how to create our modules, and we’ve been able to import them from scripts adjacent to them in the file system, but where else can we import modules from?
Documentation
Where Do Modules Come From?
Python is a language with a large and powerful standard library of modules. To use these modules, we need to import them the same way that we’ve been importing our local modules, but how does Python know where to find the code for these modules? To understand this we need to look at the module search path. When Python goes looking for a module it has a path that works very much like the PATH
variable used by our shell to find executables. A few different things are combined to make this path:
- The directory containing the running script is automatically the first item in the search path. When running the REPL this will be the current directory.
- The values set in the
PYTHONPATH
environment variable (if it is set) will be next in the list. - Finally, a list of directories configured when Python was installed. This list contains paths to directories that have the standard library modules and other packages we’ve installed.
If we want to see the module search path, we can import the sys module and view the path
variable. Let’s do this from a REPL.
$ python3.7
Python 3.7.6 (default, Jan 29 2020, 21:20:26)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python37.zip', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python3.7', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python3.7/lib-dynload', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python3.7/site-packages']
>>> exit()
Our Python install is in ~/.pyenv/versions/3.7.6
, and the directories within contain the standard library. The site-packages
directory contains third-party packages that we might install.
Just to show that we can change this, let’s set the PYTHONPATH
environment variable when starting the REPL.
$ PYTHONPATH=/home/cloud_user python3.7
Python 3.7.6 (default, Jan 29 2020, 21:20:26)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/home/cloud_user', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python37.zip', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python3.7', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python3.7/lib-dynload', '/home/cloud_user/.pyenv/versions/3.7.6/lib/python3.7/site-packages']
>>> exit()
Now we can see that /home/cloud_user
is the second item in the list. If we don’t have a package in our current directory (the ‘’ in the list), then it will check items passed in via PYTHONPATH
before looking at items provided by our Python installation.
Note: Python will search for a built-in module by name before searching the paths in sys.path
. This means you can’t accidentally create a module with the same name as a built-in module, which prevents you from overwriting the built-in module.
43.6. Creating and Using Python Packages
Python modules are simply Python files, but they are not the only way we can bundle up our code for reuse. Modules are not that easy to share. The primary way we share code is by wrapping our modules into packages. In this lesson, we’ll learn what it takes to create a Python package.
Documentation
What Is a Package in Python?
A package is a namespace that allows us to group modules together. We create a package in Python by creating a directory to hold our modules and adding a special file named __init__.py
. To show how a package can allow us to organize our code even more, let’s create a helpers directory within using_modules. Let’s create an empty __init__.py
file within that directory.
$ mkdir ~/using_modules/helpers
$ touch ~/using_modules/helpers/__init__.py
The __init__.py
doesn’t need to have anything in it, though we can and will use it later. Next, let’s move our helpers.py
file into the helpers
directory and change its name to strings.py
since this file holds helper functions completely focused on working with strings. Our extras.py
module actually doesn’t do anything besides defining variables, so let’s move it into helpers
as helpers/variables.py
.
$ cd ~/using_modules
$ mv helpers.py helpers/strings.py
$ mv extras.py helpers/variables.py
We now have a package that contains two modules, but we also broke main.py. Let’s change main.py to use our package, instead of the modules that we had before.
~/using_modules/main.py
from helpers.strings import extract_lower, extract_upper
from helpers import variables
import helpers
print(f"Lowercase letters: {extract_lower(variables.name)}")
print(f"Uppercase letters: {extract_upper(variables.name)}")
print(f"From helpers: {helpers.strings.extract_lower(variables.name)}")
The things to note here are that we can access the modules within our packages by importing them directly like with variables
and by chaining them off of the package name to import entities directly from the child module. Just like we can with a module, we’re able to import the package directly.
Running main.py
again we should see:
$ python3.7 main.py
Lowercase letters: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters: ['K', 'T']
From helpers: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
What Does init.py Do?
The mysterious __init__.py
file is used to set up the initialization code for a package, but what does this mean? This means that when the first subpackage or module within the parent package is accessed, then the code within __init__.py
gets executed. The primary other thing we can do with our __init__.py
is define the __all__
value for when we use from <package> import *
. This doesn’t immediately make sense because our __init__.py
doesn’t define anything right now, but we can import parts from our submodules and then make those immediately available if someone imports our package. Let’s modify helpers/__init__.py
to do just that.
~/using_modules/helpers/__init__.py
__all__ = ['extract_upper']
from .strings import *
The syntax of .strings
allows us to specify that we want to load the strings module within our package, regardless of what our package is named. This is just a way to be a little more explicit. Let’s change our main.py
to use this.
~/using_modules/main.py
from helpers.strings import extract_lower
from helpers import variables
from helpers import *
import helpers
print(f"Lowercase letters (from strings): {extract_lower(variables.name)}")
print(f"Uppercase letters (from package): {extract_upper(variables.name)}")
print(f"Off of helpers: {helpers.strings.extract_lower(variables.name)}")
Once again, let’s run our script to see that this code works.
$ python3.7 main.py
Lowercase letters (from strings): ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters (from package): ['K', 'T']
Off of helpers: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Implicit Namespace Packages
While the PCAP syllabus doesn’t actually mention implicit namespace packages, it is worth noting that they exist. As of Python 3.3, if we’re creating a package that doesn’t need to do anything with the __init__.py
, then we can skip creating the __init__.py
entirely and our package will work just fine.
43.7. Distributing and Installing Packages
Packages are invaluable when working in Python because the community has published a plethora of useful packages that can prevent us from needing to write that code ourselves. Additionally, we can share our own code with others by setting up our packages for distribution.
Documentation
Installing Packages
Before we look at how we can go about making our own packages installable, let’s cover installing a package from someone else. The primary place we’ll be installing packages from will be from the “Python Package Index” or “PyPi” for short.
To install packages, we’ll use pip. Let’s install one of the most popular Python packages, the requests package.
$ pip3.7 install requests
Collecting requests
Downloading https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl (57kB)
|????????????????????????????????| 61kB 2.4MB/s
Collecting certifi>=2017.4.17 (from requests)
Downloading https://files.pythonhosted.org/packages/b9/63/df50cac98ea0d5b006c55a399c3bf1db9da7b5a24de7890bc9cfd5dd9e99/certifi-2019.11.28-py2.py3-none-any.whl (156kB)
|????????????????????????????????| 163kB 8.0MB/s
Collecting idna<2.9,>=2.5 (from requests)
Downloading https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl (58kB)
|????????????????????????????????| 61kB 10.8MB/s
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests)
Downloading https://files.pythonhosted.org/packages/e8/74/6e4f91745020f967d09332bb2b8b9b10090957334692eb88ea4afe91b77f/urllib3-1.25.8-py2.py3-none-any.whl (125kB)
|????????????????????????????????| 133kB 10.8MB/s
Collecting chardet<3.1.0,>=3.0.2 (from requests)
Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)
|????????????????????????????????| 143kB 12.8MB/s
Installing collected packages: certifi, idna, urllib3, chardet, requests
Successfully installed certifi-2019.11.28 chardet-3.0.4 idna-2.8 requests-2.22.0 urllib3-1.25.8
$
The requests
package has some dependencies on other packages so pip
will go ahead and download those dependencies. For the purposes of the PCAP exam, we just need to know how to install packages, but it is definitely worth viewing the other commands provided by pip
by running pip --help
.
Making a Package Installable
To make a package installable, it needs to have a file in the root of the package called setup.py
. The structure of installable packages can vary, but the presence of a setup.py
is constant. Let’s make our helpers
package installable by adding a setup.py
and configuring it using the setup function. The “Python Packaging Authority” is the working group that maintains the core projects use for Python packaging, and they provide an example project. We’re going to take the setup.py
from that project as a starting point and modify it for our purposes. To begin, we do need to change our helpers
directory to be the container for our installable package (different than a “python package”). Let’s move things around before creating our setup.py
.
$ cd ~/using_modules
$ mkdir -p helpers/src/helpers
$ mv helpers/*.py helpers/src/helpers/
Using tree
on our directory structure for helpers
will provide us a better way to view our directories. Note that you may have to install tree using sudo yum install tree
.
$ tree helpers
helpers/
|---> src
|---> helpers
|---> __init__.py
|---> strings.py
|---> variables.py
2 directories, 3 files
The outer helpers
directory is there just to hold onto our code and isn’t actually a Python package. The inner helpers
will provide the package that can be imported after the distribution of this code is installed. For our code to be installable, we still need a setup.py
file, which will go in the outer helpers
directory. Feel free to download it directly using the curl
command or copy and paste the contents below.
$ cd helpers/
$ curl -O https://raw.githubusercontent.com/pypa/sampleproject/master/setup.py
Here’s what it will look like:
~/using_modules/helpers/setup.py
from setuptools import setup, find_packages
from os import path
here = path.abspath(path.dirname(__file__))
# Get the long description from the README file
with open(path.join(here, 'README.md'), encoding='utf-8') as f:
long_description = f.read()
setup(
name='helpers', # Required
version='1.0.0', # Required
description='Our custom collection of helper functions and variables.', # Optional
# long_description=long_description, # Optional
# long_description_content_type='text/markdown', # Optional (the README is markdown so we want to set this)
# url='https://github.com/pypa/sampleproject', # Optional
author='Keith Thompson', # Optional
author_email='keith@linuxacademy.com', # Optional
# Classifiers help users find your project by categorizing it.
#
# For a list of valid classifiers, see https://pypi.org/classifiers/
classifiers=[ # Optional
# How mature is this project? Common values are
# 3 - Alpha
# 4 - Beta
# 5 - Production/Stable
'Development Status :: 3 - Alpha',
# Indicate who your project is intended for
'Intended Audience :: Developers',
'Topic :: Software Development :: Build Tools',
# Pick your license as you wish
'License :: OSI Approved :: MIT License',
# Specify the Python versions you support here. In particular, ensure
# that you indicate whether you support Python 2, Python 3 or both.
# These classifiers are *not* checked by 'pip install'. See instead
# 'python_requires' below.
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
],
keywords='helpers', # Optional
# When your source code is in a subdirectory under the project root, e.g.
# `src/`, it is necessary to specify the `package_dir` argument.
package_dir={'': 'src'}, # Optional
# You can just specify package directories manually here if your project is
# simple. Or you can use find_packages().
#
# Alternatively, if you just want to distribute a single Python file, use
# the `py_modules` argument instead as follows, which will expect a file
# called `my_module.py` to exist:
#
# py_modules=["my_module"],
#
packages=find_packages(where='src'), # Required
# Specify which Python versions you support. In contrast to the
# 'Programming Language' classifiers above, 'pip install' will check this
# and refuse to install the project if the version does not match. If you
# do not support Python 2, you can simplify this to '>=3.5' or similar, see
# https://packaging.python.org/guides/distributing-packages-using-setuptools/#python-requires
python_requires='!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, <4',
# This field lists other packages that your project depends on to run.
# Any package you put here will be installed by pip when your project is
# installed, so they must be valid existing projects.
#
# For an analysis of "install_requires" vs pip's requirements files see:
# https://packaging.python.org/en/latest/requirements.html
# install_requires=['peppercorn'], # Optional
# List additional groups of dependencies here (e.g. development
# dependencies). Users will be able to install these using the "extras"
# syntax, for example:
#
# $ pip install sampleproject[dev]
#
# Similar to `install_requires` above, these must be valid existing
# projects.
# extras_require={ # Optional
# 'dev': ['check-manifest'],
# 'test': ['coverage'],
# },
# If there are data files included in your packages that need to be
# installed, specify them here.
#
# If using Python 2.6 or earlier, then these have to be included in
# MANIFEST.in as well.
# package_data={ # Optional
# 'sample': ['package_data.dat'],
# },
# Although 'package_data' is the preferred approach, in some case you may
# need to place data files outside of your packages. See:
# http://docs.python.org/3.4/distutils/setupscript.html#installing-additional-files
#
# In this case, 'data_file' will be installed into '<sys.prefix>/my_data'
# data_files=[('my_data', ['data/data_file'])], # Optional
# To provide executable scripts, use entry points in preference to the
# "scripts" keyword. Entry points provide cross-platform support and allow
# `pip` to create the appropriate form of executable for the target
# platform.
#
# For example, the following would provide a command called `sample` which
# executes the function `main` from this package when invoked:
# entry_points={ # Optional
# 'console_scripts': [
# 'sample=sample:main',
# ],
# },
# List additional URLs that are relevant to your project as a dict.
#
# This field corresponds to the "Project-URL" metadata fields:
# https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use
#
# Examples listed include a pattern for specifying where the package tracks
# issues, where the source is hosted, where to say thanks to the package
# maintainers, and where to support the project financially. The key is
# what's used to render the link text on PyPI.
# project_urls={ # Optional
# 'Bug Reports': 'https://github.com/pypa/sampleproject/issues',
# 'Funding': 'https://donate.pypi.org',
# 'Say Thanks!': 'http://saythanks.io/to/example',
# 'Source': 'https://github.com/pypa/sampleproject/',
# },
)
We left a lot of comments in there because they are good to read and understand, but they’re for optional fields. Some of the important and potentially confusing lines to look at are the package_dir
and packages arguments
. We’ve put our code into the src directory. We’ve set these two arguments and used the find_packages
function from setuptools
to automatically find the packages that we’re providing when someone installs this.
Building a Distribution
Making code installable in Python means that we need to create a distribution. There are two primary types of distributions: eggs and wheels. Wheels are the modern way to create a distribution and they’re a single file that can be installed by pip. They will install any dependencies and place or unpack the source code into the site-packages
directory for our Python installation. For us to build a wheel distribution, we need to install the wheel
package and run a command using Python and our setup.py
file. Let’s install wheel first.
$ pip3.7 install --upgrade wheel
...
Setuptools provides us with multiple different subcommands if we process our setup.py
through the Python interpreter. Let’s take a look at those commands.
$ python3.7 setup.py --help
Traceback (most recent call last):
File "setup.py", line 7, in <module>
with open(path.join(here, 'README.md'), encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/cloud_user/using_modules/helpers/README.md'
Our setup.py
specifies that we’ll provide documentation in a README.md
file, but that file doesn’t exist, so we can’t read it. We’ll cover file IO later in the course, but for now, we just need to make sure that that file exists.
$ touch README.md
Now, let’s try again.
$ python3.7 setup.py --help
Common commands: (see '--help-commands' for more)
setup.py build will build the package underneath 'build/'
setup.py install will install the package
Global options:
--verbose (-v) run verbosely (default)
--quiet (-q) run quietly (turns verbosity off)
--dry-run (-n) don't actually do anything
--help (-h) show detailed help message
--no-user-cfg ignore pydistutils.cfg in your home directory
--command-packages list of packages that provide distutils commands
Information display options (just display information, ignore any commands)
--help-commands list all available commands
--name print package name
--version (-V) print package version
--fullname print <package name>-<version>
--author print the author's name
--author-email print the author's email address
--maintainer print the maintainer's name
--maintainer-email print the maintainer's email address
--contact print the maintainer's name if known, else the author's
--contact-email print the maintainer's email address if known, else the
author's
--url print the URL for this package
--license print the license of the package
--licence alias for --license
--description print the package description
--long-description print the long package description
--platforms print the list of platforms
--classifiers print the list of classifiers
--keywords print the list of keywords
--provides print the list of packages/modules provided
--requires print the list of packages/modules required
--obsoletes print the list of packages/modules made obsolete
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
This gives us a lot of output, but only the common commands are provided to us. Reading the first line of the output, we can see that the rest of the commands can be shown by using –help-commands instead of --help
. Let’s do that.
$ python3.7 setup.py --help-commands
Standard commands:
build build everything needed to install
build_py "build" pure Python modules (copy to build directory)
build_ext build C/C++ extensions (compile/link to build directory)
build_clib build C/C++ libraries used by Python extensions
build_scripts "build" scripts (copy and fixup #! line)
clean clean up temporary files from 'build' command
install install everything from build directory
install_lib install all Python modules (extensions and pure Python)
install_headers install C/C++ header files
install_scripts install scripts (Python or otherwise)
install_data install data files
sdist create a source distribution (tarball, zip file, etc.)
register register the distribution with the Python package index
bdist create a built (binary) distribution
bdist_dumb create a "dumb" built distribution
bdist_rpm create an RPM distribution
bdist_wininst create an executable installer for MS Windows
check perform some checks on the package
upload upload binary package to PyPI
Extra commands:
bdist_wheel create a wheel distribution
alias define a shortcut to invoke one or more commands
bdist_egg create an "egg" distribution
develop install package in 'development mode'
dist_info create a .dist-info directory
easy_install Find/get/install Python packages
egg_info create a distribution's .egg-info directory
install_egg_info Install an .egg-info directory for the package
rotate delete older distributions, keeping N newest files
saveopts save supplied options to setup.cfg or other config file
setopt set an option in setup.cfg or another config file
test run unit tests after in-place build
upload_docs Upload documentation to PyPI
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
There are plenty of commands in here to play with, but the one that we care about is the extra command bdist_wheel
. This will build a wheel distribution that will work perfectly with pip. Let’s run that now.
$ python3.7 setup.py bdist_wheel
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/helpers
copying src/helpers/__init__.py -> build/lib/helpers
copying src/helpers/strings.py -> build/lib/helpers
copying src/helpers/variables.py -> build/lib/helpers
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/helpers
copying build/lib/helpers/__init__.py -> build/bdist.linux-x86_64/wheel/helpers
copying build/lib/helpers/strings.py -> build/bdist.linux-x86_64/wheel/helpers
copying build/lib/helpers/variables.py -> build/bdist.linux-x86_64/wheel/helpers
running install_egg_info
running egg_info
writing src/helpers.egg-info/PKG-INFO
writing dependency_links to src/helpers.egg-info/dependency_links.txt
writing top-level names to src/helpers.egg-info/top_level.txt
reading manifest file 'src/helpers.egg-info/SOURCES.txt'
writing manifest file 'src/helpers.egg-info/SOURCES.txt'
Copying src/helpers.egg-info to build/bdist.linux-x86_64/wheel/helpers-1.0.0-py3.7.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/helpers-1.0.0.dist-info/WHEEL
creating 'dist/helpers-1.0.0-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'helpers/__init__.py'
adding 'helpers/strings.py'
adding 'helpers/variables.py'
adding 'helpers-1.0.0.dist-info/METADATA'
adding 'helpers-1.0.0.dist-info/WHEEL'
adding 'helpers-1.0.0.dist-info/top_level.txt'
adding 'helpers-1.0.0.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
We now have a build
and dist
directory inside of the upper helpers
directory. The artifact that we created will be within the dist
directory and end with a .whl
extension.
Going back to ~/using_modules
, we’ll actual ly run into issues if we try to run main.py
right now because there is no helpers
package local to the file anymore. Here’s what we’ll see when we run that script:
$ cd ~/using_modules
$ python3.7 main.py
Traceback (most recent call last):
File "main.py", line 1, in <module>
from helpers.strings import extract_lower
ModuleNotFoundError: No module named 'helpers.strings'
To get around this, we’ll install our package using pip
and the wheel we built.
$ pip3.7 install helpers/dist/helpers-1.0.0-py3-none-any.whl
Processing ./helpers/dist/helpers-1.0.0-py3-none-any.whl
Installing collected packages: helpers
Successfully installed helpers-1.0.0
When we run a script or load the REPL, we can load the helpers
package and its internal modules.
$ python3.7 main.py
Lowercase letters (from strings): ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters (from package): ['K', 'T']
Off of helpers: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Our package is installed and our script runs again without using a module local to the script. We’re not going to cover publishing a package to PyPi in this course, but the PyPA documentation also details how to do that.
43.8. Docstrings, Doctests, and Shebangs
Now that we’ve created both modules and packages, we should help the potential users of our code by adding some documentation. Additionally, it’s a little cumbersome to continually pass our main.py
script to the Python executable to run it, so we’re going to turn that script into an executable to make using it a little easier.
Documentation
Documenting Python Code Using Docstrings
In many languages, when we write documentation for our code, it exists in the source code as a comment. Python is a little different because the documentation exists in the code. This official type of documentation is done by adding docstrings to our modules at the top of the file, or within functions, methods, and classes. Docstrings are triple quoted strings (start with """
or '''
) used to write multi-line, structured documentation. To add documentation to a package, we can add a docstring to the top of the package’s __init__.py
file. Let’s add some documentation to the helpers
package.
~/using_modules/helpers/src/helpers/__init__.py
"""
Helpers is a package that provides easy to use helper functions
and variables.
"""
__all__ = ["extract_upper"]
from .strings import *
One of the most common misconceptions in Python is that we just created a “block comment”. That’s entirely incorrect. We created a multi-line string and the interpreter has to do some work to read that content. An actual comment starts with an octothorp/hash/
pound sign and the interpreter completely ignores it. In the very specific case of a docstring, this string will actually be assigned to a hidden variable on the package, module, function: the __doc__
variable. To demonstrate this, we’re going to change how we installed our package so that it will pick up code changes as we write them. First, let’s uninstall the existing helpers
package.
Note: Since pip
matches your Python version, if you are not using pip 3.7
you can use the pip -V
command to find its version.
$ pip3.7 uninstall -y helpers
Found existing installation: helpers 1.0.0
Uninstalling helpers-1.0.0:
Successfully uninstalled helpers-1.0.0
We can install the package’s source so that the changes we make will be available without a reinstall. This is handy in development, but not something we would have other users do.
$ cd ~/using_modules/helpers
$ pip3.7 install --editable .
Obtaining file:///home/cloud_user/using_modules/helpers
Installing collected packages: helpers
Running setup.py develop for helpers
Successfully installed helpers
To see that our documentation is accessible in code, let’s start the REPL, import our package, and access the __doc__
variable:
$ python3.7
Python 3.7.6 (default, Jan 30 2020, 15:46:02)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import helpers
>>> helpers.__doc__
'\nHelpers is a package that provides easy to use helper functions\nand variables.\n'
Since modules are just Python files, we can do this same thing to document any module we write. To document a function we will create a triple-quoted string at the top of the function body. Let’s write some documentation for extract_upper
now.
~/using_modules/helpers/src/helpers/strings.py
def extract_upper(phrase):
"""
extract_upper takes a string and returns a list containing
only the uppercase characters from the string
>>> extract_upper("Hello There, BOB")
['H', 'T', 'B', 'O', '']
"""
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
if __name__ == "__main__":
print("HELLO FROM HELPERS")
We’ve now created a docstring for a function. One of the downsides with documenting code is that it is pretty easy for the documentation and the code to get out of sync with one another, and bad documentation helps no one. Thankfully, docstrings can be used by another standard library module called doctest
that allows us to add what looks like Python REPL lines into our docstrings that will then be evaluated to verify that they produce the expected results. Let’s use the doctest
module on our file to see if our documentation is accurate.
$ python3.7 -m doctest src/helpers/strings.py
**********************************************************************
File "src/helpers/strings.py", line 6, in strings.extract_upper
Failed example:
extract_upper("Hello There, BOB")
Expected:
['H', 'T', 'B', 'O', '']
Got:
['H', 'T', 'B', 'O', 'B']
**********************************************************************
1 items had failures:
1 of 1 in strings.extract_upper
***Test Failed*** 1 failures.
Our documentation is acting as an automated test and can now help us find regressions in our code and our documentation. In this case, the code works as intended, but there’s a typo in the documentation that demonstrates how the code would be used. Let’s fix that.
~/using_modules/helpers/src/helpers/strings.py
def extract_upper(phrase):
"""
extract_upper takes a string and returns a list containing
only the uppercase characters from the string
>>> extract_upper("Hello There, BOB")
['H', 'T', 'B', 'O', 'B']
"""
return list(filter(str.isupper, phrase))
def extract_lower(phrase):
return list(filter(str.islower, phrase))
if __name__ == "__main__":
print("HELLO FROM HELPERS")
If we run doctest
again, we should see no output because the results match the expected outcome.
$ python3.7 -m doctest src/helpers/strings.py
$
Setting a Shebang for a Script
The last thing we want to do is adjust main.py
, so that we can run it directly. To do this, we need to do two things:
- Explicitly make it executable using
chmod
. - Add a shebang to the top of the script so that the proper program will run the script.
Shebangs are useful because they allow us to write scripts in languages other than our shell’s language (bash, sh, zsh, etc.). For this to work, we need to add a reference to the executable to use at the top of the file in a special comment called a shebang. From the perspective of Python, a shebang starts like any other comment, but then immediately has an exclamation point. Let’s set our script to use the default python
executable that is currently active in our environment.
~/using_modules/main.py
#!/usr/bin/env python
from helpers.strings import extract_lower
from helpers import variables
from helpers import *
import helpers
print(f"Lowercase letters (from strings): {extract_lower(variables.name)}")
print(f"Uppercase letters (from package): {extract_upper(variables.name)}")
print(f"Off of helpers: {helpers.strings.extract_lower(variables.name)}")
If we make the script exectuable and run it, we should see the usual output without needing to pass it to the Python executable.
$ chmod +x ~/using_modules/main.py
$ ~/using_modules/main.py
Lowercase letters (from strings): ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters (from package): ['K', 'T']
Off of helpers: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Using the env
command followed by the executable we’d normally use is a good approach to setting a shebang for Python. If we want to be explicit about the version of Python to use, then we can use the absolute path. Using our pyenv-installed Python 3.7.6, we would use this path:
~/using_modules/main.py
#!/home/cloud_user/.pyenv/versions/3.7.6/bin/python
from helpers.strings import extract_lower
from helpers import variables
from helpers import *
import helpers
print(f"Lowercase letters (from strings): {extract_lower(variables.name)}")
print(f"Uppercase letters (from package): {extract_upper(variables.name)}")
print(f"Off of helpers: {helpers.strings.extract_lower(variables.name)}")
If we switch our Python back to the system Python and run main.py
, it will still have access to the helpers
package which is only installed for version 3.7.6.
$ pyenv shell system
$ python -V
Python 2.7.5
$ ~/using_modules/main.py
Lowercase letters (from strings): ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
Uppercase letters (from package): ['K', 'T']
Off of helpers: ['e', 'i', 't', 'h', 'h', 'o', 'm', 'p', 's', 'o', 'n']
[]: