Another means to reuse code is by gathering a collection of definitions of classes, functions, and data into python source files called modules.
The idea is to use or define units with well-defined purpose and scope, and reuse them instead of copying source lines of similar code.
The module file usually contains definitions of classes, functions, and global variables, data containers, and some statements to initialize the module.
--
Herein we will see how to transform scripts into modules and prepare a package.
-
A script is a unit of python code that is executed from a single file. It need not use functions.
-
A module is a unit of python code that can be be imported into other python code files. It need not be directly executable as a script.
-
A package is a collection of python modules that can be imported as a whole using a package manager. A package need not provide executable scripts, but should provide some testing capabilities.
--
- Use a text editor to open the script
./src/syene_script.py
- Add function definitions, one for each code block marked by a comment line.
- Add a
main()
function that calls the others, and reproduces the behavioe ofsyrene_script.py
- Save as
syene_module.py
--
- Modules
- Examining a module file
- Script or module?
- Creating a module
- Consensual privacy
- Dunder names
- Recommending visibility
- Re-importing a module
--
Modular programming improves readability and maintainability of your programs. In practice, programs have fewer bugs and are easier to extend and debug/troubleshoot.
The key idea is to emphasize separating the functionality of programs into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of the desired functionality.
--
Most of the functionality in Python is provided by modules
The Python Standard Library is in fact a large collection of modules:
- cross-platform implementations of common facilities
- e.g. access to the operating system, file I/O, string management, network communication.
--
Python supports modular programming at many levels.
- Containers and Functions, low-level
- Classes, low-level
- Python modules, higher-level
- modules group related data, functions, and classes together in a
my_module.py
. - The module file contents may be added to local scope with
import my_module
- when imported, the module
file name
is used as a namespace providing access to all contents of the file.
--
- As an example, we will work with the module
random
- the standard
random
module contains mathematical functions for pseudo-random numbers - Use a local copy
myrandom.py
of the filerandom.py
, just in case. - The file is about 750 lines long.
- examine the source file in an editor/IDE that provides syntax highlighting.
--
Find where the random
module file is installed and make a copy of it to the src
directory in the tutorial file directory.
import random
file_path = random.__file__
print(file_path)
--
-
A module is an object whose attributes & methods are stored in module file.
-
For instance, looking at the
myrandom
module's__file__
attribute provides the path to the module file that contains all its code.
print('The myrandom module is in the file %s' % \
myrandom.__file__)
--
To import the module myrandom
:
import src.myrandom as myrandom
Once the module myrandom
is imported from the local file myrandom.py
, we can list the "symbols" it provides using the built-in dir()
function:
for name in dir(myrandom):
print(name)
print(len(dir(myrandom)))
--
- The module file
myrandom.py
begins with a docstring at the very beginning of the file (lines 1 through 37). This is the text that appears in an interactive IPython session on typing?myrandom
.
"""Random variable generators.
integers
--------
uniform within range
# Lines deleted...
General notes on the underlying Mersenne Twister core generator:
* The period is 2**19937-1.
* It is one of the most extensively tested generators in existence.
* The random() method is implemented in C, executes in a single Python step,
and is, therefore, threadsafe.
"""
- Typing
help(myrandom)
prints out out the module docstring followed by the class docstring and the function docstring for every class and function in the module.
--
- After the module docstring, the module file
myrandom.py
imports a few modules (lines 39 through 45). - That is, modules can import other modules. It is customary to put imports of other modules near the top of a file for clarity.
from warnings import warn as _warn
from types import MethodType as _MethodType, BuiltinMethodType as _BuiltinMethodType
# ... Lines deleted
from hashlib import sha512 as _sha512
--
- After the
import
s, a few constants are defined (lines 47 through 59) and a (private, hidden) module called_random
is imported at line 66. - Once a module is imported its varibles are accessible by the Python interpreter.
__all__ = ["Random","seed","random","uniform","randint","choice","sample",
# ... Lines deleted
"SystemRandom"]
NV_MAGICCONST = 4 * _exp(-0.5)/_sqrt(2.0)
TWOPI = 2.0*_pi
LOG4 = _log(4.0)
# ... Lines deleted
import _random
--
- The classes
myrandom.Random
andmyrandom.SystemRandom
are defined in lines 68 through 635 and lines 639 through 668 respectively. - The class
myrandom.Random
provides the important methods in this module.
class Random(_random.Random):
"""Random number generator base class used by bound module functions.
# Lines deleted
"""
VERSION = 3 # used by getstate/setstate
# MANY lines deleted
## --------------- Operating System Random Source ------------------
class SystemRandom(Random):
"""Alternate random number generator using sources provided
by the operating system (such as /dev/urandom on Unix or
CryptGenRandom on Windows).
Not available on all systems (see os.urandom() for details).
"""
--
- Between lines 672 and 710, there are two test functions
_test_generator
and_test
. - A single instance
_inst
is constructed of the classrandom.Random
at line 718.
_inst = Random()
- Between lines 719 and 739, various functions are assigned as aliases so that, for instance, the function
myrandom.uniform
is actually the methodmyrandom._inst.uniform
from the objectmyrandom._inst
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
# ... Lines deleted
getrandbits = _inst.getrandbits
--
Newcomers to Python can be confused by the terms "script file" and "module file".
It is natural to ask, given a file containing Python code like myrandom.py
, is it a script or a module? The answer is that it is both.
There are two ways to get the functions and classes stored in myrandom.py
into a Python session.
The one method we have seen already is to start a Python session (e.g., in a Jupyter/IPython notebook or a plain Python shell) and to import the module into the workspace.
We have done this already and we can examine all the internal objects created in the file.
import src.myrandom as myrandom # If already imported, no change is made.
print('myrandom.Log4 is %f.' % myrandom.LOG4)
print(type(myrandom._inst)) # Remember, myrandom._inst is an instance of class myrandom.Random
# All the functions in this module are in fact methods of this class instance
print(myrandom.uniform == myrandom._inst.uniform)
print(__name__)
--
All code in a module is executed on import
- def statements only create functions, not call them
- if you a module to act like both a library and a script, write a main, and wrap it in a test for
__name__
if __name__ == "__main__":
print("What's in a name?")
Python searches the paths contained in sys.path
to find anything you try to import
import sys
sys.path
Installing a python module can be as simple as copying the file into a path like
'/Users/jvestuto/anaconda/lib/python3.5/site-packages'
or using an install tool that does this for you.
Note: Most installs are NOT that simple! Dear Team Conda, we love you!
--
The second way to get all the objects described in the file myrandom.py
into a Python session is to execute the file as a script. That is, from the command prompt of a shell, type
% python src/myrandom.py
just as we would do for running a Python script (assuming the file myrandom.py
is in the working directory).
As the interpreter parses the file, all the imports, constant definitions, class definitions, and function definitions are executed as if entered at the Python command prompt.
These lines are executed also when the module myrandom
is imported into a Python session.
--
The difference is in the very last two lines of the file myrandom.py
:
if __name__ == '__main__':
_test()
When a Python file is imported (as a module) or executed (as a script), the Python interpreter sets a special identifier __name__
that associates any objects created with a certain namespace.
-
If the file
myrandom.py
is executed (as a script) using "python src/myrandom.py
", the variable__name__
is set to'__main__'
. -
If the file
myrandom.py
is imported (as a module) using "import src/myrandom
", the variable__name__
is set to'myrandom'
.
--
Thus, the last two lines of the file myrandom.py
execute only when the file is run as a script, in which case, it executes the test code constucted in the function _test()
.
When imported as a module, the test "__name__=='__main__'
" fails at the top of the if
block and the _test()
function is not executed.
# This shell command executes myrandom.py and hence runs the _test() function.
!python src/myrandom.py
More generally, importing a module from a file module.py
assigns __name__=
'module'
. The idiom
if __name__ == "__main__:
# Block of code to execute
# when module runs as script
is widely used in module files to provide tests of the module's functions and classes.
With this if
block in place, the Python interpreter ignores the block of, say, test code when the module file is imported as a module, but runs the tests when the module file is executed as a script.
--
Having examined an actual module from the Python library, let's create our own module as an example.
- Use a text editor to create a file
BankAccount.py
and add to it the following code block:
'''BankAccount: This is the module docstring.'''
class BankAccount:
def __init__(self, account_ID, first_name, last_name, initial_balance):
self._account_ID = account_ID
self._first_name = first_name
self._last_name = last_name
self._balance = initial_balance
def deposit(self, amount):
'''BankAccount.deposit(amount) increases balance by amount'''
try:
if amount<=0:
raise(ValueError('Expect positive amount!'))
self._balance += amount
except Exception as e:
print(repr(e))
def withdraw(self, amount):
'''BankAccount.withdraw(amount) increases balance by amount'''
try:
if amount<=0:
raise(ValueError('Expect positive amount!'))
self._balance -= amount
except Exception as e:
print(repr(e))
def account_status(self):
out_string = "%s %s\tID: %s\tBalance: $%.2f" % \
(self._first_name, self._last_name, self._account_ID, self._balance)
print(out_string)
- Execute the file as if it were a script
- Import the file as a module and use one of the functions
--
Before importing this module, let's append a test function and an if
block as follows:
if __name__=='__main__':
print('BankAccount.py: executed as a script, running tests')
_test_account()
print('BankAccount.py: executed as a script, all tests passed')
else:
print('BankAccount.py: imported as a module')
- Now add the test definition and place two
print
statements immediately before theif
block, so those statements will always execute (i.e., regardless of whether the file is executed or imported).
def _test_account():
# Construct an account
sophie_account = BankAccount('987654321', 'Sophie', 'Germaine', 1000.00)
sophie_account.withdraw(150.00)
# An assert statement is like an if-block that passes or throws an error
assert sophie_account._balance == 850.00, 'Error in withdrawal function'
sophie_account.deposit(375.00)
assert sophie_account._balance == 1225.00, 'Error in deposit function'
print('All classes and function defined in module BankAccount')
print('__name__ == %s' % __name__)
if __name__=='__main__':
print('BankAccount.py: executed as a script, running tests')
_test_account()
print('BankAccount.py: executed as a script, all tests passed')
else:
print('BankAccount.py: imported as a module')
Finally, test the module by importing it and by runnign it as a script:
import src.BankAccount
$ python src/BankAccount.py
--
There is a common saying in the Python community of "We're all adults here."
The meaning of this is that Python enforces very few actual restrictions on how other users use your code;
...instead, Python has conventions about the intended use of code based on names given to objects.
Many of these conventions are described in PEP8, which is generally an excellent document to study and internalize.
Some naming conventions concern the type of object being named.
For example, we typically use names following this pattern:
CONSTANT_VAL = 7.5
class CamelCase(object): ...
function lower_with_under(args): ...
class JustAintRightError(ValueError): ...
--
A special purpose is indicated by names that have leading (and possibly trailing) underscores.
Names that have both two leading and two trailing spaces are "magic" in the sense that a number of them are used internally to enable syntax sugar or special behaviors by the interpreter.
These are often called by the nickname "dunder names (methods)."
Some of these magic names operate at module scope, but most are methods of classes.
For example:
__all__ = ['names', 'to', 'provide', 'externally']
if __name__ == '__main__':
"Code to run when used as script"
class MyThing(object):
def __init__(self, more, args):
"Things to do when creating an instance"
def __getitem__(self, key):
"How to respond to square brackets: MyThing()[something]"
return "A value"
--
Generally you will not create your own new dunder names, unless you are designing a framework or a low-level package.
However, you should take good advantage of names that lead with a single or double underscore.
Names that begin with a single underscore are implicity stated not to promise a consist API (or continued existence) over different versions of the code.
I.e. you should try not to rely on the functionality provided with these names, but rather only on names starting with letters.
The use of a leading double underscore states this non-promise even more strongly, and in the case of classes makes the name slightly harder to access at all.
--
Let us look at a few examples, first a very simple module. Notice that this module does not use the special list __all__
to override default import behavior.
% cat simple.py
public = 5
_private = 6
__secret = 7
When we import this module:
>>> from simple import * # Only import "public" names
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'public']
>>> from simple import _private, __secret
>>> _private, __secret # We're all adults here
(6, 7)
Classes do slightly more in terms of "enforcing" privacy
class Visibility(object):
public = 5
_private = 6
__secret = 7
visibility = Visibility()
# The official API of the class
visibility.public
# Part of the "private" implementation of the class
visibility._private
# A name-mangled attribute that is slightly harder to access
visibility._Visibility__secret
One thing to be aware of while developing modules: for efficiency reasons, each module is only imported once per interpreter session.
Thus, typing import module
generally imports the objects defined in module.py
only the first time only.
If the file module.py
is modified, to import the modified module, we must either restart the Python interpreter (thereby losing all data in the current session) or use the reload
function from the imp
module, i.e.,
import imp
imp.reload(module)
- There's more details to constructing packages (with modules nested within modules).
- The rules are all available in the Python Official Documentation on modules.
- There is a description of how to create (sub-)package namespaces at https://docs.python.org/3.4/tutorial/modules.html#packages.