Skip to content

Commit

Permalink
re-order executable types
Browse files Browse the repository at this point in the history
  • Loading branch information
ucodery authored and lwasser committed Nov 21, 2024
1 parent bd7aa96 commit 82e0789
Show file tree
Hide file tree
Showing 2 changed files with 172 additions and 47 deletions.
117 changes: 75 additions & 42 deletions python-packaging/execute-package.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,66 +13,99 @@ kernelspec:

# Execute a python package

In [Code Workflow Logic][Code Workflow Logic] you learned of the two primary ways to execute a stand-alone Python script.
In [Execute a Python script][#Execute_a_Python_script] you learned of the two primary ways to execute a stand-alone Python script.
There are two other ways to execute Python as commands, both of which work for code that has been formatted as a package.

### Entrypoints
## Executable modules

There is a special `entrypoint` a package can specify in its configuration which will direct installers to create an
executable command. Entrypoints are a general purpose plug-in system for Python packages, but the
[`console_scripts`](https://packaging.python.org/en/latest/specifications/entry-points/#use-for-scripts)
entry is specifically targeted at creating executable commands on systems that install the package.
We have seen how The `python` command can be passed a file for execution, but it can alternatively be passed
the name of a module, exactly as would be used after an `import`. In this case, Python will look up the module
referenced in its installed packages, and when it finds the module, will execute it as a script.

The target of a `scripts` definition should be one function within your package, which will be directly executed
when the command is invoked in the shell. A `scripts` definition in your `pyproject.toml` looks like:
This execution mode is performed with the `-m` flag, as in `python -m site`. It can be used in place of a file
path, but cannot be used in combination with a path, as there can only be one executing module.

```toml
[project.scripts]
COMMAND = "my_package.my_module:my_function"
:::{tip}
These commands both do the same thing, but one is much more portable, and easier to remember

```bash
python ./.venv/lib/python3.12/site-packages/pip/__main__.py
```

```bash
python -m pip
```
:::

where `COMMAND` is the name of the command that will be made available after installation, `my_package` is the name of
your top-level package import, `my_module` is the name of any sub-modules in your package (optional, or may be
repeated as necessary to access the correct sub-module), and `my_function` is the function that will be called
(without parameters) when the command is invoked.
On your own or in small groups:

Scripts defined in project configuration, such as `pyproject.toml`, do not need to exist as independent files in
the package repository, but will be created by installation tools, such as `pip`, at the time the package is
installed, in a manner customized to the current operating system.
Install the `my_program.py` module from the last lesson, and then try to get the same greeting as before using `-m`.

### Executable modules

The final way to make Python code executable directly from the command line is to include a
[`__main__` module](https://docs.python.org/3/library/__main__.html#module-__main__) in your package. Any package that
contains a `__main__` module and is installed in the current Python environment can be execute as a module
directly from the `python` command, without reference to any specific files.
### Executable packages

The `-m` flag as described above only works for Python modules (files), but does not work for Python (sub-)packages (directories). This means that we cannot execute a command using only the name of our package when it is structured to use directories

Once our package grows, the top-level name `my_program` turns into a directory
```
python -m my_package
project/
└── src/
└── my_program/
├── __init__.py
└── greeting.py
```

Which can't be executed
```bash
python -m my_program
python: No module named my_program.__main__; 'my_program' is a package and cannot be directly executed
```

Try to create a `__main__.py` module in your package that will execute with the above command. (don't forget to
Initially Python seems to be telling us that names of directories, including out top-level package name,
cannot be directly executes. But actually there is another lead in the error message that gives us the hint to make it work.

Earlier you learned that the `if __name__ == "__main__":` can protect parts of your script from executing
when it is imported, making that conditional only change the file's behavior as a script. There is a very
similar concept that can be used on whole packages.

Any package that contains a [`__main__.py` module](https://docs.python.org/3/library/__main__.html#module-__main__)
can be executed directly from the `python` command, without reference to any specific module files.

:::{note}
The `__main__.py` file typically doesn't have an `if __name__ == "__main__":` conditional in it, as its execution
is already separated out from the rest of the package.
:::

Try to create a `__main__.py` module in your package that will execute with the `python -m my_program`. (don't forget to
(re)install your package after creating this file!)

#### Further exploration
## Entrypoints

On your own or in small groups:
The final way to make Python code executable directly from the command line is to include a special entrypoint
into the package metadata. Entrypoints are a general purpose plug-in system for Python packages, but the
[`console_scripts`](https://packaging.python.org/en/latest/specifications/entry-points/#use-for-scripts)
entry is specifically targeted at creating executable commands on systems that install the package.

- What might be the advantages of making a packaged executable over providing script entrypoints?
- What are some disadvantages?
- Review the Pros section from [Executing Scripts][Executing Scrips]
- Any similarities between executable packages and executable scripts?
In `pyproject.toml` this specific entrypoint is configured as such

```toml
[project.scripts]
shiny = "my_program.greetings:shiny_hello"
```

#### More about main
In the above example `shiny` is the name of the command that will be made available after installation, `my_program` is the name of
your top-level package import, `greetings` is the name of the sub-package (optional, or may be
repeated as necessary to access the correct sub-package), and `shiny_hello` is the function that will be called.

You just learned that the `__main__` module allows a package to be executed directly from the command line with
`python -m`, but there is another purpose to the `__main__` name in Python. Any Python script that is executed
directly, by any of the methods you have learned to run Python code from the shell, will be given the name `__main__`
which identifies it as the first Python module loaded. This leads to the convention `if __name__ == "__main__":`, which
you may have seen used previously.
The target of each `scripts` definition should always be one function within your package, which will be directly executed (without parameters)
when the command is invoked in the shell. The target function can live anywhere; it does not have to be in a `__main__.py` or under a `if __name__ == "__main__":`.

This conditional is often used at the bottom of modules, especially modules that
are expected to be executed directly, to separate code that is intended to execute as part of a command from code that
is intended to execute as part of an import.
## Further exploration

Try to create a single Python script that contains a `if __name__ == "__main__":` which makes the file print different
messages when it is executed from when it is imported from other Python code.
On your own or in small groups:

- What might be the advantages of making a package executable over providing a script entrypoint?
- What are some disadvantages?
- Review the Pros section from [Executable _comparisons][Executable_comparisons]
- Any similarities between executable packages and executable scripts?
- Any similarities between scripts and executable scripts?
102 changes: 97 additions & 5 deletions python-packaging/execute-script.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ kernelspec:

There are two primary ways to execute a Python script.

You are may already be familiar with the `python` command, and that it can take the name of a Python file and execute it
You may already be familiar with the `python` command, and that it can take the name of a Python file and execute it

```bash
python my_program.py
Expand All @@ -25,9 +25,24 @@ When Python reads a file in this way, it executes all of the "top-level" command
This is similar, but not identical, to the behavior of copying this file and pasting it line-by-line into an interactive
Python shell (or notebook cell).

The other way a Python script may be executed is to associate the file with a launch command.
```python
def report_error():
print("An error has occured")

print("\N{Sparkles} Hello from Python \N{Sparkles}")
```

Note that only one line is printed when this script is run

```bash
my_program.py
# ✨ Hello from Python ✨
```

### Non-Windows executables
The other way a Python script may be executed is to associate the file with a launch command. The way in which
this association is done depends on what operating system you are running.

## Non-Windows executables

On Linux or Mac systems, the Python file can itself be turned into a command. By adding a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))
as the first line in any Python file, and by giving the file [executable permissions](https://docs.python.org/3/using/unix.html#miscellaneous) the
Expand Down Expand Up @@ -57,7 +72,7 @@ Windows is not natively POSIX compliant. However, some "modes" inside of Windows
(Windows Subsystem for Linux), gitbash, or some VSCode terminals.
:::

### Windows executables
## Windows executables

If your Windows machine has Python registered as the default application associated with `.py` files, then any Python
scripts can be run as commands. However, only one Python can be registered at a time, so all Python scripts run this
Expand All @@ -77,7 +92,7 @@ py my_program.py
While all Python files should end in a `.py`, this naming is necessary for Windows to associate a script with Python, as opposed
to Linux where `.py` is a convention and the shebang associates the file with Python.

While there is no in-source format that can tell Windows what to do with a Python code file, executing a
Also, although there is no in-source format that can tell Windows what to do with a Python file, executing a
Python file with a shebang on Windows also does not cause any issues. Python just sees the whole line as
a comment and ignores it!

Expand All @@ -96,3 +111,80 @@ Because of these differences it is best practice to use both a shebang and `.py`
- don't have to remember which
- don't have to use the `python` command
- don't have to even remember it is a Python script


## Separating script from import behavior

Sometimes a Python file that is useful to execute is also useful to import. You may want to use `shiny_hello`
in another Python file. But right now, the `my_program.py` does all its script behavior even when it is imported. Consider

```python
import my_program

def guess_my_number():
my_program.shiny_hello()
print("Was your number 42?")

guess_my_number()
# ✨ Hello from Python ✨
# ✨ Hello from Python ✨
# Was your number 42?
```

You may not have expected it to print the hello twice, but it did. This is because `my_program` is set to
_always_ call `shiny_hello`, and now `guess_my_number` also calls it. That's two times. How can we make
`my_program` only call `shiny_hello` when it is used as a script?

You may have already seen the answer, without realizing what it was doing. `my_program` needs a conditional that checks if is is in "script mode" or "import mode" and that conditional is `if __name__ == "__main__":`.

This conditional is often used at the bottom of modules, especially modules that are expected to be executed
directly, to separate code that is intended to execute as part of a command from code that is intended to
execute as part of an import.

```python
#!/usr/bin/env python
# The above line is a shebang, and can take the place of typing python on the command line
# This comment is below, because shebangs must be the first line!

def shiny_hello():
print("\N{Sparkles} Hello from Python \N{Sparkles}")


if __name__ == "__main__":
shiny_hello()
```

```bash
my_program.py
# ✨ Hello from Python ✨
```

```python
import my_program

def guess_my_number():
my_program.shiny_hello()
print("Was your number 42?")

guess_my_number()
# ✨ Hello from Python ✨
# Was your number 42?
```

:::{note}Why did that work?

All Python modules (individual files) have a `__name__` attribute, which is usually the same as the name used to import the module.

```python
import os
print(os.__name__)
# 'os'
```

This attribute is available within a module by using a global `__name__`. So in the `os.py` module, `__name__`
also gives the value `'os'`.

Importantly, this name is changed for the *first user-module* executed by Python. When you pass a file to
`python`, that is the first user-module executed. For this module, and only when it is the first, the `__name__`
is changed to the string `'__main__'`. This answers the question for every module used in a Python program, "am I the main module?".
:::

0 comments on commit 82e0789

Please sign in to comment.