re-order executable types

lwasser · Nov 21, 2024 · 82e0789 · 82e0789
1 parent bd7aa96
commit 82e0789
Show file tree

Hide file tree

Showing 2 changed files with 172 additions and 47 deletions.
diff --git a/python-packaging/execute-package.md b/python-packaging/execute-package.md
@@ -13,66 +13,99 @@ kernelspec:
 
 # Execute a python package
 
-In [Code Workflow Logic][Code Workflow Logic] you learned of the two primary ways to execute a stand-alone Python script.
+In [Execute a Python script][#Execute_a_Python_script] you learned of the two primary ways to execute a stand-alone Python script.
 There are two other ways to execute Python as commands, both of which work for code that has been formatted as a package.
 
-### Entrypoints
+## Executable modules
 
-There is a special `entrypoint` a package can specify in its configuration which will direct installers to create an
-executable command. Entrypoints are a general purpose plug-in system for Python packages, but the
-[`console_scripts`](https://packaging.python.org/en/latest/specifications/entry-points/#use-for-scripts)
-entry is specifically targeted at creating executable commands on systems that install the package.
+We have seen how The `python` command can be passed a file for execution, but it can alternatively be passed
+the name of a module, exactly as would be used after an `import`. In this case, Python will look up the module
+referenced in its installed packages, and when it finds the module, will execute it as a script.
 
-The target of a `scripts` definition should be one function within your package, which will be directly executed
-when the command is invoked in the shell. A `scripts` definition in your `pyproject.toml` looks like:
+This execution mode is performed with the `-m` flag, as in `python -m site`. It can be used in place of a file
+path, but cannot be used in combination with a path, as there can only be one executing module.
 
-```toml
-[project.scripts]
-COMMAND = "my_package.my_module:my_function"
+:::{tip}
+These commands both do the same thing, but one is much more portable, and easier to remember
+
+```bash
+python ./.venv/lib/python3.12/site-packages/pip/__main__.py
+```
+
+```bash
+python -m pip
 ```
+:::
 
-where `COMMAND` is the name of the command that will be made available after installation, `my_package` is the name of
-your top-level package import, `my_module` is the name of any sub-modules in your package (optional, or may be
-repeated as necessary to access the correct sub-module), and `my_function` is the function that will be called
-(without parameters) when the command is invoked.
+On your own or in small groups:
 
-Scripts defined in project configuration, such as `pyproject.toml`, do not need to exist as independent files in
-the package repository, but will be created by installation tools, such as `pip`, at the time the package is
-installed, in a manner customized to the current operating system.
+Install the `my_program.py` module from the last lesson, and then try to get the same greeting as before using `-m`.
 
-### Executable modules
 
-The final way to make Python code executable directly from the command line is to include a
-[`__main__` module](https://docs.python.org/3/library/__main__.html#module-__main__) in your package. Any package that
-contains a `__main__` module and is installed in the current Python environment can be execute as a module
-directly from the `python` command, without reference to any specific files.
+### Executable packages
+
+The `-m` flag as described above only works for Python modules (files), but does not work for Python (sub-)packages (directories). This means that we cannot execute a command using only the name of our package when it is structured to use directories
+
+Once our package grows, the top-level name `my_program` turns into a directory
 ```
-python -m my_package
+project/
+└── src/
+    └── my_program/
+        ├── __init__.py
+        └── greeting.py
+```
+
+Which can't be executed
+```bash
+python -m my_program
+python: No module named my_program.__main__; 'my_program' is a package and cannot be directly executed
 ```
 
-Try to create a `__main__.py` module in your package that will execute with the above command. (don't forget to
+Initially Python seems to be telling us that names of directories, including out top-level package name,
+cannot be directly executes. But actually there is another lead in the error message that gives us the hint to make it work.
+
+Earlier you learned that the `if __name__ == "__main__":` can protect parts of your script from executing
+when it is imported, making that conditional only change the file's behavior as a script. There is a very
+similar concept that can be used on whole packages.
+
+Any package that contains a [`__main__.py` module](https://docs.python.org/3/library/__main__.html#module-__main__)
+can be executed directly from the `python` command, without reference to any specific module files.
+
+:::{note}
+The `__main__.py` file typically doesn't have an `if __name__ == "__main__":` conditional in it, as its execution
+is already separated out from the rest of the package.
+:::
+
+Try to create a `__main__.py` module in your package that will execute with the `python -m my_program`. (don't forget to
 (re)install your package after creating this file!)
 
-#### Further exploration
+## Entrypoints
 
-On your own or in small groups:
+The final way to make Python code executable directly from the command line is to include a special entrypoint
+into the package metadata. Entrypoints are a general purpose plug-in system for Python packages, but the
+[`console_scripts`](https://packaging.python.org/en/latest/specifications/entry-points/#use-for-scripts)
+entry is specifically targeted at creating executable commands on systems that install the package.
 
-- What might be the advantages of making a packaged executable over providing script entrypoints?
-- What are some disadvantages?
-- Review the Pros section from [Executing Scripts][Executing Scrips]
-  - Any similarities between executable packages and executable scripts?
+In `pyproject.toml` this specific entrypoint is configured as such
+
+```toml
+[project.scripts]
+shiny = "my_program.greetings:shiny_hello"
+```
 
-#### More about main
+In the above example `shiny` is the name of the command that will be made available after installation, `my_program` is the name of
+your top-level package import, `greetings` is the name of the sub-package (optional, or may be
+repeated as necessary to access the correct sub-package), and `shiny_hello` is the function that will be called.
 
-You just learned that the `__main__` module allows a package to be executed directly from the command line with
-`python -m`, but there is another purpose to the `__main__` name in Python. Any Python script that is executed
-directly, by any of the methods you have learned to run Python code from the shell, will be given the name `__main__`
-which identifies it as the first Python module loaded. This leads to the convention `if __name__ == "__main__":`, which 
-you may have seen used previously.
+The target of each `scripts` definition should always be one function within your package, which will be directly executed (without parameters)
+when the command is invoked in the shell. The target function can live anywhere; it does not have to be in a `__main__.py` or under a `if __name__ == "__main__":`.
 
-This conditional is often used at the bottom of modules, especially modules that
-are expected to be executed directly, to separate code that is intended to execute as part of a command from code that
-is intended to execute as part of an import.
+## Further exploration
 
-Try to create a single Python script that contains a `if __name__ == "__main__":` which makes the file print different
-messages when it is executed from when it is imported from other Python code.
+On your own or in small groups:
+
+- What might be the advantages of making a package executable over providing a script entrypoint?
+- What are some disadvantages?
+- Review the Pros section from [Executable _comparisons][Executable_comparisons]
+  - Any similarities between executable packages and executable scripts?
+  - Any similarities between scripts and executable scripts?
diff --git a/python-packaging/execute-script.md b/python-packaging/execute-script.md
@@ -15,7 +15,7 @@ kernelspec:
 
 There are two primary ways to execute a Python script.
 
-You are may already be familiar with the `python` command, and that it can take the name of a Python file and execute it
+You may already be familiar with the `python` command, and that it can take the name of a Python file and execute it
 
 ```bash
 python my_program.py
@@ -25,9 +25,24 @@ When Python reads a file in this way, it executes all of the "top-level" command
 This is similar, but not identical, to the behavior of copying this file and pasting it line-by-line into an interactive
 Python shell (or notebook cell).
 
-The other way a Python script may be executed is to associate the file with a launch command.
+```python
+def report_error():
+  print("An error has occured")
+
+print("\N{Sparkles} Hello from Python \N{Sparkles}")
+```
+
+Note that only one line is printed when this script is run
+
+```bash
+my_program.py
+# ✨ Hello from Python ✨
+```
 
-### Non-Windows executables
+The other way a Python script may be executed is to associate the file with a launch command. The way in which
+this association is done depends on what operating system you are running.
+
+## Non-Windows executables
 
 On Linux or Mac systems, the Python file can itself be turned into a command. By adding a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))
 as the first line in any Python file, and by giving the file [executable permissions](https://docs.python.org/3/using/unix.html#miscellaneous) the
@@ -57,7 +72,7 @@ Windows is not natively POSIX compliant. However, some "modes" inside of Windows
 (Windows Subsystem for Linux), gitbash, or some VSCode terminals.
 :::
 
-### Windows executables
+## Windows executables
 
 If your Windows machine has Python registered as the default application associated with `.py` files, then any Python
 scripts can be run as commands. However, only one Python can be registered at a time, so all Python scripts run this
@@ -77,7 +92,7 @@ py my_program.py
 While all Python files should end in a `.py`, this naming is necessary for Windows to associate a script with Python, as opposed
 to Linux where `.py` is a convention and the shebang associates the file with Python.
 
-While there is no in-source format that can tell Windows what to do with a Python code file, executing a
+Also, although there is no in-source format that can tell Windows what to do with a Python file, executing a
 Python file with a shebang on Windows also does not cause any issues. Python just sees the whole line as
 a comment and ignores it!
 
@@ -96,3 +111,80 @@ Because of these differences it is best practice to use both a shebang and `.py`
   - don't have to remember which
 - don't have to use the `python` command
   - don't have to even remember it is a Python script
+
+
+## Separating script from import behavior
+
+Sometimes a Python file that is useful to execute is also useful to import. You may want to use `shiny_hello`
+in another Python file. But right now, the `my_program.py` does all its script behavior even when it is imported. Consider
+
+```python
+import my_program
+
+def guess_my_number():
+    my_program.shiny_hello()
+    print("Was your number 42?")
+
+guess_my_number()
+# ✨ Hello from Python ✨
+# ✨ Hello from Python ✨
+# Was your number 42?
+```
+
+You may not have expected it to print the hello twice, but it did. This is because `my_program` is set to
+_always_ call `shiny_hello`, and now `guess_my_number` also calls it. That's two times. How can we make
+`my_program` only call `shiny_hello` when it is used as a script?
+
+You may have already seen the answer, without realizing what it was doing. `my_program` needs a conditional that checks if is is in "script mode" or "import mode" and that conditional is `if __name__ == "__main__":`.
+
+This conditional is often used at the bottom of modules, especially modules that are expected to be executed
+directly, to separate code that is intended to execute as part of a command from code that is intended to
+execute as part of an import.
+
+```python
+#!/usr/bin/env python
+# The above line is a shebang, and can take the place of typing python on the command line
+# This comment is below, because shebangs must be the first line!
+
+def shiny_hello():
+    print("\N{Sparkles} Hello from Python \N{Sparkles}")
+
+
+if __name__ == "__main__":
+    shiny_hello()
+```
+
+```bash
+my_program.py
+# ✨ Hello from Python ✨
+```
+
+```python
+import my_program
+
+def guess_my_number():
+    my_program.shiny_hello()
+    print("Was your number 42?")
+
+guess_my_number()
+# ✨ Hello from Python ✨
+# Was your number 42?
+```
+
+:::{note}Why did that work?
+
+All Python modules (individual files) have a `__name__` attribute, which is usually the same as the name used to import the module.
+
+```python
+import os
+print(os.__name__)
+# 'os'
+```
+
+This attribute is available within a module by using a global `__name__`. So in the `os.py` module, `__name__`
+also gives the value `'os'`.
+
+Importantly, this name is changed for the *first user-module* executed by Python. When you pass a file to
+`python`, that is the first user-module executed. For this module, and only when it is the first, the `__name__`
+is changed to the string `'__main__'`. This answers the question for every module used in a Python program, "am I the main module?".
+:::