Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys._base_executable resolves symlinks too eagerly, resulting in undesirable venv from venv behaviour #128670

Open
pelson opened this issue Jan 9, 2025 · 1 comment
Labels
type-bug An unexpected behavior, bug, or error

Comments

@pelson
Copy link
Contributor

pelson commented Jan 9, 2025

Bug report

Bug description:

In #106045, the use case of putting a symlink in /usr/local/bin to a Python binary from another prefix was highlighted. For homebrew, this use case is used with the idea that the symlink is the public interface/location, and the prefix where it is actually installed is an implementation detail (and can change over time) (more details in astral-sh/uv#1640).

#106045 seems to have now been resolved by #127974 (I wrote a test for that). The originating issue proposed a solution which would fix the problem with the broken virtual environment in that case, but would actually break the aforementioned homebrew example by eagerly resolving the symlinked executable and using the resolved path to determine the home value in pyvenv.cfg - thereby exposing the "internal" prefix into pyvenv.cfg.

The problem I am reporting here is that

  • because sys._base_executable eagerly resolves symlinks in getpath.py
  • and venv, virtualenv and uv all use sys._base_executable to determine the home location
  • then when you take a virtual environment from another virtual environment in the aforementioned setup, the second virtual environment's home location will be resolved, whereas the first will not be. Thereby bleeding the implementation detail once more.

In uv, the behaviour was solidified in astral-sh/uv#8433 to avoid exposing the internal prefix in the venv. Note that virtualenv is (accidentally?) now exposing the internal prefix even on the first virtual environment pypa/virtualenv#2770.

To reproduce this:

$ mkdir -p /tmp/public/bin/
$ ln -s /usr/local/bin/python3.14 /tmp/public/bin/

$ /tmp/public/bin/python3.14 -m venv /tmp/venv1
$ /tmp/venv1/bin/python -m venv /tmp/venv2

$ cat /tmp/venv1/pyvenv.cfg 
home = /tmp/public/bin
include-system-site-packages = false
version = 3.14.0
executable = /usr/local/bin/python3.14
command = /tmp/public/bin/python3.14 -m venv /tmp/venv1

$ cat /tmp/venv2/pyvenv.cfg 
home = /usr/local/bin
include-system-site-packages = false
version = 3.14.0
executable = /usr/local/bin/python3.14
command = /tmp/venv1/bin/python -m venv /tmp/venv2

And observe that the second venv's home is not /tmp/public/bin but the "internal detail" one.

A quick test for getpath, and a complete test for venv are included below (both failing):

diff --git a/Lib/test/test_getpath.py b/Lib/test/test_getpath.py
index f86df9d0d03..1c2e2a0b3fc 100644
--- a/Lib/test/test_getpath.py
+++ b/Lib/test/test_getpath.py
@@ -864,6 +864,55 @@ def test_PYTHONHOME_in_venv(self):
         actual = getpath(ns, expected)
         self.assertEqual(expected, actual)
 
+    def test_venv_w_symlinked_base_executable(self):
+        """
+        If we symlink the base executable, and point to it via home in pyvenv.cfg,
+        we should have it as sys.executable (and sys.prefix should be the resolved location)
+        """
+        ns = MockPosixNamespace(
+            argv0="/venv/bin/python3",
+            PREFIX="/some/_internal/prefix",
+        )
+        # Setup venv
+        ns.add_known_xfile("/venv/bin/python3")
+        ns.add_known_xfile("/usr/local/bin/python3")
+        ns.add_known_xfile("/some/_internal/prefix/bin/python3")
+
+        ns.add_known_file("/venv/pyvenv.cfg", [
+            # The published based executable location is /usr/local/bin - we don't want to
+            # expose /some/internal/directory (this location can change under our feet)
+            r"home = /usr/local/bin"
+        ])
+        ns.add_known_link("/venv/bin/python3", "/usr/local/bin/python3")
+        ns.add_known_link("/usr/local/bin/python3", "/some/_internal/prefix/bin/python3")
+
+        ns.add_known_file("/some/_internal/prefix/lib/python9.8/os.py")
+        ns.add_known_dir("/some/_internal/prefix/lib/python9.8/lib-dynload")
+
+        # Put a file completely outside of /usr/local to validate that the issue
+        # in https://github.com/python/cpython/issues/106045 is resolved.
+        ns.add_known_dir("/usr/lib/python9.8/lib-dynload")
+
+        expected = dict(
+            executable="/venv/bin/python3",
+            prefix="/venv",
+            exec_prefix="/venv",
+            base_prefix="/some/_internal/prefix",
+            base_exec_prefix="/some/_internal/prefix",
+            # It is important to maintain the link to the original executable, as this
+            # is used when creating a new virtual environment (which should also have home
+            # set to /usr/local/bin to avoid bleeding the internal path to the venv)
+            base_executable="/usr/bin/python3",
+            module_search_paths_set=1,
+            module_search_paths=[
+                "/some/_internal/prefix/lib/python98.zip",
+                "/some/_internal/prefix/lib/python9.8",
+                "/some/_internal/prefix/lib/python9.8/lib-dynload",
+            ],
+        )
+        actual = getpath(ns, expected)
+        self.assertEqual(expected, actual)
+
 # ******************************************************************************
 
 DEFAULT_NAMESPACE = dict(
diff --git a/Lib/test/test_venv.py b/Lib/test/test_venv.py
index 0b09010c69d..bd1b19a9c15 100644
--- a/Lib/test/test_venv.py
+++ b/Lib/test/test_venv.py
@@ -756,6 +756,53 @@ def test_zippath_from_non_installed_posix(self):
         out, err = check_output(cmd)
         self.assertTrue(zip_landmark.encode() in out)
 
+    @unittest.skipIf(os.name == 'nt', 'not relevant on Windows')
+    @requireVenvCreate
+    def test_venv_from_venv_with_symlink(self):
+        """
+        Test that we can create a venv from a venv using a base Python that is
+        a symlink, without exposing the location of the symlink in pyvenv.cfg.
+        """
+        rmtree(self.env_dir)
+        public_prefix = os.path.realpath(tempfile.mkdtemp())
+        self.addCleanup(rmtree, public_prefix)
+        public_bin_dir = os.path.join(public_prefix, 'bin')
+        os.mkdir(public_bin_dir)
+        public_exe = os.path.join(public_bin_dir, self.exe)
+        os.symlink(sys.executable, public_exe)
+        cmd = [public_exe,
+               "-m",
+               "venv",
+               "--without-pip",
+               "--without-scm-ignore-files",
+               self.env_dir]
+
+        subprocess.check_call(cmd)
+
+        # Verify that we don't expose the internal prefix to the first venv config:
+        contents = (pathlib.Path(self.env_dir) / 'pyvenv.cfg').read_text()
+        self.assertIn(f'home = {public_bin_dir}\n', contents)
+
+        # Now use the venv to make another, and assert that the internal env is
+        # also not exposed there.
+        second_venv = os.path.realpath(tempfile.mkdtemp())
+        self.addCleanup(rmtree, second_venv)
+
+        cmd = [os.path.join(self.env_dir, 'bin', 'python3'),
+               "-m",
+               "venv",
+               "--without-pip",
+               "--without-scm-ignore-files",
+               second_venv]
+
+        subprocess.check_call(cmd)
+
+        contents = (pathlib.Path(second_venv) / 'pyvenv.cfg').read_text()
+        self.assertIn(f'home = {public_bin_dir}\n', contents)
+
     @requireVenvCreate
     def test_activate_shell_script_has_no_dos_newlines(self):
         """

cc @FFY00 following on from our conversation in #127974 (comment) (I would love to get a commit for the tests, if not the fix 😜).

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

@pelson pelson added the type-bug An unexpected behavior, bug, or error label Jan 9, 2025
@pelson
Copy link
Contributor Author

pelson commented Jan 9, 2025

There is also some useful context on cases that are covered by uv in astral-sh/uv#9723. The approach taken there allows for the additional possibility of having a symlink to Python which is not named python, and still being able to create a venv from it. The issue I've raised here is only intended to get the second and third cases right - IMO the first case should be handled by a change to the pyvenv.cfg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant