Skip to content

Commit

Permalink
[0 of 6] New feature: allow 'git::' forced filepaths, both absolute a…
Browse files Browse the repository at this point in the history
…nd relative

This series of changesets introduces a feature that allows the 'git::'
forcing token to be used on local file system paths to reference Git
repositories. Both absolute paths and relative paths are supported. For
example:
    git::./some/relative/path/to/a/git-repo//some-subdir?ref=v1.2.3
or:
    git::../../some/relative/path/to/a/git-repo//some-subdir?ref=v1.2.3
or:
    git::/some/absolute/path/to/a/git-repo//some-subdir?ref=v4.5.6

Only filepaths that are prefixed with the 'git::' forcing token are
considered for processing.

Internally, go-getter transforms the provided string into a 'file://'
URI with an absolute filepath, with query string params and subdirectory
retained.

The rationale for using a 'file://' URI internally is that the Git clone
operation can already work with 'file://' URIs, and using them for this
feature allows us to leverage the existing go-getter URI-handling
machinery. That gets us support for query params (to clone a specific
git ref (tag, commit hash, ...)) "for free".

The rationale for using an absolute filepath (even when the provided
string is a relative filepath) is that (per RFC 1738 and RFC 8089) only
absolute filepaths are legitimate in 'file://' URIs. But more
importantly here, the Git clone operation only supports 'file://' URIs
with absolute paths.

Q: Why support this functionality at all?

   Why not just require that a source location use an absolute path in a
   'file://' URI explicitly if that's what is needed?

A: The primary reason is to allow support for relative filepaths to Git
   repos.

   There are use cases in which the absolute path cannot be known in
   advance, but a relative path to a Git repo is known.

   For example, when a Terraform project (or any Git-based project) uses
   Git submodules, it will know the relative location of the Git
   submodule repos, but cannot know the absolute path in advance because
   it will vary based on where the "superproject" repo is
   cloned. Nevertheless, those relative paths should be usable as
   clonable Git repos, and this mechanism would allow for that.

   Support for filepaths that are already absolute is provided mainly
   for symmetry. It would be surprising for the feature to work with
   relative file paths, but not for absolute filepaths.

For projects using Terraform, in particular, this feature (along with a
small change in the Terraform code to leverage it) enables the
non-fragile use of relative paths in a module "call" block, when
combined with Git submodules:

    module "my_module" {
        source = "git::../git-submodules/tf-modules/some-tf-module?ref=v0.1.0"
        // ...
    }

In the above example "superproject" Git repo (the one "calling" the
terraform module) knows the relative path to its own Git submodules
because they are embedded in a subdirectory beneath the top-level of the
"superproject" repo.

Two downstream Terraform issues that would require go-getter support for
this feature (or something like it) are at [0] and [1].

This first changeset in the series updates the README.md documentation
to note the new feature and provide examples.

[0] "Unable to use relative path to local Git module"
    hashicorp/terraform#25488

[1] "In 0.12, modules can no longer be installed from local git repositories at relative paths"
    hashicorp/terraform#21107

Design Notes
------------
In order for this feature to work, additional contextual information is
needed by the Git detector than can be provided using the existing
Detector API.

Internally, the Detector's Detect method does not pass along to the
Detector implementations all of the contextual information that it has
available. In particular, the forcing token and go-getter subdir
component are stripped out of the source string before invoking the
implementation's Detect method. In the particular case of the Git
detector, that means it cannot know that a 'git::' forcing token was
provided on an input string that otherwise looks like a file system
path. And /that/ means that it is not correct or safe for it to identify
any filepath string value as a Git repository.

Externally, callers (such as Terraform) already provide a value for the
'pwd' parameter of Detect, but it is not (necessarily) the location from
which a relative path in a 'git::' string should be resolved. In a
Terraform module (which may be in an arbitrary subdirectory from the
process current working directory), module "source" references that
contain relative paths must be interpreted relative to the location of
the module source file. Terraform has that information available, but in
the existing Detect API there is no way to convey it to go-getter.

Constraints
-----------
Additional Detector methods cannot be added without burdening all
existing detectors (both internal and in the wild) with the need to
support them.

Additional Detect method params cannot be added without breaking all
existing Detector implementations (internal, wild).

Additional parameters cannot be added to the Detect dispatching function
without affecting all callers.

Approach
--------
The goal is to provide the feature in a way that is as minimally
invasive as possible. But above all else it needs to avoid breaking
backward compatibility in any way.

Given that, the approach taken by this changeset series is to introduce
the concept of a "Contextual Detector". It is structured in the same way
as the current Detector framework, but works through a new CtxDetector
interface that is not constrained by the existing API.

The only callers affected by this change would be those that wish to take
advantage of the additional capabilities. And for those, the migration path
straight-forward because the new API is structured like the existing one.

In particular, this changeset series introduces four new elements:

    1. CtxDetector interface

    2. CtxDetect dispatching function

    3. CtxDetect method on the CtxDetector interface

    4. Full suite of CtxDetector implementations that are analogues of
       the existing detectors (most of which (currently) just delegate
       to the existing Detector implementations).

There is also a global 'ContextualDetectors' list that serves a function
analogous to the existing 'Detectors' list.

Signed-off-by: Alan D. Salewski <[email protected]>
  • Loading branch information
salewski committed Jun 5, 2023
1 parent c12e42f commit 1ab60bc
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,19 @@ The "scp-style" addresses _cannot_ be used in conjunction with the `ssh://`
scheme prefix, because in that case the colon is used to mark an optional
port number to connect on, rather than to delimit the path from the host.

Git repositories that reside on the local filesystem can be accessed by
prefixing the `git::` forcing token to the file path. Both absolute and
relative paths are accepted, and may contain query parameters and/or the a
double-slash `//` subdirectory component. Some examples:

#### Git File Path Examples

- `git::/path/to/some/git/repo`
- `git::/path/to/some/git/repo//some/subdir`
- `git::/path/to/some/git/repo//some/subdir?ref=v1.2.3`
- `git::./path/to/some/git/repo//some/subdir?ref=v1.2.3`
- `git::../../path/to/some/git/repo//some/subdir?ref=v1.2.3`

### Mercurial (`hg`)

* `rev` - The Mercurial revision to checkout.
Expand Down

0 comments on commit 1ab60bc

Please sign in to comment.