Skip to content

Commit

Permalink
guide: make new dvc.yaml guide skeleton (proposal)
Browse files Browse the repository at this point in the history
  • Loading branch information
jorgeorpinel committed Jan 16, 2021
1 parent fedddd3 commit d23a42e
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 8 deletions.
43 changes: 42 additions & 1 deletion content/docs/user-guide/creating-pipelines.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,44 @@
# Creating Pipelines

`dvc.yaml` guide
You construct pipelines by defining individual
[stages](/doc/command-reference/run) in one or more `dvc.yaml` files. Stages
that connect to each other (the <abbr>outputs</abbr> of one stage become the
<abbr>dependencies</abbr> of another one, and so on) become a pipeline. See
[Data Pipelines](/doc/start/data-pipelines) for an intro.

💡 Keep in mind that one `dvc.yaml` file does not necessarily equal one pipeline
(although that is typical). DVC evaluates all the `dvc.yaml` files in the
<abbr>workspace</abbr> to rebuild an validate all of your pipelines (see
`dvc repro` and `dvc status`).

To record the state of your pipeline(s) and track its outputs, DVC will also
maintain `dvc.lock` file(s) matching `dvc.yaml`.

> Note `dvc.yaml` and `dvc.lock` files are meant to be versioned with Git (if
> enabled in the <abbr>repository</abbr>).
## DVC YAML files

`dvc.yaml` files (or _pipelines files_) specify stages that form the pipeline(s)
of a project, and how they connect (_dependency graph_ or
[DAG](/doc/command-reference/dag)).

They use the [YAML 1.2](https://yaml.org/) file format, and a human-friendly
schema described below. We encourage you to get familiar with it so you may
modify, write, or generate stages and pipelines on your own. Here's an example:

...

> See [How to Merge Conflicts](/doc/user-guide/how-to/merge-conflicts) for tips
> on managing DVC files.
## dvc.yaml specification

...

## dvc.lock file

... normally have a matching `dvc.lock` file to record the pipeline state and
track its <abbr>outputs</abbr>.

> ⚠️ Avoid editing these, DVC will create and update them for you.
13 changes: 8 additions & 5 deletions content/docs/user-guide/tracking-existing-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ Alternatively, `dvc import` and `dvc import-url` let you bring data from
external locations to your project, and start tracking it at the same time. See
also [Data Access](/doc/start/data-access).

In either case, one or more files ending with the `.dvc` extension ("dot DVC
file") are created in the project, containing the information to track the
target data over time.
In any case, one or more files ending with the `.dvc` extension ("dot DVC file")
are created in the project, containing the information to track the target data
over time.

> Note `.dvc` files are meant to be versioned with Git (in Git-enabled
> <abbr>repositories</abbr>).
> Note `.dvc` files are meant to be versioned with Git (if enabled in the
> <abbr>repository</abbr>).
## Dot DVC files

Expand All @@ -38,6 +38,9 @@ meta:
email: [email protected]
```
> See [How to Merge Conflicts](/doc/user-guide/how-to/merge-conflicts) for tips
> on managing DVC files.
## .dvc YAML specification
| Field | Description |
Expand Down
4 changes: 2 additions & 2 deletions content/linked-terms.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ module.exports = [
},
{
matches: 'dvc.yaml',
url: '/doc/user-guide/creating-pipelines'
url: '/doc/user-guide/creating-pipelines#dvc-yaml-files'
},
{
matches: 'dvc.lock',
url: '/doc/user-guide/creating-pipelines#dvclock'
url: '/doc/user-guide/creating-pipelines#dvclock-file'
}
]

0 comments on commit d23a42e

Please sign in to comment.