Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework workflow implementation #110

Closed
2 of 3 tasks
rob-p opened this issue Oct 17, 2023 · 2 comments
Closed
2 of 3 tasks

Rework workflow implementation #110

rob-p opened this issue Oct 17, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@rob-p
Copy link
Contributor

rob-p commented Oct 17, 2023

The goal of this enhancement is to re-work the workflow implementation to be cleaner, simpler, and ultimately more flexible and easier to use. It consists of two main sub-goals:

Replace Jsonnet with Nickel configuration

Note: We have decided not to do this, as our initial evaluation undersold the capabilities of Jsonnet, and it actually seems more than adequate for our needs.

While Jsonnet is very powerful, and supports the features we need, there are a few downsides that seem to be a thorn in our side. First, Jsonnet has very little in the way of typing information to help validate/verify correct configurations, so this must be done manually. Second, its capability with respect to merging records is lacking. Finally, while the existing jrsonnet library is quite nice, it's is in the middle of a major version transition, and is not particularly well-documented.

On the other hand, Nickel provides advanced merging capabilities, (optional) static typing of variables, enforcement of runtime constraints through contracts and schemas, and first-class programmatic support in rust (the official implementation is in rust). Most of the functionality that we had to implement "by hand" in Jsonnet (in the form of the utility library), can be directly expressed in the Nickel configuration language directly.

So, the first goal is to replace Jsonnet with Nickel. This requires the following changes:

  • Replace code for parsing and evaluating Jsonnet template with code for parsing and evaluating Nickel template.
  • Change expected structure of template and fields to match those in the Nickel template (e.g. the workflow steps are in a nested workflow record rather than at the top level, some field names have changed Step becomes step etc.).
  • Change the code for constructing external commands and command line invocations to allow for records that are not strings. To take advantage of Nickel's typing ability, we can't have the entire template be stringly-typed. Booleans should be bools, numbers should be Numbers etc. However, when we construct actual parameter vectors for parsing from the records, everything need to be properly turned into a string, so we should make sure we handle this properly (note, this happens when parsing the JSON manifest, not the Nickel template, I believe).
  • Add testing for the nickel system (akin to what currently exists in the unit tests for jsonnet evalutation).

Add patching functionality to the configuration

Pursuant to our internal discussions and the discussion here, we wish to have a simple way to "patch" an instantiated workflow to allow it to be reused across many samples where only a small number of fields are changing. This is our "patch" system. The idea is that the user can provide an instantiated or partially instantiated template, and a patch file. The patch file is a CSV format file with column headers being the fully-qualified path name to some key in the template, and row values being the values to be "patched-in" to that key. The patching system then creates a copy of the instantiated template where all of the patched values are replaced by those in the current row of the patch file, and this template is then written out to another file to be executed. One such patched template is created for each row in the CSV file. These can then be independently executed. It is also possible to think about allowing patching directly from the command line. That is, if the number of fields to be patched is small, it would be nice to let the user specify the record to be patched and the patch value from the command line, so that patches can then be applied programmatically without having to create an intermediate patch file.

The tasks for this are:

  • Add the patching sub-sub-command to the simpleaf workflow command.
  • This is the patch sub-command.
  • Implement the patching, and work out how/where, exactly, it should be applied. For example if it is applied to a manifest, then only the directly specified records change, but if it is applied to a template which is then evaluated, then derived values can be changed as well. What is the right place to apply a patch? Should we allow both, or just one? If one, which?
  • Patching is applied to templates.
  • Implement direct patching from the command line in addition to from a CSV.
  • The syntax for this seems like it would be quite unruly, so let's put this on the backburner for now.
  • Add testing for the patch system (unit tests to ensure that patching works as expected, especially important if we allow patching of templates and not just fully-instantiated manifests).
@rob-p rob-p added the enhancement New feature or request label Oct 17, 2023
@DongzeHE
Copy link
Collaborator

DongzeHE commented Oct 17, 2023

  • Add the argument --manifest to allow the execution of a workflow from its manifest JSON file directly.

Done!

@rob-p
Copy link
Contributor Author

rob-p commented Oct 24, 2023

Done in 0.15.0

@rob-p rob-p closed this as completed Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants