effective processes for monorepos #3203
Replies: 9 comments 3 replies
-
This is significantly more difficult than it may sound at first. When a Promotion is created, the target Stage's promotionTemplate is copied to the Promotion so that it has a complete and immutable "plan" for how to proceed. The engine that executes the steps, internal book-keeping, and UI features that give visibility into what steps have done or will do is all predicated on the sequence of steps being discrete and finite. This isn't something that could easily be changed. Were we able to support something like a loop, what we'd have to do is something similar to the approach we're currently whiteboarding for the ability to reference a reusable set of parameterized steps. A task reference or loop would be a "pseudo-step" that's expanded up-front so that we're still just feeding the engine a discrete and finite list of steps. In the end, I think this would have the feel of the GitHub Actions matrix feature more so than the feeling of an actual loop. i.e. No "do while;" just "repeat for x, y, z."
Technical constraints and implementation details aside, I do wonder if this isn't an edge case. The scale is neither surprising nor problematic in and of itself, but the notion of a single Stage being so vast in scope that it's left you feeling you need a loop feature is unusual. Such sprawling Stages strongly imply that your Promotions are coupling together the deployment of artifacts that could have been or should have been capable of moving through the stages of their own lifecycles more independently of one another. So, not to derail the loop/matrix conversation, but some more info about your specific use case may be enlightening. |
Beta Was this translation helpful? Give feedback.
-
@krancour Using the example shown in the link, right now Kargo can't generate the
Assuming the system gets more and more complex and more microservices join the mix with time, this leads to very long and difficult to manage Stage files. |
Beta Was this translation helpful? Give feedback.
-
I think the desire for a loop is a symptom of a poorly conceived process. If I have some large number of microservices, is it really my desire to move image versions or config changes for all of them through a single pipeline as a unit? Most likely not, because if it were I'd be missing out on some of the biggest benefits of a microservice architecture. I'm taking a large number of things that could have been deployed and managed individually and bundling them together. A more sensible approach to this is a separate pipeline for each microservice (or one for each relatively small set of microservices that must, for whatever reason, always be deployed as a unit). "Watch this image repo and these particular directories in my monorepo and move that small set of artifacts through the pipeline as a unit" makes more sense -- in which case you have no need to loop through n directories, applying the same process to each. Where you might have had one Project before with one Warehouse, and one pipeline comprised of a few Stages, you would now have many Projects, probably split along organizational lines with different people responsible for each. Each will likely contain multiple Warehouses and multiple pipelines comprised of a few Stages each. But in a sense, the original problem still remains. Instead of having one promotion process with n repetative steps, you have n Stages and each one has some slight variation of the necessary steps... but this is exactly what v1.2's major feature is meant to address -- reuse of parameterized task definitions. Bottom line is it would make far more sense to define the promotion process for any one microservice and then resuse that process (with different arguments) from each Stage. It DRYs things up nicely, keeps microservices individually deployable/manageable, and requires no loop. 1.2 will be out shortly. In the meantime, another option I have seen several users apply is a custom Helm chart to create Projects that all conform to a common recipe. |
Beta Was this translation helpful? Give feedback.
-
But if you have multiple pipelines, how do you get branch per stage when effectively you have 30-50 pipelines writing to single branch? For me, this concept falls apart rather quickly if we move out of Kargo demo territory. If you have dozens of services which have some coupling (let's face it majority of environments will have) it gets messy. My original scenario is that I run product with multitenancy by environments. So effectively I have 30-50 envs at given moment per stage (dev/preview/prod) each having 30+ microservices. Assuming that each chart is ~10 manifests, I will have branches with 15k manifests. If I must store them in a single folder and run as one task this will be a nightmare. I do not want to have stage per env as running 150+ branches will be even less efficient. And some of my environments are short-lived which makes things even en messier. For me Kargo is not ready for that scale. Of course, I can handle that in Argo Workflow and use http to run and observe workflows. But what is the added benefit of Kargo if I must write to my repo using workflow anyway. Loops and conditions are nice first step to have some basic pipeline capabilities without the need to hack rest controls for Argo Workflows or Jenkins anyway. I get it that my scenario might be a bit specific, but this capability will also solve other problems mentioned in the comments. |
Beta Was this translation helpful? Give feedback.
-
Can you reword this or provide more detail? I want to try to understand the exact process you're trying to achieve. I think something that many users are failing to appreciate is that Kargo is a tool for implementing your own process. If your process has its own inherent flaws, Kargo won't magically save you. It's important to take a step back and examine how you really want things to work.
Well... what you're doing might be a little nuts. idk what you mean by "environments." It's a term we shy away from due to it being overloaded with different meanings depending on one's perspective. But I can infer the scale you're talking about. 30-50 times another 30+. So we're talking 900 - 1500+ microservices per stage. You really want to deploy that many microservices at once as an inseparable unit? I have my doubts that this is what you really want and that you've gotten stuck there because it isn't clear to you how to do whatever it is you really do want. So... I'd really like to try and get to the bottom of how you want things to work, without Kargo entering into the conversation. Let's find the process that fits your needs and afterwards worry about how Kargo can implement that (or cannot and therefore has a gap that needs to be filled). |
Beta Was this translation helpful? Give feedback.
-
I get what you're saying. Not even going to argue when it comes to my microservices. What I first try to achieve is a proper process that takes care of my clusters' addons/infra manifests. Let's assume I'm willing to pay that price (which might be daunting at first, but I can live with it if get a stable process out of it), how would you suggest I best manage things from branches perspective? Should each branch represent a combo of the stage and the addon? Say, |
Beta Was this translation helpful? Give feedback.
-
Ok... I'm glad we're having this conversation, because I see a lot of users trying to do what you were doing, and it's becoming more clear that it's not at all because they want to deploy 900 things at once. It's because they don't know to approach that more reasonably.
Totally understand this. Monorepos shouldn't represent any kind of difficulty for Kargo. They just require some careful planning.
So let's start from there. Let's say the add-ons you want to deal with are cert-manager and kube-prometheus-stack. (Clearly, there's no good reason to move upgrades to those two things through a single pipeline as a unit.) And let's say that your Stages (for this particular case) map to clusters like "lab," "non-prod," and "prod." The first question is one of how the monorepo will be laid out. I think something along these lines is a pretty good starting point, but you can tweak it: .
├── cert-manager/
│ ├── base/
│ └── stages/
│ ├── lab/
│ ├── non-prod/
│ └── prod/
└── kube-prometheus-stack/
├── base/
└── stages/
├── lab/
├── non-prod/
└── prod/ The major features of this layout are that:
In order to neither make assumptions nor complicate the conversation, I've left out the contents of all the
Now you can make a Warehouse for each of these, and importantly you need to do some path filtering: apiVersion: kargo.akuity.io/v1alpha1
kind: Warehouse
metadata:
name: cert-manager
namespace: addons
spec:
subscriptions:
- git:
repoURL: https://github.com/example/monorepo.git
includePaths:
- cert-manager
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Warehouse
metadata:
name: kube-prometheus-stack
namespace: addons
spec:
subscriptions:
- git:
repoURL: https://github.com/example/monorepo.git
includePaths:
- kube-prometheus-stack Again, the path filtering is important here. The The second question is what's your promotion process look like, which can be further decomposed into two smaller questions:
Now onto the last component of this: "The YAML explosion." If you've got a lot of add-ons and a lot of Stages for each, you're going to have a lot of separate Stages that with promotionTemplates that look (possibly) like this: apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: cert-manager-lab
namespace: addons
spec:
requestedFreight:
- origin:
kind: Warehouse
name: cert-manager
sources:
direct: true
# OR for an env downstream from cert-manager-lab:
# sources:
# stages:
# - cert-manager-lab
promotionTemplate:
spec:
vars:
- name: addon
value: cert-manager
- name: simpleStageName
value: lab
- name: gitRepo
value: https://github.com/example/monorepo.git
- name: targetBranch
value: stage/${{ ctx.stage }} # Evaluates to `stage/cert-manager-lab`
steps:
- uses: git-clone
config:
repoURL: ${{ vars.gitRepo }}
checkout:
- commit: ${{ commitFrom(vars.gitRepo).ID }}
path: ./src
- branch: ${{ vars.targetBranch }}
create: true
path: ./out
- uses: git-clear
config:
path: ./out
- uses: helm-template
config:
path: ./src/${{ vars.addon }}/base
releaseName: ${{ vars.addon }} # Not anticipating name collisions since each stage is a separate cluster, but if that were not true, you could adjust accordingly
valuesFiles:
- ./src/${{ vars.addon }}/stages/${{ vars.simpleStageName }}
outPath: ./out
- uses: git-commit
as: commit
config:
path: ./out
- uses: git-push
config:
path: ./out
targetBranch: ${{ vars.targetBranch }}
- uses: argocd-update
config:
apps:
- name: ${{ vars.addon }} # Not anticipating name collisions since each stage is a separate cluster, but if that were not true, you could adjust accordingly
sources:
- repoURL: ${{ vars.gitRepo }}
desiredRevision: ${{ outputs.commit.commit }} Note that everything from I hope this helps and please do keep the conversation going if you've got questions or concerns. As I said, I think this conversation is proving to be enormously important and productive because I've seen many users falling into the same trap that I think you were. I'm even going to point our developer advocates at this issue because of just how important I think this is. cc @arober39 |
Beta Was this translation helpful? Give feedback.
-
I was considering to use Kargo to manage so called "addons". But many of these "addons" contain a lot of infrastructure-specific variables (like AWS account id) and now I'm using ArgoCD cluster metadata to pass these values to applications using applicationsets Maybe I could use Kargo http step to extract these values from ArgoCD cluster metadata and substitute them into values, but it doesn't look straightforward |
Beta Was this translation helpful? Give feedback.
-
@Brightside56 this looks like it's a separate topic? Start a new thread please. |
Beta Was this translation helpful? Give feedback.
-
Checklist
kargo version
, if applicable.Proposed Feature
Add loop step to promotion steps to allow running steps against list of objects e.g. folders defining instances of application.
Motivation
Currently steps like
kustomize build
orhelm template
render flat structure of files. This is not a big problem until you run dozens of instances of app across multiple namespaces or even clusters but treat them as single stage. To make the output branch easy understandable by humans (file tree is more natural traversable than prefixes offered by kustomize) you must build folders one by one with different sub paths as output. Similar scenario goes if you have multiple repositories to checkout or commit branches against.Suggested Implementation
Loop step which takes list of objects to determine iterator (e.g. all folders or files in specific directory, array of values) and list of steps to run against specific value of the iterator. This works on the premise that specific value of iterator is accessible via expression language inside the loop.
Beta Was this translation helpful? Give feedback.
All reactions