Skip to content

Dev: Workflows

Nils Philippsen edited this page Jun 24, 2022 · 1 revision

This page describes some important workflows in Duffy from say a 10ft perspective, i.e. leaving unnecessary details out.

Session Lifecycle

Requesting a Session

  1. A tenant requests a new session with a list of specs of how many nodes from which pool by POSTing to /api/v1/session.
  2. The web app checks if the tenant’s number of active nodes plus the newly requested ones would exceed their quota, if so, an error is returned.
  3. The web app looks up if enough nodes in the desired pools are available, if not an error is returned.
  4. A session object is created, the nodes are allocated to it and marked as contextualizing.
  5. The allocated nodes are contextualized in parallel, i.e. the tenant’s SSH key is added to .ssh/authorized_keys of their root user.
  6. If contextualizing any node fails, they are marked as failed, the other changes are rescinded and an error is returned.
  7. If contextualizing the nodes is successful, the nodes are marked as deployed and information about the session and its nodes is returned to the tenant.

In this, if any node was taken out of circulation (by being deployed or due to failure), the fill_pools() backend task is kicked off with the affected pools so new nodes can be provisioned into them.

Retiring a Session

  1. The session is retired by one of these ways:
  • A tenant retires a session manually by PUTing to /api/v1/session/<id> and setting active to false.
  • The expire_sessions() backend task, which is run at configurable, regular intervals, finds sessions past their configured lifetimes.
  1. The deprovision_nodes() backend task is kicked off for the nodes in the session.
  2. It sorts the nodes in the session by their pools and kicks of a deprovision_pool_nodes() task for each pool and its nodes.
  3. deprovision_pool_nodes() first decontextualizes the nodes, i.e. attempts to remove the tenant’s SSH key from them.
  4. Then, it deprovisions the nodes using the configured backend mechanism. Right now, this means running the configured deprovisioning playbook through Ansible.
  5. If the nodes are reusable, they’re marked as unused and can be provisioned anew later. If not, they are marked as done and retired in the database.

Filling up Pools

  1. At configurable, regular intervals, the fill_pools() backend task is executed.
  2. It iterates over configured pools and kicks of fill_single_pool() tasks for each of them.
  3. This task checks if enough nodes are available in the pool or being provisioned into it (i.e. in states ready or provisioning). The latter is important to check in order to not overfill a pool, provisionined can take longer than the intervals at which pool fill-levels are checked.
  4. If not enough nodes are available, either new nodes are allocated or existing reusable ones are marked as provisioning in the database.
  5. Depending on configuration, the provision_nodes_into_pool() backend tasks is kicked off for either all nodes at once or each node individually.
  6. This provisions the nodes using the configured backend mechanism. At this moment, this means running the configured provisioning playbook through Ansible.
  7. If any nodes failed to be provisioned, they’re either deleted from the database or (if reusable) marked as unused again.
  8. Successfully provisioned nodes are marked as ready, i.e. available to be handed out to tenants.