Figure 1 (below) illustrates how Magic Castle is structured to provide a unified interface between multiple cloud providers. Each blue block is a file or a module, while white blocks are variables or resources. Arrows indicate variables or resources that contribute to the definition of the linked variables or resources. The figure can be read as a flow-chart from top to bottom. Some resources and variables have been left out of the chart to avoid cluttering it further.
Figure 1. Magic Castle Terraform Project Structure
-
main.tf
: User provides the instances and volumes structure they wants as _map_s.instances = { mgmt = { type = "p4-7.5gb", tags = ["puppet", "mgmt", "nfs"] } login = { type = "p2-3.75gb", tags = ["login", "public", "proxy"] } node = { type = "p2-3.75gb", tags = ["node"], count = 2 } } volumes = { nfs = { home = { size = 100 } project = { size = 500 } scratch = { size = 500 } } }
-
common/design
:- the
instances
map is expanded to form a new map where each entry represents a single host.instances = { mgmt1 = { type = "p2-3.75gb" tags = ["puppet", "mgmt", "nfs"] } login1 = { type = "p2-3.75gb" tags = ["login", "public", "proxy"] } node1 = { type = "p2-3.75gb" tags = ["node"] } node2 = { type = "p2-3.75gb" tags = ["node"] } }
- the
volumes
map is expanded to form a new map where each entry represent a single volumevolumes = { mgmt1-nfs-home = { size = 100 } mgmt1-nfs-project = { size = 100 } mgmt1-nfs-scratch = { size = 500 } }
- the
-
network.tf
: theinstances
map fromcommon/design
is used to generate a network interface (nic) for each host, and a public ip address for each host with thepublic
tag. The local ip address retrieved from the nic of the instance taggedpuppet
is outputted aspuppetserver_ip
.resource "provider_network_interface" "nic" { for_each = module.design.instances ... }
-
common/instance_config
: for each host ininstances
, a cloud-init yaml config that includespuppetserver_ip
is generated. These configs are outputted to auser_data
map where the keys are the hostnames.user_data = { for key, values in var.instances : key => templatefile("${path.module}/puppet.yaml", { ... }) }
-
infrastructure.tf
: for each host ininstances
, an instance resource as defined by the selected cloud provider is generated. Each instance is initially configured by itsuser_data
cloud-init yaml config.resource "provider_instance" "instances" { for_each = module.design.instance user_data = module.instance_config.user_data[each.key] ... }
-
infrastructure.tf
: for each volume involumes
, a block device as defined by the selected cloud provider is generated and attached it to its matching instance using anattachment
resource.resource "provider_volume" "volumes" { for_each = module.design.volumes size = each.value.size ... } resource "provider_attachment" "attachments" { for_each = module.design.volumes instance_id = provider_instance.instances[each.value.instance].id volume_id = provider_volume.volumes[each.key].id ... }
-
infrastructure.tf
: the created instances' information are consolidated in a map output asall_instances
.all_instances = { mgmt1 = { public_ip = "" local_ip = "10.0.0.1" id = "abc1213-123-1231" hostkey = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAB" tags = ["mgmt", "puppet", "nfs"] } ... }
-
common/cluster_config
: the information from created instances is consolidated inall_instances
and written in a yaml file that is uploaded on the Puppet server as part of the hieradata.resource "null_resource" "deploy_hieradata" { ... provisioner "file" { content = local.hieradata destination = "terraform_data.yaml" } ... }
-
outputs.tf
: the information of all instances that have a public address are output as a map namedpublic_instances
.
In the previous section, we have used generic resource name when writing HCL code that defines these resources. The following table indicate what resource is used for each provider based on its role in the cluster.
Magic Castle currently supports five cloud providers, but its design makes it easy to add new providers. This section presents a step-by-step guide to add a new cloud provider support to Magic Castle.
-
Identify the resources. Using the Resource per provider table, read the cloud provider Terraform documentation, and identify the name for each resource in the table.
-
Check minimum requirements. Once all resources have been identified, you should be able to determine if the cloud provider can be used to deploy Magic Castle. If you found a name for each resource listed in table, the cloud provider can be supported. If some resources are missing, you will need to do read the provider's documentation to determine if the absence of the resource can be compensated for somehow.
-
Initialize the provider folder. Create a folder named after the provider. In this folder, create two symlinks, one pointing to
common/variables.tf
and the other tocommon/outputs.tf
. These files define the interface common to all providers supported by Magic Castle. -
Define cloud provider specifics variables. Create a file named after your provider
provider_name.tf
and define variables that are required by the provider but not common to all providers, for example the availability zone or the region. -
Initialize the infrastructure. Create a file named
infrastructure.tf
. In this file, define the provider if it requires input parameters (for example the region) and include thecommon/design
module.provider "provider_name" { region = var.region } module "design" { source = "../common/design" cluster_name = var.cluster_name domain = var.domain instances = var.instances volumes = var.volumes }
-
Create the networking infrastructure. Create a file named
network.tf
and define the network, subnet, router, nat, firewall, nic and public ip resources using themodule.design.instances
map. -
Create the instance configurations. In
infrastructure.tf
, include thecommon/instance_config
module and provide the required input parameters.
module "instance_config" {
source = "../common/instance_config"
...
}
-
Create the instances. In
infrastructure.tf
, define theinstances
resource usingmodule.design.instances
for the instance attributes andmodule.instance_config.user_data
for the initial configuration. -
Create the volumes. In
infrastructure.tf
, define thevolumes
resource usingmodule.design.volumes
. -
Attach the volumes. In
infrastructure.tf
, define theattachments
resource usingmodule.design.volumes
and refer to the attributeeach.value.instance
to retrieve the instance's id to which the volume needs to be attached. -
Consolidate the instances' information. In
infrastructure.tf
, define a local variable namedall_instances
that will be a map containing the following keys (for each created instance):id
,public_ip
,local_ip
,tags
,hostkeys
, wherehostkeys
is also a map with a key namedrsa
that correspond to the instance hostkey. -
Consolidate the volume device information. In
infrastructure.tf
, define a local variable namedvolume_devices
implementing the following logic in HCL. Replace the line starting by/dev/disk/by-id
with the proper logic that would match the volume resource to its device path from within the instance to which it is attached.
volume_devices = {
for ki, vi in var.volumes :
ki => {
for kj, vj in vi :
kj => [for key, volume in module.design.volumes :
"/dev/disk/by-id/*${substr(provider_volume.volumes["${volume["instance"]}-${ki}-${kj}"].id, 0, 20)}"
if key == "${volume["instance"]}-${ki}-${kj}"
]
}
}
- Create the cluster configuration and upload. In
infrastructure.tf
, include thecommon/cluster_config
module and provide the required input parameters.
-
Identify the resources. For Digital Ocean, Oracle Cloud and Alibaba Cloud, we get the following resource mapping:
Resource Digital Ocean Oracle Cloud Alibaba Cloud network digitalocean_vpc oci_core_vcn alicloud_vpc subnet built in vpc oci_subnet alicloud_vswitch router n/a oci_core_route_table built in vpc nat n/a oci_core_internet_gateway alicloud_nat_gateway firewall digitalocean_firewall oci_core_security_list alicloud_security_group nic n/a built in instance alicloud_network_interface public ip digitalocean_floating_ip built in instance alicloud_eip instance digitalocean_droplet oci_core_instance alicloud_instance volume digitalocean_volume oci_core_volume alicloud_disk attachment digitalocean_volume_attachment oci_core_volume_attachment alicloud_disk_attachment -
Check minimum requirements. In the preceding table, we can see Digital Ocean does not have the ability to define a network interface. The documentation also leads us to conclude that it is not possible to define the private ip address of the instances before creating them. Because the Puppet server ip address is required before generating the cloud-init YAML config for all instances, including the Puppet server itself, this means it impossible to use Digital Ocean to spawn a Magic Castle cluster.
Oracle Cloud presents the same issue, however, after reading the instance documentation, we find that it is possible to define a static ip address as a string in the instance attribute. It would therefore be possible to create a datastructure in Terraform that would associate each instance hostname with an ip address in the subnet CIDR.
Alibaba cloud has an answer for each resource, so we will use this provider in the following steps. -
Initialize the provider folder. In a terminal:
git clone https://github.com/ComputeCanada/magic_castle.git
cd magic_castle
mkdir alicloud
cd aliclcoud
ln -s ../common/{variables,outputs}.tf .
- Define cloud provider specifics variables. Add the following to a new file
alicloud.tf
:
variable "region" { }
locals {
cloud_provider = "alicloud"
cloud_region = var.region
}
- Initialize the infrastructure. Add the following to a new file
infrastructure.tf
:
provider "alicloud" {
region = var.region
}
module "design" {
source = "../common/design"
cluster_name = var.cluster_name
domain = var.domain
instances = var.instances
volumes = var.volumes
}
- Create the networking infrastructure.
network.tf
base template:
resource "alicloud_vpc" "network" { }
resource "alicloud_vswitch" "subnet" { }
resource "alicloud_nat_gateway" "nat" { }
resource "alicloud_security_group" "firewall" { }
resource "alicloud_security_group_rule" "allow_in_services" { }
resource "alicloud_security_group" "allow_any_inside_vpc" { }
resource "alicloud_security_group_rule" "allow_ingress_inside_vpc" { }
resource "alicloud_security_group_rule" "allow_egress_inside_vpc" { }
resource "alicloud_network_interface" "nic" { }
resource "alicloud_eip" "public_ip" { }
resource "alicloud_eip_association" "eip_asso" { }
locals {
puppetserver_ip = [
for x, values in module.design.instances : alicloud_network_interface.nic[x].private_ip
if contains(values.tags, "puppet")
]
}
- Create the instance configuration. Add the following to
infrastructure.tf
:
module "instance_config" {
source = "../common/instance_config"
instances = module.design.instances
config_git_url = var.config_git_url
config_version = var.config_version
puppetserver_ip = local.puppetserver_ip
sudoer_username = var.sudoer_username
public_keys = var.public_keys
generate_ssh_key = var.generate_ssh_key
}
- Create the instances. Add and complete the following snippet to
infrastructure.tf
:
resource "alicloud_instance" "instances" {
for_each = module.design.instances
}
- Create the volumes. Add and complete the following snippet to
infrastructure.tf
:
resource "alicloud_disk" "volumes" {
for_each = module.design.volumes
}
- Attach the volumes. Add and complete the following snippet to
infrastructure.tf
:
resource "alicloud_disk_attachment" "attachments" {
for_each = module.design.volumes
}
- Consolidate the instances' information. Add the following snippet to
infrastructure.tf
:
locals {
all_instances = { for x, values in module.design.instances :
x => {
public_ip = contains(values["tags"], "public") ? alicloud_eip.public_ip[x].public_ip : ""
local_ip = alicloud_network_interface.nic[x].private_ip
tags = values["tags"]
id = alicloud_instance.instances[x].id
hostkeys = {
rsa = module.instance_config.rsa_hostkeys[x]
}
}
}
}
- Consolidate the volume devices' information. Add the following snippet to
infrastructure.tf
:
volume_devices = {
for ki, vi in var.volumes :
ki => {
for kj, vj in vi :
kj => [for key, volume in module.design.volumes :
"/dev/disk/by-id/virtio-${replace(alicloud_disk.volumes["${volume["instance"]}-${ki}-${kj}"].id, "d-", "")}"
if key == "${volume["instance"]}-${ki}-${kj}"
]
}
}
- Create the cluster configuration and upload. Add the following snippet to
infrastructure.tf
.
module "cluster_config" {
source = "../common/cluster_config"
instances = local.all_instances
nb_users = var.nb_users
hieradata = var.hieradata
software_stack = var.software_stack
cloud_provider = local.cloud_provider
cloud_region = local.cloud_region
sudoer_username = var.sudoer_username
guest_passwd = var.guest_passwd
domain_name = module.design.domain_name
cluster_name = var.cluster_name
volume_devices = local.volume_devices
private_ssh_key = module.instance_config.private_key
}
Once your new provider is written, you can write an example that will use the module to spawn a Magic Castle cluster with that provider.
module "alicloud" {
source = "./alicloud"
config_git_url = "https://github.com/ComputeCanada/puppet-magic_castle.git"
config_version = "main"
cluster_name = "new"
domain = "my.cloud"
image = "centos_7_9_x64_20G_alibase_20210318.vhd"
nb_users = 10
instances = {
mgmt = { type = "ecs.g6.large", tags = ["puppet", "mgmt", "nfs"] }
login = { type = "ecs.g6.large", tags = ["login", "public", "proxy"] }
node = { type = "ecs.g6.large", tags = ["node"], count = 1 }
}
volumes = {
nfs = {
home = { size = 10 }
project = { size = 50 }
scratch = { size = 50 }
}
}
public_keys = [file("~/.ssh/id_rsa.pub")]
# Alicloud specifics
region = "us-west-1"
}