OpenShift on Azure with Terraform!
OpenShift Reference Architecture implementation on Azure using Terraform.
Make sure you have Terraform(v0.11.x) in your PATH.
Log in into Azure using Azure CLI:
az login
To generate certificate to be used by the openshift cluster, use the certs module. Configure the cert.tfvars file as needed.
To generate the certificate using ACME, do:
cd certs
terraform apply -var-file=cert.tfvars
To get the certificate values, do:
terraform output public_certificate_pem
terraform output public_certificate_key
terraform output public_certificate_intermediate_pem
Once the certificate is generated, you can use these certificates in either terraform-ocp.tfvars or terraform-okd.tfvars files according to your needs
Create a service principal which will allow terraform to create resources on your behalf on azure
az ad sp create-for-rbac -n {PRINCIPAL_NAME} --password {PASSWORD} --role contributor --scopes /subscriptions/{subscription-id}
You can tweak the OpenShift inventory file. Its rendered, copied and executed on the server using inventory.tf.
To configure OKD, modify the variables in openshift/terraform-okd.tfvars
, leave the empty variables and replace the variables filled with capital letters and apply:
cd openshift
terraform apply -var-file=terraform-okd.tfvars
To configure OCP, modify the variables in openshift/terraform-ocp.tfvars
, replace the variables in capital letters and apply:
cd openshift
terraform apply -var-file=terraform-ocp.tfvars
When finished, you will get the public IPs for the Bastion host and for both the External Load Balancer and the Router Load Balancer.
In order to SSH into the Bastion host use the key in the keys
folder:
ssh -i keys/bastion.key cloud-user@BASTION_IP
The oc
command is configured to be used in the Bastion host.
Also, you can access the other servers from bastion host e.g to access master1
server execute the following on bastion
ssh -i openshift.key [email protected]
To scale up openshift stack, set the scale_up
variable to true and add the configuration of new nodes under OSEv3.children.new_nodes.hosts
in the openshift/provision/template-inventory.yaml
e.g
OSEv3:
children:
new_nodes:
hosts:
infra2.openshift.local:
openshift_node_group_name: node-config-infra
and simply do terraform apply.
- If you have changed some config file in the
openshift/provision
folder and need to re apply the config on the stack then chances are it won't get triggered automatically. This is done by design to avoid automatic deployment e.g in case of scale up. In case you have to re apply the config on the server then use theterraform taint
command. e.g you have the changed the inventory file and want to re-apply the deploy cluster script then you need to first do:
terraform taint null_resource.main
and then do terraform apply
Terraform currently has an issue with resources depending on entire modules. The work around for this is to just re-apply the resource that failed (manually taint resources if needed).
On certain AMIs, openshift sdn pods may not start causing the nodes to not become ready. This issue is caused when the network interface is not allowed to be managed by network manager. You can confirm this by reading the file /etc/sysconfig/network-scripts/ifcfg-eth0
and make sure that NM_CONTROLLED
is set to yes
. To automate this, you can add the following task to your standard ansible node config.
- name: Allow network to be controlled by Network Manager
lineinfile:
dest: /etc/sysconfig/network-scripts/ifcfg-eth0
regexp: '^NM_CONTROLLED=no$'
line: 'NM_CONTROLLED=yes'
backrefs: yes
For this repo, it is already added to openshift/provision/node-config-playbook.yaml
null_resource.bastion_config (remote-exec): Connected!
null_resource.bastion_config (remote-exec): Loaded plugins: langpacks, product-id,
null_resource.bastion_config (remote-exec): : search-disabled-repos,
null_resource.bastion_config (remote-exec): : subscription-manager
null_resource.bastion_config (remote-exec): This system is registered with an entitlement server, but is not receiving updates. You can use subscription-manager to assign subscriptions.
null_resource.bastion_config (remote-exec): No package ansible available.
null_resource.bastion_config (remote-exec): Error: Nothing to do
null_resource.bastion_config (remote-exec): /home/cloud-user/bastion-config.sh: line 4: ansible-playbook: command not found
Error: Error applying plan:
1 error(s) occurred:
* null_resource.bastion_config: error executing "/tmp/terraform_1606393123.sh": Process exited with status 127
This issue was due to the existing subscriptions that was causing some packages to not be installed that caused the above issue. This was resolved by removing the existing subsriptions.