June 3, 2021
The Windows Machine Config Operator builds and configures Windows machines to act as nodes in an OpenShift cluster enabling cross platform workloads. This post will demonstrate the addition of a Windows node to an existing cluster and explore the integration of Windows and Kubernetes.
๐ This is part 2 of a 3 part series on OpenShift support for Windows containers. Parts: 1, 2, 3
Enabling Windows Workloads
Enabling support for Windows containers on OpenShift is a “day 2” operation. Before we can proceeed we must ensure we’ve met the “day 1” networking prerequisites covered in part 1 of this series. Be sure to start there.
With that out of the way we can begin understanding Windows containers support in OpenShift. Unlike the Linux nodes which use the CRI-O runtime, Windows nodes continue to use Docker runtime (until such time as containerd is adopted). It is also important to note that unlike the automated over the air updates for CoreOS nodes, the Windows operating system is not automatically patched nor upgraded.
Understanding Windows Container Images
Containers on Windows are less portable than on Linux. It is critical that the same OS version is used on the node and in the container image. This presents a challenge when containerizing applications and for patching nodes.
While the WMCO will configure a Windows node with the kubelet, kube-proxy, and container runtime plumbing enabling it to join the Kubernetes cluster, the upgrading or patching of the node is not automated. The process of building and testing the machine images to address Windows patches will remain your responsibility.
This playbook for automating the creation of a Window image offers a starting point for constructing a pipeline to build and test updated Windows images for use on nodes.
Reviewing the Day 1 Cluster
After provisioning a cluster prepared for Windows support, let’s examine it. From this starting point we have only Linux based nodes and the cluster is deployed to Azure. Additonally, all the cluster operators are healthy and no extra operators have been installed yet.
$ oc get nodes -L kubernetes.io/os
NAME STATUS ROLES AGE VERSION OS
win-tmk9g-master-0 Ready master 60m v1.20.0+7d0a2b2 linux
win-tmk9g-master-1 Ready master 60m v1.20.0+7d0a2b2 linux
win-tmk9g-master-2 Ready master 60m v1.20.0+7d0a2b2 linux
win-tmk9g-worker-westus-wq4hg Ready worker 49m v1.20.0+7d0a2b2 linux
win-tmk9g-worker-westus-x4gc6 Ready worker 49m v1.20.0+7d0a2b2 linux
$ oc describe infrastructure | grep ^Status -A -1
Status:
API Server Internal URI: https://api-int.win.az.tofu.org:6443
API Server URL: https://api.win.az.tofu.org:6443
Etcd Discovery Domain:
Infrastructure Name: win-tmk9g
Platform: Azure
Platform Status:
Azure:
Cloud Name: AzurePublicCloud
Network Resource Group Name: win-tmk9g-rg
Resource Group Name: win-tmk9g-rg
Type: Azure
Events: <none>
$ oc get clusteroperators
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication 4.7.8 True False False 37m
baremetal 4.7.8 True False False 58m
cloud-credential 4.7.8 True False False 62m
cluster-autoscaler 4.7.8 True False False 57m
config-operator 4.7.8 True False False 58m
console 4.7.8 True False False 43m
csi-snapshot-controller 4.7.8 True False False 57m
dns 4.7.8 True False False 56m
etcd 4.7.8 True False False 56m
image-registry 4.7.8 True False False 48m
ingress 4.7.8 True False False 48m
insights 4.7.8 True False False 50m
kube-apiserver 4.7.8 True False False 54m
kube-controller-manager 4.7.8 True False False 55m
kube-scheduler 4.7.8 True False False 55m
kube-storage-version-migrator 4.7.8 True False False 47m
machine-api 4.7.8 True False False 46m
machine-approver 4.7.8 True False False 57m
machine-config 4.7.8 True False False 57m
marketplace 4.7.8 True False False 56m
monitoring 4.7.8 True False False 46m
network 4.7.8 True False False 58m
node-tuning 4.7.8 True False False 57m
openshift-apiserver 4.7.8 True False False 50m
openshift-controller-manager 4.7.8 True False False 55m
openshift-samples 4.7.8 True False False 49m
operator-lifecycle-manager 4.7.8 True False False 57m
operator-lifecycle-manager-catalog 4.7.8 True False False 57m
operator-lifecycle-manager-packageserver 4.7.8 True False False 50m
service-ca 4.7.8 True False False 58m
storage 4.7.8 True False False 58m
$ oc get operators --all-namespaces
No resources found
Installing the Windows Machine Config Operator
OpenShift uses operators to create and manage the nodes in a cluster along with managment of cluster services. Check out my post on Understanding Over the Air Updates for some background.
Most relevant of these cluster operators are the Machine-API and Machine-Config which facilitate the creation of machines using the cloud provider API and the operating system configuration of these machines necessary to form a cluster node. However, enabling Windows containers requires installation of an additional Windows machine config operator.
Installing an operator on the CLI typically requires creation of a Namespace, an OperatorGroup, and a Subscription resource.
- Create the
Namespace
- clusters/az-win/day2/base/namespace.yaml
$ cat <<EOF | oc create -f -
apiVersion: v1
kind: Namespace
metadata:
name: openshift-windows-machine-config-operator
labels:
openshift.io/cluster-monitoring: "true"
EOF
- Create the
OperatorGroup
- clusters/az-win/day2/base/operatorgroup.yaml
$ cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: windows-machine-config-operator
namespace: openshift-windows-machine-config-operator
spec:
targetNamespaces:
- openshift-windows-machine-config-operator
EOF
- Create the
Subscription
- clusters/az-win/day2/base/subscription.yaml
$ cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: windows-machine-config-operator
namespace: openshift-windows-machine-config-operator
spec:
channel: "stable"
installPlanApproval: "Automatic"
name: "windows-machine-config-operator"
source: "redhat-operators"
sourceNamespace: "openshift-marketplace"
EOF
After a few seconds check that the ClusterServiceVersion
has been created. This is used to tell the Operator Lifecycle Manager how to install the operator.
$ oc get csv -n openshift-windows-machine-config-operator
๐ Do you GitOps?
Assuming the
machineset.yaml
has been configured as described and asuming a layout like this we can use a more GitOps compatible flow for the WMCO installation. A Kustomize template can be applied with the oc CLI-k
flag.
Adding a Windows Node
Now that the WMCO factory is in place we can provide it the raw materials to build a Windows node.
Creating a Windows MachineSet
The Machine API operator uses a MachineSet
resource to understand exactly how to build a machine and how many to build. There is already a MachineSet for the Linux workers, so we will create an Azure Windows MachineSet to enable our Windows machines to be built.
๐ Start with
0
replicasIt is important to create the MachineSet with 0 replicas for now as we fulfill some further prerequisites.
Here is an example with placeholder values. Further details on the values to replace in this example MachineSet are discussed in the WMCO github repo.
clusters/az-win/day2/base/machineset.yaml
# example MachineSet before replacing "<values>"
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
labels:
machine.openshift.io/cluster-api-cluster: <infrastructureID>
name: winworker
namespace: openshift-machine-api
spec:
replicas: 0
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: <infrastructureID>
machine.openshift.io/cluster-api-machineset: winworker
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: <infrastructureID>
machine.openshift.io/cluster-api-machine-role: worker
machine.openshift.io/cluster-api-machine-type: worker
machine.openshift.io/cluster-api-machineset: winworker
machine.openshift.io/os-id: Windows
spec:
metadata:
labels:
node-role.kubernetes.io/worker: ""
providerSpec:
value:
apiVersion: azureproviderconfig.openshift.io/v1beta1
credentialsSecret:
name: azure-cloud-credentials
namespace: openshift-machine-api
image:
offer: WindowsServer
publisher: MicrosoftWindowsServer
resourceID: ""
sku: 2019-Datacenter-with-Containers
version: latest
kind: AzureMachineProviderSpec
location: <location>
managedIdentity: <infrastructureID>-identity
networkResourceGroup: <infrastructureID>-rg
osDisk:
diskSizeGB: 128
managedDisk:
storageAccountType: Premium_LRS
osType: Windows
publicIP: false
resourceGroup: <infrastructureID>-rg
subnet: <infrastructureID>-worker-subnet
userDataSecret:
name: windows-user-data
namespace: openshift-machine-api
vmSize: Standard_D2s_v3
vnet: <infrastructureID>-vnet
zone: "<zone>"
Every cluster has an infrastructure name that is a combination of the cluster name and a unique string. We will need to use this value in the Windows MachineSet.
- Capture the infrastructure ID
$ export CLUSTER_ID=$(oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster)
$ echo $CLUSTER_ID
win-tmk9g
- Create the above MachineSet after updating the placeholder values (there are no availability zones in westus region)
$ sed \
-e "s/<infrastructureID>/$CLUSTER_ID/" \
-e "s/<location>/westus/" \
-e "s/<zone>//" \
-i.bak \
$CLUSTER_DIR/day2/base/machineset.yaml
$ oc apply -n openshift-machine-api -f $CLUSTER_DIR/day2/base/machineset.yaml
Providing an SSH Private Key
In part 1 of this series, we generated an ssh key for installing OpenShift. Now we will give this same key to the WMCO for use in configuring our Windows node. By providing the private key, a new public key will be minted by the operator and installed on the node via user-data.
- Create a secret containing the private key that will be used to access the Windows VMs
$ oc create secret generic cloud-private-key \
--from-file=private-key.pem=${HOME}/.ssh/az-win \
-n openshift-windows-machine-config-operator
After the ssh key is created the WMCO will generate a public key and create a windows-user-data
secret for use by the openshift-machine-api when provisioning the machine.
$ oc logs -n openshift-windows-machine-config-operator \
deployment/windows-machine-config-operator | tail -1
2021-05-24T18:26:13.851Z INFO secret_controller secret not found, creating the secret {"namespace": "openshift-windows-machine-config-operator", "name": "cloud-private-key", "name": "windows-user-data"}
You can extract the secret to view the contents. Notice the authorized key file modification.
$ oc extract secret/windows-user-data -n openshift-machine-api --to=-
# userData
<powershell>
Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0
$firewallRuleName = "ContainerLogsPort"
$containerLogsPort = "10250"
New-NetFirewallRule -DisplayName $firewallRuleName -Direction Inbound -Action Allow -Protocol TCP -LocalPort $containerLogsPort -EdgeTraversalPolicy Allow
Set-Service -Name sshd -StartupType โAutomaticโ
Start-Service sshd
$pubKeyConf = (Get-Content -path C:\ProgramData\ssh\sshd_config) -replace '#PubkeyAuthentication yes','PubkeyAuthentication yes'
$pubKeyConf | Set-Content -Path C:\ProgramData\ssh\sshd_config
$passwordConf = (Get-Content -path C:\ProgramData\ssh\sshd_config) -replace '#PasswordAuthentication yes','PasswordAuthentication yes'
$passwordConf | Set-Content -Path C:\ProgramData\ssh\sshd_config
$authorizedKeyFilePath = "$env:ProgramData\ssh\administrators_authorized_keys"
New-Item -Force $authorizedKeyFilePath
echo "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJhagYxgTRdyNVUU8w+WwISvm2Syq6Wu+GV0nz/XoP99
"| Out-File $authorizedKeyFilePath -Encoding ascii
$acl = Get-Acl C:\ProgramData\ssh\administrators_authorized_keys
$acl.SetAccessRuleProtection($true, $false)
$administratorsRule = New-Object system.security.accesscontrol.filesystemaccessrule("Administrators","FullControl","Allow")
$systemRule = New-Object system.security.accesscontrol.filesystemaccessrule("SYSTEM","FullControl","Allow")
$acl.SetAccessRule($administratorsRule)
$acl.SetAccessRule($systemRule)
$acl | Set-Acl
Restart-Service sshd
</powershell>
<persist>true</persist>
๐ Beware of failures creating windows-user-data secret
If you see an error like: โfailed to create vm winworker-29pjk: failed to get custom script data: error getting user data secret windows-user-data in namespace openshift-machine-api: Secret “windows-user-data” not found
Check the operator logs. Hint: Did your ssh key have a passphrase?
$ oc logs -n openshift-windows-machine-config-operator \ -f deployment/windows-machine-config-operator $ oc get -n openshift-windows-machine-config-operator \ secret/windows-user-data
Scaling up the Windows MachineSet
We created the MachineSet with zero replicas precisely because the windows-user-data secret did not yet exist. Now that it does we can scale up and create our Windows machine.
- Scale up the machineset
$ oc scale machineset winworker --replicas=1
Eventually there will be a windows machine and it will become a node. We can use some labels to identify the operating systems.
$ oc get machines -n openshift-machine-api -L machine.openshift.io/os-id
NAME PHASE TYPE REGION ZONE AGE OS-ID
win-77226-master-0 Running Standard_D8s_v3 westus 3h6m
win-77226-master-1 Running Standard_D8s_v3 westus 3h6m
win-77226-master-2 Running Standard_D8s_v3 westus 3h6m
win-77226-worker-westus-g55j4 Running Standard_D2s_v3 westus 3h
win-77226-worker-westus-n2kwj Running Standard_D2s_v3 westus 3h
win-77226-worker-westus-wv8ql Running Standard_D2s_v3 westus 3h
winworker-74qw4 Running Standard_D2s_v3 westus 76m Windows
$ oc get nodes -L kubernetes.io/os
NAME STATUS ROLES AGE VERSION OS
win-77226-master-0 Ready master 3h3m v1.20.0+bafe72f linux
win-77226-master-1 Ready master 3h3m v1.20.0+bafe72f linux
win-77226-master-2 Ready master 3h3m v1.20.0+bafe72f linux
win-77226-worker-westus-g55j4 Ready worker 172m v1.20.0+bafe72f linux
win-77226-worker-westus-n2kwj Ready worker 172m v1.20.0+bafe72f linux
win-77226-worker-westus-wv8ql Ready worker 172m v1.20.0+bafe72f linux
winworker-74qw4 Ready worker 61m v1.20.0-1030+cac2421340a449 windows
๐บ Watch Demo: Installing WMCO with Kustomize and Deploying a Windows Node
Accessing the Windows Node via SSH
Our Windows node does not necessarily have a graphical interface, so how do we connect to it? Ssh of course, but to do that requires a bastion that can reach it. We will use a pod for this.
Deploying a Bastion Pod as an SSH Client
- Create a Deployment to launch the bastion pod, courtesy of Christian Hernandez’s Windows Containers Quickstart Workshop - clusters/az-win/day2/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: winc-ssh
name: winc-ssh
namespace: openshift-windows-machine-config-operator
spec:
replicas: 1
selector:
matchLabels:
app: winc-ssh
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: winc-ssh
spec:
containers:
- command:
- /bin/bash
- -c
- |
sleep infinity
image: quay.io/redhatworkshops/winc-ssh:latest
name: winc-ssh-container
resources: {}
volumeMounts:
- mountPath: /tmp/ssh
name: sshkey
readOnly: true
volumes:
- name: sshkey
secret:
defaultMode: 256
secretName: cloud-private-key
๐ Pending Pod If you notice that the winc-ssh pod is stuck in ‘pending’ state, it is likely because the pod needs to mount a secret which wasn’t created yet.
After the pod has launched we can oc rsh
to the pod and then ssh
to the IP address of the Windows node. It is worth mentioning that the Azure Windows image will expect you to login as user capi
rather than administrator
.
$ oc get nodes -l kubernetes.io/os=windows -o yaml | yq e '.items[].status | .addresses' -
- address: winworker-g7tsk
type: Hostname
- address: 10.0.32.6
type: InternalIP
$ oc rsh -n openshift-windows-machine-config-operator deployment/winc-ssh
sh-4.4$ export WIN_NODE=10.0.32.6
sh-4.4$ ssh -i /tmp/ssh/private-key.pem capi@$WIN_NODE
๐บ Watch Demo: ssh to Windows node
Once logged in you can interact with containers using the legacy docker runtime.
PS C:\Users\capi> docker network ls
NETWORK ID NAME DRIVER SCOPE
16b1dbbcbc00 nat nat local
f9ad45f65cb7 none null local
PS C:\Users\capi> docker network inspect nat
[ {
"Name": "nat",
"Id": "16b1dbbcbc009975abee4bea88378cf8f3ab4062c70f31c8fcf49d1057448cd8",
"Created": "2021-05-14T23:51:33.3446158Z",
"Scope": "local",
"Driver": "nat",
"EnableIPv6": false,
"IPAM": {
"Driver": "windows",
"Options": null,
"Config": [ {
"Subnet": "172.17.64.0/20",
"Gateway": "172.17.64.1"
} ]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {},
"Options": {
"com.docker.network.windowsshim.hnsid": "FA662A2F-C423-41AF-90EC-26E71FB35871",
"com.docker.network.windowsshim.networkname": "nat"
},
"Labels": {}
} ]
๐ Now that we have docked, all we need now is a containerized Windows application!
Summary
OpenShift enables cloud native workflows for diverse workloads. It enhances automation, resilience, and scalability while enhancing developer productivity for legacy applications. With support for Windows nodes, everyone is invited to dock with the cluster!
After deploying OpenShift to Azure and adding a Windows node using the WindowsMachineConfigOperator
we are ready to deploy a cross platform application. Stay tuned for part 3!
References
- Understanding Windows Containers Support on OpenShift
- Enabling Windows container workloads on OpenShift
- Windows Machine Config Operator Project
- Creating a Windows MachineSet for Azure
- Azure MachineSet docs from WMCO Project
- Windows container networking
- Playbook for Creating Windows Container Images
- Video Demo: Installing a Windows Node on OpenShift
- Windows Containers Quickstart Workshop
- Machine-API Operator
- Machine-Config Operator