As of OpenShift Container Platform 3.10 etcd is expected to run in static pods on the master nodes in the control plane. You may have a deployed an HA cluster with dedicated etcd nodes managed with systemd. How do you migrate the this new architecture?
- You are running OCP 3.9
- You have multiple Master nodes
- You have dedicated Etcd nodes
- You are running RHEL, not Atomic nodes
- Backup etcd
- Scale up Etcd cluster to include Master nodes
- Configure Openshift Masters to ignore the old Etcd nodes
- Scale down etcd cluster to remove old Etcd nodes
Follow along in this document https://docs.openshift.com/container-platform/3.9/admin_guide/assembly_replace-etcd-member.html You may find some etcd aliases handy before proceeding.
new_etcdansible group in your inventory file.
Add the first Master node to this
new_etcdgroup for testing.
new_etcdgroup as a child to the
Confirm your cluster health on the first etcd server.
- Create a backup of your etcd data and configuration.
- Run the etcd scaleup playbook
In my case I found etcd had been accidentally started by hand with a default config file which listened on localhost. The config file was modified by the etcd role and the restart etcd handler was notified, but it was skipped. This caused the etcd cluster status check task to timeout, and subsequent steps in the playbook to fail.
After restarting etcd at 18:43 the cluster reports as healthy, and I re-ran the playbook successfully.
After the playbook has been run successfuly it can be seen that the master node has been added as an etcd endpoint in
/etc/origin/master/master-config.yaml on every master node.
This master is done. Move this first master from the
etcdansible group. Leave it in any other groups it is already a member of of course.
ose-test-etcd-03node from the
master-config.yamlto include only the hosts remaining in the
etcdansible group and restart api service.
Verify OpenShift operation
ose-test-etcd-03node from etcd cluster.
- Repeat for Masters 2 and 3 and etcd nodes 2 and 1.
You are now one step closer to OpenShift 3.10.
At this point etcd should be running only on the 3 Master nodes and not on the old Etcd nodes. All the masters should know this, and you are one step closer to being able to upgrade to OpenShift 3.10.
- As I mentioned I had accidentally started etcd with a default config and the scaleup playbook did not expect this condition.
- I scaled up 2 masters as etcd nodes which got etcd 3.3.11 installed. When I went to scale up the 3rd master soon after, suddenly the newest etcd RPM was 3.2.22 which is incompatible. In fact OpenShift is not certified to work with etcd 3.3. Etcd 3.3 should be excluded in
yum.confbut it is not BZ 1672518! This KB points out a 3.2 etcd container image got a 3.3 etcd binary into it also! “ETCD hosts were upgraded to version 3.3.11.”. Here is what I did.
- Backup and Restore in OpenShift Container Platform 3
- Replacing a failed etcd member
- Known Issues when upgrading to OpenShift 3.10
- Role for configuring master config
- ETCD hosts were upgraded to version 3.3.11.
- Etcd 3.3 should be excluded but it is not BZ 1672518
- 19 Feb 2019 » Downgrade Etcd 3.3.11 to 3.2.22 for OpenShift Compatibility
- 08 Feb 2019 » Etcdctl v2 and v3 Aliases for Peer Authenticated Commands
- 21 Nov 2018 » How to Create and Use OpenStack Heat Orchestration Templates Part 1
- 30 Oct 2018 » Creating OpenStack Provider Network for Use by a Single Project
- 16 Feb 2018 » Load balancing of OpenShift HA Routers Mind the GARP
- 13 Oct 2017 » OpenShift 3.6 Upgrade Metrics Fails Missing heapster-certs Secret
- 20 Aug 2017 » Installing OpenShift on OpenStack
- 14 Aug 2017 » OpenStack Network Diagram
- 09 Aug 2017 » How to push an image to an unexposed OpenShift Docker registry
- 22 Mar 2017 » Automated Pruning of OpenShift Artifacts; Builds, Deploys, Images