OpenShift 3.6 Upgrade Metrics Fails Missing heapster-certs Secret

October 13, 2017

After your upgrade to OpenShift v3.6 did the deployment of cluster metrics wind up with empty graphs? Check if the heapster pod failed to start due to a missing secret called heapster-certs in the openshift-infra namespace.


Heapster pod is failing to start

$ oc get pods
NAME                         READY     STATUS              RESTARTS   AGE
hawkular-cassandra-1-l1f3s   1/1       Running             0          9m
hawkular-metrics-rdl07       1/1       Running             0          9m
heapster-cfpcj               0/1       ContainerCreating   0          3m

Check what volumes it is attempting to mount

$ oc volume rc/heapster
  secret/heapster-secrets as heapster-secrets
    mounted at /secrets
  secret/hawkular-metrics-account as hawkular-metrics-account
    mounted at /hawkular-account
  secret/hawkular-metrics-certs as hawkular-metrics-certs
    mounted at /hawkular-metrics-certs
  secret/heapster-certs as heapster-certs
    mounted at /heapster-certs

Check for the existence of the heapster-certs secret

$ oc get secrets heapster-certs
Error from server (NotFound): secrets "heapster-certs" not found


Maybe you, like I, overlooked a v3.3 tech preview feature called service serving certificates. You missed that this became mandatory in v3.6 because it is not yet in the release notes. See also this bug.

However, even if you have /etc/origin/master/service-signer.crt in my case it was not visible because of this commit to the v3.3 upgrade playbook had a typo placing servicesServingCert instead of serviceServingCert in /etc/origin/master/master-config.yaml. e.g.

      certFile: service-signer.crt
      keyFile: service-signer.key

And now it has been fixed in PR 5765

