Refresh Datadog certificate

The following problem may have been the actual cause to Datadog not working on some instances: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10435


Datadog email:

https://docs.datadoghq.com/agent/faq/certificate_verify_failed-error/

You are receiving this email as an administrator of your Datadog account. You are currently among a subset of customers impacted by a connectivity issue with Datadog and your action is required to resolve it.

What happened?

On Saturday May 30th, 2020, at 10:48 UTC, an SSL root certificate belonging to a certificate authority and used to sign some of the Datadog certificates expired, and caused some of your agents to lose connectivity with Datadog endpoints. Because this root certificate is embedded in agents from version 3.6 to version 5.32.6, you will need to take action to restore connectivity.

What versions of the agent are affected?

Agent versions spanning 3.6.x to 5.32.6 embed the expired root certificate and are affected. Agent versions 6.x and 7.x are unaffected; if you are using these agents, no action is required on your part.

How can I fix this by updating to the latest version of Agent 5?

You can also resolve this issue by upgrading all your instances of agent 5.x to the latest version, 5.32.7, as described below:

To manually update the Datadog Agent core between two minor versions on a given host, run the corresponding install command for your platform.

Centos/Red Hat: sudo yum check-update && yum install datadog-agent Debian/Ubuntu: sudo apt-get update && sudo apt-get install datadog-agent Windows: Download the Datadog Agent installer (https://s3.amazonaws.com/ddagent-windows-stable/ddagent-cli-latest.msi). start /wait msiexec /qn /i ddagent-cli-latest.msi

How can I fix this without updating the agent?

If you do not want to update your agent or are not sure which version you are running, you can address this issue immediately by deleting the certificate file bundled with the agent as follows:

On Linux:

sudo rm /opt/datadog-agent/agent/datadog-cert.pem && sudo service datadog-agent restart

On windows using PowerShell:

rm "C:\Program Files (x86)\Datadog\Datadog Agent\files\datadog-cert.pem"
net stop /y datadogagent ; net start /y datadogagent

On Windows through the GUI:

Delete "datadog-cert.pem", in: C:\Program Files (x86)\Datadog\Datadog Agent\files\
Once removed, simply restart the Datadog Service from the Windows Service Manager.

Can I run this fix everywhere regardless of agent version?

Yes. You may see messages saying that the file doesn't exist, as agent 6.x and 7.x don't include it, but it won't impact these agents.

Should I update my agent even if I deleted the certificate?

We recommend keeping up to date and updating to the latest version of the agent. Deployments set to auto-update will do so with 5.32.7.

Am I still encrypting traffic with SSL even if I delete the certificate?

Yes. The certificate is just a preset for the client to use and is not necessary to connect via SSL. Datadog agent endpoints only accept SSL traffic.