Troubleshooting Oracle Cloud Agent
When using Oracle Cloud Agent, you might encounter the following problems:
- On the Oracle Cloud Agent tab of the Instance Details page, the status for all plugins is Invalid.
- In the Metrics section of the Console dashboard, you can't see any CPU, memory, network, or disk metrics for the instance.
If you encounter any of these problems, Oracle Cloud Agent might not be installed or running, or it might not be able to communicate with Oracle services. To diagnose the specific issue, follow these troubleshooting steps.
In this topic, the instructions for Oracle Linux also apply to CentOS images.
If you can't connect to your instance, see:
Step 1: Verify that Oracle Cloud Agent is Installed
Follow these steps to confirm that Oracle Cloud Agent is installed on your instance.
- Connect to the instance and run one of the following commands, depending on your operating system.Oracle Linux
rpm -q oracle-cloud-agent && echo "OCA Installed" || echo "OCA not Installed"
If Oracle Cloud Agent is installed, a message similar to the following displays:
oracle-cloud-agent-<version>.x86_64 OCA Installed
Ubuntusnap list oracle-cloud-agent &>/dev/null && echo "OCA Installed" || echo "OCA not Installed"
If Oracle Cloud Agent is installed, the following message displays:
OCA Installed
Windows ServerRun the command in Windows PowerShell as an administrator.
Get-WmiObject -Class Win32_Product |where name -eq "Oracle Cloud Agent"
If Oracle Cloud Agent is installed, a message similar to the following displays:
IdentifyingNumber : {<unique_ID>} Name : Oracle Cloud Agent Vendor : Oracle Corporation Version : <version> Caption : Oracle Cloud Agent
- If the message indicating that Oracle Cloud Agent is installed does not display after you run the command, install Oracle Cloud Agent. If Oracle Cloud Agent is installed, proceed to the next step to verify that it is running.
Step 2: Verify that Oracle Cloud Agent is Running
After you confirm that Oracle Cloud Agent is installed, follow these steps to confirm that it is running.
- Connect to the instance and run one of the following commands to restart Oracle Cloud Agent.
Oracle Linux 7.x and later versions
systemctl is-enabled oracle-cloud-agent &>/dev/null && echo "OCA is enabled" || echo "OCA is disabled" \ && systemctl is-active oracle-cloud-agent &> /dev/null && echo "OCA is running" || echo "OCA is not running"
Expected response if Oracle Cloud Agent is running:
OCA is enabled OCA is running
Ubuntusnap services oracle-cloud-agent
Expected response if Oracle Cloud Agent is running:
Service Startup Current Notes oracle-cloud-agent.oracle-cloud-agent enabled active -
Windows ServerRun the command in Windows PowerShell as an administrator.
sc.exe query "OCA"|findstr "RUNNING"
Expected response if Oracle Cloud Agent is running:
STATE : 4 RUNNING
- If the message indicating that Oracle Cloud Agent is running does not display after you run the command, run the diagnostic tool and then file a support ticket with the file that contains debugging information and logs for the plugins. If Oracle Cloud Agent is running, proceed to the next step to verify that it can connect to Oracle services.
Step 3: Verify that Oracle Cloud Agent Can Connect to Oracle Services
If you confirm that Oracle Cloud Agent is installed and running but the status for all plugins on the Instance Details page is Invalid or you cannot see any metrics in the Metrics section of the Console dashboard, Oracle Cloud Agent might not be able to connect to Oracle services. The following sections explore possible reasons that Oracle Cloud Agent is unable to connect to Oracle services. To diagnose the issue, follow these steps in order.
- Verify that the instance can access the Instance Metadata Service endpoint.
- Check for clock skew errors.
- Verify that gateways are configured correctly.
- Change your proxy server settings.
Verify that the Instance Can Access the Instance Metadata Service Endpoint
These steps verify whether the instance can access the Instance Metadata Service endpoint.
- Connect to the instance and run one of the following commands, depending on you operating system.Oracle Linux and Ubuntu
curl -v -H 'Authorization: Bearer Oracle' http://169.254.169.254/opc/v2/instance/
If Oracle Cloud Agent is running, a message similar to the following displays:
* About to connect() to 169.254.169.254 port 80 (#0) * Trying 169.254.169.254... * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) > GET /opc/v2/instance/ HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 169.254.169.254 > Accept: */* > Authorization: Bearer Oracle > < HTTP/1.1 200 OK < Server: server < Date: Wed, 24 Mar 2021 20:52:38 GMT < Content-Type: application/json < Content-Length: 1800 < Last-Modified: Wed, 03 Mar 2021 01:43:50 GMT < Connection: keep-alive < ETag: "603ee9d6-708" < Accept-Ranges: bytes < { "availabilityDomain" : "uybn:<availability_domain>", "faultDomain" : "<fault_domain>", "compartmentId" : "ocid1.compartment.oc1..<unique_ID>", "displayName" : "<instance_name>", "hostname" : "<host_name>", "id" : "<unique_ID>", "image" : "ocid1.image.oc1.iad.<unique_ID>", "metadata" : { "ssh_authorized_keys" : "" }, "region" : "<region_key>", "canonicalRegionName" : "<region_name>", "ociAdName" : "<availability_domain>", "regionInfo" : { "realmKey" : "<realm>", "realmDomainComponent" : "oraclecloud.com", "regionKey" : "<region_key>", "regionIdentifier" : "<region>" }, "shape" : "<shape>", "state" : "Running", "timeCreated" : 1614637343723, "agentConfig" : { "monitoringDisabled" : false, "managementDisabled" : false, "allPluginsDisabled" : false, "pluginsConfig" : [ { "name" : "OS Management Service Agent", "desiredState" : "ENABLED" }, { "name" : "Custom Logs Monitoring", "desiredState" : "ENABLED" }, { "name" : "Compute Instance Run Command", "desiredState" : "ENABLED" }, { "name" : "Compute Instance Monitoring", "desiredState" : "ENABLED" } ] }, "freeformTags" : { "keep" : "keep" } * Connection #0 to host 169.254.169.254 left intact
Windows ServerRun the command in Windows PowerShell as an administrator.
Invoke-WebRequest -Headers @{'Authorization'='Bearer Oracle'} http://169.254.169.254/opc/v2/instance/
If Oracle Cloud Agent is running, a message similar to the following displays:
StatusCode : 200 StatusDescription : OK Content : { "availabilityDomain" : "<availability_domain>", "faultDomain" : "<fault_domain>", "compartmentId" : "ocid1.tenancy.region1..<unique_ID>", "displayNam... RawContent : HTTP/1.1 200 OK Connection: keep-alive Accept-Ranges: bytes Content-Length: 1197 Content-Type: application/json Date: Wed, 24 Mar 2021 21:07:42 GMT ETag: "<unique_ID>" Last-Modified: Wed, 24 M... Forms : {} Headers : {[Connection, keep-alive], [Accept-Ranges, bytes], [Content-Length, 1197], [Content-Type, application/json]...} Images : {} InputFields : {} Links : {} ParsedHtml : mshtml.HTMLDocumentClass RawContentLength : 1197
- If you get a successful response without proxy errors, check for clock skew errors. If proxy server errors occur, check your proxy server settings.
Check for Clock Skew Errors
Sometimes, the clock on an instance is not synchronized with the NTP service. Clock skew can cause TLS negotiations to fail, preventing the instance from connecting to Oracle services. Follow these steps to check for clock skew errors.
-
Connect to the instance and run one of the following commands to generate the
monitoring.log
file.Linuxsudo tail -15 /var/log/oracle-cloud-agent/plugins/gomon/monitoring.log
Windows Server 2019, Windows Server 2022Run the command in Windows PowerShell as an administrator.
Get-Content -tail 15 C:\Windows\ServiceProfiles\OCA\AppData\Local\OracleCloudAgent\plugins\gomon\monitoring.log
Windows Server earlier than 2019Run the command in Windows PowerShell as an administrator.
Get-Content -tail 15 C:\Users\OCA\AppData\Local\OracleCloudAgent\plugins\gomon\monitoring.log
If there is a clock skew error, a message similar to the following displays:
failed to call: Service error:NotAuthenticated. Date 'Tue, 09 Mar 2021 06:39:35 UTC' is not within allowed clock skew. Current 'Tue, 09 Mar 2021 06:45:45 UTC', valid datetime range: ['Tue, 09 Mar 2021 06:40:45 UTC', 'Tue, 09 Mar 2021 06:50:46 UTC']. http status code: 401. Opc request id: <unique_id>
- If a clock skew error occurs, configure the Oracle Cloud Infrastructure NTP service for your instance. If no clock skew error occurs, verify that gateways are configured correctly.
- If you configured the NTP service in the previous step, after you complete the configuration, run one of the following commands to restart Oracle Cloud Agent:Oracle Linux 7.x and later versions
sudo systemctl restart oracle-cloud-agent
Ubuntusudo snap restart oracle-cloud-agent
Windows ServerRun the command in Windows PowerShell as an administrator.
net stop OCA net start OCA
-
Generate the
monitoring.log
file again.If Oracle Cloud Agent is running correctly, a successful response is 200 OK. In the
monitoring.log
, look for a message similar to the following:2021/03/18 03:12:44.391381 t2.go:139: Sent metrics status: 200; took: 387ms; with opc-request-id:<unique_ID>; 2021/03/18 03:13:44.006391 instancemetadata_client.go:64: fetched metadata from http://169.254.169.254/opc/v2/instance/ , status 200 OK 2021/03/18 03:13:44.730102 t2.go:139: Sent metrics status: 200; took: 723ms; with opc-request-id:<unique_ID>; 2021/03/18 03:14:44.324046 t2.go:139: Sent metrics status: 200; took: 320ms; with opc-request-id:<unique_ID>;
Verify Permissions for Windows Domain Joined Instances
If you have a Windows instance that is joined to a domain, verify that the virtual account is granted the Log on as a service user right in the local Group Policy. To set permissions, follow the steps for enabling service log on through a local group policy in Microsoft's Enable Service Logon guide. For Log on as a service, add the user NT SERVICE\ALL SERVICES or the specific user.
Verify that Gateways are Configured Correctly
For Oracle Cloud Agent to communicate with Oracle services, gateways in subnets must be configured correctly. Follow these steps to verify and correct your configuration.
- Configure the internet gateway, NAT gateway, or service gateway for the subnet in your VCN.
- After you follow the configuration steps, restart the services using the commands in the Verify that the Instance Can Access the Instance Metadata Service Endpoint section. After you restart the services, check the
monitoring.log
file for successful requests to Oracle services.
Change Proxy Server Settings
Sometimes, local proxy servers prevent Oracle Cloud Agent from communicating with any services. Each proxy server is different.
Often, setting the http_proxy
, https_proxy
, and no_proxy
environment variables for the oracle-cloud-agent
and oracle-cloud-agent-updater
services on the proxy client instances resolves proxy issues. After you set these environment variables, in the proxy server access.log
file (or equivalent, depending on your system), verify that you see requests from the proxy client to services that Oracle Cloud Agent accesses.
-
Run the following command.
sudo EDITOR=vi systemctl edit oracle-cloud-agent
-
In the editor window, add the following entries, and then save the file.
[Service] Environment="http_proxy=<proxy_url>:<proxy_port>" Environment="https_proxy=<proxy_url>:<proxy_port>" Environment="no_proxy=localhost,127.0.0.1,169.254.169.254"
- <proxy_url> is the proxy URL.
- <proxy_port> is the proxy port.
- Repeat the previous two steps for the
oracle-cloud-agent-updater
service. -
Run the following commands, and then restart the services.
sudo systemctl daemon-reload sudo systemctl restart oracle-cloud-agent oracle-cloud-agent-updater
-
Run the following command.
sudo EDITOR=vi systemctl edit snap.oracle-cloud-agent.oracle-cloud-agent
-
In the editor window, add the following entries, and then save the file.
[Service] Environment="http_proxy=<proxy_url>:<proxy_port>" Environment="https_proxy=<proxy_url>:<proxy_port>" Environment="no_proxy=localhost,127.0.0.1,169.254.169.254"
- <proxy_url> is the proxy URL.
- <proxy_port> is the proxy port.
- Repeat the previous two steps for the
snap.oracle-cloud-agent.oracle-cloud-agent-updater
service. -
Run the following commands, and then restart the services.
sudo systemctl daemon-reload sudo systemctl restart snap.oracle-cloud-agent.oracle-cloud-agent snap.oracle-cloud-agent.oracle-cloud-agent-updater
-
Run the following commands in Windows PowerShell as an administrator. Do not change the casing of the environment variables.
Set System environment variables for HTTP_PROXY, HTTPS_PROXY and NO_PROXY [System.Environment]::SetEnvironmentVariable("HTTP_PROXY", "<proxy_url>:<proxy_port>", [System.EnvironmentVariableTarget]::Machine) [System.Environment]::SetEnvironmentVariable("HTTPS_PROXY", "<proxy_url>:<proxy_port>", [System.EnvironmentVariableTarget]::Machine) [System.Environment]::SetEnvironmentVariable("NO_PROXY", "localhost,127.0.0.1,169.254.169.254", [System.EnvironmentVariableTarget]::Machine)
- <proxy_url> is the proxy URL.
- <proxy_port> is the proxy port.
-
Restart the
oracle-cloud-agent
andoracle-cloud-agent-updater
services.net stop OCA net start OCA net stop OCAU net start OCAU
-
To verify that the
Custom Logs Monitoring
plugin is able to send metrics, tail themonitoring.log
file.Windows Server 2019, Windows Server 2022
Get-Content C:\Windows\ServiceProfile\OCA\Appdata\Local\OracleCloudAgent\plugins\gomon\monitoring.log -Wait
Windows Server versions earlier than 2019
Get-Content C:\Users\OCA\Appdata\Local\OracleCloudAgent\plugins\gomon\monitoring.log -Wait
Step 4: Generate a Diagnostic File for Oracle Cloud Agent
To make it easier for Oracle support to help you troubleshoot issues with the Oracle Cloud Agent software, you can run the Oracle Cloud Agent diagnostic tool on your compute instances. The diagnostic tool generates a file that contains debugging information and logs for the plugins that Oracle Cloud Agent manages.
The diagnostic tool is installed with Oracle Cloud Agent version 1.14.0 and later. To update Oracle Cloud Agent, see Updating the Oracle Cloud Agent Software.
After you complete the previous troubleshooting steps, run the diagnostic tool and then file a support ticket with the file that contains debugging information and logs for the plugins.
- Connect to the instance.
-
Change directories to the folder where the diagnostic tool is saved:
cd /usr/libexec/oracle-cloud-agent/ocatools
-
Run the diagnostic tool:
sudo ./diagnostic
The tool generates a TAR file with a name in the format
oca-diag-<date>.<identifier>.tar.gz
. Provide the file when you open the support request.
- Connect to the instance.
- Open PowerShell as an administrator.
-
Change directories to the folder where the diagnostic tool is saved:
cd C:\Program Files\Oracle Cloud Agent\ocatools
-
Run the diagnostic tool:
.\diagnostic.ps1
The tool generates a ZIP file and saves it to
C:\Users\opc\Desktop\
. Provide the file when you open the support request.