-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Component(s)
Oracle Cloud Resource Detection Processor
Describe the issue you're reporting
1. Background
In reviewing the initial implementation of the Oracle Cloud resource detector, maintainers provided the following feedback:
This approach is OK, but you may want to try and differentiate between the case where someone is not on oracle cloud (where return
pcommon.NewResource(), "", nil
is correct), and the case where someone is on oracle cloud, but the request failed (e.g. a timeout?). Currently, it is possible for a user to end up with missing resource attributes in the latter case.
and furthermore:
Feedback above is non-blocking, but I would recommend filing an issue to track fixing it if you choose not to address it in this PR.
I'm collecting what I've learned so far, so that we can address this issue in a subsequent PR after the initial processor logic is merged.
2. Current Detector Behavior
- The
Detect()
function attempts to fetch Oracle Cloud instance metadata. - If fetching metadata fails (for any reason), the detector:
- Logs the error at debug level.
- Returns an empty Resource and no error.
- There is no distinction made between:
- (a) not running on Oracle Cloud, and
- (b) running on Oracle Cloud with a temporary/partial metadata fetch failure.
3. Desired Behavior
Maintainers suggest that the resource detector should:
- Distinguish between "not Oracle Cloud" and "Oracle Cloud, but metadata fetch failed."
- On genuine fetch failure while on Oracle Cloud, the detector should signal a problem (e.g., return a non-nil error), rather than silently falling back to an empty Resource.
4. Strategies from Other Cloud Providers
GCP Detector
- Pattern: Performs a fast "platform probe" using
metadata.OnGCE()
, which attempts a low-cost HTTP request to the well-known GCP instance metadata service (http://169.254.169.254
). This detects if the service is present, with a very short timeout, without immediately fetching all metadata. - Logic structure:
- If "not on GCP" → return empty resource, no error.
- If "on GCP" but attribute fetch fails → logs and returns partial/empty resource, AND returns error.
AWS EKS Detector
- Pattern: Uses clues specific to EKS before making metadata API calls.
- Checks for the
KUBERNETES_SERVICE_HOST
environment variable. - Probes the Kubernetes API to verify environment (e.g., checks if cluster version string contains
-eks-
).
- Checks for the
- Performs network metadata requests only if these clues are positive.
5. Possible Approaches for Oracle Cloud
- Environment Clues: Investigate existence of Oracle-specific environment variables, files, or config sections that can be cheaply checked to determine cloud platform.
- Examples: agent marker files, OCI config, distinctive DNS, special startup scripts.
- DNS/Hostname: Check for Oracle Cloud-specific DNS search domains or hostname patterns.
- Network Probe: (Fallback) Use a very short-timeout HTTP probe to the metadata endpoint (as done in GCP/AWS detectors).
- Decision Logic:
- If a clue or probe indicates "not Oracle Cloud" → return empty resource, no error.
- If a clue or probe indicates "Oracle Cloud" but metadata fetch fails → return error, avoid silent fallback.
cc @tonychoe @dashpole @atoulme
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1
or me too
, to help us triage it. Learn more here.