Skip to content

Logs not enriched by k8sprocessor after startup only while short period #43623

@pbptaker

Description

@pbptaker

Component(s)

No response

What happened?

We are using the latest version. Don't know if it may be a feature request:
Currently we use the filelogreceiver to scrape logs on our k8s environments for a PoC. We use
processors / pipelines to send logs to our splunk servers using the so called dynamic index routing. To make this happen, we use the k8sprocessor to get the index out of our k8s environments namespaces (get the index out of a label, set the value to the field com.splunk.index).

Everything works (logs are forwaredet to our splunk servers, splunk servers are moving them to the desired index) according to the com.splunk.index ressource attribut.

Only when we restart the otelcols daemonsets logs are shortly sent without the com.splunk.index attribut. Meaning that we get a bunch of logs into our splunk servers hec tokens default indexes instead into the correct indexes.

This is an issue for us, we need to send the logs strictly to different indexes for security purposes.

We think that that the filelogreceiver is not waiting for the k8sprocessor to be ready, resulting to those logs being in wrong indexes.

Thank you,

Expected Result

Logs should always being enriched from k8sprocessor before they are sent to exporters

Actual Result:

After reboot logs are shortly not enriched by k8sprocessor (short period after startup of otelcol pod)

Collector version

latest, 0.137.0

Environment information

Environment

Some Broadcom TKGI Environment

OpenTelemetry Collector configuration

receivers:
      filelog:
        encoding: utf-8
        fingerprint_size: 1kb
        force_flush_period: "2s"
        include:
          - /var/log/pods/*/*/*.log
        include_file_name: false
        include_file_path: true
        max_concurrent_files: 1024
        max_log_size: 10MiB
        poll_interval: 200ms
        retry_on_failure:
          enabled: true
        start_at: beginning
        storage: file_storage
        operators:
          - type: add
            field: attributes.otelcol_uidextracted
            value: false
          - id: parse_pod_uid_for_k8sattributes
            type: regex_parser
            parse_from: attributes["log.file.path"]
            parse_to: attributes
            error_mode: continue
            regex: '^/var/log/pods/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[^/]+)/(?P<container_name>[^._]+)/(?P<restart_count>\d+)\.log$'
          - type: add
            if: attributes.uid != ""
            field: attributes.otelcol_uidextracted
            value: true
          - type: move
            if: attributes.otelcol_uidextracted == true
            from: attributes.uid
            to: resource["k8s.pod.uid"]
          - type: move
            from: attributes.restart_count
            to: resource["k8s.container.restart_count"]
          - type: add
            field: attributes.otelcol_prefixremover
            value: false
          - id: strip_containerd_prefix
            type: regex_parser
            parse_from: body
            error_mode: continue
            parse_to: attributes
            regex: ^(?P<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d{1,9})?Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
            timestamp:
              layout: "2006-01-02T15:04:05.999999999Z"
              layout_type: gotime
              parse_from: attributes.time
          - type: add     
            if: body matches "^(?P<time>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}(?:\\.\\d{1,9})?Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
            field: attributes.otelcol_prefixremover
            value: true
          - type: add
            field: attributes.otelcol_multiliner
            value: false
          - id: detect_row_and_recombine_multiline
            type: recombine
            combine_field: attributes.log
            error_mode: continue
            combine_with: "\n"
            is_first_entry: attributes.log matches "^(?:ts=)?\\d{4}-\\d{2}-\\d{2}|^{|^[IWE]\\d{4}"
            max_log_size: 10Mib
          - type: add
            if: attributes.log matches "^(?:ts=)?\\d{4}-\\d{2}-\\d{2}|^{|^[IWE]\\d{4}"
            field: attributes.otelcol_multiliner
            value: true
          - type: add
            field: attributes.otelcol_jsonparser
            value: false
          - id: parse_json_log
            type: json_parser
            parse_from: attributes.log
            parse_to: attributes
            error_mode: continue
            if: attributes.log matches "^\\{"
            timestamp:
              parse_from: attributes["@timestamp"]
              layout_type: gotime
              layout: "2006-01-02T15:04:05.999999999Z07:00"
          - type: add
            if: attributes.log matches "^{"
            field: attributes.otelcol_jsonparser
            value: true
          - id: move_log_to_body
            type: move
            from: attributes.log
            to: body
    connectors:
      routing:
        table:
          - condition: 'true'
            context: log
            pipelines:
              - logs/default
    processors:
      batch: {}
      memory_limiter:
        check_interval: 2s
        limit_mib: 500
      filter/drop_logs_having_k8s_label_splunk.com_exlude:
        logs:
          exclude:
            match_type: strict
            resource_attributes:
              - key: splunk.com/exclude
                value: "true"
      resource/add-index-from-ssp-k8s:
        attributes:
          - action: upsert
            from_attribute: splunk-index-from-ssp-and-k8s
            key: com.splunk.index
      resource/add-podname-as-host:
        attributes:
          - action: upsert
            key: host.name
            value: ${OTEL_HOSTNAME}
      resource/add-splunk-client:
        attributes:
          - action: upsert
            key: splunk-client
            value: otelcol_on_paas  
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: false
        filter:
          node_from_env_var: K8S_NODE_NAME
        extract:
          metadata:
            - k8s.pod.name
            - k8s.deployment.name
            - k8s.namespace.name
            - k8s.node.name
            - k8s.pod.start_time
            - container.id
            - container.image.name
            - container.image.tag
          labels:
            - tag_name: splunk-index-from-ssp-and-k8s
              key: foo.demo.ch/splunk-index
              from: namespace
            - tag_name: splunk.com/exclude
              key: splunk.com/exclude
              from: namespace
          otel_annotations: true
        pod_association:
          - sources:
              - from: resource_attribute
                name: k8s.pod.uid
      attributes/remove_doubled_fields_from_filelog:
        actions:
          - action: delete
            key: namespace
          - action: delete
            key: pod_name
          - action: delete
            key: log.file.path
      resource/cleanup:
        attributes:
          - action: delete
            key: prometheus.io/port
          - action: delete
            key: prometheus.io/path
          - action: delete
            key: prometheus.io/scheme
          - action: delete
            key: prometheus.io/srape
    service:
      extensions:
        - file_storage
        - health_check
      pipelines:
        logs:
          receivers:
            - filelog
          processors:
            - memory_limiter
            - k8sattributes
            - filter/drop_logs_having_k8s_label_splunk.com_exlude
            - batch
          exporters:
            - routing
        logs/default:
          receivers:
            - routing
          processors:
            - resource/cleanup
            - resource/add-index-from-ssp-k8s
            - resource/add-cluster-name
            - resource/add-podname-as-host
            - resource/add-splunk-client
            - attributes/remove_doubled_fields_from_filelog
          exporters:
          # - debug
            - splunk_hec/first_endpoint
    extensions:
      file_storage:
        directory: /var/lib/otelcol/storage
      health_check:
        endpoint: 0.0.0.0:13133
    exporters:
      debug:
        verbosity: basic
        # verbosity: detailed
      splunk_hec/first_endpoint:
        endpoint: https://some-splunk-server.demo.com:8088/services/collector
        token: ${SPLUNK_TOKEN}
        tls:
          insecure_skip_verify: true
        disable_compression: true
        idle_conn_timeout: 10s
        max_idle_conns: 200
        max_idle_conns_per_host: 200
        profiling_data_enabled: false
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_elapsed_time: 300s
          max_interval: 30s
        sending_queue:
          enabled: true
          num_consumers: 10
          queue_size: 10000
        source: http:paas_test
        sourcetype: paas_default
        timeout: 30s

Log output

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions