Skip to content

Conversation

@yanivagman
Copy link
Collaborator

Detectors draft

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

❌ Patch coverage is 59.75309% with 163 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.05%. Comparing base (034cbab) to head (1444cab).
⚠️ Report is 68 commits behind head on main.

Files with missing lines Patch % Lines
pkg/datastores/ipreputation/store.go 77.60% 34 Missing and 9 partials ⚠️
pkg/datastores/container/store.go 63.21% 31 Missing and 1 partial ⚠️
pkg/cmd/cobra/cobra.go 0.00% 23 Missing ⚠️
pkg/datastores/ipreputation/types.go 0.00% 20 Missing ⚠️
pkg/datastores/dns/store.go 67.85% 17 Missing and 1 partial ⚠️
cmd/tracee/cmd/root.go 0.00% 9 Missing ⚠️
pkg/cmd/list.go 0.00% 8 Missing ⚠️
cmd/tracee/cmd/list.go 0.00% 7 Missing ⚠️
pkg/datastores/container/containers.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5003      +/-   ##
==========================================
+ Coverage   29.72%   34.05%   +4.32%     
==========================================
  Files         234      260      +26     
  Lines       26186    30593    +4407     
==========================================
+ Hits         7785    10418    +2633     
- Misses      17864    19423    +1559     
- Partials      537      752     +215     
Flag Coverage Δ
unit 34.05% <59.75%> (+4.32%) ⬆️
Files with missing lines Coverage Δ
pkg/config/config.go 0.00% <ø> (ø)
pkg/datastores/dns/dnscache.go 85.71% <ø> (ø)
pkg/datastores/null_stores.go 6.25% <ø> (ø)
pkg/datastores/process/proctree.go 29.59% <ø> (-2.25%) ⬇️
pkg/datastores/process/store.go 73.77% <ø> (ø)
pkg/datastores/registry.go 75.94% <ø> (ø)
pkg/datastores/symbol/adapter.go 0.00% <ø> (ø)
pkg/datastores/symbol/kernel.go 82.89% <ø> (+3.94%) ⬆️
pkg/datastores/symbol/store.go 44.00% <ø> (ø)
pkg/datastores/syscall/store.go 100.00% <ø> (ø)
... and 42 more

... and 40 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add ./detectors/ to TRACEE_SRC_DIRS and add go.mod/go.sum dependencies
(both root and detectors module) for the tracee binary target. This
ensures that changes to detector source files or module dependencies
trigger a rebuild when running 'make tracee'.
Add GetAncestry() method to ProcessStore interface to retrieve process
ancestry chains up to a specified depth. Implementation walks the parent
chain and returns a slice where [0]=process itself, [1]=parent,
[2]=grandparent, etc.

The method stops if:
- maxDepth is reached
- Parent not found in the tree
- Circular reference detected
- maxDepth <= 0 (returns empty slice)

This enables detectors to analyze process lineage for threat detection
patterns without complex manual queries through the process tree.
Implement ProcessAncestry auto-population in the detector dispatcher.
When a detector declares AutoPopulate.ProcessAncestry = true, the engine
automatically queries the process store and populates up to 5 ancestor
levels in the output event's Workload.Process.Ancestors field.

The implementation:
- Queries ancestry using detector's DataStores reference
- Skips the first entry (process itself, already in output)
- Converts datastores.ProcessInfo to v1beta1.Process format
- Only populates if Ancestors field is empty (respects manual overrides)

Add comprehensive integration test that creates a 3-level process tree
(init -> bash -> python) and verifies ancestry is correctly populated
with proper entity IDs and PIDs for each ancestor.
- Extract buildEventFromOutput() to properly construct output events
- Deep clone Workload from input event to prevent race conditions
- Clone Policies from input event to preserve policy matching context
- Clear process ancestors array (will be populated by auto-population if requested)

This fixes potential data races where multiple detectors could modify
shared event structures, and ensures each detector output has its own
isolated copy of the workload and policy context.
… Threat

- Rename autoPopulateFields() to autoPopulateFieldsFromOutput()
- Support output-level AutoPopulate overrides (output.AutoPopulate > definition)
- Support output-level Threat overrides (output.Threat > definition.ThreatMetadata)
- Allow detectors to customize threat severity/description per detection
- Allow detectors to selectively disable auto-population for specific outputs

This enables detectors to dynamically adjust threat levels based on
detection confidence or context, and to control field population on
a per-detection basis rather than only at the definition level.
Add ExampleDetector demonstrating the detector API:
- Subscribes to execve events
- Produces example_detection event with 3 fields
- Shows field definitions, threat metadata, and auto-population
- Implements full detector lifecycle (Init/OnEvent/Close)

This detector serves as:
- Reference implementation for new detectors
- Test case for detector infrastructure
- Live documentation of API patterns
…ation

Detector data filters were skipped when DataFilter enabled kernel
filtering mode for pathname fields. Detectors lack kernel-side
filtering, so userspace filtering must always run.

Add skipKernelFilter flag to DataFilter and NewDetectorDataFilter()
constructor. Detector registry now uses NewDetectorDataFilter() to
ensure filters are always applied in userspace.
- Change all datastore query methods to return (value, error) pattern
- Return datastores.ErrNotFound instead of (nil, false)
- Document that Registry accessors never return nil stores
- Update all datastore implementations and tests
Update example detectors and tests to use error returns instead of
bool. Check errors.Is(err, datastores.ErrNotFound) for not-found cases.
…ontrol

Add optional AncestryDepth field to DetectorOutput enabling fine-grained
control over process ancestry fetching depth on a per-detection basis.

Changes:
- Add AncestryDepth *uint32 to DetectorOutput struct
- Implement priority: output.AncestryDepth > ProcessAncestry bool (default 5)
- Update dispatch logic to use explicit depth when provided
- Document new field and usage patterns in developer guide

The ProcessAncestry boolean remains as simple fallback (default 5 levels).
Detectors can now override depth per detection: nil (use boolean), 0 (disable),
or N (fetch N levels).
Add comprehensive documentation for EventDetector API and DataStore system:

- docs/docs/detectors/developer-guide.md - Complete detector development guide
  with quick start, API reference, testing, migration from signatures, and examples
- docs/docs/detectors/datastore-api.md - DataStore API reference for all stores
  (Process, Container, System, Syscall, Symbol, DNS) with usage patterns
- docs/docs/detectors/index.md - Landing page with overview and navigation
- docs/docs/events/custom/overview.md - Updated to highlight new detector system
  and mark signatures as legacy

Covers auto-registration, filters, version validation, enrichment requirements,
metrics, process ancestry, and more.
Add WritableStore interface for detectors/extensions to store custom data.

- Add WritableStore interface (WriteValue, WriteBatchValues, Delete, ClearSource, ListSources)
- Add RegisterWritableStore() to Registry
- Define protobuf schemas for DataEntry and gRPC DataSourceService
- Source isolation for data provenance tracking
- Update Makefile for subdirectory protobuf generation
- Ownership model: registrant controls store behavior
- Fix test mocks to return valid health/metrics objects
Add reference implementation of WritableStore for IP reputation tracking.

- Implement IPReputationStore with type-safe and generic methods
- Support three conflict resolution policies (LastWriteWins, MaxSeverity, PriorityBased)
- Source partitioning for multi-feed aggregation
- Thread-safe operations with proper locking
- Comprehensive test suite (15 tests, 78% coverage)
- Type-safe methods: GetReputation, IsBlacklisted, WriteReputation, WriteBatch
- Generic methods: WriteValue, WriteBatchValues, Delete, ClearSource, ListSources
Add comprehensive documentation for writable datastores.

- Document WritableStore interface and usage patterns
- Registration, writing, and reading examples
- Source isolation and batch operations
- Complete detector example using IP reputation store
- Focus on generic API contract, reference implementation-specific details
- Fix event data access examples (use GetData instead of non-existent GetSourceIP)
Ensure Registry accessor methods never return nil for safety and consistency.

- Add null object implementations for all store types
- Initialize Registry with null objects by default
- Null stores return ErrStoreUnhealthy for all operations
- Update tests to verify nil-safety guarantee
- Document nil-safety guarantee in API docs

Benefits:
- Eliminates need for nil checks in detector code
- Enables safe method chaining
- Clear error signals (ErrStoreUnhealthy vs ErrNotFound)
- Graceful degradation when stores unavailable
…datastores

Add comprehensive test validating core use case where detectors share data
through writable datastores. Tests proper event-based communication where
one detector writes threat intelligence and another reads it during event
processing.

Test scenarios:
- Threat intelligence sharing between detectors
- Ownership model (single writer, multiple readers)
- Graceful degradation when store unavailable
- Batch operations (1000+ entries)

All communication happens through OnEvent() methods, demonstrating realistic
detector interaction patterns.
Enable declarative threat detection via YAML files with full Go detector parity.
Supports event filtering, runtime data extraction, and auto-population.

- YAML schema with validation
- Field extraction using v1beta1 helpers
- CLI: --detectors yaml-dir=/path/to/dir
- Config: detectors.yaml-dir
- Default paths: ./detectors, /etc/tracee/detectors
- Test coverage: 48.1%
Add example YAML detector and update documentation:
- suspicious_binary_execution.yaml: detects execution of networking tools
- Updated detectors/README.md with YAML detector documentation
- Added detectors/yaml/README.md with usage guide
Add complete user guide for YAML detectors:
- Quick start with working examples
- Full schema reference with all fields
- Event filtering and data extraction guide
- Complete list of supported extraction paths
- Deployment options and best practices
- Troubleshooting guide
- Updated detector index with YAML guide link
Add 5 integration tests covering:
- Basic YAML detector loading and registration
- Event generation with field extraction (data.*, workload.*)
- 2-level detector chaining/composition
- Data filter application (matching/non-matching)
- Error handling (invalid YAML, missing fields)

Tests follow existing integration test patterns with proper cleanup,
goleak checking, and clear logging for debugging.
Add comprehensive testing infrastructure for unit testing detectors:
- TestHarness for engine setup and detector registration
- Event builders for constructing test inputs
- Assertion helpers for validating outputs
- SimpleDetectorTest for lightweight unit testing
- Support for both Go and YAML detectors
- Examples demonstrating stateful and stateless detector testing
Add parent field to DetectedFrom for complete detection provenance through
multi-level chains. Engine automatically preserves chain when dispatching
between detectors.

- Add DetectedFrom.parent field (protobuf)
- Auto-chain parent in dispatch logic with cloning
- Add helpers: GetDetectionChain, GetRootDetection, GetChainDepth
- Update docs and add chain tests (2-level, 3-level)
- Add Common Expression Language (CEL) for dynamic conditions and field extraction
- Support short form (field names) and long form (CEL expressions) for output fields
- Add helper functions: getData(), getDataInt(), getDataUInt(), hasData()
- Rename extract_fields -> fields, source -> expression for clarity
- Update all existing YAML detectors to use CEL syntax
- Add comprehensive documentation for CEL features
- All unit tests pass (38.3% coverage)
- YAML detectors load successfully with CEL compilation

Breaking changes:
- YAML schema: extract_fields -> fields, source -> expression
- Field extraction now uses CEL expressions instead of JSONPath-like paths
Implements CEL global list variables to share common lists (e.g., shell
binaries, sensitive paths) across YAML detectors. Lists are defined in
{detector-dir}/lists/ subdirectory and exposed as CEL variables.

- Add list schema and loader with validation (uppercase snake_case names)
- Register lists as CEL variables (list<string>) in environment
- Pass lists to CEL evaluation context at runtime
- Add comprehensive unit and integration tests (16 new tests)
- Add example shell_binaries list and detector
- Update documentation with usage examples

Lists are compiled into CEL at load time for zero runtime overhead and
compile-time type safety.
Expose all core datastores (Process, Container, System, Kernel, DNS, Syscall)
as namespaced CEL functions for use in YAML detector conditions and outputs.

Key changes:
- Add datastores.Registry field to YAMLDetector, rebuild CEL env in Init()
- Implement 12 datastore functions: process.get/getAncestry/getChildren,
  container.get/getByName, system.info, kernel.resolveSymbol/getSymbolAddress,
  dns.getResponse, syscall.getName/getId
- Add comprehensive test coverage with mock datastores
- Update documentation with examples and usage patterns

Functions return null for not-found entities, handle time.Time conversion,
and integrate seamlessly with existing CEL expressions.
Add 8 string utility functions for YAML detectors:
- split(str, delimiter) - Split string into list
- join(list, delimiter) - Join list into string
- trim(str) - Remove leading/trailing whitespace
- replace(str, old, new) - Replace all occurrences
- upper(str) - Convert to uppercase
- lower(str) - Convert to lowercase
- basename(path) - Get filename from path
- dirname(path) - Get directory from path

Functions are available in both conditions and output expressions.
All functions handle CEL's various list representations ([]string, []interface{}, []ref.Val).

Includes comprehensive unit tests and documentation updates.
Support proto-native PipelineEvents to avoid redundant conversions.
Detector outputs now flow as protobuf without trace.Event conversion.

- Make PipelineEvent.ProtoEvent public
- Update ToProto() to handle proto-native events (Event == nil)
- Add matchPoliciesProto() for proto-native filtering
- Unify sink stage to handle both event types
- Pool all PipelineEvents uniformly

Eliminates proto→trace→proto conversion overhead for detector outputs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant