Detectors #5003

yanivagman · 2025-10-30T16:43:14Z

Detectors draft

codecov · 2025-10-30T16:54:08Z

Codecov Report

❌ Patch coverage is 59.75309% with 163 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.05%. Comparing base (034cbab) to head (1444cab).
⚠️ Report is 68 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/datastores/ipreputation/store.go	77.60%	34 Missing and 9 partials ⚠️
pkg/datastores/container/store.go	63.21%	31 Missing and 1 partial ⚠️
pkg/cmd/cobra/cobra.go	0.00%	23 Missing ⚠️
pkg/datastores/ipreputation/types.go	0.00%	20 Missing ⚠️
pkg/datastores/dns/store.go	67.85%	17 Missing and 1 partial ⚠️
cmd/tracee/cmd/root.go	0.00%	9 Missing ⚠️
pkg/cmd/list.go	0.00%	8 Missing ⚠️
cmd/tracee/cmd/list.go	0.00%	7 Missing ⚠️
pkg/datastores/container/containers.go	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5003      +/-   ##
==========================================
+ Coverage   29.72%   34.05%   +4.32%     
==========================================
  Files         234      260      +26     
  Lines       26186    30593    +4407     
==========================================
+ Hits         7785    10418    +2633     
- Misses      17864    19423    +1559     
- Partials      537      752     +215

Flag	Coverage Δ
unit	`34.05% <59.75%> (+4.32%)`	⬆️

Files with missing lines	Coverage Δ
pkg/config/config.go	`0.00% <ø> (ø)`
pkg/datastores/dns/dnscache.go	`85.71% <ø> (ø)`
pkg/datastores/null_stores.go	`6.25% <ø> (ø)`
pkg/datastores/process/proctree.go	`29.59% <ø> (-2.25%)`	⬇️
pkg/datastores/process/store.go	`73.77% <ø> (ø)`
pkg/datastores/registry.go	`75.94% <ø> (ø)`
pkg/datastores/symbol/adapter.go	`0.00% <ø> (ø)`
pkg/datastores/symbol/kernel.go	`82.89% <ø> (+3.94%)`	⬆️
pkg/datastores/symbol/store.go	`44.00% <ø> (ø)`
pkg/datastores/syscall/store.go	`100.00% <ø> (ø)`
... and 42 more

... and 40 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add ./detectors/ to TRACEE_SRC_DIRS and add go.mod/go.sum dependencies (both root and detectors module) for the tracee binary target. This ensures that changes to detector source files or module dependencies trigger a rebuild when running 'make tracee'.

Add GetAncestry() method to ProcessStore interface to retrieve process ancestry chains up to a specified depth. Implementation walks the parent chain and returns a slice where [0]=process itself, [1]=parent, [2]=grandparent, etc. The method stops if: - maxDepth is reached - Parent not found in the tree - Circular reference detected - maxDepth <= 0 (returns empty slice) This enables detectors to analyze process lineage for threat detection patterns without complex manual queries through the process tree.

Implement ProcessAncestry auto-population in the detector dispatcher. When a detector declares AutoPopulate.ProcessAncestry = true, the engine automatically queries the process store and populates up to 5 ancestor levels in the output event's Workload.Process.Ancestors field. The implementation: - Queries ancestry using detector's DataStores reference - Skips the first entry (process itself, already in output) - Converts datastores.ProcessInfo to v1beta1.Process format - Only populates if Ancestors field is empty (respects manual overrides) Add comprehensive integration test that creates a 3-level process tree (init -> bash -> python) and verifies ancestry is correctly populated with proper entity IDs and PIDs for each ancestor.

- Extract buildEventFromOutput() to properly construct output events - Deep clone Workload from input event to prevent race conditions - Clone Policies from input event to preserve policy matching context - Clear process ancestors array (will be populated by auto-population if requested) This fixes potential data races where multiple detectors could modify shared event structures, and ensures each detector output has its own isolated copy of the workload and policy context.

… Threat - Rename autoPopulateFields() to autoPopulateFieldsFromOutput() - Support output-level AutoPopulate overrides (output.AutoPopulate > definition) - Support output-level Threat overrides (output.Threat > definition.ThreatMetadata) - Allow detectors to customize threat severity/description per detection - Allow detectors to selectively disable auto-population for specific outputs This enables detectors to dynamically adjust threat levels based on detection confidence or context, and to control field population on a per-detection basis rather than only at the definition level.

Add ExampleDetector demonstrating the detector API: - Subscribes to execve events - Produces example_detection event with 3 fields - Shows field definitions, threat metadata, and auto-population - Implements full detector lifecycle (Init/OnEvent/Close) This detector serves as: - Reference implementation for new detectors - Test case for detector infrastructure - Live documentation of API patterns

…ation Detector data filters were skipped when DataFilter enabled kernel filtering mode for pathname fields. Detectors lack kernel-side filtering, so userspace filtering must always run. Add skipKernelFilter flag to DataFilter and NewDetectorDataFilter() constructor. Detector registry now uses NewDetectorDataFilter() to ensure filters are always applied in userspace.

- Change all datastore query methods to return (value, error) pattern - Return datastores.ErrNotFound instead of (nil, false) - Document that Registry accessors never return nil stores - Update all datastore implementations and tests

Update example detectors and tests to use error returns instead of bool. Check errors.Is(err, datastores.ErrNotFound) for not-found cases.

…ontrol Add optional AncestryDepth field to DetectorOutput enabling fine-grained control over process ancestry fetching depth on a per-detection basis. Changes: - Add AncestryDepth *uint32 to DetectorOutput struct - Implement priority: output.AncestryDepth > ProcessAncestry bool (default 5) - Update dispatch logic to use explicit depth when provided - Document new field and usage patterns in developer guide The ProcessAncestry boolean remains as simple fallback (default 5 levels). Detectors can now override depth per detection: nil (use boolean), 0 (disable), or N (fetch N levels).

Add comprehensive documentation for EventDetector API and DataStore system: - docs/docs/detectors/developer-guide.md - Complete detector development guide with quick start, API reference, testing, migration from signatures, and examples - docs/docs/detectors/datastore-api.md - DataStore API reference for all stores (Process, Container, System, Syscall, Symbol, DNS) with usage patterns - docs/docs/detectors/index.md - Landing page with overview and navigation - docs/docs/events/custom/overview.md - Updated to highlight new detector system and mark signatures as legacy Covers auto-registration, filters, version validation, enrichment requirements, metrics, process ancestry, and more.

Add WritableStore interface for detectors/extensions to store custom data. - Add WritableStore interface (WriteValue, WriteBatchValues, Delete, ClearSource, ListSources) - Add RegisterWritableStore() to Registry - Define protobuf schemas for DataEntry and gRPC DataSourceService - Source isolation for data provenance tracking - Update Makefile for subdirectory protobuf generation - Ownership model: registrant controls store behavior - Fix test mocks to return valid health/metrics objects

Add reference implementation of WritableStore for IP reputation tracking. - Implement IPReputationStore with type-safe and generic methods - Support three conflict resolution policies (LastWriteWins, MaxSeverity, PriorityBased) - Source partitioning for multi-feed aggregation - Thread-safe operations with proper locking - Comprehensive test suite (15 tests, 78% coverage) - Type-safe methods: GetReputation, IsBlacklisted, WriteReputation, WriteBatch - Generic methods: WriteValue, WriteBatchValues, Delete, ClearSource, ListSources

Add comprehensive documentation for writable datastores. - Document WritableStore interface and usage patterns - Registration, writing, and reading examples - Source isolation and batch operations - Complete detector example using IP reputation store - Focus on generic API contract, reference implementation-specific details - Fix event data access examples (use GetData instead of non-existent GetSourceIP)

Ensure Registry accessor methods never return nil for safety and consistency. - Add null object implementations for all store types - Initialize Registry with null objects by default - Null stores return ErrStoreUnhealthy for all operations - Update tests to verify nil-safety guarantee - Document nil-safety guarantee in API docs Benefits: - Eliminates need for nil checks in detector code - Enables safe method chaining - Clear error signals (ErrStoreUnhealthy vs ErrNotFound) - Graceful degradation when stores unavailable

…datastores Add comprehensive test validating core use case where detectors share data through writable datastores. Tests proper event-based communication where one detector writes threat intelligence and another reads it during event processing. Test scenarios: - Threat intelligence sharing between detectors - Ownership model (single writer, multiple readers) - Graceful degradation when store unavailable - Batch operations (1000+ entries) All communication happens through OnEvent() methods, demonstrating realistic detector interaction patterns.

Enable declarative threat detection via YAML files with full Go detector parity. Supports event filtering, runtime data extraction, and auto-population. - YAML schema with validation - Field extraction using v1beta1 helpers - CLI: --detectors yaml-dir=/path/to/dir - Config: detectors.yaml-dir - Default paths: ./detectors, /etc/tracee/detectors - Test coverage: 48.1%

Add example YAML detector and update documentation: - suspicious_binary_execution.yaml: detects execution of networking tools - Updated detectors/README.md with YAML detector documentation - Added detectors/yaml/README.md with usage guide

Add complete user guide for YAML detectors: - Quick start with working examples - Full schema reference with all fields - Event filtering and data extraction guide - Complete list of supported extraction paths - Deployment options and best practices - Troubleshooting guide - Updated detector index with YAML guide link

Add 5 integration tests covering: - Basic YAML detector loading and registration - Event generation with field extraction (data.*, workload.*) - 2-level detector chaining/composition - Data filter application (matching/non-matching) - Error handling (invalid YAML, missing fields) Tests follow existing integration test patterns with proper cleanup, goleak checking, and clear logging for debugging.

Add comprehensive testing infrastructure for unit testing detectors: - TestHarness for engine setup and detector registration - Event builders for constructing test inputs - Assertion helpers for validating outputs - SimpleDetectorTest for lightweight unit testing - Support for both Go and YAML detectors - Examples demonstrating stateful and stateless detector testing

Add parent field to DetectedFrom for complete detection provenance through multi-level chains. Engine automatically preserves chain when dispatching between detectors. - Add DetectedFrom.parent field (protobuf) - Auto-chain parent in dispatch logic with cloning - Add helpers: GetDetectionChain, GetRootDetection, GetChainDepth - Update docs and add chain tests (2-level, 3-level)

- Add Common Expression Language (CEL) for dynamic conditions and field extraction - Support short form (field names) and long form (CEL expressions) for output fields - Add helper functions: getData(), getDataInt(), getDataUInt(), hasData() - Rename extract_fields -> fields, source -> expression for clarity - Update all existing YAML detectors to use CEL syntax - Add comprehensive documentation for CEL features - All unit tests pass (38.3% coverage) - YAML detectors load successfully with CEL compilation Breaking changes: - YAML schema: extract_fields -> fields, source -> expression - Field extraction now uses CEL expressions instead of JSONPath-like paths

Implements CEL global list variables to share common lists (e.g., shell binaries, sensitive paths) across YAML detectors. Lists are defined in {detector-dir}/lists/ subdirectory and exposed as CEL variables. - Add list schema and loader with validation (uppercase snake_case names) - Register lists as CEL variables (list<string>) in environment - Pass lists to CEL evaluation context at runtime - Add comprehensive unit and integration tests (16 new tests) - Add example shell_binaries list and detector - Update documentation with usage examples Lists are compiled into CEL at load time for zero runtime overhead and compile-time type safety.

Expose all core datastores (Process, Container, System, Kernel, DNS, Syscall) as namespaced CEL functions for use in YAML detector conditions and outputs. Key changes: - Add datastores.Registry field to YAMLDetector, rebuild CEL env in Init() - Implement 12 datastore functions: process.get/getAncestry/getChildren, container.get/getByName, system.info, kernel.resolveSymbol/getSymbolAddress, dns.getResponse, syscall.getName/getId - Add comprehensive test coverage with mock datastores - Update documentation with examples and usage patterns Functions return null for not-found entities, handle time.Time conversion, and integrate seamlessly with existing CEL expressions.

Add 8 string utility functions for YAML detectors: - split(str, delimiter) - Split string into list - join(list, delimiter) - Join list into string - trim(str) - Remove leading/trailing whitespace - replace(str, old, new) - Replace all occurrences - upper(str) - Convert to uppercase - lower(str) - Convert to lowercase - basename(path) - Get filename from path - dirname(path) - Get directory from path Functions are available in both conditions and output expressions. All functions handle CEL's various list representations ([]string, []interface{}, []ref.Val). Includes comprehensive unit tests and documentation updates.

Support proto-native PipelineEvents to avoid redundant conversions. Detector outputs now flow as protobuf without trace.Event conversion. - Make PipelineEvent.ProtoEvent public - Update ToProto() to handle proto-native events (Event == nil) - Add matchPoliciesProto() for proto-native filtering - Unify sink stage to handle both event types - Pool all PipelineEvents uniformly Eliminates proto→trace→proto conversion overhead for detector outputs.

github-actions bot added area/ebpf kind/documentation area/testing area/UX area/events area/build area/api labels Oct 30, 2025

github-actions bot assigned yanivagman Oct 30, 2025

yanivagman force-pushed the detectors branch 4 times, most recently from 8cc653f to 817333b Compare November 4, 2025 16:03

github-actions bot added area/filtering area/flags area/grpc labels Nov 4, 2025

yanivagman force-pushed the detectors branch 14 times, most recently from 10f7269 to a659dbe Compare November 12, 2025 17:25

yanivagman added 29 commits November 27, 2025 16:02

feat(detectors): update to use new datastore error handling

61f2919

Update example detectors and tests to use error returns instead of bool. Check errors.Is(err, datastores.ErrNotFound) for not-found cases.

chain test

e21960a

complete signatures to detectors migration

8edb3a1

yanivagman force-pushed the detectors branch from 4ca54eb to 1444cab Compare November 27, 2025 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Detectors #5003

Detectors #5003

Uh oh!

yanivagman commented Oct 30, 2025

Uh oh!

codecov bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Detectors #5003

Are you sure you want to change the base?

Detectors #5003

Uh oh!

Conversation

yanivagman commented Oct 30, 2025

Uh oh!

codecov bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Oct 30, 2025 •

edited

Loading