-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat(logs): add live tailing functionality for logs #42214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Size Change: 0 B Total Size: 3.41 MB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 files reviewed, 5 comments
products/logs/frontend/logsLogic.tsx
Outdated
| latest_time_bucket = time_bucket | ||
| } | ||
| // insert all new logs into the sparkline data as their own bucket - we'll later aggregate them | ||
| values.sparkline.push({ time: time_bucket.toISOString(), level: log.level, count: 1 }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Mutating values.sparkline directly violates immutability principles - causes unpredictable state behavior and breaks Kea's reactivity model
| values.sparkline.push({ time: time_bucket.toISOString(), level: log.level, count: 1 }) | |
| const sparklineWithNewLogs = [...values.sparkline, { time: time_bucket.toISOString(), level: log.level, count: 1 }] |
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/logs/frontend/logsLogic.tsx
Line: 853:853
Comment:
**logic:** Mutating `values.sparkline` directly violates immutability principles - causes unpredictable state behavior and breaks Kea's reactivity model
```suggestion
const sparklineWithNewLogs = [...values.sparkline, { time: time_bucket.toISOString(), level: log.level, count: 1 }]
```
How can I resolve this? If you propose a fix, please make it concise.| let latest_time_bucket = dayjs(last_bucket) | ||
| for (const log of logs) { | ||
| const time_bucket = dayjs.unix(Math.floor(dayjs(log.timestamp).unix() / interval) * interval) | ||
| if (time_bucket.isAfter(latest_time_bucket)) { | ||
| latest_time_bucket = time_bucket | ||
| } | ||
| // insert all new logs into the sparkline data as their own bucket - we'll later aggregate them | ||
| values.sparkline.push({ time: time_bucket.toISOString(), level: log.level, count: 1 }) | ||
| } | ||
| actions.setSparkline( | ||
| values.sparkline | ||
| .sort( | ||
| // sort buckets by time, then level | ||
| (a, b) => dayjs(a.time).diff(dayjs(b.time)) || a.level.localeCompare(b.level) | ||
| ) | ||
| .reduce((acc, curr) => { | ||
| // aggregate buckets by time and level | ||
| const index = acc.findIndex( | ||
| (item) => dayjs(item.time).isSame(dayjs(curr.time)) && item.level === curr.level | ||
| ) | ||
| if (index === -1) { | ||
| // haven't seen this time/level combo before, add it to the accumulator | ||
| acc.push(curr) | ||
| } else { | ||
| // increase existing bucket | ||
| acc[index].count += curr.count | ||
| } | ||
| return acc | ||
| // Keep the overall sparkline time range the same by dropping older buckets | ||
| }, []) | ||
| .filter((item) => latest_time_bucket.diff(dayjs(item.time), 'seconds') <= duration) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Collect new sparkline entries first, then create new immutable array - avoids mutating state and eliminates need for intermediate array
let latest_time_bucket = dayjs(last_bucket)
const newSparklineEntries = []
for (const log of logs) {
const time_bucket = dayjs.unix(Math.floor(dayjs(log.timestamp).unix() / interval) * interval)
if (time_bucket.isAfter(latest_time_bucket)) {
latest_time_bucket = time_bucket
}
newSparklineEntries.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
}
actions.setSparkline(
[...values.sparkline, ...newSparklineEntries]
.sort(
// sort buckets by time, then level
(a, b) => dayjs(a.time).diff(dayjs(b.time)) || a.level.localeCompare(b.level)
)
.reduce((acc, curr) => {
// aggregate buckets by time and level
const index = acc.findIndex(
(item) => dayjs(item.time).isSame(dayjs(curr.time)) && item.level === curr.level
)
if (index === -1) {
// haven't seen this time/level combo before, add it to the accumulator
acc.push(curr)
} else {
// increase existing bucket
acc[index].count += curr.count
}
return acc
// Keep the overall sparkline time range the same by dropping older buckets
}, [])
.filter((item) => latest_time_bucket.diff(dayjs(item.time), 'seconds') <= duration)
)
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/logs/frontend/logsLogic.tsx
Line: 846:877
Comment:
**style:** Collect new sparkline entries first, then create new immutable array - avoids mutating state and eliminates need for intermediate array
```
let latest_time_bucket = dayjs(last_bucket)
const newSparklineEntries = []
for (const log of logs) {
const time_bucket = dayjs.unix(Math.floor(dayjs(log.timestamp).unix() / interval) * interval)
if (time_bucket.isAfter(latest_time_bucket)) {
latest_time_bucket = time_bucket
}
newSparklineEntries.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
}
actions.setSparkline(
[...values.sparkline, ...newSparklineEntries]
.sort(
// sort buckets by time, then level
(a, b) => dayjs(a.time).diff(dayjs(b.time)) || a.level.localeCompare(b.level)
)
.reduce((acc, curr) => {
// aggregate buckets by time and level
const index = acc.findIndex(
(item) => dayjs(item.time).isSame(dayjs(curr.time)) && item.level === curr.level
)
if (index === -1) {
// haven't seen this time/level combo before, add it to the accumulator
acc.push(curr)
} else {
// increase existing bucket
acc[index].count += curr.count
}
return acc
// Keep the overall sparkline time range the same by dropping older buckets
}, [])
.filter((item) => latest_time_bucket.diff(dayjs(item.time), 'seconds') <= duration)
)
```
How can I resolve this? If you propose a fix, please make it concise.| } | ||
| }, | ||
| pollForNewLogs: async () => { | ||
| if (!values.liveTailRunning || values.orderBy !== 'latest' || document.hidden) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Skips polling when tab hidden but doesn't resume when tab becomes visible - live tail stops working after tab switch
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/logs/frontend/logsLogic.tsx
Line: 754:754
Comment:
**logic:** Skips polling when tab hidden but doesn't resume when tab becomes visible - live tail stops working after tab switch
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds live tailing functionality to the logs viewer, allowing users to continuously poll for and display new log entries in real-time. The implementation includes frontend polling logic with exponential backoff, a ClickHouse-based checkpoint system to track log ingestion lag, and UI enhancements with visual indicators for new entries.
Key Changes:
- Implements live tail polling with exponential backoff (1-5s intervals) that fetches new logs matching current filters
- Adds Kafka metrics table and checkpoint system to ensure complete log retrieval by filtering on
observed_timestamp - Introduces live tail toggle button with disabled states based on query requirements (latest ordering, open-ended time range)
Reviewed changes
Copilot reviewed 11 out of 14 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
products/logs/frontend/logsLogic.tsx |
Core live tail logic including polling, checkpoint management, sparkline updates, and state management |
products/logs/frontend/LogsScene.tsx |
UI components for live tail toggle button and status display |
products/logs/backend/api.py |
API handling for live checkpoint parameter and optimization bypass during live tailing |
products/logs/backend/logs_query_runner.py |
Query builder integration for checkpoint filtering and result field mapping |
posthog/schema.py |
Python schema updates for liveLogsCheckpoint field |
frontend/src/queries/schema/schema-general.ts |
TypeScript schema for checkpoint and new log indicators |
posthog/hogql/database/schema/logs.py |
Database schema for Kafka metrics table |
posthog/hogql/database/database.py |
Database registration for logs_kafka_metrics table |
bin/clickhouse-logs.sql |
ClickHouse table and materialized view for tracking Kafka consumption metrics |
frontend/src/lib/lemon-ui/icons/icons.tsx |
New pause circle icon for UI |
frontend/src/lib/lemon-ui/LemonTable/LemonTable.tsx |
Type definition for new row status |
frontend/src/lib/lemon-ui/LemonTable/LemonTable.scss |
Flash animation for newly arrived logs |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (response.results.length > 0) { | ||
| // the live_logs_checkpoint is the latest known timestamp for which we know we have all logs up to that point | ||
| // it's returned from clickhouse as a value on every log row - but the value is fixed per query | ||
| actions.setLiveLogsCheckpoint(response.results[0].live_logs_checkpoint) | ||
| } |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the initial query fetches logs, it should also set the liveLogsCheckpoint from the first log entry (similar to what's done in pollForNewLogs). Without this, if the user clicks "Live tail" immediately after the first query returns results, the checkpoint will be null and all logs will be refetched. Consider extracting this checkpoint-setting logic into a helper function or adding it to the fetchLogs success handler.
| expireLiveTail: async ({}, breakpoint) => { | ||
| await breakpoint(30000) | ||
| if (values.liveTailRunning) { | ||
| return | ||
| } | ||
| actions.setLiveTailExpired(true) | ||
| }, |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expiration logic is inverted. When live tail is running, it returns early without setting the expiration flag. This means the expiration will only be set when live tail is NOT running, which defeats the purpose. The condition should be if (!values.liveTailRunning) to expire only when live tail is not active, or the entire approach should be reconsidered. Additionally, calling expireLiveTail when stopping live tail (line 750) is problematic since it won't actually expire (due to the inverted logic) and wastes a 30s timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope copilot has this backwards. We're expiring the ability to click the button when it's stopped, not expiring the running live tail
products/logs/frontend/logsLogic.tsx
Outdated
| setServiceNames: (serviceNames: LogsQuery['serviceNames']) => ({ serviceNames }), | ||
| setWrapBody: (wrapBody: boolean) => ({ wrapBody }), | ||
| setPrettifyJson: (prettifyJson: boolean) => ({ prettifyJson }), | ||
| setLiveLogsCheckpoint: (liveLogsCheckpoint: string) => ({ liveLogsCheckpoint }), |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setLiveLogsCheckpoint action expects a non-nullable string, but the reducer initializes it as string | null and it can be set to null elsewhere in the code (e.g., line 662 suggestion). The action signature should be (liveLogsCheckpoint: string | null) to match the actual usage and the reducer type.
| setLiveLogsCheckpoint: (liveLogsCheckpoint: string) => ({ liveLogsCheckpoint }), | |
| setLiveLogsCheckpoint: (liveLogsCheckpoint: string | null) => ({ liveLogsCheckpoint }), |
| const liveTailController = new AbortController() | ||
| const signal = liveTailController.signal | ||
| actions.cancelInProgressLiveTail(liveTailController) | ||
| let duration = 0 |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name duration is misleading as it represents the time taken to execute the API request, not the duration of the live tail. Consider renaming it to requestDuration or apiCallDuration for clarity.
products/logs/frontend/logsLogic.tsx
Outdated
|
|
||
| const first_bucket = values.sparklineData.dates[0] | ||
| const last_bucket = values.sparklineData.dates[values.sparklineData.dates.length - 1] | ||
| const duration = dayjs(last_bucket).diff(first_bucket, 'seconds') |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable name duration in the addLogsToSparkline function conflicts with the duration variable in pollForNewLogs and represents a different concept (time range of sparkline vs. API request time). Consider renaming to sparklineTimeRangeSeconds or sparklineDurationSeconds for clarity.
bf0cf42 to
a5480ac
Compare
jonmcwest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can we
- remove the unused setLastSparklineUpdate
- remove the sparkline reducer and let the loader handle the sparkline state
|
yep those were both conflict artefacts (I resolved these conflicts like 10 times :/) |
|
Looking forward to using this thanks very much @charlesvien && @frankh |
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
|
|
||
| class LogsKafkaMetricsTable(DANGEROUS_NoTeamIdCheckTable): | ||
| """ | ||
| Table stores meta information about kafka consumption _not_ scoped to teams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
6484fc7 to
9181686
Compare
9181686 to
2292da5
Compare
Problem
brought in this external contributors PR, #41598
Changes
Screen.Recording.2025-11-27.at.15.03.24.mov
How did you test this code?
ran locally
Changelog: (features only) Is this feature complete?