Skip to content

Conversation

@frankh
Copy link
Contributor

@frankh frankh commented Nov 27, 2025

Problem

brought in this external contributors PR, #41598

Changes

  • Added a "Live Tail" toggle button next to the search button (requires an open-ended time range and latest ordering)
  • Implements polling that continuously fetches new log entries matching the current query filters and appends them to the logs list (while making sure the list is sliced to stay within log entries limits)
  • Implemented exponential backoff, polling interval starts at 1s/req and increases x1.5 each time we query and receive no new logs to append, until reaching a max delay of 5s/req
  • add "live logs checkpoint" subquery to the logs query which gets the latest observed timestamp at which we know we've received all logs from capture and filter new logs by this
  • "live tail" can only be pressed for 30 seconds after a query, after that you need to search again before live tailing
Screen.Recording.2025-11-27.at.15.03.24.mov

How did you test this code?

ran locally

Changelog: (features only) Is this feature complete?

@frankh frankh requested a review from jonmcwest November 27, 2025 15:05
@frankh frankh requested a review from a team as a code owner November 27, 2025 15:05
Copilot AI review requested due to automatic review settings November 27, 2025 15:05
Copilot finished reviewing on behalf of frankh November 27, 2025 15:09
@github-actions
Copy link
Contributor

github-actions bot commented Nov 27, 2025

Size Change: 0 B

Total Size: 3.41 MB

ℹ️ View Unchanged
Filename Size
frontend/dist/toolbar.js 3.41 MB

compressed-size-action

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

latest_time_bucket = time_bucket
}
// insert all new logs into the sparkline data as their own bucket - we'll later aggregate them
values.sparkline.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Mutating values.sparkline directly violates immutability principles - causes unpredictable state behavior and breaks Kea's reactivity model

Suggested change
values.sparkline.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
const sparklineWithNewLogs = [...values.sparkline, { time: time_bucket.toISOString(), level: log.level, count: 1 }]
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/logs/frontend/logsLogic.tsx
Line: 853:853

Comment:
**logic:** Mutating `values.sparkline` directly violates immutability principles - causes unpredictable state behavior and breaks Kea's reactivity model

```suggestion
                const sparklineWithNewLogs = [...values.sparkline, { time: time_bucket.toISOString(), level: log.level, count: 1 }]
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 846 to 926
let latest_time_bucket = dayjs(last_bucket)
for (const log of logs) {
const time_bucket = dayjs.unix(Math.floor(dayjs(log.timestamp).unix() / interval) * interval)
if (time_bucket.isAfter(latest_time_bucket)) {
latest_time_bucket = time_bucket
}
// insert all new logs into the sparkline data as their own bucket - we'll later aggregate them
values.sparkline.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
}
actions.setSparkline(
values.sparkline
.sort(
// sort buckets by time, then level
(a, b) => dayjs(a.time).diff(dayjs(b.time)) || a.level.localeCompare(b.level)
)
.reduce((acc, curr) => {
// aggregate buckets by time and level
const index = acc.findIndex(
(item) => dayjs(item.time).isSame(dayjs(curr.time)) && item.level === curr.level
)
if (index === -1) {
// haven't seen this time/level combo before, add it to the accumulator
acc.push(curr)
} else {
// increase existing bucket
acc[index].count += curr.count
}
return acc
// Keep the overall sparkline time range the same by dropping older buckets
}, [])
.filter((item) => latest_time_bucket.diff(dayjs(item.time), 'seconds') <= duration)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Collect new sparkline entries first, then create new immutable array - avoids mutating state and eliminates need for intermediate array

            let latest_time_bucket = dayjs(last_bucket)
            const newSparklineEntries = []
            for (const log of logs) {
                const time_bucket = dayjs.unix(Math.floor(dayjs(log.timestamp).unix() / interval) * interval)
                if (time_bucket.isAfter(latest_time_bucket)) {
                    latest_time_bucket = time_bucket
                }
                newSparklineEntries.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
            }
            actions.setSparkline(
                [...values.sparkline, ...newSparklineEntries]
                    .sort(
                        // sort buckets by time, then level
                        (a, b) => dayjs(a.time).diff(dayjs(b.time)) || a.level.localeCompare(b.level)
                    )
                    .reduce((acc, curr) => {
                        // aggregate buckets by time and level
                        const index = acc.findIndex(
                            (item) => dayjs(item.time).isSame(dayjs(curr.time)) && item.level === curr.level
                        )
                        if (index === -1) {
                            // haven't seen this time/level combo before, add it to the accumulator
                            acc.push(curr)
                        } else {
                            // increase existing bucket
                            acc[index].count += curr.count
                        }
                        return acc
                        // Keep the overall sparkline time range the same by dropping older buckets
                    }, [])
                    .filter((item) => latest_time_bucket.diff(dayjs(item.time), 'seconds') <= duration)
            )
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/logs/frontend/logsLogic.tsx
Line: 846:877

Comment:
**style:** Collect new sparkline entries first, then create new immutable array - avoids mutating state and eliminates need for intermediate array

```
            let latest_time_bucket = dayjs(last_bucket)
            const newSparklineEntries = []
            for (const log of logs) {
                const time_bucket = dayjs.unix(Math.floor(dayjs(log.timestamp).unix() / interval) * interval)
                if (time_bucket.isAfter(latest_time_bucket)) {
                    latest_time_bucket = time_bucket
                }
                newSparklineEntries.push({ time: time_bucket.toISOString(), level: log.level, count: 1 })
            }
            actions.setSparkline(
                [...values.sparkline, ...newSparklineEntries]
                    .sort(
                        // sort buckets by time, then level
                        (a, b) => dayjs(a.time).diff(dayjs(b.time)) || a.level.localeCompare(b.level)
                    )
                    .reduce((acc, curr) => {
                        // aggregate buckets by time and level
                        const index = acc.findIndex(
                            (item) => dayjs(item.time).isSame(dayjs(curr.time)) && item.level === curr.level
                        )
                        if (index === -1) {
                            // haven't seen this time/level combo before, add it to the accumulator
                            acc.push(curr)
                        } else {
                            // increase existing bucket
                            acc[index].count += curr.count
                        }
                        return acc
                        // Keep the overall sparkline time range the same by dropping older buckets
                    }, [])
                    .filter((item) => latest_time_bucket.diff(dayjs(item.time), 'seconds') <= duration)
            )
```

How can I resolve this? If you propose a fix, please make it concise.

}
},
pollForNewLogs: async () => {
if (!values.liveTailRunning || values.orderBy !== 'latest' || document.hidden) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Skips polling when tab hidden but doesn't resume when tab becomes visible - live tail stops working after tab switch

Prompt To Fix With AI
This is a comment left during a code review.
Path: products/logs/frontend/logsLogic.tsx
Line: 754:754

Comment:
**logic:** Skips polling when tab hidden but doesn't resume when tab becomes visible - live tail stops working after tab switch

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds live tailing functionality to the logs viewer, allowing users to continuously poll for and display new log entries in real-time. The implementation includes frontend polling logic with exponential backoff, a ClickHouse-based checkpoint system to track log ingestion lag, and UI enhancements with visual indicators for new entries.

Key Changes:

  • Implements live tail polling with exponential backoff (1-5s intervals) that fetches new logs matching current filters
  • Adds Kafka metrics table and checkpoint system to ensure complete log retrieval by filtering on observed_timestamp
  • Introduces live tail toggle button with disabled states based on query requirements (latest ordering, open-ended time range)

Reviewed changes

Copilot reviewed 11 out of 14 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
products/logs/frontend/logsLogic.tsx Core live tail logic including polling, checkpoint management, sparkline updates, and state management
products/logs/frontend/LogsScene.tsx UI components for live tail toggle button and status display
products/logs/backend/api.py API handling for live checkpoint parameter and optimization bypass during live tailing
products/logs/backend/logs_query_runner.py Query builder integration for checkpoint filtering and result field mapping
posthog/schema.py Python schema updates for liveLogsCheckpoint field
frontend/src/queries/schema/schema-general.ts TypeScript schema for checkpoint and new log indicators
posthog/hogql/database/schema/logs.py Database schema for Kafka metrics table
posthog/hogql/database/database.py Database registration for logs_kafka_metrics table
bin/clickhouse-logs.sql ClickHouse table and materialized view for tracking Kafka consumption metrics
frontend/src/lib/lemon-ui/icons/icons.tsx New pause circle icon for UI
frontend/src/lib/lemon-ui/LemonTable/LemonTable.tsx Type definition for new row status
frontend/src/lib/lemon-ui/LemonTable/LemonTable.scss Flash animation for newly arrived logs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 780 to 839
if (response.results.length > 0) {
// the live_logs_checkpoint is the latest known timestamp for which we know we have all logs up to that point
// it's returned from clickhouse as a value on every log row - but the value is fixed per query
actions.setLiveLogsCheckpoint(response.results[0].live_logs_checkpoint)
}
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the initial query fetches logs, it should also set the liveLogsCheckpoint from the first log entry (similar to what's done in pollForNewLogs). Without this, if the user clicks "Live tail" immediately after the first query returns results, the checkpoint will be null and all logs will be refetched. Consider extracting this checkpoint-setting logic into a helper function or adding it to the fetchLogs success handler.

Copilot uses AI. Check for mistakes.
Comment on lines +709 to +757
expireLiveTail: async ({}, breakpoint) => {
await breakpoint(30000)
if (values.liveTailRunning) {
return
}
actions.setLiveTailExpired(true)
},
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expiration logic is inverted. When live tail is running, it returns early without setting the expiration flag. This means the expiration will only be set when live tail is NOT running, which defeats the purpose. The condition should be if (!values.liveTailRunning) to expire only when live tail is not active, or the entire approach should be reconsidered. Additionally, calling expireLiveTail when stopping live tail (line 750) is problematic since it won't actually expire (due to the inverted logic) and wastes a 30s timeout.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope copilot has this backwards. We're expiring the ability to click the button when it's stopped, not expiring the running live tail

setServiceNames: (serviceNames: LogsQuery['serviceNames']) => ({ serviceNames }),
setWrapBody: (wrapBody: boolean) => ({ wrapBody }),
setPrettifyJson: (prettifyJson: boolean) => ({ prettifyJson }),
setLiveLogsCheckpoint: (liveLogsCheckpoint: string) => ({ liveLogsCheckpoint }),
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setLiveLogsCheckpoint action expects a non-nullable string, but the reducer initializes it as string | null and it can be set to null elsewhere in the code (e.g., line 662 suggestion). The action signature should be (liveLogsCheckpoint: string | null) to match the actual usage and the reducer type.

Suggested change
setLiveLogsCheckpoint: (liveLogsCheckpoint: string) => ({ liveLogsCheckpoint }),
setLiveLogsCheckpoint: (liveLogsCheckpoint: string | null) => ({ liveLogsCheckpoint }),

Copilot uses AI. Check for mistakes.
const liveTailController = new AbortController()
const signal = liveTailController.signal
actions.cancelInProgressLiveTail(liveTailController)
let duration = 0
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name duration is misleading as it represents the time taken to execute the API request, not the duration of the live tail. Consider renaming it to requestDuration or apiCallDuration for clarity.

Copilot uses AI. Check for mistakes.

const first_bucket = values.sparklineData.dates[0]
const last_bucket = values.sparklineData.dates[values.sparklineData.dates.length - 1]
const duration = dayjs(last_bucket).diff(first_bucket, 'seconds')
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name duration in the addLogsToSparkline function conflicts with the duration variable in pollForNewLogs and represents a different concept (time range of sparkline vs. API request time). Consider renaming to sparklineTimeRangeSeconds or sparklineDurationSeconds for clarity.

Copilot uses AI. Check for mistakes.
@frankh frankh force-pushed the charles/logs-live-tailing branch 2 times, most recently from bf0cf42 to a5480ac Compare November 27, 2025 16:31
Copy link
Contributor

@jonmcwest jonmcwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can we

  • remove the unused setLastSparklineUpdate
  • remove the sparkline reducer and let the loader handle the sparkline state

@frankh
Copy link
Contributor Author

frankh commented Nov 27, 2025

yep those were both conflict artefacts (I resolved these conflicts like 10 times :/)

@jonmcwest
Copy link
Contributor

Looking forward to using this thanks very much @charlesvien && @frankh

@wiz-7ad640923b
Copy link

wiz-7ad640923b bot commented Nov 27, 2025

Wiz Scan Summary

Scanner Findings
Vulnerability Finding Vulnerabilities -
Data Finding Sensitive Data 1 Medium
Secret Finding Secrets -
IaC Misconfiguration IaC Misconfigurations -
SAST Finding SAST Findings -
Total 1 Medium

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

@frankh frankh enabled auto-merge (squash) November 27, 2025 17:07

class LogsKafkaMetricsTable(DANGEROUS_NoTeamIdCheckTable):
"""
Table stores meta information about kafka consumption _not_ scoped to teams
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@frankh frankh force-pushed the charles/logs-live-tailing branch from 6484fc7 to 9181686 Compare November 27, 2025 19:32
@frankh frankh force-pushed the charles/logs-live-tailing branch from 9181686 to 2292da5 Compare November 28, 2025 09:24
@frankh frankh merged commit 0ab14a0 into master Nov 28, 2025
195 of 198 checks passed
@frankh frankh deleted the charles/logs-live-tailing branch November 28, 2025 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants