wait until processImpl completes before closing stream chan#1677
wait until processImpl completes before closing stream chan#1677nmerkulov wants to merge 3 commits intoClickHouse:mainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a race condition in query processing where the stream channel could be closed while a background goroutine was still attempting to write to it, causing a panic.
- Adds synchronization to wait for background goroutine completion before returning from
process()function when context is cancelled - Prevents "send on closed channel" panic during query cancellation
- Ensures proper cleanup and resource management without breaking API changes
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| // Wait for goroutine to finish before returning | ||
| select { | ||
| case <-errCh: | ||
| case <-doneCh: | ||
| } |
There was a problem hiding this comment.
Corrected spelling of 'untill' to 'until' in the PR title.
|
@nmerkulov thanks for the PR :) do you mind adding a test case to lock the behavior? |
<ClickHouse/clickhouse-go#1677> UPD исправил фикс на надежный. теперь канал не будет закрыт, пока горутина-писатель не завершится Семантически так правильней commit_hash:ef4786541bee52c63054552b0c593ce70fc6c75b
| case <-ctx.Done(): | ||
| c.cancel() | ||
| // Wait for goroutine to finish before returning | ||
| select { |
There was a problem hiding this comment.
My bit of concern with this approach is, if you look at the background goroutine, it doesn't use/watch for passed in ctx for cancellation. It does give it to upstream handle method but we cannot gurantee that handle returns as soon as passed in ctx get cancelled.
the problem is, the whole point of canceling this method is to stop doing handle and return immediately.
With this change, we may indirectly may end-up waiting for handle to "finish".
I would handle the situation ideally like following.
- make the goroutine listen with two
selectcase. a. handle to finish b. context to be done. - Based on which ever comes first (a) or (b), we exit the go-routine.
Does it make sense?
There was a problem hiding this comment.
Also we may have same problem during firstBlock as well
https://github.com/nmerkulov/clickhouse-go/blob/24bbc84a5899c097a66469e063b3ddb3eba39045/conn_process.go#L30-L46
Fix race condition in query processing
Problem
There was a race condition between closing the
streamchannel and writing to it from a background goroutine inconn_query.go.Race condition scenario:
conn_query.gostarts a goroutine that callsc.process(ctx, onProcess)(line 45)c.process()spawns another goroutine internally that callsc.processImpl()(line 106 inconn_process.go)c.processImpl()reads data from the server and callson.data(block)for each data block (line 170 inconn_process.go)c.process()returns withctx.Err()conn_query.goimmediately closes thestreamchannel (line 54)streamviaon.data(block)panic: send on closed channelStack trace example:
Solution
Modified
conn_process.goto ensure that theprocess()function waits for the internal goroutine to complete before returning, even when the context is cancelled.Changes:
ctx.Done()is triggered, after callingc.cancel(), we now wait for the goroutine to finish by selecting onerrChordoneChon.datawill not be called afterprocess()returnsstreamchannel can now be safely closed inconn_query.gowithout race conditionsImpact