Commit 52b8f7f
client pool: Prevent a deadlock in mongoc_client_pool_pop()
I was seeing deadlocks in the client when it lost contact with mongo.
The client was getting stuck on a futex in mongoc_client_pool_pop(),
specifically
mongoc_cond_wait(&pool->cond, &pool->mutex);
and never making any progress even when mongo came back on-line.
This problem was introduced by commit a8c1da4 ("Client pool tries to
repair unhealthy connections...") which started shutting down excess
connections, i.e where there were more connections than minPoolSize.
As part of this it would also decrement pool->size. This all happens in
mongoc_client_pool_push()
The problem here was that if you had a single active connection and a
minPoolSize of 0 (the default) then it would try to remove the oldest
client from the queue (in this case there wasn't one) and then decrement
pool->size.
pool->size would now be 0.
Then we move into the trying to deal with duff connections part of the
above commit. In this case, with no working mongo, it will want to
destroy this connection, which it does but it will also again decrement
pool->size, which now equals -1, except pool->size is a unit32_t
Then in mongoc_client_pool_pop() we have this check
if (pool->size < pool->max_pool_size) {
pool->size is now some _large_ number, UINT32_MAX perhaps. and so that
evaluates to false and then we move to the else branch which then
executes
mongoc_cond_wait(&pool->cond, &pool->mutex);
and this is where we get stuck. It thinks there is too many connections
and at the same time nothing is ever going to be returned to the queue
(we only had a single active connection) and even we did return a
connection to the pool, UINT32_MAX - 1 is still likely going to be >
pool->max_pool_size.
So to avoid this problem we simply don't want to try and remove
connections from the queue that don't exist and more importantly don't
decrement pool->size when we don't actually remove an old connection.
After this change, I can start mango, start the client, kill mongo,
client shows errors, but keeps trying to re-connect, restart mongo and
client is then happy again.
Signed-off-by: Andrew Clayton <[email protected]>
Closes #1871 parent dc9e04b commit 52b8f7f
1 file changed
+4
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
202 | 202 | | |
203 | 203 | | |
204 | 204 | | |
205 | | - | |
206 | | - | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
207 | 209 | | |
208 | 210 | | |
209 | 211 | | |
| |||
0 commit comments