Skip to content

[enhance] buffer storage#83

Open
iaojnh wants to merge 44 commits intomainfrom
feat/buffer_storage_vec
Open

[enhance] buffer storage#83
iaojnh wants to merge 44 commits intomainfrom
feat/buffer_storage_vec

Conversation

@iaojnh
Copy link
Collaborator

@iaojnh iaojnh commented Feb 9, 2026

resolve #64

@Cuiyus

This comment was marked as outdated.

@greptile-apps
Copy link

greptile-apps bot commented Feb 27, 2026

Greptile Summary

This PR introduces a new buffer pool-based storage backend with several critical implementation bugs that must be fixed before merging.

Critical bugs identified:

  1. Data corruption in WrappedSegment::fetch() (buffer_storage.cc:75-88): The segment-relative offset is passed as an absolute file offset to get_buffer(), then added again via pointer arithmetic. This causes reads to access the wrong file location and can result in out-of-bounds buffer access when offset ≥ len.

  2. Race condition in acquire_buffer() (buffer_pool.cc:184-190): The recycle() function is called without holding the required mutex_, creating a window where concurrent threads can race during eviction and placement of blocks.

  3. Memory leak in VecBufferPool destructor (buffer_pool.h:101-103): All char* buffers allocated via ailego_malloc() and stored in free_buffers_ or lp_map_.entries_[*].buffer are never freed on destruction, causing permanent memory leaks on every pool teardown.

  4. File descriptor leak in constructor (buffer_pool.cc:140-150): If fstat() fails after a successful open(), the exception is thrown before the constructor completes, preventing the destructor from running and leaking the file descriptor.

These are not edge cases — fetch() is on the hot read path, and the mutex and fd leaks occur during normal initialization and shutdown.

Confidence Score: 1/5

  • Not safe to merge — critical data corruption in fetch() causes silent data corruption and buffer overflow; mutex and fd leaks occur during normal operation.
  • All four prior comments describe concrete, reproducible bugs verified in the actual code: (1) fetch() double-applies offset leading to wrong file reads and OOB access; (2) unprotected recycle() call enables race conditions; (3) destructor leaks all allocated buffers; (4) fd leak on constructor failure. These are not edge cases or speculative — they are critical bugs on hot paths or during normal initialization/teardown.
  • src/core/utility/buffer_storage.cc (offset bug), src/ailego/buffer/buffer_pool.cc (mutex and fd leaks), src/include/zvec/ailego/buffer/buffer_pool.h (destructor memory leak)

Sequence Diagram

sequenceDiagram
    participant Caller
    participant WrappedSegment
    participant BufferStorage
    participant VecBufferPoolHandle
    participant VecBufferPool
    participant LPMap

    Caller->>WrappedSegment: fetch(offset, buf, len)
    WrappedSegment->>BufferStorage: get_buffer(offset❌, len, segment_id)
    note over WrappedSegment,BufferStorage: BUG: should be absolute file offset,<br/>not segment-relative offset

    BufferStorage->>VecBufferPoolHandle: get_block(offset, len, segment_id)
    VecBufferPoolHandle->>VecBufferPool: acquire_buffer(segment_id, offset, len, 5)
    VecBufferPool->>LPMap: acquire_block(segment_id)
    alt Block already cached
        LPMap-->>VecBufferPool: cached buffer ptr
    else Not cached — fetch from disk
        VecBufferPool->>VecBufferPool: pread(fd, buffer, len, offset)
        VecBufferPool->>LPMap: set_block_acquired(segment_id, buffer)
        note over VecBufferPool,LPMap: BUG: recycle() called without mutex_<br/>before reaching this point (line 185)
        LPMap-->>VecBufferPool: placed buffer ptr
    end
    VecBufferPool-->>VecBufferPoolHandle: buffer ptr
    VecBufferPoolHandle-->>BufferStorage: buffer ptr
    BufferStorage-->>WrappedSegment: buffer ptr

    WrappedSegment->>WrappedSegment: memmove(buf, buffer + offset❌, len)
    note over WrappedSegment: BUG: offset applied twice → OOB if offset≥len
    WrappedSegment-->>Caller: len
Loading

Comments Outside Diff (1)

  1. src/core/utility/buffer_storage.cc, line 75-88 (link)

    fetch() uses wrong file offset and double-applies the segment offset

    The offset parameter is segment-relative (0-based within the segment's data), but the code passes it directly to get_buffer(offset, len, segment_id_) as if it were an absolute file offset. The correct absolute offset should be computed as shown in the read() method below (lines 99-101):

    size_t buffer_offset = segment_header_start_offset_ +
                           segment_header_->content_offset +
                           segment_->meta()->data_index;

    Additionally, the offset is applied twice: once implicitly in get_buffer() and again explicitly via + offset in the memmove. This causes reads with non-zero offset to access the wrong file location, and if offset >= len, the pointer arithmetic will escape the allocated buffer entirely (buffer overflow).

    The correct implementation should mirror the read() method:

    size_t buffer_offset = segment_header_start_offset_ +
                           segment_header_->content_offset +
                           segment_->meta()->data_index;
    memmove(buf,
            (const uint8_t *)(owner_->get_buffer(buffer_offset, capacity_, segment_id_)) +
                offset,
            len);

Last reviewed commit: 40d6af9

greptile-apps[bot]

This comment was marked as resolved.

@greptile-apps

This comment was marked as resolved.

chinaux and others added 9 commits March 2, 2026 11:29
Co-authored-by: yinzefeng.yzf <yinzefeng.yzf@alibaba-inc.com>
Changed 'your-org' placeholder to 'alibaba' in the git clone command to reflect the correct repository URL.
…BUTING (#150)

- README.md: remove spurious space in align=" center" → align="center"
  (logo was not centered on GitHub due to invalid HTML attribute value)
- CONTRIBUTING.md: correct Python prerequisite from '>= 3.9' to '3.10 - 3.12'
  to match pyproject.toml classifiers and CI matrix (cp310, cp312)
JalinWang and others added 18 commits March 2, 2026 11:29
* docs: remove x.com in join us

* docs: join discord with widget
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
* feat: linux ci with github runner

* fix: add CFLAGS with -march=native

* fix: fix linux x86 ci

* fix: with zen3 in x86 ci

* feat: nightly report with github-runner

* feat: refactor ci workflow

* feat: rename workflow

* feat: rename job

* feat: update ci badge

* feat: remove python coverage
* add Jina Embeddings v5 support

* fix ruff format for jina_embedding_function.py
@iaojnh

This comment was marked as outdated.

@iaojnh

This comment was marked as outdated.

Comment on lines +184 to +190
for (int i = 0; i < retry; i++) {
lp_map_.recycle(free_buffers_);
found = free_buffers_.try_dequeue(buffer);
if (found) {
break;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recycle() called without holding mutex_ in acquire_buffer()

The header file (buffer_pool.h:76) documents that recycle() must be called under lock. However, at line 185, lp_map_.recycle(free_buffers_) is invoked without holding mutex_. The mutex is only acquired later at line 206 for set_block_acquired().

This creates a race window where concurrent threads can race during the eviction and placement phases. One thread can validate and evict a block while another thread is attempting to acquire and place the same block.

Wrap the recycle call with the mutex:

for (int i = 0; i < retry; i++) {
  {
    std::lock_guard<std::mutex> lock(mutex_);
    lp_map_.recycle(free_buffers_);
  }
  found = free_buffers_.try_dequeue(buffer);
  if (found) break;
}

Comment on lines +101 to +103
~VecBufferPool() {
close(fd_);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VecBufferPool destructor leaks all pooled buffers

The destructor only closes the file descriptor but never frees the char* buffers. All buffers allocated via ailego_malloc() in init() (line 158) and enqueued into free_buffers_ are never deallocated. Additionally, any buffers currently held in lp_map_.entries_[*].buffer are also leaked on destruction. This causes a permanent memory leak every time the buffer pool is destroyed.

The destructor should drain both the free pool and any buffers still held in the LPMap:

~VecBufferPool() {
  // Free all buffers in the free list
  char *buf = nullptr;
  while (free_buffers_.try_dequeue(buf)) {
    ailego_free(buf);
  }
  // Free any buffers still pinned in the map
  for (size_t i = 0; i < lp_map_.entry_num(); ++i) {
    char *b = lp_map_.evict_block(i);
    if (b) ailego_free(b);
  }
  close(fd_);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhance]: Buffer Storage