Skip to content

Conversation

@emesare
Copy link
Member

@emesare emesare commented Jun 1, 2025

Stabilizing the format with 1.0. This PR refactors the format and Rust API resulting in significant on-disk storage savings, in-memory savings, as well as general performance improvements, while also making the format more flexible for upcoming features, such as fuzzy matching, or other matching algorithms.

@emesare emesare self-assigned this Jun 1, 2025
@emesare emesare added the enhancement New feature or request label Jun 1, 2025
@emesare emesare force-pushed the next branch 3 times, most recently from 6c0fd4c to 6242e48 Compare June 4, 2025 16:24
@emesare emesare force-pushed the next branch 4 times, most recently from eaba638 to f44be64 Compare June 10, 2025 21:35
@emesare emesare force-pushed the next branch 6 times, most recently from f531459 to 48a7cb8 Compare June 23, 2025 00:48
@emesare emesare force-pushed the next branch 3 times, most recently from aef7ed0 to ce5d80b Compare July 1, 2025 20:56
- Update to flatbuffers `25.2.10`
- Add fuzzing targets for type and function `from_bytes`
- Update examples
- Simplify type spec
- Make constraints generic and remove specialized constraint lists
- Space optimizations for type and functions specs
- More tests with greater coverage
- Introduce the concept of a WARP `File` and `Chunk`s
- Make chunk compression configurable
- Make `Type` objects class field unboxed (decreases memory pressure)
- Use standard directory structure for Rust API
- Move tests to `tests` directory for more easy discovery
- Remove almost all uses of `unwrap` (needed for server-side parsing)
- Refactor `TypeMetadata`
- Add `mock` module for easy mocking in tests and examples
- Make `Symbol` space optimized
- Switch to using `.warp` extension to represent general analysis data instead of just signatures
- Add format version to `File` and `Chunk` (allow for breaking changes later)
- Make analysis data (signatures and types) copy on write (See `ChunkHandler` impl's)

This work is being done to allow for networked WARP information and generally to make the WARP format more usable in a wider set of scenarios. After this commit any breaking changes to the format will be held off for 2.0, if that ever becomes a thing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants