Skip to content

Best‐Practices: Serialization and Deserialization framework for Rust

Suryansh Gupta edited this page Feb 3, 2026 · 1 revision

Purpose

We want a stable and efficient format for structured data in our Rust code. The choice should balance speed, memory behavior, schema evolution, and a smooth experience for contributors.

Requirements

  1. Low allocation during reads where practical
  2. Predictable performance for both reads and writes
  3. Clear schema evolution story
  4. Simple build and code generation steps
  5. Good Rust support and long term stability

Options considered

JSON with serde

Pros

  • Very easy to use
  • Clean integration with normal Rust structs
  • No external tools needed
  • Great for readability and debugging

Cons

  • High allocation cost for large or frequent messages
  • Slower parsing compared with binary formats
  • Larger data size on the wire or disk
  • No enforced schema evolution rules

Allocation High. In the usual pattern, decoding into Rust structs or into serde_json::Value produces owned data structures that allocate on the heap. This is simple and predictable, but not ideal for hot paths.


Protobuf with prost

Pros

  • Compact binary encoding
  • Mature and actively used Rust support
  • Clear and predictable schema evolution rules
  • Generates normal Rust types
  • Good community and tooling support

Cons

  • Requires a build script and generated code
  • Read path allocates to build Rust structs
  • Not a zero copy format

Allocation Similar in nature to JSON when decoding into owned Rust values. Protobuf also constructs an owned object graph, therefore it allocates for fields and collections. The main benefits over JSON come from a smaller encoded size and typically faster parsing, not from avoiding allocations.


FlatBuffers

Pros

  • Can read data in place from a single buffer without extra copies in many situations
  • Very low allocation in read heavy lookup paths when using buffer views
  • Good fit for some performance sensitive read patterns
  • Compact binary representation

Cons

  • Builder pattern for writes is more verbose
  • Rust support is documented as experimental and APIs may change between minor versions
  • Accessor based API can feel mechanical to use
  • Schema evolution requires care and discipline

Allocation Reads can work over the original buffer, so additional allocation on the read path can be very low when the API is used in that style. Writes use a builder that owns a growing Vec<u8>, which may reallocate as it grows if it is not sized correctly up front.


Capn proto

Pros

  • Designed for direct traversal of message buffers without a separate decode step
  • Good fit for memory mapped or shared buffers
  • Clear schema evolution model
  • Rust crate is stable and maintained
  • Supports zero copy style access in many scenarios

Cons

  • Learning curve around segments, arenas, and pointer layout
  • Smaller ecosystem compared with Protobuf
  • Requires schema based code generation and a build script
  • Write side uses arena style allocation, so memory use depends on message layout and configuration

Allocation Capn proto encodes messages in a layout that allows in place traversal of the buffer, which can avoid extra allocations on the read path. On the write path, it relies on an arena of one or more segments that grow as needed, so allocation behavior depends on message size and growth pattern. We do not yet have internal measurements that compare write side behavior directly against FlatBuffers for our workloads.


Comparison summary

Feature JSON (serde) Protobuf (prost) FlatBuffers Cap’n Proto
Encoding Format Text based JSON Compact binary tag length value format Binary format with offset tables and tables Binary pointer based format with segments
Read Performance Slowest for large or frequent messages Usually faster than JSON, often good enough for many workloads Very fast when reading directly from buffers Very fast when traversing message buffers directly
Write Performance Simple, moderate speed Good, encoding cost usually lower than JSON Good, but builder API can be verbose Good, arena based writes, cost depends on message layout
Allocation on Read High, always builds owned Rust values High, also builds owned Rust values Very low when using buffer views instead of materializing structs Very low when traversing buffers directly, may increase if data is copied
Allocation on Write Allocations in string builder or buffers Allocations for message buffers and repeated fields Single growing buffer that may reallocate if not sized up front Arena or segment growth, allocations depend on segment sizing
Schema Evolution By convention only, no enforced rules Strong and well documented evolution rules Workable but requires discipline and careful schema changes Strong evolution model with explicit rules
Rust Developer Experience Easiest to adopt, integrates naturally with structs and enums Straightforward once code generation is wired into the build Less natural, accessor style API and experimental status in Rust Requires learning segments and builders, API is less familiar for the team
Tooling and Ecosystem Very broad support, many helper crates Strong ecosystem across languages and good Rust integration Good general story, Rust support smaller and less mature Solid core tools, smaller ecosystem and fewer examples than Protobuf
Indexing or Debuggability Human readable, easy to inspect and log Needs decoded view or specific tools to inspect binary messages Requires schema aware tools or helper code to inspect buffers Requires schema aware tools, binary layout less intuitive to read directly
Code Generation None required for typical usage Requires .proto schemas and a build script to generate Rust code Requires schema files and generated code, plus builder or accessor helpers Requires schema files, generated code, and build integration
Weak Spots Slow and large for hot paths, no built in schema guarantees No zero copy read, still allocates for full object graphs Experimental Rust API, more complex schema and builder mental model Less internal experience, arena behavior and layout need careful profiling

Recommendation

For most use cases that need a good balance of performance, clarity, and long term stability, Protobuf remains the default choice. It provides predictable behavior, strong schema evolution rules, and a mature Rust toolchain, and we already understand how to integrate it into our builds.

JSON remains useful for configuration, small human edited files, and readable examples. It should not be used for performance sensitive data paths.

For workloads that require very low allocation or direct use of memory mapped buffers, Capn proto and FlatBuffers are both worth exploring. At this point we do not have enough internal experience or measurements to clearly favor one over the other. Teams that hit such bottlenecks should run focused experiments with their real data and access patterns, then choose the format that best fits their constraints and build setup.

This guidance therefore proposes:

  • Use Protobuf as the primary format for structured data in this library
  • Continue to use JSON for configuration and debugging oriented structures
  • Treat Capn proto and FlatBuffers as advanced options to evaluate when a specific workload clearly needs lower allocation or a more specialized memory layout