Skip to content

Define Data Migration Path from Hyperledger Fabric Snapshot to Fabric-X #21

@cendhu

Description

@cendhu

Problem Statement & Motivation

To provide a viable upgrade path for existing Hyperledger Fabric users, we need to define a process for migrating ledger data from a Fabric network to Fabric-X.

The official peer ledger snapshot feature in Hyperledger Fabric (as documented here) seems like the best starting point, as it provides a verifiable copy of the world state at a specific block height.

However, a direct 1-to-1 import is not possible due to key architectural differences:

  1. No Channels: Fabric-X does not have the concept of channels. Fabric data is namespaced by channel. We must define how data from one or more channels maps into the single state model of Fabric-X.

  2. No Private Data: Fabric-X does not have private data collections (PDCs). The Fabric snapshot contains hashes of private data, which are irrelevant to Fabric-X and must be excluded.

This issue is to design the export/import process and define the target data set for this migration.

Analysis of Fabric Snapshot Data

Based on the documentation, a Fabric snapshot directory contains several key data components:

  • Public State (World State): The complete key-value state from the state database (LevelDB or CouchDB). This is the primary data to migrate.

  • Private Data Hashes: Hashes of data stored in PDCs. This must be ignored.

  • Transactions IDs: A list of all processed transaction IDs. This is used to prevent replay attacks. This should be considered for migration.

  • Collection Config History: The history of PDC configurations. This must be ignored.

  • Metadata Files (_snapshot_signable_metadata.json): Contains channel name, block height, and hashes of the snapshot files for verification.

Proposed Tasks & Acceptance Criteria

This effort needs to be broken down into two main parts: (1) defining the data export and mapping, and (2) defining the Fabric-X import mechanism.

Part 1: Data Export & Mapping Definition

  • Define the Canonical Data Set: Specify exactly which data from the Fabric snapshot will be part of the migration.

  • Resolve the Channel-to-Global State Mapping: This is the most critical design decision. A strategy must be defined for handling data from multiple Fabric channels.

    • Option 1 (Prefixing): The migration tool prefixes all keys with their channel of origin (e.g., mychannel_namespace_key123). This preserves all data in a single Fabric-X instance but may require chaincode modification.

    • Option 2 (One-to-One): The migration path is defined as one Fabric channel migrates to one Fabric-X instance. This is simpler but less flexible for users with many channels.

Decision: A clear strategy must be chosen and documented.

  • Design the "Exporter" Tool:
    • The tool must be able to parse a standard Fabric snapshot directory.
    • It must apply the channel mapping logic (e.g., prefixing keys).
    • It must filter out and ignore all data related to PDCs.
    • It must output a single, verifiable "genesis data file" or "snapshot artifact" that Fabric-X can understand.

Part 2: Fabric-X Ingestion/Bootstrap

  • Define the "Bootstrap from Snapshot" Feature in Fabric-X:

    • Fabric-X needs a new startup procedure (e.g., committer start --from-snapshot ).
    • This procedure must be a one-time operation for a new network instance.
  • Implement the Ingestion Logic:

    • On boot, Fabric-X must parse the "genesis data file" from the exporter.
    • It must populate its state database with all the K-V pairs.
    • It must (if included) populate its transaction ID history.
  • Define Verification Process: A method must be created to verify that the state in Fabric-X matches the state from the Fabric snapshot (e.g., a post-migration checksum tool or K-V-pair count).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions