diff --git a/CIPs/cip-145.md b/CIPs/cip-145.md index 964a19e..1024d2b 100644 --- a/CIPs/cip-145.md +++ b/CIPs/cip-145.md @@ -31,43 +31,90 @@ older Data Events in the stream. This type of fork in a stream is easily possibl protocol level. Prior to this CIP, Time Events and Data Events were not considered by the branch pruning rules described below, and would result in pruning of Data Events coinciding with Time Events. +This specification includes precedence rules for determining the default tip of a diverged stream. It is simpler for +applications to be provided a default state instead of each application having to define its own rules for determining +the tip. Applications can choose to engage with the complexity of conflict resolution but don't have to. Using the +rules from this spec, there is _some_ eventually consistent state of a stream. This holds true even if the stream is in +a diverged state and the controller never rectifies this. + +A key advantage of this specification is that all Data Events for a stream will discoverable from the stream tip, even +if they are not part of the prev-chain from the tip to the Init Event. This is different from the previous protocol +implementation, where nodes could recover from conflicts but had no way to account for it in the stream. + ### Definitions * A Ceramic `synchronization layer` synchronizes all the events for a Ceramic stream. * A Ceramic `aggregation layer` converts all the events in a Ceramic stream to its corresponding tip state. -* A `fork point` is the last common ancestor of two events `A` and `B`. -* A `merge point` is the first common descendant of two events `A` and `B`. +* A `branch` for an event is the set of all events transitively covered by that event. +* A `pruned branch` is any branch that is not covered by the tip. +* A `fork point` for a branch is the earliest event on that branch that is not on another branch. +* A `merge point` for two events is an event that transitively covers both events. * `A` is a `covered event` if another event `B` has `A`'s CID in its `prev` field. + * Said differently, if event `B` has event `A`'s CID in its `prev` field, then event `B` "covers" event `A`. By + transitivity, event `B` also covers every event covered by event `A`. + * The `time` of a Data Event is the timestamp of the earliest Time Event that covers the Data Event. * `A` is an `uncovered event` if there is no event with `A`'s CID in its `prev` field. -* A `diverged stream` has more than one uncovered event. -* A `converged stream` has a single uncovered event. +* A `dominant Data Event` is one that is not covered by another Data Event. +* A `non-dominant Data Event` is one that is covered by another Data Event. This applies transitively through Time + Events. +* A `diverged stream` has more than one dominant Data Event. +* A `converged stream` has a single dominant Data Event. * An `invalid Data Event` is one that has either an invalid signature or an expired CACAO. * A `valid Data Event` is one that has a valid signature and a valid CACAO (if applicable). +* A `Merge Event` is a Data Event that has multiple previous CIDs. +* A `prev-chain` is the list of events from the tip to the Init Event, following the first prev CID. +* A `prev-DAG` is the DAG of events from the tip to the Init Event, following all prev CIDs. ## Specification Any stream where there are Data Events without a common descendant is in a diverged state. When a stream diverges, Ceramic uses the rules below to determine which of the uncovered events will serve as the tip of the stream and which branches are pruned. Since pruning branches that contain Data Events would result in the pruning of valid data, we allow -a Data Event to contain multiple ancestor events in its `prev` field so that a new merge point Data Event can cover -multiple events. - -1. If a stream is in a diverged state (see events `A` and `B` in fig. 5), we only consider branches that contain valid, - uncovered Data Events. -2. For branches that contain valid, uncovered Data Events, we only consider the first Data Event on a branch after the - fork. -3. The Data Event that is covered by the earliest Time Event wins (see event `A` in fig. 5). -4. If two uncovered Data Events are covered by Time Events at the same block height, the Data Event with the lower CID - wins. - -Using multi-prev Data Events allows us to reduce the number of uncovered events and converge the stream so that there is -only a single uncovered event, without any data abandoned on pruned branches. The stream's converged/diverged state can -be determined by looking at the `prev` fields of all the Data Events for that stream. +a Data Event to contain multiple ancestor events in its `prev` field so that a new Merge Event can cover multiple events. + +1. If a stream is in a diverged state, each uncovered event is a candidate tip. +2. Branches that do not contain dominant Data Events cannot be the tip. +3. For branches that contain dominant Data Events, consider the earliest Data Event on each branch after a fork point. +4. The branch with the earliest Data Event becomes the tip. An anchored Data Event is considered earlier that an + unanchored Data Event. +5. If multiple earliest Data Events have the same time, then the Data Event with the lowest CID becomes the tip. + +A side effect of these rules is that even a stream that is not in a diverged state can include Time Events that are not +covered by the tip. If a node becomes aware of such uncovered Time Events, it may include them in a multi-prev so that +they become discoverable from the tip. + +These rules will determine the default tip of the stream, indicated as such to an application. By default, this tip +will be the first CID in the `prev` list of the Merge Event but the application may choose to select a different +candidate tip as the source for its Merge Event. + +An application can follow the tip back to the Init Event by selecting the first `prev` CID of each Data Event. + +For an example of these rules in action, see the figure below: +![Alt text](../assets/cip-145/rules1.png) + +1. Based on rule (1), the stream state has 4 candidate tips, `Time 5`, `Time 6`, `Time 4`, and `Data F`. One of these + candidate tips will become the tip of the stream. + ![Alt text](../assets/cip-145/rules2.png) +2. The stream forks between the branches for `Time 1` and `Data A`. Based on rules (3) and (4), the branch for `Data A` + is the only branch considered for tip selection. + ![Alt text](../assets/cip-145/rules3.png) +3. This branch later forks into additional branches for `Data E`, `Time 4`, and `Data F`. Based on rules (3) and (4), + the branches for `Data E` and `Data F` are the only branches considered for tip selection. + ![Alt text](../assets/cip-145/rules4.png) +4. Based on rule (4), since there is a Time Event for `Data E` but not for `Data F`, only the branch for `Data E` is + considered for tip selection. + ![Alt text](../assets/cip-145/rules5.png) +5. `Time 6` is the tip of the surviving branch, and therefore becomes the tip of the stream. + ![Alt text](../assets/cip-145/rules6.png) + +Using multi-prev Data Events allows us to reduce the number of dominant Data Events and converge the stream so that +there is only a single dominant Data Event, without any data abandoned on pruned branches. The stream's +converged/diverged state can be determined by looking at the `prev` fields of all the Data Events for that stream. Events that have invalid signatures cannot be tips, even if uncovered. This has important implications for Data Events -with expired CACAOs. In figures 5 and 6, if we assume event `A` has an expired CACAO, the aggregation layer can choose +with expired CACAOs. In figures 5 and 6, if we assume event `A` has an expired CACAO, the application layer can choose to repair the stream by creating a new Data Event `C` with `{ "prev": [CID_Data_B, CID_Time_2] }`. This effectively prunes the branch with Time Event `2` and merges Data Event `B` into the new event `C`. How this merge is implemented is -at the discretion of the aggregation layer. +at the discretion of the application. ### Example > ![lattice](../assets/cip-145/lattice.png) diff --git a/assets/cip-145/rules1.png b/assets/cip-145/rules1.png new file mode 100644 index 0000000..793fcfd Binary files /dev/null and b/assets/cip-145/rules1.png differ diff --git a/assets/cip-145/rules2.png b/assets/cip-145/rules2.png new file mode 100644 index 0000000..d02603c Binary files /dev/null and b/assets/cip-145/rules2.png differ diff --git a/assets/cip-145/rules3.png b/assets/cip-145/rules3.png new file mode 100644 index 0000000..8843946 Binary files /dev/null and b/assets/cip-145/rules3.png differ diff --git a/assets/cip-145/rules4.png b/assets/cip-145/rules4.png new file mode 100644 index 0000000..35877fb Binary files /dev/null and b/assets/cip-145/rules4.png differ diff --git a/assets/cip-145/rules5.png b/assets/cip-145/rules5.png new file mode 100644 index 0000000..8fbd3c1 Binary files /dev/null and b/assets/cip-145/rules5.png differ diff --git a/assets/cip-145/rules6.png b/assets/cip-145/rules6.png new file mode 100644 index 0000000..82e8a72 Binary files /dev/null and b/assets/cip-145/rules6.png differ