Skip to content

Conversation

erights
Copy link
Contributor

@erights erights commented Sep 16, 2025

Closes: #XXXX
Refs: #2673 , #2675

Description

For data that would round trip through JSON(*), I think people have not appreciated how close smallcaps is to JSON. For such data, if the encoding to JSON has no special strings, i.e., strings that begin with a character between ! and _, then smallcaps and JSON are essentially equivalent (modulo required hardening).

This PR only revises the smallcaps cheatsheet to make this clearer.

(*) without procedural interventions such as toJSON, replacers, or resolvers.

Security Considerations

To the extent we can start using smallcaps where we currently use JSON, we set our selves up for better security. Once we start using the non-trapping integrity trait shim (#2673 and #2675), then once we know a value is passable, we'll know that it can be marshaled without reentrancy vulnerabilities.

Scaling Considerations

smallcaps encoding builds on JSON encoding, and so will always be somewhat slower than the JSON it builds on. We won't know how much, or how to speed it up, until we measure.

Documentation Considerations

The point. This PR only expands the cheatsheet. We should have a more complete explanation with all the caveats somewhere else. Where?

Testing Considerations

This PR makes expanded claims about invariants. At some point, we should have fastcheck tests for testing all these invariants.

Compatibility Considerations

This PR itself is only a doc change, so in that sense, none.

If we start switching some JSON uses to smallcaps, non-passable values may cause throws rather than silent misbehavior.

Upgrade Considerations

This PR itself is only a doc change, so in that sense, none.

If we start switching some durable storage from JSON to smallcaps, we'll have an upgrade nightmare. For durably stored JSON encodings, we should not change those to smallcaps without deep carefulness. Not recommended.

@erights erights requested a review from Copilot September 16, 2025 07:57
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR expands the documentation for Smallcaps encoding readability invariants in the marshal package by clarifying when smallcaps and JSON encodings behave identically.

  • Reformats existing documentation with better emphasis on special strings
  • Adds a new "Readability Invariants" section explaining when smallcaps and JSON are equivalent
  • Provides clearer guidance for users working with simple values

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@erights erights requested a review from kriskowal September 16, 2025 07:58
@erights
Copy link
Contributor Author

erights commented Sep 16, 2025

@kriskowal , I only changed one *.md file and all my CI tests are red. And again on retry. Help!

@erights
Copy link
Contributor Author

erights commented Sep 16, 2025

@kriskowal , I only changed one *.md file and all my CI tests are red. And again on retry. Help!

Seems healed. Nevermind.

@erights erights marked this pull request as ready for review September 16, 2025 18:33
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

| selector | Selector | `makeSelector('foo')` | `"%foo"` // converting from symbol |
| passable symbols | Symbol | `passableSymbolForName('foo')` | `"%foo"` // in transition |
| copyArray | List | `[a,b]` | `[<a>,<b>]` |
| copyRecord | Struct | `{x:a,y:b}` | `{<x>:<a>,<y>:<b>}` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what pass-styles are allowed for x and y? is it only string and thus the <x> and <y> are simply the encoded strings ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Resolving.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in scope for this PR, but it would be nice to clarify that in this doc. I ended up finding the information in the ocapn spec, but besides that, most of the doc is pretty standalone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolving this one because it should be simple enough to do. I'll consider it as a drive-by.

@erights erights requested a review from mhofman September 16, 2025 22:32
Copy link
Contributor

@mhofman mhofman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An actual review of the invariants as requested

Every JSON encoding with no special strings anywhere decodes to itself.
## Readability Invariants

For every JSON encoding with no special strings, the JSON and smallcaps decodings are the same.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copilot had a point here. What does it mean for a "JSON encoding" to be with a special string? Special string is in the realm of the value to be encoded (or that was decoded).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any JSON encoding, there is a concrete lexical element for "string" that is always quoted. https://www.json.org/json-en.html


For every JSON encoding with no special strings, the JSON and smallcaps decodings are the same.

For every value with no special strings that round trips through JSON, the JSON and smallcaps encodings are the same.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we established that the ocapn specification does not define a canonical encoding. Isn't it allowed for the encoding to be different yet equivalent, and as such a small caps implementation is not required to follow the exact semantics of json encoding (e.g. in struct field order). Furthermore there are many ways to encode strings, and there are indentation settings in JSON.serialize that would similarly produce different encodings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. I'm not referring to any concrete encoding mandated by ocapn. Indeed, ocaps itself is likely to mandate something completely different. I only speaking here about the smallcaps vs JSON, and about the smallcaps concrete encoding of the ocapn abstract-syntax/data-model.

@erights erights force-pushed the markm-another-smallcaps-invariant branch 3 times, most recently from f3fe9d6 to 42854fa Compare September 18, 2025 00:37
@erights erights force-pushed the markm-another-smallcaps-invariant branch from 42854fa to eb014a9 Compare September 19, 2025 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants