-
Notifications
You must be signed in to change notification settings - Fork 8
2025 Cross‐Origin Storage (COS)
Manuel Rego edited this page Jun 12, 2025
·
1 revision
Created: 2025-06-03
Last updated: 2025-06-03
Presenters:
Present (optional):
- You
- You
- And you
- …
Issue: https://github.com/Igalia/webengineshackfest/issues/68
Slides: Cross-Origin Storage (COS)
Problem: several web apps might want to use the same model. So, each one of them will need to download those big models (e.g., in order of GBs, or several GBs).
This has several problems (wasteful, not much free storage on some devices, etc…).
Two observations about resources:
- Often hosted in several CDNs, but with different URLs. Using hashes might be a better idea
- (missed the second one: was it about having fallbacks for the same resource?)
Already some requests for support (https://github.com/explainers-by-googlers/cross-origin-storage/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22expression%20of%20support%22):
- Transformers.js
- WebLLM
- Why not just use HTTP headers instead of JS API? Something similar to CORS which would disable cache partitioning when allowed.
- To use hashes rather than URLs
- Aren't ETag headers possible as hashes?
- How to do this with URLs?
- How about this: user agent issues HEAD request for a URL, gets back ETag response header, compares the hash with its cache and if user allows it, the file is taken from cache instead of downloading.
- How to do this with URLs?
- Aren't ETag headers possible as hashes?
- Custom scheme (cos://<hash>)?
- To use hashes rather than URLs
- Permissions: Don’t reveal that the user has rejected the request.
- A: User agents could lie, requires deeper thinking.
- Why an array?
- Applications typically need more than one file, e.g. WebLLM where the weights are in chunks, several hundred depending on the model
- Can you store file names to the SHA256 hashes, e.g. for a cache contents UI
- Difficult for i18n
- [Anne] Previous attempts failed due to fingerprinting concerns etc.
- [Tom] Different perspective now with huge AI models
- [Chris] Not having a cross-origin caching mechanism will effectively prevent broader usage of WebLLM etc. because user’s disks will fill up quickly
- [Anne] Model versions will also change, so you may have to cache multiple versions of the model → isn’t there king making in the hash?
- [Shane] Global approval of certain data files?
- [Shane] Locale data loading use case
- … language packs for digitally disadvantaged languages
- Why would allowlist be ok? What is the reasoning behind that? Wouldn’t each country want that models from their companies/universities/etc. would be in the allow list (at least when it comes to the speech recognition).
- [Chris] Wouldn’t need the allowlist personally
- Rely on file sizes? Minimum size?
- [Tom] It’s easy to inflate files to make them bigger
- Cross-Origin Storage API seems to be an augmentation of the cache API
- [Chris] It’s based on HTTP requests, so we’re back to the monarch making
- [Shane] For the allowlist of non-prompt resources, you could start with an empty list and let the user decide if in the future there should be prompting when the same resource is requested by a different origin.