Skip to content

Conversation

@theduke
Copy link
Contributor

@theduke theduke commented Apr 23, 2025

This commit contains two notable changes:

  • The VirtualFile trait is extended with a new as_owned_buffer() method.
    This allows cheap copying of file data if the file is already fully in
    memory, and enables optimisations for code that is aware of this
    functionality by allowing zero-copy cloning.
    This is especially useful for data that is already mmaped.

    The new method has an empty default impl, but is implemented for
    appropriate files (ReadOnlyFile, webc volume files).

  • Modifies the Runtime::load_module method to take a new
    HashedModuleData struct

    This has multiple benefits:

    • HashedModuleData is safe by construction, thanks to private fields,
      so the hash it contains will always be valid for the module data.
      This removes a source of bugs due to passing a wrong hash for module
      data to module loaders etc.
    • HashedModuleData can be cheaply creating from a
      BinaryPackageCommand, preventing redundant cloning of data
    • The load_module method now receives owned data that is cheaply
      clonable.
      Previously it received a slice, which introduces the neccessity of
      cloning the data if it needs to be passed on to a different thread.

In combination this PR can significantly reduce memory usage for
high-concurrency scenarios.

Closes #5527

@theduke theduke requested a review from Arshia001 April 23, 2025 17:13
@promptless
Copy link

promptless bot commented Apr 23, 2025

✅ No documentation updates required.

@theduke theduke force-pushed the optimise-wasm-compilation branch 3 times, most recently from 7831689 to 901db2b Compare July 2, 2025 08:52
@theduke theduke requested a review from Arshia001 July 2, 2025 08:52
This commit contains two notable changes:

* The VirtualFile trait is extended with a new as_owned_buffer() method.
  This allows cheap copying of file data if the file is already fully in
  memory, and enables optimisations for code that is aware of this
  functionality by allowing zero-copy cloning.
  This is especially useful for data that is already mmaped.

  The new method has an empty default impl, but is implemented for
  appropriate files (ReadOnlyFile, webc volume files).

* Modifies the Runtime::load_module method to take a new
  `HashedModuleData` struct

  This has multiple benefits:
  - HashedModuleData is safe by construction, thanks to private fields,
    so the hash it contains will always be valid for the module data.
    This removes a source of bugs due to passing a wrong hash for module
    data to module loaders etc.
  - HashedModuleData can be cheaply creating from a
    BinaryPackageCommand, preventing redundant cloning of data
  - The load_module method now receives owned data that is cheaply
    clonable.
    Previously it received a slice, which introduces the neccessity of
    cloning the data if it needs to be passed on to a different thread.

In combination this PR can significantly reduce memory usage for
high-concurrency scenarios.
Copy link
Member

@syrusakbary syrusakbary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good, but we need to do the small change to preserve API compatibility as discussed in Slack

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves module loading performance and API correctness by introducing a HashedModuleData struct and extending the VirtualFile trait. The changes focus on reducing memory allocations and eliminating the possibility of hash/data mismatches in the module loading pipeline.

  • Introduces HashedModuleData struct that safely pairs WebAssembly module data with its hash
  • Adds as_owned_buffer() method to VirtualFile trait for zero-copy optimizations
  • Deprecates old load_module methods in favor of new load_hashed_module methods

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lib/wasix/src/runtime/module_cache/hashed_module.rs New module defining HashedModuleData struct with safe hash/data pairing
lib/wasix/src/runtime/mod.rs Updates Runtime trait with new hashed module loading methods and deprecates old ones
lib/wasix/src/state/linker.rs Updates module loading to use HashedModuleData and as_owned_buffer() optimization
lib/wasix/src/bin_factory/mod.rs Refactors executable loading to use OwnedBuffer and new hashed module APIs
lib/wasix/src/bin_factory/exec.rs Updates function signatures to accept HashedModuleData instead of byte slices
lib/wasix/src/os/command/builtins/cmd_wasmer.rs Updates command execution to use new HashedModuleData API
lib/virtual-fs/src/lib.rs Extends VirtualFile trait with as_owned_buffer() method
lib/virtual-fs/src/mem_fs/file.rs Implements as_owned_buffer() for memory filesystem files
lib/virtual-fs/src/webc_volume_fs.rs Implements as_owned_buffer() for webc volume files
lib/package/src/utils.rs Adds is_container() utility function for container detection
examples/wasi.rs Updates example to use new HashedModuleData API

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Keep the old load_module methods as deprecated, but switch to new
load_module_hashed.
@theduke theduke force-pushed the optimise-wasm-compilation branch from bc7305f to 00543bd Compare September 23, 2025 10:07
@theduke theduke force-pushed the optimise-wasm-compilation branch from 9aa1b1f to 0d93bc4 Compare September 23, 2025 11:33
@theduke theduke merged commit b60af6b into main Sep 23, 2025
92 checks passed
@theduke theduke deleted the optimise-wasm-compilation branch September 23, 2025 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WASIX: Improve module loading API correctness and performance

3 participants