Skip to content

Conversation

shoaibkamil
Copy link
Contributor

This PR makes it possible to specify two environment variables HL_METAL_COMPILER and HL_METAL_LINKER to make it possible for Halide to compile Metal shaders into embedded .metallibs (Apple's libraries for Metal code). For example:

$ HL_METAL_COMPILER="xcrun -sdk macosx metal" HL_METAL_LINKER="xcrun -sdk macosx metallib" ./correctness_hello_gpu

This works for AOT code as well.

Looking for feedback on where to document this, and any other concerns folks may have. Feedback on how to test on the buildbots is also very welcome.

@alexreinking
Copy link
Member

Is there any reason we wouldn't want to make this the default behavior?

@alexreinking
Copy link
Member

Regarding documentation, doc/Metal.md would be a natural place to put this.

@abadams
Copy link
Member

abadams commented Sep 17, 2025

I think it can't be the default because it rules out cross compilation.

@alexreinking
Copy link
Member

The other question I have is whether using this option raises any sort of compatibility concerns such that it should be a target option?

@shoaibkamil
Copy link
Contributor Author

Is there any reason we wouldn't want to make this the default behavior?

I wanted to make this strictly opt-in, given it doesn't work with cross-compilation, and not invisible/automatic because users who want compiled shaders likely want to know if compilation failed for whatever reason.

The other question I have is whether using this option raises any sort of compatibility concerns such that it should be a target option?

I haven't been able to think of a scenario where this might happen. Apple's Metal IR/metallib seems to be intermediate formats compatible with any Apple-supported GPU. Shaders in text from can target both iOS and macOS, and it's unclear whether a single metallib will work on both (since the SDK is specified when calling the Metal compiler), but this isn't a real concern given we already generate CPU-side code that isn't portable across the two platforms.

@alexreinking
Copy link
Member

The hannk failures are fixed in #8815 (not yet merged)

@zvookin
Copy link
Member

zvookin commented Sep 23, 2025

For security reasons, I'm a little uneasy doing environment variable lookups with no way to disable them at all. E.g. if someone is using Halide as JIT in a process, an attacker could arrange for the env vars to be set and thus hijack compilation. (It also might be difficult to set them in an application when one does want to use them, but I'm less concerned about that because at least in that a case the developer is aware they exist.)

A potential solution is to call setenv explicitly in application code before invoking any Halide compilations (that could use Metal, etc.). This presumes setenv is a thing everywhere and that we want to standardize mutating the environment as part of our API. If we go down that path, this should be carefully documented as the approach we expect people to take. E.g. adding a page on all of Halide's environment variable dependencies. (Or if one already exists, adding these to that page.) An API call to clear all Halide environment variables might also be useful. (It basically says "Hey, I don't want any non-default values creeping into Halide compilation from the environment.")

@alexreinking
Copy link
Member

alexreinking commented Sep 23, 2025

Adding to @zvookin's point... The values of the environment variables in this PR each reach a call to system, which invokes the shell. If there's no reason to trust their values, this is a pretty serious issue.

We overuse environment variables in this codebase, anyway. For example, I only recently discovered we can pass arbitrary extra flags to LLVM's options parser via HL_LLVM_ARGS.

Is there anything more limited we can do here?

For instance, it might be acceptable to sniff the target to determine the correct SDK to call for metal / metallib. It's reasonable to ask users to put xcrun on the PATH. Then use execvp to call xcrun. The only need for an environment variable would be to gate the feature behind a boolean flag.

@alexreinking
Copy link
Member

E.g. adding a page on all of Halide's environment variable dependencies.

I've raised this issue before and I was told that we don't want to document them because doing so would enshrine them as something other than hacks.

I'm a little uneasy doing environment variable lookups with no way to disable them at all. [...] we want to standardize mutating the environment as part of our API. If we go down that path, this should be carefully documented as the approach we expect people to take.

We inconsistently use getenv vs our get_env_variable helper. We could start by standardizing on get_env_variable and then:

  1. Replacing all the strings with an enum of environment variables we expose
  2. Document each of these with Doxygen doc strings
  3. Add an environment variable that acts as a floodgate. Something like HL_NO_ENV. When enabled, no other environment data is read.
  4. Add C++ APIs for individually setting each option without needing to go through setenv.

I'm sure @zvookin could refine this sketch into something better.

@abadams
Copy link
Member

abadams commented Sep 23, 2025

To move this specific case forwards without shaving a herd of yaks, maybe they could be globals in libHalide with a setter. If someone really wants them driven by env vars they can write:

set_metal_linker(get_env_variable("HL_METAL_LINKER"));

in their JIT compiler or AOT generator.

globals have downsides, but they're no worse than env vars. It's hard to imagine a context where you want to use a different metal compiler/linker per compilation within the same process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants