Skip to content

Commit 9884afb

Browse files
authored
Merge branch 'main' into no_default_as_host_vec
2 parents 5f72c5d + 3d80022 commit 9884afb

File tree

145 files changed

+7284
-6218
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

145 files changed

+7284
-6218
lines changed

.github/workflows/rust.yml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@ on:
77
push:
88
paths-ignore:
99
- '**.md'
10-
branches:
11-
- master
1210

1311
env:
1412
RUST_LOG: info
@@ -33,7 +31,7 @@ jobs:
3331
uses: actions/checkout@v2
3432

3533
- name: Install CUDA
36-
uses: Jimver/[email protected].4
34+
uses: Jimver/[email protected].21
3735
id: cuda-toolkit
3836
with:
3937
cuda: '11.2.2'
@@ -74,4 +72,4 @@ jobs:
7472
- name: Check documentation
7573
env:
7674
RUSTDOCFLAGS: -Dwarnings
77-
run: cargo doc --workspace --all-features --document-private-items --no-deps --exclude "optix" --exclude "path_tracer" --exclude "denoiser" --exclude "add" --exclude "ex*"
75+
run: cargo doc --workspace --all-features --document-private-items --no-deps --exclude "optix" --exclude "path_tracer" --exclude "denoiser" --exclude "add" --exclude "ex*"

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
[workspace]
2+
resolver = "2"
23
members = [
34
"crates/*",
45
"crates/optix/examples/ex*",

README.md

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,22 @@
77
</p>
88

99
<h3>
10-
<a href="guide">Guide</a>
10+
<a href="https://rust-gpu.github.io/Rust-CUDA/guide/index.html">Guide</a>
1111
<span> | </span>
1212
<a href="guide/src/guide/getting_started.md">Getting Started</a>
1313
<span> | </span>
1414
<a href="guide/src/features.md">Features</a>
1515
</h3>
16-
<strong>⚠️ The project is still in early development, expect bugs, safety issues, and things that don't work ⚠️</strong>
16+
<strong>⚠️ The project is still in early development, expect bugs, safety issues, and things that don't work ⚠️</strong>
1717
</div>
1818

19+
<br/>
20+
21+
> [!IMPORTANT]
22+
> This project is no longer dormant and is [being
23+
> rebooted](https://rust-gpu.github.io/blog/2025/01/27/rust-cuda-reboot).
24+
> Please contribute!
25+
1926
## Goal
2027

2128
The Rust CUDA Project is a project aimed at making Rust a tier-1 language for extremely fast GPU computing
@@ -28,7 +35,7 @@ Historically, general purpose high performance GPU computing has been done using
2835
provides a way to use Fortran/C/C++ code for GPU computing in tandem with CPU code with a single source. It also provides
2936
many libraries, tools, forums, and documentation to supplement the single-source CPU/GPU code.
3037

31-
CUDA is exclusively an NVIDIA-only toolkit. Many tools have been proposed for cross-platform GPU computing such as
38+
CUDA is exclusively an NVIDIA-only toolkit. Many tools have been proposed for cross-platform GPU computing such as
3239
OpenCL, Vulkan Computing, and HIP. However, CUDA remains the most used toolkit for such tasks by far. This is why it is
3340
imperative to make Rust a viable option for use with the CUDA toolkit.
3441

@@ -38,12 +45,12 @@ in recent years it has been shown time and time again that a specialized solutio
3845
of projects such as rust-gpu (for Rust -> SPIR-V).
3946

4047
Our hope is that with this project we can push the Rust GPU computing industry forward and make Rust an excellent language
41-
for such tasks. Rust offers plenty of benefits such as `__restrict__` performance benefits for every kernel, An excellent module/crate system,
48+
for such tasks. Rust offers plenty of benefits such as `__restrict__` performance benefits for every kernel, An excellent module/crate system,
4249
delimiting of unsafe areas of CPU/GPU code with `unsafe`, high level wrappers to low level CUDA libraries, etc.
4350

4451
## Structure
4552

46-
The scope of the Rust CUDA Project is quite broad, it spans the entirety of the CUDA ecosystem, with libraries and tools to make it
53+
The scope of the Rust CUDA Project is quite broad, it spans the entirety of the CUDA ecosystem, with libraries and tools to make it
4754
usable using Rust. Therefore, the project contains many crates for all corners of the CUDA ecosystem.
4855

4956
The current line-up of libraries is the following:
@@ -52,7 +59,7 @@ The current line-up of libraries is the following:
5259
- Generates highly optimized PTX code which can be loaded by the CUDA Driver API to execute on the GPU.
5360
- For the near future it will be CUDA-only, but it may be used to target amdgpu in the future.
5461
- `cuda_std` for GPU-side functions and utilities, such as thread index queries, memory allocation, warp intrinsics, etc.
55-
- *Not* a low level library, provides many utility functions to make it easier to write cleaner and more reliable GPU kernels.
62+
- _Not_ a low level library, provides many utility functions to make it easier to write cleaner and more reliable GPU kernels.
5663
- Closely tied to `rustc_codegen_nvvm` which exposes GPU features through it internally.
5764
- [`cudnn`](https://github.com/Rust-GPU/Rust-CUDA/tree/master/crates/cudnn) for a collection of GPU-accelerated primitives for deep neural networks.
5865
- `cust` for CPU-side CUDA features such as launching GPU kernels, GPU memory allocation, device queries, etc.
@@ -67,12 +74,23 @@ In addition to many "glue" crates for things such as high level wrappers for cer
6774
## Related Projects
6875

6976
Other projects related to using Rust on the GPU:
77+
7078
- 2016: [glassful](https://github.com/kmcallister/glassful) Subset of Rust that compiles to GLSL.
7179
- 2017: [inspirv-rust](https://github.com/msiglreith/inspirv-rust) Experimental Rust MIR -> SPIR-V Compiler.
7280
- 2018: [nvptx](https://github.com/japaric-archived/nvptx) Rust to PTX compiler using the `nvptx` target for rustc (using the LLVM PTX backend).
73-
- 2020: [accel](https://github.com/termoshtt/accel) Higher level library that relied on the same mechanism that `nvptx` does.
81+
- 2020: [accel](https://github.com/termoshtt/accel) Higher-level library that relied on the same mechanism that `nvptx` does.
7482
- 2020: [rlsl](https://github.com/MaikKlein/rlsl) Experimental Rust -> SPIR-V compiler (predecessor to rust-gpu)
75-
- 2020: [rust-gpu](https://github.com/EmbarkStudios/rust-gpu) Rustc codegen backend to compile Rust to SPIR-V for use in shaders, similar mechanism as our project.
83+
- 2020: [rust-gpu](https://github.com/Rust-GPU/rust-gpu) `rustc` compiler backend to compile Rust to SPIR-V for use in shaders, similar mechanism as our project.
84+
85+
## Usage
86+
```bash
87+
## setup your environment like:
88+
### export OPTIX_ROOT=/opt/NVIDIA-OptiX-SDK-9.0.0-linux64-x86_64
89+
### export OPTIX_ROOT_DIR=/opt/NVIDIA-OptiX-SDK-9.0.0-linux64-x86_64
90+
91+
## build proj
92+
cargo build
93+
```
7694

7795
## License
7896

@@ -86,4 +104,3 @@ at your discretion.
86104
### Contribution
87105

88106
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
89-

crates/blastoff/Cargo.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ authors = ["Riccardo D'Ambrosio <[email protected]>"]
66
repository = "https://github.com/Rust-GPU/Rust-CUDA"
77

88
[dependencies]
9-
bitflags = "1.3.2"
9+
bitflags = "2.8"
1010
cublas_sys = { version = "0.1", path = "../cublas_sys" }
1111
cust = { version = "0.3", path = "../cust", features = ["impl_num_complex"] }
12-
num-complex = "0.4.0"
13-
half = { version = "1.8.0", optional = true }
12+
num-complex = "0.4.6"
13+
half = { version = "2.4.1", optional = true }
1414

1515
[package.metadata.docs.rs]
1616
rustdoc-args = ["--html-in-header", "katex-header.html", "--cfg", "docsrs"]

crates/blastoff/src/context.rs

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,13 @@ impl CublasContext {
140140
) -> Result<T> {
141141
unsafe {
142142
// cudaStream_t is the same as CUstream
143-
sys::v2::cublasSetStream_v2(self.raw, mem::transmute(stream.as_inner())).to_result()?;
143+
sys::v2::cublasSetStream_v2(
144+
self.raw,
145+
mem::transmute::<*mut cust::sys::CUstream_st, *mut cublas_sys::v2::CUstream_st>(
146+
stream.as_inner(),
147+
),
148+
)
149+
.to_result()?;
144150
let res = func(self)?;
145151
// reset the stream back to NULL just in case someone calls with_stream, then drops the stream, and tries to
146152
// execute a raw sys function with the context's handle.
@@ -227,10 +233,11 @@ impl CublasContext {
227233
/// ```
228234
pub fn set_math_mode(&self, math_mode: MathMode) -> Result<()> {
229235
unsafe {
230-
Ok(
231-
sys::v2::cublasSetMathMode(self.raw, mem::transmute(math_mode.bits()))
232-
.to_result()?,
236+
Ok(sys::v2::cublasSetMathMode(
237+
self.raw,
238+
mem::transmute::<u32, cublas_sys::v2::cublasMath_t>(math_mode.bits()),
233239
)
240+
.to_result()?)
234241
}
235242
}
236243

crates/blastoff/src/level1.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ fn check_stride<T: BlasDatatype>(x: &impl GpuBuffer<T>, n: usize, stride: Option
2424
);
2525
}
2626

27-
/// Scalar and Vector-based operations such as `min`, `max`, `axpy`, `copy`, `dot`, `nrm2`, `rot`, `rotg`, `rotm`, `rotmg`, `scal`, and `swap`.
28-
27+
/// Scalar and Vector-based operations such as `min`, `max`, `axpy`, `copy`, `dot`,
28+
/// `nrm2`, `rot`, `rotg`, `rotm`, `rotmg`, `scal`, and `swap`.
2929
impl CublasContext {
3030
/// Same as [`CublasContext::amin`] but with an explicit stride.
3131
///

crates/blastoff/src/lib.rs

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -93,22 +93,17 @@ pub(crate) mod private {
9393

9494
/// An optional operation to apply to a matrix before a matrix operation. This includes
9595
/// no operation, transpose, or conjugate transpose.
96-
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
96+
#[derive(Debug, Default, Clone, Copy, PartialEq, Eq, Hash)]
9797
pub enum MatrixOp {
9898
/// No operation, leave the matrix as is. This is the default.
99+
#[default]
99100
None,
100101
/// Transpose the matrix in place.
101102
Transpose,
102103
/// Conjugate transpose the matrix in place.
103104
ConjugateTranspose,
104105
}
105106

106-
impl Default for MatrixOp {
107-
fn default() -> Self {
108-
MatrixOp::None
109-
}
110-
}
111-
112107
impl MatrixOp {
113108
/// Returns the corresponding `cublasOperation_t` for this operation.
114109
pub fn to_raw(self) -> sys::v2::cublasOperation_t {

crates/cuda_builder/Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,6 @@ readme = "../../README.md"
1111
[dependencies]
1212
rustc_codegen_nvvm = { version = "0.3", path = "../rustc_codegen_nvvm" }
1313
nvvm = { path = "../nvvm", version = "0.1" }
14-
serde = { version = "1.0.130", features = ["derive"] }
15-
serde_json = "1.0.68"
14+
serde = { version = "1.0.217", features = ["derive"] }
15+
serde_json = "1.0.138"
1616
find_cuda_helper = { version = "0.2", path = "../find_cuda_helper" }

crates/cuda_builder/src/lib.rs

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -246,13 +246,13 @@ impl CudaBuilder {
246246

247247
/// Emit LLVM IR, the exact same as rustc's `--emit=llvm-ir`.
248248
pub fn emit_llvm_ir(mut self, emit_llvm_ir: bool) -> Self {
249-
self.emit = emit_llvm_ir.then(|| EmitOption::LlvmIr);
249+
self.emit = emit_llvm_ir.then_some(EmitOption::LlvmIr);
250250
self
251251
}
252252

253253
/// Emit LLVM Bitcode, the exact same as rustc's `--emit=llvm-bc`.
254254
pub fn emit_llvm_bitcode(mut self, emit_llvm_bitcode: bool) -> Self {
255-
self.emit = emit_llvm_bitcode.then(|| EmitOption::Bitcode);
255+
self.emit = emit_llvm_bitcode.then_some(EmitOption::Bitcode);
256256
self
257257
}
258258

@@ -376,10 +376,13 @@ fn invoke_rustc(builder: &CudaBuilder) -> Result<PathBuf, CudaBuilderError> {
376376

377377
let new_path = get_new_path_var();
378378

379-
let mut rustflags = vec![format!(
380-
"-Zcodegen-backend={}",
381-
rustc_codegen_nvvm.display(),
382-
)];
379+
let mut rustflags = vec![
380+
format!("-Zcodegen-backend={}", rustc_codegen_nvvm.display()),
381+
"-Zcrate-attr=feature(register_tool)".into(),
382+
"-Zcrate-attr=register_tool(nvvm_internal)".into(),
383+
"-Zcrate-attr=no_std".into(),
384+
"-Zsaturating_float_casts=false".into(),
385+
];
383386

384387
if let Some(emit) = &builder.emit {
385388
let string = match emit {
@@ -432,13 +435,12 @@ fn invoke_rustc(builder: &CudaBuilder) -> Result<PathBuf, CudaBuilderError> {
432435
}
433436

434437
let mut cargo = Command::new("cargo");
435-
cargo.args(&[
438+
cargo.args([
436439
"build",
437440
"--lib",
438441
"--message-format=json-render-diagnostics",
439442
"-Zbuild-std=core,alloc",
440-
"--target",
441-
"nvptx64-nvidia-cuda",
443+
"--target=nvptx64-nvidia-cuda",
442444
]);
443445

444446
cargo.args(&builder.build_args);
@@ -523,7 +525,7 @@ fn get_last_artifact(out: &str) -> Option<PathBuf> {
523525
}
524526
})
525527
.filter(|line| line.reason == "compiler-artifact")
526-
.last()
528+
.next_back()
527529
.expect("Did not find output file in rustc output");
528530

529531
let mut filenames = last

crates/cuda_std/Cargo.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ repository = "https://github.com/Rust-GPU/Rust-CUDA"
88
readme = "../../README.md"
99

1010
[dependencies]
11-
vek = { version = "0.15.1", default-features = false, features = ["libm"] }
11+
vek = { version = "0.17.1", default-features = false, features = ["libm"] }
1212
cuda_std_macros = { version = "0.2", path = "../cuda_std_macros" }
13-
half = "1.7.1"
14-
bitflags = "1.3.2"
15-
paste = "1.0.5"
13+
half = "2.4.1"
14+
bitflags = "2.8"
15+
paste = "1.0.15"

0 commit comments

Comments
 (0)