Skip to content

Commit 81fc555

Browse files
authored
Merge branch 'master' into libSparse
2 parents 85de8e3 + 4075446 commit 81fc555

File tree

8 files changed

+134
-93
lines changed

8 files changed

+134
-93
lines changed

.github/dependabot.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ updates:
44
- package-ecosystem: "github-actions"
55
directory: "/" # Location of package manifests
66
schedule:
7-
interval: "weekly"
7+
interval: "monthly"

.github/workflows/CI.yml

Lines changed: 10 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -14,48 +14,31 @@ jobs:
1414
matrix:
1515
version:
1616
- '1'
17+
- 'lts'
1718
os:
1819
- ubuntu-latest
1920
- windows-latest
2021
- macOS-13 # intel
21-
- macOS-14 # arm
2222
arch:
2323
- x64
24-
- aarch64
25-
exclude:
26-
- os: ubuntu-latest
24+
include:
25+
- os: macOS-latest
2726
arch: aarch64
28-
- os: windows-latest
27+
version: 'lts'
28+
- os: macOS-latest
2929
arch: aarch64
30-
- os: macOS-13
31-
arch: x86
32-
- os: macOS-13
33-
arch: aarch64
34-
- os: macOS-14
35-
arch: x86
36-
- os: macOS-14
37-
arch: x64
30+
version: '1'
3831
steps:
39-
- uses: actions/checkout@v4
32+
- uses: actions/checkout@v5
4033
- uses: julia-actions/setup-julia@v2
4134
with:
4235
version: ${{ matrix.version }}
4336
arch: ${{ matrix.arch }}
44-
- uses: actions/cache@v4
45-
env:
46-
cache-name: cache-artifacts
47-
with:
48-
path: ~/.julia/artifacts
49-
key: ${{ runner.os }}-test-${{ env.cache-name }}-${{ hashFiles('**/Project.toml') }}
50-
restore-keys: |
51-
${{ runner.os }}-test-${{ env.cache-name }}-
52-
${{ runner.os }}-test-
53-
${{ runner.os }}-
37+
- uses: julia-actions/cache@v2
5438
- uses: julia-actions/julia-buildpkg@v1
5539
- uses: julia-actions/julia-runtest@v1
5640
- uses: julia-actions/julia-processcoverage@v1
57-
- uses: codecov/codecov-action@v4
41+
- uses: codecov/codecov-action@v5
5842
with:
59-
file: lcov.info
43+
files: lcov.info
6044
token: ${{ secrets.CODECOV_TOKEN }}
61-

.github/workflows/TagBot.yml

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,7 @@ on:
66
workflow_dispatch:
77
inputs:
88
lookback:
9-
default: 3
10-
permissions:
11-
actions: read
12-
checks: read
13-
contents: write
14-
deployments: read
15-
issues: read
16-
discussions: read
17-
packages: read
18-
pages: read
19-
pull-requests: read
20-
repository-projects: read
21-
security-events: read
22-
statuses: read
9+
default: "3"
2310
jobs:
2411
TagBot:
2512
if: github.event_name == 'workflow_dispatch' || github.actor == 'JuliaTagBot'
@@ -28,6 +15,4 @@ jobs:
2815
- uses: JuliaRegistries/TagBot@v1
2916
with:
3017
token: ${{ secrets.GITHUB_TOKEN }}
31-
# Edit the following line to reflect the actual name of the GitHub Secret containing your private key
3218
ssh: ${{ secrets.DOCUMENTER_KEY }}
33-
# ssh: ${{ secrets.NAME_OF_MY_SSH_PRIVATE_KEY_SECRET }}

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
name = "AppleAccelerate"
22
uuid = "13e28ba4-7ad8-5781-acae-3021b1ed3924"
3-
version = "0.4.1"
3+
version = "0.4.4"
44

55
[deps]
66
Libdl = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

README.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This provides a Julia interface to some of the
44
[macOS Accelerate framework](https://developer.apple.com/documentation/accelerate). At
55
the moment, this package provides:
66
1. Access to Accelerate BLAS and LAPACK using the [libblastrampoline](https://github.com/JuliaLinearAlgebra/libblastrampoline) framework,
7-
2. An interface to the [array-oriented functions](https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vecLib/index.html#//apple_ref/doc/uid/TP30000414-357225),
7+
2. An interface to the [array-oriented functions](https://developer.apple.com/documentation/accelerate/veclib),
88
which provide a vectorised form for many common mathematical functions
99

1010
The performance is significantly better than using standard libm functions in some cases, though there does appear to be some reduced accuracy.
@@ -36,9 +36,33 @@ Some additional functions that are also available:
3636
* `fdiv(x,y)`: divide (`x ./ y` in Base)
3737
* `sincos(x)`: returns `(sin(x), cos(x))`
3838

39-
## Setting the number of threads
39+
## BLAS Multi-threading
40+
41+
Accelerate BLAS is multi-threaded by default. Starting with macOS 15 (Sequoia), `get_num_threads()` and `set_num_threads()` are available. The Accelerate API only allows the user to choose single-threaded operation or multi-threaded operation. Thus, `set_num_threads(1)` will give single-threaded operation, and any number greater than 1 will give multi-threaded operation. `get_num_threads()` will return `1` for single-threaded operation and `Sys.CPU_THREADS` for multi-threaded operation. The following example is on Apple M2 Max, where `Sys.CPU_THREADS` is `8`.
42+
43+
```julia
44+
julia> using AppleAccelerate
45+
46+
julia> AppleAccelerate.get_num_threads() # Default is multi-threaded. Return value is `Sys.CPU_THREADS`
47+
8
48+
49+
julia> AppleAccelerate.set_num_threads(1) # Set single-threaded operation
50+
51+
julia> AppleAccelerate.get_num_threads()
52+
1
53+
54+
julia> AppleAccelerate.set_num_threads(4) # Set multi-threaded operation, with input value > 1.
55+
56+
julia> AppleAccelerate.get_num_threads() # Return value is `Sys.CPU_THREADS` for multi-threaded operation
57+
8
58+
```
59+
60+
On older versions of macOS (< 15), the number of threads can be changed with the `VECLIB_MAXIMUM_THREADS` environment variable before starting Julia:
61+
62+
```julia
63+
VECLIB_MAXIMUM_THREADS=1 julia
64+
```
4065

41-
Accelerate is multithreaded by default. You can set an upper limit on the number of threads through the `VECLIB_MAXIMUM_THREADS` environment variable; for example, for single-threaded execution, start Julia as `VECLIB_MAXIMUM_THREADS=1 julia`. Accelerate does not support the `BLAS.set_num_threads(nthreads)` and `BLAS.get_num_threads()` API used by other BLAS backends (`set_num_threads` is a no-op and `get_num_threads` returns a hardcoded default).
4266

4367
## Example
4468

src/AppleAccelerate.jl

Lines changed: 69 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,14 @@
11
module AppleAccelerate
2-
32
using LinearAlgebra, Libdl
4-
#using LAPACK_jll, LAPACK32_jll # Needed if use_external_lapack == true
53

6-
# For now, only use BLAS from Accelerate (that is to say, vecLib)
7-
global const libacc = "/System/Library/Frameworks/Accelerate.framework/Accelerate"
8-
global const libacc_info_plist = "/System/Library/Frameworks/Accelerate.framework/Versions/Current/Resources/Info.plist"
4+
const libacc = "/System/Library/Frameworks/Accelerate.framework/Accelerate"
5+
const libacc_info_plist = "/System/Library/Frameworks/Accelerate.framework/Versions/Current/Resources/Info.plist"
6+
7+
# VecLib Threading API: /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks/vecLib.framework/Headers/thread_api.h
8+
@enum Threading::Cuint begin
9+
BLAS_THREADING_MULTI_THREADED
10+
BLAS_THREADING_SINGLE_THREADED
11+
end
912

1013
function forward_accelerate(interface::Symbol;
1114
new_lapack::Bool = interface == :ilp64,
@@ -27,12 +30,16 @@ function forward_accelerate(interface::Symbol;
2730
end
2831

2932
"""
30-
load_accelerate(;clear = true, verbose = false)
33+
load_accelerate(; clear = false, verbose = false, load_ilp64 = true)
3134
32-
Load Accelerate, replacing the current LBT forwarding tables if `clear` is `true`.
33-
Attempts to load the ILP64 symbols if `load_ilp64` is `true`, and errors out if unable.
35+
Load Accelerate, replacing the current LBT forwarding tables if `clear` is `true`. `clear`
36+
is `false` by default to allow for OpenBLAS to act as a fallback for operations missing
37+
from Accelerate, such as `gemmt`. Attempts to load the ILP64 symbols if `load_ilp64` is
38+
`true`, and errors out if unable.
3439
"""
35-
function load_accelerate(;clear::Bool = true, verbose::Bool = false, load_ilp64::Bool = true, use_external_lapack::Bool = true)
40+
function load_accelerate(; clear::Bool = false,
41+
verbose::Bool = false,
42+
load_ilp64::Bool = true)
3643
# Silently exit on non-Accelerate-capable platforms
3744
@static if !Sys.isapple()
3845
return
@@ -44,25 +51,17 @@ function load_accelerate(;clear::Bool = true, verbose::Bool = false, load_ilp64:
4451

4552
# Check to see if we can load ILP64 symbols
4653
if load_ilp64 && dlsym_e(libacc_hdl, "dgemm\$NEWLAPACK\$ILP64") == C_NULL
47-
error("Unable to load ILP64 interface from '$(libacc)'; You are running macOS version $(get_macos_version()), you need v13.4+")
54+
@error "Unable to load ILP64 interface from '$(libacc)'; You are running macOS version $(get_macos_version()), you need v13.4+"
4855
end
4956

5057
# First, load :lp64 symbols, optionally clearing the current LBT forwarding tables
5158
forward_accelerate(:lp64; new_lapack=true, clear, verbose)
5259
if load_ilp64
5360
forward_accelerate(:ilp64; new_lapack=true, verbose)
5461
end
55-
56-
# Next, load an external LAPACK, if requested
57-
if use_external_lapack
58-
if load_ilp64
59-
BLAS.lbt_forward(LAPACK_jll.liblapack_path; suffix_hint="64_", verbose)
60-
end
61-
BLAS.lbt_forward(LAPACK32_jll.liblapack32_path; verbose)
62-
end
6362
end
6463

65-
function get_macos_version()
64+
function get_macos_version(normalize=true)
6665
@static if !Sys.isapple()
6766
return nothing
6867
end
@@ -78,17 +77,64 @@ function get_macos_version()
7877
return nothing
7978
end
8079

81-
return VersionNumber(only(m.captures))
80+
ver = VersionNumber(only(m.captures))
81+
if normalize && ver.major == 16
82+
return VersionNumber(26, ver.minor, ver.patch)
83+
end
84+
return ver
85+
end
86+
87+
function set_num_threads(n::LinearAlgebra.BlasInt)
88+
@static if Sys.isapple()
89+
if get_macos_version() < v"15"
90+
@warn "The threading API is only available in macOS 15 and later"
91+
return -1
92+
end
93+
else
94+
return -1
95+
end
96+
97+
retval::Cint = -1
98+
if n == 1
99+
retval = ccall((:BLASSetThreading, libacc), Cint, (Cint,), BLAS_THREADING_SINGLE_THREADED)
100+
elseif n > 1
101+
retval = ccall((:BLASSetThreading, libacc), Cint, (Cint,), BLAS_THREADING_MULTI_THREADED)
102+
end
103+
@assert retval == 0 "AppleAccelerate: Call to BlasSetThreading failed"
104+
return nothing
105+
end
106+
107+
function get_num_threads()::LinearAlgebra.BlasInt
108+
@static if Sys.isapple()
109+
if get_macos_version() < v"15"
110+
@warn "The threading API is only available in macOS 15 and later"
111+
return -1
112+
end
113+
else
114+
return -1
115+
end
116+
117+
retval::Threading = ccall((:BLASGetThreading, libacc), Threading, ())
118+
if retval == BLAS_THREADING_SINGLE_THREADED
119+
return LinearAlgebra.BlasInt(1)
120+
elseif retval == BLAS_THREADING_MULTI_THREADED
121+
return ccall((:APPLE_NTHREADS, libacc), LinearAlgebra.BlasInt, ())
122+
else
123+
@error "AppleAccelerate: Call to BlasGetThreading failed"
124+
end
82125
end
83126

84127
function __init__()
85128
Sys.isapple() || return
86129
ver = get_macos_version()
87130
# Default to loading the ILP64 interface on macOS 13.3+
88131
# dsptrf has a bug in the initial release of the $NEWLAPACK symbols in 13.3.
89-
# Thus use macOS 13.4 for ILP64 and a correct LAPACK
90-
ver < v"13.4" && return
91-
load_accelerate(; load_ilp64=true, use_external_lapack=false)
132+
# Thus use macOS 13.4 for ILP64, a correct LAPACK, and threading APIs
133+
if ver < v"13.4"
134+
@info AppleAccelerate.jl needs macOS 13.4 or later
135+
return
136+
end
137+
load_accelerate(; clear = false, load_ilp64=true)
92138
end
93139

94140
if Sys.isapple()

src/Util.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ tupletypelength(a)=length(a.parameters)
77

88
const OPS = Dict{Symbol,Tuple{Symbol, Symbol, Symbol}}(:+ => (:vadd, :vsadd, :vsadd),
99
:- => (:vsub, :vssub, :svsub),
10-
:* => (:vadd, :vsmul, :vsmul),
11-
:/ => (:vadd, :vsdiv, :vsdiv),)
10+
:* => (:vmul, :vsmul, :vsmul),
11+
:/ => (:vdiv, :vsdiv, :vsdiv),)
1212

1313
macro replaceBase(fs...)
1414
b = Expr(:block)

test/runtests.jl

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -338,8 +338,8 @@ end
338338
=#
339339
end
340340

341-
if AppleAccelerate.get_macos_version() < v"13.3"
342-
@info("AppleAccelerate.jl needs macOS >= 13.3 for BLAS forwarding. Not testing forwarding capabilities.")
341+
if AppleAccelerate.get_macos_version() < v"13.4"
342+
@info("AppleAccelerate.jl needs macOS >= 13.4 for BLAS forwarding. Not testing forwarding capabilities.")
343343
exit(0)
344344
end
345345

@@ -386,24 +386,27 @@ end
386386
@test BLAS.dot(a, a) 14f0
387387
end
388388

389-
# Run all the LinearAlgebra stdlib tests, but with Accelerate. We still
390-
# use `Base.runtests()` to get multithreaded, distributed execution
391-
# to cut down on CI times, and also to restart workers that trip over
392-
# the testing RSS limit. In order for distributed workers to use Accelerate,
393-
# we'll modify the test source code so that it imports Accelerate:
394-
395-
@testset "Full LinearAlgebra test suite" begin; mktempdir() do dir
396-
cp(joinpath(Sys.BINDIR, Base.DATAROOTDIR, "julia", "test"), dir; force=true, follow_symlinks=true)
397-
398-
# Prepend `using AppleAccelerate` to `testdefs.jl`, so that all test workers load Acclerate
399-
testdefs_path = joinpath(dir, "testdefs.jl")
400-
chmod(testdefs_path, 0o644)
401-
testdefs_content = String(read(testdefs_path))
402-
open(testdefs_path, write=true) do io
403-
println(io, accelerate_header)
404-
println(io, testdefs_content)
389+
@testset "BLAS threading tests" begin
390+
if AppleAccelerate.get_macos_version() >= v"15"
391+
AppleAccelerate.set_num_threads(1)
392+
@test AppleAccelerate.get_num_threads() == 1
393+
AppleAccelerate.set_num_threads(4)
394+
@test AppleAccelerate.get_num_threads() == Sys.CPU_THREADS
395+
else
396+
@test AppleAccelerate.get_num_threads() == -1
405397
end
398+
end
399+
400+
linalg_stdlib_test_path = joinpath(dirname(pathof(LinearAlgebra)), "..", "test")
401+
402+
# TODO: Re-enable after fixing https://github.com/JuliaLinearAlgebra/AppleAccelerate.jl/issues/87
403+
# Don't run blas.jl tests since the "strided interface blas" tests are currently failing
404+
#=
405+
@testset verbose=true "LinearAlgebra.jl BLAS tests" begin
406+
joinpath(linalg_stdlib_test_path, "blas.jl") |> include
407+
end
408+
=#
406409

407-
run(`$(Base.julia_cmd()) --project=$(Base.active_project()) $(dir)/runtests.jl LinearAlgebra/blas LinearAlgebra/lapack`)
408-
end;
410+
@testset verbose=true "LinearAlgebra.jl LAPACK tests" begin
411+
joinpath(linalg_stdlib_test_path, "lapack.jl") |> include
409412
end

0 commit comments

Comments
 (0)