Skip to content

rank(::QRPivoted) is very inefficient #1128

@ajinkya-k

Description

@ajinkya-k

The current implementation of rank(::QRPivoted) (see below) is very inefficient. I will submit a PR shortly to fix this.

More detailed benchmarking here: https://ajinkya-k.github.io/hidden/2024-11-28-allocations-runtime/

function rank(A::QRPivoted; atol::Real=0, rtol::Real=min(size(A)...) * eps(real(float(one(eltype(A.Q))))) * iszero(atol))
    m = min(size(A)...)
    m == 0 && return 0
    tol = max(atol, rtol*abs(A.R[1,1]))
    return something(findfirst(i -> abs(A.R[i,i]) <= tol, 1:m), m+1) - 1
end

This is because of the repeated calls to A.R inside findfirst. Each call recomputes the factor R, causing too many allocations and slowing down the method. This is irrelevant for small matrices but becomes problematic for large matrices (see examples below).

The following is a faster implementation, because it calls A.factors instead of A.R and only collects the diagonal elements, which is all we need.
This cuts down the runtime by many orders of magnitude for larger matrices.

function rankfast(A::QRPivoted; atol::Real=0, rtol::Real=min(size(A)...) * eps(real(float(one(eltype(A.Q))))) * iszero(atol))
    m = min(size(A)...)
    m == 0 && return 0
    rs = diag(getfield(A, :factors))
    tol = max(atol, rtol*abs(rs[1]))
    return something(findfirst(abs.(rs) .<= tol), m+1) - 1
end

Here's a sample comparison for a matrix of size 1000 x 100 and rank 50.
Note that the current method takes milliseconds, while the faster one works in microseconds

@be rank($Mqr)
Benchmark: 42 samples with 1 evaluation
 min    294.166 μs (157 allocs: 3.971 MiB)
 median 397.667 μs (157 allocs: 3.971 MiB)
 mean   2.323 ms (157 allocs: 3.971 MiB, 13.20% gc time)
 max    74.098 ms (157 allocs: 3.971 MiB, 99.40% gc time)

julia> @be rankfast($Mqr)
Benchmark: 10820 samples with 58 evaluations
 min    79.034 ns (6 allocs: 1.047 KiB)
 median 96.259 ns (6 allocs: 1.047 KiB)
 mean   134.789 ns (6 allocs: 1.047 KiB, 0.18% gc time)
 max    62.378 μs (6 allocs: 1.047 KiB, 99.07% gc time)

Here are also some plots of run time and bytes allocated for matrices of size 1000 x 100 as the rank increases from 1 to 100:
**Note: ** Y axis is log-scaled

Image

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions