This would make writing Enzyme rules much easier as it would allow us to avoid mixed activity. It would also help avoid GPU synchronization by storing the truncation error in a GPU-side vector which can be read if necessary. Would it make sense to add another interface for the svd_trunc!, eig_trunc!, and friends like:
svd_trunc!(A, USVh, epsilon::AbstractVector{T}; alg) where {T<:Real}, where epsilon is length 1 and overwritten?