[linux-nvidia-6.8] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults by nvmochs · Pull Request #335 · NVIDIA/NV-Kernels

nvmochs · 2026-03-09T16:14:31Z

contpte_ptep_set_access_flags() compared the gathered ptep_get() value against the requested entry to detect no-ops. ptep_get() ORs AF/dirty from all sub-PTEs in the CONT block, so a dirty sibling can make the target appear already-dirty. When the gathered value matches entry, the function returns 0 even though the target sub-PTE still has PTE_RDONLY set in hardware.

For a CPU with FEAT_HAFDBS this gathered view is fine, since hardware may set AF/dirty on any sub-PTE and CPU TLB behavior is effectively gathered across the CONT range. But page-table walkers that evaluate each descriptor individually (e.g. a CPU without DBM support, or an SMMU without HTTU, or with HA/HD disabled in CD.TCR) can keep faulting on the unchanged target sub-PTE, causing an infinite fault loop.

Gathering can therefore cause false no-ops when only a sibling has been updated:

write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared)
read faults: target still lacks PTE_AF

Fix by checking each sub-PTE against the requested AF/dirty/write state (the same bits consumed by __ptep_set_access_flags()), using raw per-PTE values rather than the gathered ptep_get() view, before returning no-op. Keep using the raw target PTE for the write-bit unfold decision.

Per Arm ARM (DDI 0487) D8.7.1 ("The Contiguous bit"), any sub-PTE in a CONT range may become the effective cached translation and software must maintain consistent attributes across the range.

Fixes: 4602e57 ("arm64/mm: wire up PTE_CONT for user mappings")
Cc: Ryan Roberts ryan.roberts@arm.com
Cc: Catalin Marinas catalin.marinas@arm.com
Cc: Will Deacon will@kernel.org
Cc: Jason Gunthorpe jgg@nvidia.com
Cc: John Hubbard jhubbard@nvidia.com
Cc: Zi Yan ziy@nvidia.com
Cc: Breno Leitao leitao@debian.org
Cc: stable@vger.kernel.org
Reviewed-by: Alistair Popple apopple@nvidia.com
Reviewed-by: James Houghton jthoughton@google.com
Reviewed-by: Ryan Roberts ryan.roberts@arm.com
Reviewed-by: Catalin Marinas catalin.marinas@arm.com
Tested-by: Breno Leitao leitao@debian.org

Acked-by: Balbir Singh balbirs@nvidia.com

(backported from commit 97c5550) [mochs: minor context adjustment due to lack of contpte_clear_young_dirty_ptes()]

This v7.0 fix patch is needed to resolve a hang in pageable D2H copy. See nvb#5931592.

The fix was tested using the cuda_d2h_pageable.py script attached to the bug.

LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2143602

contpte_ptep_set_access_flags() compared the gathered ptep_get() value against the requested entry to detect no-ops. ptep_get() ORs AF/dirty from all sub-PTEs in the CONT block, so a dirty sibling can make the target appear already-dirty. When the gathered value matches entry, the function returns 0 even though the target sub-PTE still has PTE_RDONLY set in hardware. For a CPU with FEAT_HAFDBS this gathered view is fine, since hardware may set AF/dirty on any sub-PTE and CPU TLB behavior is effectively gathered across the CONT range. But page-table walkers that evaluate each descriptor individually (e.g. a CPU without DBM support, or an SMMU without HTTU, or with HA/HD disabled in CD.TCR) can keep faulting on the unchanged target sub-PTE, causing an infinite fault loop. Gathering can therefore cause false no-ops when only a sibling has been updated: - write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared) - read faults: target still lacks PTE_AF Fix by checking each sub-PTE against the requested AF/dirty/write state (the same bits consumed by __ptep_set_access_flags()), using raw per-PTE values rather than the gathered ptep_get() view, before returning no-op. Keep using the raw target PTE for the write-bit unfold decision. Per Arm ARM (DDI 0487) D8.7.1 ("The Contiguous bit"), any sub-PTE in a CONT range may become the effective cached translation and software must maintain consistent attributes across the range. Fixes: 4602e57 ("arm64/mm: wire up PTE_CONT for user mappings") Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Breno Leitao <leitao@debian.org> Cc: stable@vger.kernel.org Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: James Houghton <jthoughton@google.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Breno Leitao <leitao@debian.org> Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com> Acked-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (backported from commit 97c5550) [mochs: minor context adjustment due to lack of contpte_clear_young_dirty_ptes()] Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>

clsotog

Acked-by: Carol L Soto csoto@nvidia.com

jamieNguyenNVIDIA · 2026-03-09T17:01:16Z

Acked-by: Jamie Nguyen <jamien@nvidia.com>

nvmochs · 2026-03-09T18:11:34Z

PR submitted to Canonical.

nvmochs requested review from clsotog, jamieNguyenNVIDIA and nirmoy March 9, 2026 16:14

clsotog approved these changes Mar 9, 2026

View reviewed changes

jamieNguyenNVIDIA approved these changes Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[linux-nvidia-6.8] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults#335

[linux-nvidia-6.8] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults#335
nvmochs wants to merge 1 commit intoNVIDIA:24.04_linux-nvidiafrom
nvmochs:contpte_fix_68

nvmochs commented Mar 9, 2026

Uh oh!

clsotog left a comment

Uh oh!

jamieNguyenNVIDIA commented Mar 9, 2026

Uh oh!

nvmochs commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nvmochs commented Mar 9, 2026

Uh oh!

clsotog left a comment

Choose a reason for hiding this comment

Uh oh!

jamieNguyenNVIDIA commented Mar 9, 2026

Uh oh!

nvmochs commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants