Skip to content

[linux-nvidia-6.8] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults#335

Open
nvmochs wants to merge 1 commit intoNVIDIA:24.04_linux-nvidiafrom
nvmochs:contpte_fix_68
Open

[linux-nvidia-6.8] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults#335
nvmochs wants to merge 1 commit intoNVIDIA:24.04_linux-nvidiafrom
nvmochs:contpte_fix_68

Conversation

@nvmochs
Copy link
Collaborator

@nvmochs nvmochs commented Mar 9, 2026

contpte_ptep_set_access_flags() compared the gathered ptep_get() value against the requested entry to detect no-ops. ptep_get() ORs AF/dirty from all sub-PTEs in the CONT block, so a dirty sibling can make the target appear already-dirty. When the gathered value matches entry, the function returns 0 even though the target sub-PTE still has PTE_RDONLY set in hardware.

For a CPU with FEAT_HAFDBS this gathered view is fine, since hardware may set AF/dirty on any sub-PTE and CPU TLB behavior is effectively gathered across the CONT range. But page-table walkers that evaluate each descriptor individually (e.g. a CPU without DBM support, or an SMMU without HTTU, or with HA/HD disabled in CD.TCR) can keep faulting on the unchanged target sub-PTE, causing an infinite fault loop.

Gathering can therefore cause false no-ops when only a sibling has been updated:

  • write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared)
  • read faults: target still lacks PTE_AF

Fix by checking each sub-PTE against the requested AF/dirty/write state (the same bits consumed by __ptep_set_access_flags()), using raw per-PTE values rather than the gathered ptep_get() view, before returning no-op. Keep using the raw target PTE for the write-bit unfold decision.

Per Arm ARM (DDI 0487) D8.7.1 ("The Contiguous bit"), any sub-PTE in a CONT range may become the effective cached translation and software must maintain consistent attributes across the range.

Fixes: 4602e57 ("arm64/mm: wire up PTE_CONT for user mappings")
Cc: Ryan Roberts ryan.roberts@arm.com
Cc: Catalin Marinas catalin.marinas@arm.com
Cc: Will Deacon will@kernel.org
Cc: Jason Gunthorpe jgg@nvidia.com
Cc: John Hubbard jhubbard@nvidia.com
Cc: Zi Yan ziy@nvidia.com
Cc: Breno Leitao leitao@debian.org
Cc: stable@vger.kernel.org
Reviewed-by: Alistair Popple apopple@nvidia.com
Reviewed-by: James Houghton jthoughton@google.com
Reviewed-by: Ryan Roberts ryan.roberts@arm.com
Reviewed-by: Catalin Marinas catalin.marinas@arm.com
Tested-by: Breno Leitao leitao@debian.org

Acked-by: Balbir Singh balbirs@nvidia.com

(backported from commit 97c5550) [mochs: minor context adjustment due to lack of contpte_clear_young_dirty_ptes()]


This v7.0 fix patch is needed to resolve a hang in pageable D2H copy. See nvb#5931592.

The fix was tested using the cuda_d2h_pageable.py script attached to the bug.


LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2143602

contpte_ptep_set_access_flags() compared the gathered ptep_get() value
against the requested entry to detect no-ops. ptep_get() ORs AF/dirty
from all sub-PTEs in the CONT block, so a dirty sibling can make the
target appear already-dirty. When the gathered value matches entry, the
function returns 0 even though the target sub-PTE still has PTE_RDONLY
set in hardware.

For a CPU with FEAT_HAFDBS this gathered view is fine, since hardware may
set AF/dirty on any sub-PTE and CPU TLB behavior is effectively gathered
across the CONT range. But page-table walkers that evaluate each
descriptor individually (e.g. a CPU without DBM support, or an SMMU
without HTTU, or with HA/HD disabled in CD.TCR) can keep faulting on the
unchanged target sub-PTE, causing an infinite fault loop.

Gathering can therefore cause false no-ops when only a sibling has been
updated:
 - write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared)
 - read faults:  target still lacks PTE_AF

Fix by checking each sub-PTE against the requested AF/dirty/write state
(the same bits consumed by __ptep_set_access_flags()), using raw
per-PTE values rather than the gathered ptep_get() view, before
returning no-op. Keep using the raw target PTE for the write-bit unfold
decision.

Per Arm ARM (DDI 0487) D8.7.1 ("The Contiguous bit"), any sub-PTE in a CONT
range may become the effective cached translation and software must
maintain consistent attributes across the range.

Fixes: 4602e57 ("arm64/mm: wire up PTE_CONT for user mappings")
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Breno Leitao <leitao@debian.org>
Cc: stable@vger.kernel.org
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
(backported from commit 97c5550)
[mochs: minor context adjustment due to lack of contpte_clear_young_dirty_ptes()]
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Copy link
Collaborator

@clsotog clsotog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked-by: Carol L Soto csoto@nvidia.com

@jamieNguyenNVIDIA
Copy link
Collaborator

Acked-by: Jamie Nguyen <jamien@nvidia.com>

@nvmochs
Copy link
Collaborator Author

nvmochs commented Mar 9, 2026

PR submitted to Canonical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants