From: Piotr Jaroszynski <pjaroszynski@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
Alistair Popple <apopple@nvidia.com>,
John Hubbard <jhubbard@nvidia.com>, Zi Yan <ziy@nvidia.com>,
Breno Leitao <leitao@debian.org>,
stable@vger.kernel.org
Subject: Re: [PATCH] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults
Date: Wed, 4 Mar 2026 09:16:58 -0800 [thread overview]
Message-ID: <aahnVXb1IF2yVV5x@box> (raw)
In-Reply-To: <20260304153949.GP972761@nvidia.com>
On Wed, Mar 04, 2026 at 11:39:49AM -0400, Jason Gunthorpe wrote:
> On Wed, Mar 04, 2026 at 03:01:51PM +0000, Catalin Marinas wrote:
> > Good point. For the AF bit, the hardware is not allowed to cache it in
> > the TLB, so we can't get an AF fault for an unrelated VA nearby.
>
> The way we have read the spec is there is no restriction on what PTE
> the HW accesses when it encounters a CONT group.
>
> To be concrete, the spec seems to say it is legal to make HW that
> fetches the PTE at the VA, sees the CONT bit, and then always fetches
> the 0th PTE from the group and only uses that for permission checks.
>
> Therefore SW should never assume that HW will read any particular
> sub-PTE under any scenario.
>
> It seems current cores don't do this, and it is a bit silly to do, but
> I can imagine an optimizion where the core does a cache line fetch to
> read the PTE so it can freely snap to the PTE at the start of the
> cache line for permission checks. Consolidating permission storage to
> fewer PTEs would reduce atomic memory traffic if the TLB is thrashing.
"The Contiguous bit" section I quoted in the change says this:
The entry is permitted to be cached in a TLB as though it is one of a
number of adjacent translation table entries that point to a contiguous
OA range with consistent attributes and permissions.
Software is required to ensure that all of the adjacent translation
table entries for the contiguous region point to a contiguous OA range
with consistent attributes and permissions.
I think your example is valid as any of the sub-PTEs can be cached.
Another valid example is that first you access addr A and PTE for A gets
cached as the value for the whole 2MB region. Then you access a
different address B within the region and fault based on the cached
attributes. In this case the SMMU never had to read the PTE for B as it
already cached it when accessing A. If the faulting code only read the
PTE for B it can show that e.g. RDONLY was already cleared and hit the
problem again.
In summary, I don't see a way to skip reading and fixing all the
sub-PTEs. And the previous code is already reading all of them so the
fix is not adding any new overhead.
>
> Jason
next prev parent reply other threads:[~2026-03-04 17:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-03 6:37 [PATCH] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults Piotr Jaroszynski
2026-03-03 7:19 ` James Houghton
2026-03-03 12:45 ` Jason Gunthorpe
2026-03-03 21:40 ` Piotr Jaroszynski
2026-03-05 4:31 ` James Houghton
2026-03-03 8:38 ` Ryan Roberts
2026-03-03 18:40 ` Piotr Jaroszynski
2026-03-03 19:12 ` Jason Gunthorpe
2026-03-04 12:20 ` Ryan Roberts
2026-03-04 13:44 ` Jason Gunthorpe
2026-03-04 11:17 ` Catalin Marinas
2026-03-04 13:43 ` Jason Gunthorpe
2026-03-04 15:01 ` Catalin Marinas
2026-03-04 15:39 ` Jason Gunthorpe
2026-03-04 17:16 ` Piotr Jaroszynski [this message]
2026-03-04 17:25 ` Catalin Marinas
2026-03-04 17:37 ` Breno Leitao
2026-03-05 17:33 ` Catalin Marinas
2026-03-05 22:49 ` Piotr Jaroszynski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aahnVXb1IF2yVV5x@box \
--to=pjaroszynski@nvidia.com \
--cc=apopple@nvidia.com \
--cc=catalin.marinas@arm.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=leitao@debian.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=will@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.