From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Andrey Ryabinin <ryabinin.a.a@gmail.com>,
Alexander Potapenko <glider@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Dmitry Vyukov <dvyukov@google.com>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
Mark Rutland <mark.rutland@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 11/14] arm64/mm: Wire up PTE_CONT for user mappings
Date: Sun, 16 Jul 2023 08:09:52 -0700 [thread overview]
Message-ID: <ZLQIQAtq6NfSjX1C@arm.com> (raw)
In-Reply-To: <f59228dd-2186-c882-3774-c9778918cd31@arm.com>
On Tue, Jul 04, 2023 at 12:09:31PM +0100, Ryan Roberts wrote:
> On 03/07/2023 16:17, Catalin Marinas wrote:
> > Hi Ryan,
> >
> > Some comments below. I did not have time to trim down the quoted text,
> > so you may need to scroll through it.
>
> Thanks for the review!
>
> Looking at the comments, I think they all relate to implementation. Does that
> imply that you are happy with the shape/approach?
I can't really tell yet as there are a few dependencies and I haven't
applied them to look at the bigger picture. My preference would be to
handle the large folio breaking/making in the core code via APIs like
set_ptes() and eliminate the loop heuristics in the arm64
code to fold/unfold. Maybe it's not entirely possible I need to look at
the bigger picture with all the series applied (and on a bigger screen,
writing this reply on a laptop in flight).
> Talking with Anshuman yesterday, he suggested putting this behind a new Kconfig
> option that defaults to disabled and also adding a command line option to
> disable it when compiled in. I think that makes sense for now at least to reduce
> risk of performance regression?
I'm fine with a Kconfig option (maybe expert) but default enabled,
otherwise it won't get enough coverage. AFAICT, the biggest risk of
regression is the heuristics for folding/unfolding. In general the
overhead should be offset by the reduced TLB pressure but we may find
some pathological case where this gets in the way.
> > On Thu, Jun 22, 2023 at 03:42:06PM +0100, Ryan Roberts wrote:
> >> + /*
> >> + * No need to flush here; This is always "more permissive" so we
> >> + * can only be _adding_ the access or dirty bit. And since the
> >> + * tlb can't cache an entry without the AF set and the dirty bit
> >> + * is a SW bit, there can be no confusion. For HW access
> >> + * management, we technically only need to update the flag on a
> >> + * single pte in the range. But for SW access management, we
> >> + * need to update all the ptes to prevent extra faults.
> >> + */
> >
> > On pre-DBM hardware, a PTE_RDONLY entry (writable from the kernel
> > perspective but clean) may be cached in the TLB and we do need flushing.
>
> I don't follow; The Arm ARM says:
>
> IPNQBP When an Access flag fault is generated, the translation table entry
> causing the fault is not cached in a TLB.
>
> So the entry can only be in the TLB if AF is already 1. And given the dirty bit
> is SW, it shouldn't affect the TLB state. And this function promises to only
> change the bits so they are more permissive (so AF=0 -> AF=1, D=0 -> D=1).
>
> So I'm not sure what case you are describing here?
The comment for this function states that it sets the access/dirty flags
as well as the write permission. Prior to DBM, the page is marked
PTE_RDONLY and we take a fault. This function marks the page dirty by
setting the software PTE_DIRTY bit (no need to worry) but also clearing
PTE_RDONLY so that a subsequent access won't fault again. We do need the
TLBI here since PTE_RDONLY is allowed to be cached in the TLB.
Sorry, I did not reply to your other comments (we can talk in person in
about a week time). I also noticed you figured the above but I had
written it already.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Andrey Ryabinin <ryabinin.a.a@gmail.com>,
Alexander Potapenko <glider@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Dmitry Vyukov <dvyukov@google.com>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
Mark Rutland <mark.rutland@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 11/14] arm64/mm: Wire up PTE_CONT for user mappings
Date: Sun, 16 Jul 2023 08:09:52 -0700 [thread overview]
Message-ID: <ZLQIQAtq6NfSjX1C@arm.com> (raw)
In-Reply-To: <f59228dd-2186-c882-3774-c9778918cd31@arm.com>
On Tue, Jul 04, 2023 at 12:09:31PM +0100, Ryan Roberts wrote:
> On 03/07/2023 16:17, Catalin Marinas wrote:
> > Hi Ryan,
> >
> > Some comments below. I did not have time to trim down the quoted text,
> > so you may need to scroll through it.
>
> Thanks for the review!
>
> Looking at the comments, I think they all relate to implementation. Does that
> imply that you are happy with the shape/approach?
I can't really tell yet as there are a few dependencies and I haven't
applied them to look at the bigger picture. My preference would be to
handle the large folio breaking/making in the core code via APIs like
set_ptes() and eliminate the loop heuristics in the arm64
code to fold/unfold. Maybe it's not entirely possible I need to look at
the bigger picture with all the series applied (and on a bigger screen,
writing this reply on a laptop in flight).
> Talking with Anshuman yesterday, he suggested putting this behind a new Kconfig
> option that defaults to disabled and also adding a command line option to
> disable it when compiled in. I think that makes sense for now at least to reduce
> risk of performance regression?
I'm fine with a Kconfig option (maybe expert) but default enabled,
otherwise it won't get enough coverage. AFAICT, the biggest risk of
regression is the heuristics for folding/unfolding. In general the
overhead should be offset by the reduced TLB pressure but we may find
some pathological case where this gets in the way.
> > On Thu, Jun 22, 2023 at 03:42:06PM +0100, Ryan Roberts wrote:
> >> + /*
> >> + * No need to flush here; This is always "more permissive" so we
> >> + * can only be _adding_ the access or dirty bit. And since the
> >> + * tlb can't cache an entry without the AF set and the dirty bit
> >> + * is a SW bit, there can be no confusion. For HW access
> >> + * management, we technically only need to update the flag on a
> >> + * single pte in the range. But for SW access management, we
> >> + * need to update all the ptes to prevent extra faults.
> >> + */
> >
> > On pre-DBM hardware, a PTE_RDONLY entry (writable from the kernel
> > perspective but clean) may be cached in the TLB and we do need flushing.
>
> I don't follow; The Arm ARM says:
>
> IPNQBP When an Access flag fault is generated, the translation table entry
> causing the fault is not cached in a TLB.
>
> So the entry can only be in the TLB if AF is already 1. And given the dirty bit
> is SW, it shouldn't affect the TLB state. And this function promises to only
> change the bits so they are more permissive (so AF=0 -> AF=1, D=0 -> D=1).
>
> So I'm not sure what case you are describing here?
The comment for this function states that it sets the access/dirty flags
as well as the write permission. Prior to DBM, the page is marked
PTE_RDONLY and we take a fault. This function marks the page dirty by
setting the software PTE_DIRTY bit (no need to worry) but also clearing
PTE_RDONLY so that a subsequent access won't fault again. We do need the
TLBI here since PTE_RDONLY is allowed to be cached in the TLB.
Sorry, I did not reply to your other comments (we can talk in person in
about a week time). I also noticed you figured the above but I had
written it already.
--
Catalin
next prev parent reply other threads:[~2023-07-16 15:09 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-22 14:41 [PATCH v1 00/14] Transparent Contiguous PTEs for User Mappings Ryan Roberts
2023-06-22 14:41 ` Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 01/14] arm64/mm: set_pte(): New layer to manage contig bit Ryan Roberts
2023-06-22 14:41 ` Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 02/14] arm64/mm: set_ptes()/set_pte_at(): " Ryan Roberts
2023-06-22 14:41 ` Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 03/14] arm64/mm: pte_clear(): " Ryan Roberts
2023-06-22 14:41 ` Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 04/14] arm64/mm: ptep_get_and_clear(): " Ryan Roberts
2023-06-22 14:41 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 05/14] arm64/mm: ptep_test_and_clear_young(): " Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 06/14] arm64/mm: ptep_clear_flush_young(): " Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 07/14] arm64/mm: ptep_set_wrprotect(): " Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 08/14] arm64/mm: ptep_set_access_flags(): " Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 09/14] arm64/mm: ptep_get(): " Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 10/14] arm64/mm: Split __flush_tlb_range() to elide trailing DSB Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 11/14] arm64/mm: Wire up PTE_CONT for user mappings Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-30 1:54 ` John Hubbard
2023-06-30 1:54 ` John Hubbard
2023-07-03 9:48 ` Ryan Roberts
2023-07-03 9:48 ` Ryan Roberts
2023-07-03 15:17 ` Catalin Marinas
2023-07-03 15:17 ` Catalin Marinas
2023-07-04 11:09 ` Ryan Roberts
2023-07-04 11:09 ` Ryan Roberts
2023-07-05 13:13 ` Ryan Roberts
2023-07-05 13:13 ` Ryan Roberts
2023-07-16 15:09 ` Catalin Marinas [this message]
2023-07-16 15:09 ` Catalin Marinas
2023-06-22 14:42 ` [PATCH v1 12/14] arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 13/14] mm: Batch-copy PTE ranges during fork() Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 14/14] arm64/mm: Implement ptep_set_wrprotects() to optimize fork() Ryan Roberts
2023-06-22 14:42 ` Ryan Roberts
2023-07-10 12:05 ` [PATCH v1 00/14] Transparent Contiguous PTEs for User Mappings Barry Song
2023-07-10 12:05 ` Barry Song
2023-07-10 13:28 ` Ryan Roberts
2023-07-10 13:28 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZLQIQAtq6NfSjX1C@arm.com \
--to=catalin.marinas@arm.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=dvyukov@google.com \
--cc=glider@google.com \
--cc=james.morse@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=ryabinin.a.a@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=vincenzo.frascino@arm.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yuzenghui@huawei.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.