All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
	Mark Rutland <mark.rutland@arm.com>,
	David Hildenbrand <david@redhat.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	John Hubbard <jhubbard@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Barry Song <21cnbao@gmail.com>, Yang Shi <shy828301@gmail.com>,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 02/15] mm: Batch-clear PTE ranges during zap_pte_range()
Date: Fri, 08 Dec 2023 12:30:40 +1100	[thread overview]
Message-ID: <87h6kta3ap.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <20231204105440.61448-3-ryan.roberts@arm.com>


Ryan Roberts <ryan.roberts@arm.com> writes:

> Convert zap_pte_range() to clear a set of ptes in a batch. A given batch
> maps a physically contiguous block of memory, all belonging to the same
> folio. This will likely improve performance by a tiny amount due to
> removing duplicate calls to mark the folio dirty and accessed. And also
> provides us with a future opportunity to batch the rmap removal.
>
> However, the primary motivation for this change is to reduce the number
> of tlb maintenance operations that the arm64 backend has to perform
> during exit and other syscalls that cause zap_pte_range() (e.g. munmap,
> madvise(DONTNEED), etc.), as it is about to add transparent support for
> the "contiguous bit" in its ptes. By clearing ptes using the new
> clear_ptes() API, the backend doesn't have to perform an expensive
> unfold operation when a PTE being cleared is part of a contpte block.
> Instead it can just clear the whole block immediately.
>
> This change addresses the core-mm refactoring only, and introduces
> clear_ptes() with a default implementation that calls
> ptep_get_and_clear_full() for each pte in the range. Note that this API
> returns the pte at the beginning of the batch, but with the dirty and
> young bits set if ANY of the ptes in the cleared batch had those bits
> set; this information is applied to the folio by the core-mm. Given the
> batch is garranteed to cover only a single folio, collapsing this state

Nit: s/garranteed/guaranteed/

> does not lose any useful information.
>
> A separate change will implement clear_ptes() in the arm64 backend to
> realize the performance improvement as part of the work to enable
> contpte mappings.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  include/asm-generic/tlb.h |  9 ++++++
>  include/linux/pgtable.h   | 26 ++++++++++++++++
>  mm/memory.c               | 63 ++++++++++++++++++++++++++-------------
>  mm/mmu_gather.c           | 14 +++++++++
>  4 files changed, 92 insertions(+), 20 deletions(-)

<snip>

> diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
> index 4f559f4ddd21..57b4d5f0dfa4 100644
> --- a/mm/mmu_gather.c
> +++ b/mm/mmu_gather.c
> @@ -47,6 +47,20 @@ static bool tlb_next_batch(struct mmu_gather *tlb)
>  	return true;
>  }
>  
> +unsigned int tlb_get_guaranteed_space(struct mmu_gather *tlb)
> +{
> +	struct mmu_gather_batch *batch = tlb->active;
> +	unsigned int nr_next = 0;
> +
> +	/* Allocate next batch so we can guarrantee at least one batch. */
> +	if (tlb_next_batch(tlb)) {
> +		tlb->active = batch;

Rather than calling tlb_next_batch(tlb) and then undoing some of what it
does I think it would be clearer to factor out the allocation part of
tlb_next_batch(tlb) into a separate function (eg. tlb_alloc_batch) that
you can call from both here and tlb_next_batch().

Otherwise I think this overall direction looks better than trying to
play funny games in the arch layer as it's much clearer what's going on
to core-mm code.

 - Alistair

> +		nr_next = batch->next->max;
> +	}
> +
> +	return batch->max - batch->nr + nr_next;
> +}
> +
>  #ifdef CONFIG_SMP
>  static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_struct *vma)
>  {


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Alistair Popple <apopple@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
	Mark Rutland <mark.rutland@arm.com>,
	David Hildenbrand <david@redhat.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	John Hubbard <jhubbard@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Barry Song <21cnbao@gmail.com>, Yang Shi <shy828301@gmail.com>,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 02/15] mm: Batch-clear PTE ranges during zap_pte_range()
Date: Fri, 08 Dec 2023 12:30:40 +1100	[thread overview]
Message-ID: <87h6kta3ap.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <20231204105440.61448-3-ryan.roberts@arm.com>


Ryan Roberts <ryan.roberts@arm.com> writes:

> Convert zap_pte_range() to clear a set of ptes in a batch. A given batch
> maps a physically contiguous block of memory, all belonging to the same
> folio. This will likely improve performance by a tiny amount due to
> removing duplicate calls to mark the folio dirty and accessed. And also
> provides us with a future opportunity to batch the rmap removal.
>
> However, the primary motivation for this change is to reduce the number
> of tlb maintenance operations that the arm64 backend has to perform
> during exit and other syscalls that cause zap_pte_range() (e.g. munmap,
> madvise(DONTNEED), etc.), as it is about to add transparent support for
> the "contiguous bit" in its ptes. By clearing ptes using the new
> clear_ptes() API, the backend doesn't have to perform an expensive
> unfold operation when a PTE being cleared is part of a contpte block.
> Instead it can just clear the whole block immediately.
>
> This change addresses the core-mm refactoring only, and introduces
> clear_ptes() with a default implementation that calls
> ptep_get_and_clear_full() for each pte in the range. Note that this API
> returns the pte at the beginning of the batch, but with the dirty and
> young bits set if ANY of the ptes in the cleared batch had those bits
> set; this information is applied to the folio by the core-mm. Given the
> batch is garranteed to cover only a single folio, collapsing this state

Nit: s/garranteed/guaranteed/

> does not lose any useful information.
>
> A separate change will implement clear_ptes() in the arm64 backend to
> realize the performance improvement as part of the work to enable
> contpte mappings.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  include/asm-generic/tlb.h |  9 ++++++
>  include/linux/pgtable.h   | 26 ++++++++++++++++
>  mm/memory.c               | 63 ++++++++++++++++++++++++++-------------
>  mm/mmu_gather.c           | 14 +++++++++
>  4 files changed, 92 insertions(+), 20 deletions(-)

<snip>

> diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
> index 4f559f4ddd21..57b4d5f0dfa4 100644
> --- a/mm/mmu_gather.c
> +++ b/mm/mmu_gather.c
> @@ -47,6 +47,20 @@ static bool tlb_next_batch(struct mmu_gather *tlb)
>  	return true;
>  }
>  
> +unsigned int tlb_get_guaranteed_space(struct mmu_gather *tlb)
> +{
> +	struct mmu_gather_batch *batch = tlb->active;
> +	unsigned int nr_next = 0;
> +
> +	/* Allocate next batch so we can guarrantee at least one batch. */
> +	if (tlb_next_batch(tlb)) {
> +		tlb->active = batch;

Rather than calling tlb_next_batch(tlb) and then undoing some of what it
does I think it would be clearer to factor out the allocation part of
tlb_next_batch(tlb) into a separate function (eg. tlb_alloc_batch) that
you can call from both here and tlb_next_batch().

Otherwise I think this overall direction looks better than trying to
play funny games in the arch layer as it's much clearer what's going on
to core-mm code.

 - Alistair

> +		nr_next = batch->next->max;
> +	}
> +
> +	return batch->max - batch->nr + nr_next;
> +}
> +
>  #ifdef CONFIG_SMP
>  static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_struct *vma)
>  {



  reply	other threads:[~2023-12-08  1:31 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 10:54 [PATCH v3 00/15] Transparent Contiguous PTEs for User Mappings Ryan Roberts
2023-12-04 10:54 ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 01/15] mm: Batch-copy PTE ranges during fork() Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 15:47   ` David Hildenbrand
2023-12-04 15:47     ` David Hildenbrand
2023-12-04 16:00     ` David Hildenbrand
2023-12-04 16:00       ` David Hildenbrand
2023-12-04 17:27       ` David Hildenbrand
2023-12-04 17:27         ` David Hildenbrand
2023-12-05 11:30         ` Ryan Roberts
2023-12-05 11:30           ` Ryan Roberts
2023-12-05 12:04           ` David Hildenbrand
2023-12-05 12:04             ` David Hildenbrand
2023-12-05 14:16             ` Ryan Roberts
2023-12-05 14:16               ` Ryan Roberts
2023-12-08  0:32   ` Alistair Popple
2023-12-08  0:32     ` Alistair Popple
2023-12-12 11:51     ` Ryan Roberts
2023-12-12 11:51       ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 02/15] mm: Batch-clear PTE ranges during zap_pte_range() Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-08  1:30   ` Alistair Popple [this message]
2023-12-08  1:30     ` Alistair Popple
2023-12-12 11:57     ` Ryan Roberts
2023-12-12 11:57       ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 03/15] arm64/mm: set_pte(): New layer to manage contig bit Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 04/15] arm64/mm: set_ptes()/set_pte_at(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 05/15] arm64/mm: pte_clear(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 06/15] arm64/mm: ptep_get_and_clear(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 07/15] arm64/mm: ptep_test_and_clear_young(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 08/15] arm64/mm: ptep_clear_flush_young(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 09/15] arm64/mm: ptep_set_wrprotect(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 10/15] arm64/mm: ptep_set_access_flags(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 11/15] arm64/mm: ptep_get(): " Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 12/15] arm64/mm: Split __flush_tlb_range() to elide trailing DSB Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-12 11:35   ` Will Deacon
2023-12-12 11:35     ` Will Deacon
2023-12-12 11:47     ` Ryan Roberts
2023-12-12 11:47       ` Ryan Roberts
2023-12-14 11:53       ` Ryan Roberts
2023-12-14 11:53         ` Ryan Roberts
2023-12-14 12:13         ` Will Deacon
2023-12-14 12:13           ` Will Deacon
2023-12-14 12:30           ` Robin Murphy
2023-12-14 12:30             ` Robin Murphy
2023-12-14 14:28             ` Ryan Roberts
2023-12-14 14:28               ` Ryan Roberts
2023-12-14 15:22             ` Jean-Philippe Brucker
2023-12-14 15:22               ` Jean-Philippe Brucker
2023-12-14 16:45               ` Jonathan Cameron
2023-12-14 16:45                 ` Jonathan Cameron
2023-12-04 10:54 ` [PATCH v3 13/15] arm64/mm: Wire up PTE_CONT for user mappings Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 14/15] arm64/mm: Implement ptep_set_wrprotects() to optimize fork() Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-08  1:37   ` Alistair Popple
2023-12-08  1:37     ` Alistair Popple
2023-12-12 11:59     ` Ryan Roberts
2023-12-12 11:59       ` Ryan Roberts
2023-12-15  4:32       ` Alistair Popple
2023-12-15  4:32         ` Alistair Popple
2023-12-15 14:05         ` Ryan Roberts
2023-12-15 14:05           ` Ryan Roberts
2023-12-04 10:54 ` [PATCH v3 15/15] arm64/mm: Implement clear_ptes() to optimize exit() Ryan Roberts
2023-12-04 10:54   ` Ryan Roberts
2023-12-08  1:45   ` Alistair Popple
2023-12-08  1:45     ` Alistair Popple
2023-12-12 12:02     ` Ryan Roberts
2023-12-12 12:02       ` Ryan Roberts
2023-12-05  3:41 ` [PATCH v3 00/15] Transparent Contiguous PTEs for User Mappings John Hubbard
2023-12-05  3:41   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h6kta3ap.fsf@nvdebian.thelocal \
    --to=apopple@nvidia.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=anshuman.khandual@arm.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=james.morse@arm.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=ryabinin.a.a@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yuzenghui@huawei.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.