linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	mingo@redhat.com, luto@kernel.org, peterz@infradead.org,
	paulmck@kernel.org, rostedt@goodmis.org, tglx@linutronix.de,
	willy@infradead.org, jon.grimm@amd.com, bharata@amd.com,
	raghavendra.kt@amd.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com
Subject: Re: [PATCH v3 1/4] x86/clear_page: extend clear_page*() for multi-page clearing
Date: Mon, 14 Apr 2025 08:32:29 +0200	[thread overview]
Message-ID: <Z_yr_cmXti4kXHaX@gmail.com> (raw)
In-Reply-To: <20250414034607.762653-2-ankur.a.arora@oracle.com>


* Ankur Arora <ankur.a.arora@oracle.com> wrote:

> clear_page*() variants now take a page-aligned length parameter and
> clears the whole region.

Please read your changelogs and fix typos. ;-)

> +void clear_pages_orig(void *page, unsigned int length);
> +void clear_pages_rep(void *page, unsigned int length);
> +void clear_pages_erms(void *page, unsigned int length);

What unit is 'length' in? If it's bytes, why is this interface 
artificially limiting itself to ~4GB? On x86-64 there's very little (if 
any) performance difference between a 32-bit and a 64-bit length 
iterations.

Even if we end up only exposing a 32-bit length API to the generic MM 
layer, there's no reason to limit the x86-64 assembly code in such a 
fashion.

>  static inline void clear_page(void *page)
>  {
> +	unsigned int length = PAGE_SIZE;
>  	/*
> -	 * Clean up KMSAN metadata for the page being cleared. The assembly call
> +	 * Clean up KMSAN metadata for the pages being cleared. The assembly call
>  	 * below clobbers @page, so we perform unpoisoning before it.

>  	 */
> -	kmsan_unpoison_memory(page, PAGE_SIZE);
> -	alternative_call_2(clear_page_orig,
> -			   clear_page_rep, X86_FEATURE_REP_GOOD,
> -			   clear_page_erms, X86_FEATURE_ERMS,
> +	kmsan_unpoison_memory(page, length);
> +
> +	alternative_call_2(clear_pages_orig,
> +			   clear_pages_rep, X86_FEATURE_REP_GOOD,
> +			   clear_pages_erms, X86_FEATURE_ERMS,
>  			   "=D" (page),
> -			   "D" (page),
> +			   ASM_INPUT("D" (page), "S" (length)),
>  			   "cc", "memory", "rax", "rcx");
>  }
>  
> diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
> index a508e4a8c66a..bce516263b69 100644
> --- a/arch/x86/lib/clear_page_64.S
> +++ b/arch/x86/lib/clear_page_64.S
> @@ -13,20 +13,35 @@
>   */
>  
>  /*
> - * Zero a page.
> - * %rdi	- page
> + * Zero kernel page aligned region.
> + *
> + * Input:
> + * %rdi	- destination
> + * %esi	- length
> + *
> + * Clobbers: %rax, %rcx
>   */
> -SYM_TYPED_FUNC_START(clear_page_rep)
> -	movl $4096/8,%ecx
> +SYM_TYPED_FUNC_START(clear_pages_rep)
> +	movl %esi, %ecx
>  	xorl %eax,%eax
> +	shrl $3,%ecx
>  	rep stosq
>  	RET
> -SYM_FUNC_END(clear_page_rep)
> -EXPORT_SYMBOL_GPL(clear_page_rep)
> +SYM_FUNC_END(clear_pages_rep)
> +EXPORT_SYMBOL_GPL(clear_pages_rep)
>  
> -SYM_TYPED_FUNC_START(clear_page_orig)
> +/*
> + * Original page zeroing loop.
> + * Input:
> + * %rdi	- destination
> + * %esi	- length
> + *
> + * Clobbers: %rax, %rcx, %rflags
> + */
> +SYM_TYPED_FUNC_START(clear_pages_orig)
> +	movl   %esi, %ecx
>  	xorl   %eax,%eax
> -	movl   $4096/64,%ecx
> +	shrl   $6,%ecx

So if the natural input parameter is RCX, why is this function using 
RSI as the input 'length' parameter? Causes unnecessary register 
shuffling.

> +/*
> + * Zero kernel page aligned region.
> + *
> + * Input:
> + * %rdi	- destination
> + * %esi	- length
> + *
> + * Clobbers: %rax, %rcx
> + */
> +SYM_TYPED_FUNC_START(clear_pages_erms)
> +	movl %esi, %ecx
>  	xorl %eax,%eax
>  	rep stosb
>  	RET

Same observation: unnecessary register shuffling.

Also, please rename this (now-) terribly named interface:

> +void clear_pages_orig(void *page, unsigned int length);
> +void clear_pages_rep(void *page, unsigned int length);
> +void clear_pages_erms(void *page, unsigned int length);

Because the 'pages' is now a bit misleading, and why is the starting 
address called a 'page'?

So a more sensible namespace would be to follow memset nomenclature:

	void memzero_page_aligned_*(void *addr, unsigned long len);

... and note the intentional abbreviation to 'len'.

Also, since most of these changes are to x86 architecture code, this is 
a new interface only used by x86, and the MM glue is minimal, I'd like 
to merge this series via the x86 tree, if the glue gets acks from MM 
folks.

Thanks,

	Ingo


  reply	other threads:[~2025-04-14  6:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-14  3:46 [PATCH v3 0/4] mm/folio_zero_user: add multi-page clearing Ankur Arora
2025-04-14  3:46 ` [PATCH v3 1/4] x86/clear_page: extend clear_page*() for " Ankur Arora
2025-04-14  6:32   ` Ingo Molnar [this message]
2025-04-14 11:02     ` Peter Zijlstra
2025-04-14 11:14       ` Ingo Molnar
2025-04-14 19:46       ` Ankur Arora
2025-04-14 22:26       ` Mateusz Guzik
2025-04-15  6:14         ` Ankur Arora
2025-04-15  8:22           ` Mateusz Guzik
2025-04-15 20:01             ` Ankur Arora
2025-04-15 20:32               ` Mateusz Guzik
2025-04-14 19:52     ` Ankur Arora
2025-04-14 20:09       ` Matthew Wilcox
2025-04-15 21:59         ` Ankur Arora
2025-04-14  3:46 ` [PATCH v3 2/4] x86/clear_page: add clear_pages() Ankur Arora
2025-04-14  3:46 ` [PATCH v3 3/4] huge_page: allow arch override for folio_zero_user() Ankur Arora
2025-04-14  3:46 ` [PATCH v3 4/4] x86/folio_zero_user: multi-page clearing Ankur Arora
2025-04-14  6:53   ` Ingo Molnar
2025-04-14 21:21     ` Ankur Arora
2025-04-14  7:05   ` Ingo Molnar
2025-04-15  6:36     ` Ankur Arora
2025-04-22  6:36     ` Raghavendra K T
2025-04-22 19:14       ` Ankur Arora
2025-04-15 10:16   ` Mateusz Guzik
2025-04-15 21:46     ` Ankur Arora
2025-04-15 22:01       ` Mateusz Guzik
2025-04-16  4:46         ` Ankur Arora
2025-04-17 14:06           ` Mateusz Guzik
2025-04-14  5:34 ` [PATCH v3 0/4] mm/folio_zero_user: add " Ingo Molnar
2025-04-14 19:30   ` Ankur Arora
2025-04-14  6:36 ` Ingo Molnar
2025-04-14 19:19   ` Ankur Arora
2025-04-15 19:10 ` Zi Yan
2025-04-22 19:32   ` Ankur Arora
2025-04-22  6:23 ` Raghavendra K T
2025-04-22 19:22   ` Ankur Arora
2025-04-23  8:12     ` Raghavendra K T
2025-04-23  9:18       ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_yr_cmXti4kXHaX@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=bharata@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jon.grimm@amd.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).