From: Catalin Marinas <catalin.marinas@arm.com>
To: Will Deacon <will@kernel.org>
Cc: Linus Walleij <linusw@kernel.org>, Marc Zyngier <maz@kernel.org>,
Oliver Upton <oupton@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Ankur Arora <ankur.a.arora@oracle.com>,
David Hildenbrand <david@kernel.org>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
James Clark <james.clark2@arm.com>
Subject: Re: [PATCH] arm64: Implement clear_pages()
Date: Tue, 3 Mar 2026 15:45:15 +0000 [thread overview]
Message-ID: <aacCCwgvo2J8CCEI@arm.com> (raw)
In-Reply-To: <aab0SvA0dU0R0pnl@willie-the-truck>
On Tue, Mar 03, 2026 at 02:46:34PM +0000, Will Deacon wrote:
> On Tue, Mar 03, 2026 at 11:06:13AM +0100, Linus Walleij wrote:
> > On QEMU:
> >
> > Before this patch: After this patch:
> > 2.38 GB/s 2.41 GB/s
>
> I really don't think we should pay attention to performance under QEMU
> as it doesn't necessarily have any correlation with real hardware.
I agree.
> > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> > index b39cc1127e1f..916a3e7c9a19 100644
> > --- a/arch/arm64/include/asm/page.h
> > +++ b/arch/arm64/include/asm/page.h
> > @@ -20,7 +20,18 @@ struct page;
> > struct vm_area_struct;
> >
> > extern void copy_page(void *to, const void *from);
> > -extern void clear_page(void *to);
> > +extern void clear_pages_asm(void *addr, unsigned int nbytes);
> > +
> > +static inline void clear_pages(void *addr, unsigned int npages)
> > +{
> > + clear_pages_asm(addr, npages * PAGE_SIZE);
> > +}
> > +#define clear_pages clear_pages
>
> Hmm. From what I can tell, this just turns a branch in C code into a
> branch in assembly, so it's hard to correlate that meaningfully with
> the performance improvement you see.
>
> If we have CPUs that are this sensitive to branches, perhaps we'd be
> better off taking the opposite approach and moving more code into C
> so that the compiler can optimise the control flow for us?
I think it's more than the loop branch - the whole DCZID_EL0 read to
decide whether to use DC ZVA or STNP. I wonder why we didn't do that
with an alternative than always read the sysreg.
That said, I wouldn't mind rewriting this in C if the numbers don't get
worse. It is a bit more involved if we keep the DC ZVA use, though with
alternatives maybe not that bad (mte_set_mem_tag_range() is an example
of doing something similar in C but for clear page we don't need to deal
with unaligned boundaries).
--
Catalin
next prev parent reply other threads:[~2026-03-03 15:45 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-03 10:06 [PATCH] arm64: Implement clear_pages() Linus Walleij
2026-03-03 14:46 ` Will Deacon
2026-03-03 15:45 ` Catalin Marinas [this message]
2026-03-04 0:39 ` Linus Walleij
2026-03-04 8:05 ` Ankur Arora
2026-03-04 8:49 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aacCCwgvo2J8CCEI@arm.com \
--to=catalin.marinas@arm.com \
--cc=ankur.a.arora@oracle.com \
--cc=david@kernel.org \
--cc=james.clark2@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linusw@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.