From: Catalin Marinas <catalin.marinas@arm.com>
To: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: linux-arm-kernel@lists.infradead.org,
Will Deacon <will@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Robin Murphy <robin.murphy@arm.com>,
Marc Zyngier <maz@kernel.org>
Subject: Re: [PATCH 4/5] arm64: lib: Use MOPS for memcpy() routines
Date: Thu, 17 Oct 2024 12:57:22 +0100 [thread overview]
Message-ID: <ZxD7oiW1j0EI2DUw@arm.com> (raw)
In-Reply-To: <bf06d3e8-736e-4bcf-9aa8-2d1112fcfda6@arm.com>
On Wed, Oct 16, 2024 at 02:08:27PM +0100, Kristina Martsenko wrote:
> On 04/10/2024 11:07, Catalin Marinas wrote:
> > On Thu, Oct 03, 2024 at 05:46:08PM +0100, Kristina Martsenko wrote:
> >> On 02/10/2024 16:29, Catalin Marinas wrote:
> >>> On Mon, Sep 30, 2024 at 05:10:50PM +0100, Kristina Martsenko wrote:
> >>>> diff --git a/arch/arm64/lib/memcpy.S b/arch/arm64/lib/memcpy.S
> >>>> index 4ab48d49c451..9b99106fb95f 100644
> >>>> --- a/arch/arm64/lib/memcpy.S
> >>>> +++ b/arch/arm64/lib/memcpy.S
> >>>> @@ -57,7 +57,7 @@
> >>>> The loop tail is handled by always copying 64 bytes from the end.
> >>>> */
> >>>>
> >>>> -SYM_FUNC_START(__pi_memcpy)
> >>>> +SYM_FUNC_START_LOCAL(__pi_memcpy_generic)
> >>>> add srcend, src, count
> >>>> add dstend, dstin, count
> >>>> cmp count, 128
> >>>> @@ -238,7 +238,24 @@ L(copy64_from_start):
> >>>> stp B_l, B_h, [dstin, 16]
> >>>> stp C_l, C_h, [dstin]
> >>>> ret
> >>>> +SYM_FUNC_END(__pi_memcpy_generic)
> >>>> +
> >>>> +#ifdef CONFIG_AS_HAS_MOPS
> >>>> + .arch_extension mops
> >>>> +SYM_FUNC_START(__pi_memcpy)
> >>>> +alternative_if_not ARM64_HAS_MOPS
> >>>> + b __pi_memcpy_generic
> >>>> +alternative_else_nop_endif
> >>>
> >>> I'm fine with patching the branch but I wonder whether, for the time
> >>> being, we should use alternative_if instead and the NOP to fall through
> >>> the default implementation. The hardware in the field doesn't have
> >>> FEAT_MOPS yet and they may see a slight penalty introduced by the
> >>> branch, especially for small memcpys. Just guessing, I haven't done any
> >>> benchmarks.
> >>
> >> My thinking was that this way it doesn't have to be changed again in the
> >> future. But I'm fine with switching to alternative_if for v2.
> >
> > The other option is to benchmark the proposed patches a bit and see if
> > we notice any difference on current hardware. Not sure exactly what
> > benchmarks would exercise these paths. For copy_page(), I suspect the
> > branch is probably lost in the noise. It's more like small copies that
> > might notice.
> >
> > Yet another option is to leave the patches as they are and see if anyone
> > complains, we swap them over then ;).
>
> I tried benchmarking a kernel build and hackbench on a Morello board (with
> usercopy patches applied as well) but didn't see any significant performance
> difference between the branch and NOP so I would leave the patches as they are.
That's great. Thanks for checking.
--
Catalin
next prev parent reply other threads:[~2024-10-17 11:59 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-30 16:10 [PATCH 0/5] arm64: Use memory copy instructions in kernel routines Kristina Martsenko
2024-09-30 16:10 ` [PATCH 1/5] arm64: probes: Disable kprobes/uprobes on MOPS instructions Kristina Martsenko
2024-10-02 10:28 ` Catalin Marinas
2024-09-30 16:10 ` [PATCH 2/5] arm64: mops: Handle MOPS exceptions from EL1 Kristina Martsenko
2024-09-30 16:10 ` [PATCH 3/5] arm64: mops: Document booting requirement for HCR_EL2.MCE2 Kristina Martsenko
2024-10-02 10:38 ` Catalin Marinas
2024-10-02 13:31 ` Kristina Martsenko
2024-10-02 17:09 ` Catalin Marinas
2024-09-30 16:10 ` [PATCH 4/5] arm64: lib: Use MOPS for memcpy() routines Kristina Martsenko
2024-10-02 15:29 ` Catalin Marinas
2024-10-03 16:46 ` Kristina Martsenko
2024-10-04 10:07 ` Catalin Marinas
2024-10-16 13:08 ` Kristina Martsenko
2024-10-17 11:57 ` Catalin Marinas [this message]
2024-09-30 16:10 ` [PATCH 5/5] arm64: lib: Use MOPS for copy_page() and clear_page() Kristina Martsenko
2024-10-02 15:37 ` Catalin Marinas
2024-10-02 16:20 ` [PATCH 0/5] arm64: Use memory copy instructions in kernel routines Catalin Marinas
2024-10-03 16:49 ` Kristina Martsenko
2024-10-04 10:10 ` Catalin Marinas
2024-10-17 18:00 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZxD7oiW1j0EI2DUw@arm.com \
--to=catalin.marinas@arm.com \
--cc=kristina.martsenko@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.