From: david laight <david.laight@runbox.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Jinjie Ruan <ruanjinjie@huawei.com>,
linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org,
chris@zankel.net, jcmvbkbc@gmail.com, akpm@linux-foundation.org,
macro@orcam.me.uk, charlie@rivosinc.com, deller@gmx.de,
ldv@strace.io, rostedt@goodmis.org, tglx@linutronix.de,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] arm64: Avoid memcpy() for syscall_get_arguments()
Date: Mon, 1 Dec 2025 10:26:33 +0000 [thread overview]
Message-ID: <20251201102633.17a99afc@pumpkin> (raw)
In-Reply-To: <aS1qYhHhZK3CD_bU@J2N7QTR9R3>
On Mon, 1 Dec 2025 10:13:54 +0000
Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Nov 27, 2025 at 08:36:30PM +0800, Jinjie Ruan wrote:
> > Do not use memcpy() to extract syscall arguments from struct pt_regs
> > but rather just perform direct assignments.
> >
> > The performance benchmarks with Generic Entry patch[1] with audit on
> > from perf bench basic syscall on kunpeng920 gives roughly a 1%
> > performance uplift and also aligns the implementation with
> > x86 and RISC-V.
> >
> > | Metric | W/O this patch | With this patch | Change |
> > | ---------- | -------------- | --------------- | --------- |
> > | Total time | 2.241 [sec] | 2.211 [sec] | ↓1.36% |
> > | usecs/op | 0.224157 | 0.221146 | ↓1.36% |
> > | ops/sec | 4,461,157 | 4,501,409 | ↑0.9% |
> >
> > Before:
> > <syscall_get_arguments.constprop.0>:
> > aa0103e2 mov x2, x1
> > 91002003 add x3, x0, #0x8
> > f9408804 ldr x4, [x0, #272]
> > f8008444 str x4, [x2], #8
> > a9409404 ldp x4, x5, [x0, #8]
> > a9009424 stp x4, x5, [x1, #8]
> > a9418400 ldp x0, x1, [x0, #24]
> > a9010440 stp x0, x1, [x2, #16]
> > f9401060 ldr x0, [x3, #32]
> > f9001040 str x0, [x2, #32]
> > d65f03c0 ret
> > d503201f nop
> >
> > After:
> > a9408e82 ldp x2, x3, [x20, #8]
> > 2a1603e0 mov w0, w22
> > f9400e84 ldr x4, [x20, #24]
> > f9408a81 ldr x1, [x20, #272]
> > 9401c4ba bl ffff800080215ca8 <__audit_syscall_entry>
>
> It's probably worth noting that __audit_syscall_entry() only takes 4
> syscall arguments, and hence the compiler has elided the copy of
> regs->regs[4] and regs->regs[5], which it apparently couldn't manage
> before.
Hasn't it actually inlined it and completely optimised away the regs[] array?
It looks (from the asm) as though syscall_get_arguments() is followed by:
fn(regs[0], regs[1], regs[2], regs[3])
David
>
> > [1]: https://lore.kernel.org/all/20251126071446.3234218-1-ruanjinjie@huawei.com/
> > Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> > ---
> > arch/arm64/include/asm/syscall.h | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
> > index f3853047c28e..f3564ba97f7e 100644
> > --- a/arch/arm64/include/asm/syscall.h
> > +++ b/arch/arm64/include/asm/syscall.h
> > @@ -82,9 +82,11 @@ static inline void syscall_get_arguments(struct task_struct *task,
> > unsigned long *args)
> > {
> > args[0] = regs->orig_x0;
> > - args++;
> > -
> > - memcpy(args, ®s->regs[1], 5 * sizeof(args[0]));
> > + args[1] = regs->regs[1];
> > + args[2] = regs->regs[2];
> > + args[3] = regs->regs[3];
> > + args[4] = regs->regs[4];
> > + args[5] = regs->regs[5];
> > }
>
> FWIW, I think this is clearer than the 'args++' and the memcpy(), so I'm
> happy with this regardless of the performance concern.
>
> However, as Dmitry says, we should keep this structurally the same as
> syscall_set_arguments(), and so we should update that in the same way.
>
> Mark.
>
next prev parent reply other threads:[~2025-12-01 10:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-27 12:36 [PATCH 0/2] syscall: Cleanup and improve syscall_get_arguments() Jinjie Ruan
2025-11-27 12:36 ` [PATCH 1/2] syscall.h: Remove unused SYSCALL_MAX_ARGS Jinjie Ruan
2025-11-27 12:36 ` [PATCH 2/2] arm64: Avoid memcpy() for syscall_get_arguments() Jinjie Ruan
2025-11-27 14:35 ` Dmitry V. Levin
2025-12-01 10:13 ` Mark Rutland
2025-12-01 10:26 ` david laight [this message]
2025-12-01 10:30 ` Mark Rutland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251201102633.17a99afc@pumpkin \
--to=david.laight@runbox.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=charlie@rivosinc.com \
--cc=chris@zankel.net \
--cc=deller@gmx.de \
--cc=jcmvbkbc@gmail.com \
--cc=ldv@strace.io \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=macro@orcam.me.uk \
--cc=mark.rutland@arm.com \
--cc=rostedt@goodmis.org \
--cc=ruanjinjie@huawei.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.