Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Ada Couprie Diaz <ada.coupriediaz@arm.com>
To: Jinjie Ruan <ruanjinjie@huawei.com>, mark.rutland@arm.com
Cc: peterz@infradead.org, catalin.marinas@arm.com, oleg@redhat.com,
	song@kernel.org, will@kernel.org, kees@kernel.org,
	thuth@redhat.com, ryan.roberts@arm.com,
	anshuman.khandual@arm.com, kevin.brodsky@arm.com,
	pengcan@kylinos.cn, broonie@kernel.org, luto@kernel.org,
	ldv@strace.io, linux-arm-kernel@lists.infradead.org,
	wad@chromium.org, yeoreum.yun@arm.com,
	linux-kernel@vger.kernel.org, james.morse@arm.com,
	tglx@kernel.org, liqiang01@kylinos.cn, linusw@kernel.org
Subject: Re: [PATCH v15 07/11] arm64: syscall: Introduce syscall_exit_to_user_mode_work()
Date: Mon, 29 Jun 2026 15:48:33 +0100	[thread overview]
Message-ID: <feb4296b-66ad-4119-b53d-015e0957efca@arm.com> (raw)
In-Reply-To: <83aed271-5f28-4f30-a800-b94192cb06a6@huawei.com>

On 25/06/2026 10:18, Jinjie Ruan wrote:

> On 6/24/2026 10:37 PM, Ada Couprie Diaz wrote:
>> On 11/05/2026 10:20, Jinjie Ruan wrote:
>>> Refactor the system call exit path to align with the generic entry
>>> framework. This consolidates thread flag checking, rseq handling, and
>>> syscall tracing into a structure that mirrors the generic
>>> syscall_exit_to_user_mode_work() implementation.
>>>
>>> [Rationale]
>>> The generic entry code employs a hierarchical approach for
>>> syscall exit work:
>>>
>>> 1. syscall_exit_to_user_mode_work(): The entry point that handles
>>>      rseq and checks if further exit work (tracing/audit) is required.
>>>
>>> 2. syscall_exit_work(): Performs the actual tracing, auditing, and
>>>      ptrace reporting.
>>>
>>> [Changes]
>>> - Rename and Encapsulate: Rename syscall_trace_exit() to
>>>     syscall_exit_work() and make it static, as it is now an internal
>>>     helper for the exit path.
>>>
>>> - New Entry Point: Implement syscall_exit_to_user_mode_work() to
>>>     replace the manual flag-reading logic in el0_svc_common(). This
>>>     function now encapsulates the rseq_syscall() call and the
>>>     conditional execution of syscall_exit_work().
>>>
>>> - Simplify el0_svc_common(): Remove the complex conditional checks
>>>     for tracing and CONFIG_DEBUG_RSEQ at the end of the syscall path,
>>>     delegating this responsibility to the new helper.
>> It is indeed simpler, however to me there are two changes to the behaviour,
>> which are not called out (apologies if I missed some prior discussion
>> when I looked for some) :
>> 1. As pointed by the removed comment, in mainline we *always* trace on exit
>>     if we traced on entry. This is why there are two `has_syscall_work()`
>> checks
>>     on exit, with a re-read of the flags after syscall execution in between.
>>     This change only checks once on exit after updating the flags, so if
>>     there was work on entry but the flags got cleared, it *won't* trace
>> on exit.
>>     Is this desired ? Can this change of behaviour have an impact ?
> Hi, Ada,
>
> After rework, `syscall_exit_to_user_mode_work()` will be executed
> unconditionally, regardless of whether the conditions below evaluate to
> true or false. [...]

Hi Jinjie,

Indeed, my worry was about the actual work in 
`syscall_exit_to_user_mode_work()`
being gated by a different truth table, but it makes sense now, I will 
detail below.
> [...] You can see how this is handled in the finer-grained
> refactoring split which will be shown in v16.
I have seen that you just posted v16, thanks for splitting things up a 
bit more !
>
> 	if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ))
>
>>> - Helper Migration: Move has_syscall_work() to asm/syscall.h
>>>     to allow its reuse across ptrace.c and syscall.c.
>>>
>>> - Clean up RSEQ: Remove the explicit IS_ENABLED(CONFIG_DEBUG_RSEQ)
>>>     check in the caller, as rseq_syscall() is already a no-op when the
>>>     config is disabled.
>> 2. `rseq_syscall()` is indeed a no-op, but removing the explicit check here
>>     does change the behaviour : in mainline we *always* trace on exit if
>>     `CONFIG_DEBUG_RSEQ` is enabled, bypassing the `has_syscall_work()`
>> check.
>>     This change does not bypass the `has_syscall_work()` check if
>>     `CONFIG_DEBUG_RSEQ` is enabled, so there might be a change of behaviour.
>>     Same questions as above : is this change desired ? Can it have an
>> impact ?
> This should not introduce any functional changes.
>
> Except for "audit", the internal code execution of
> `syscall_trace_exit()` is gated by the "_TIF_SYSCALL_TRACEPOINT,
> _TIF_SYSCALL_TRACE, or _TIF_SINGLESTEP" TIF flags.
>
> And gating audit_syscall_exit() behind `_TIF_SYSCALL_AUDIT` introduces
> no functional changes.
>
> The `SYSCALL_AUDIT` flag and its context are
> statically allocated via audit_alloc() at fork and only freed via
> audit_free() at do_exit(). Since the flag remains persistent and static
> throughout syscall execution, checking the `_TIF_SYSCALL_AUDIT` flag is
> completely equivalent to evaluating audit_context() in
> audit_syscall_exit().
Thank you for the additional details : I did miss that 
`audit_syscall_exit()`
did its own check internally for `SYSCALL_WORK_SYSCALL_AUDIT`, which
is part of `SYSCALL_WORK_{ENTRY,EXIT}`, so this indeed does not change
the behaviour.

As you moved `rseq_debug_syscall_return()` outside of `syscall_exit_work()`,
it will indeed always be executed (potentially being no-op),
which removes the need for the removed check and does not change
the actual behaviour, because the rest of `syscall_trace_exit()` used
the exit TIFs as well anyway, as far as I understand.

> I probably moved too fast with this refactoring. I'll split this into
> smaller, more granular steps in v16 to make the logic clearer and easier
> to follow."
There were a bit more subtleties than I expected, so I think it is good
to have split it in more self-contained patches !
>> I understand that the change is to align with the generic entry, but it
>> seems
>> like this could have an impact that I do not really understand, so I prefer
>> asking !
>>
>> Apart from the above everything looks OK to me, but I'd like
>> some confirmation that the change of behaviours either do not exist or
>> are OK !
> Thank you for the review.
>
>> Thanks,
>> Ada
>>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Reviewed-by: Linus Walleij <linusw@kernel.org>
>>> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
>>> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
>>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>>> ---
>>> v15
>>> - Make syscall_exit_to_user_mode_work() __always_inline to keep
>>>     the fast-path performance as Sashiko pointed out.
>>> ---
>>>    arch/arm64/include/asm/syscall.h | 18 +++++++++++++++++-
>>>    arch/arm64/kernel/ptrace.c       |  5 +----
>>>    arch/arm64/kernel/syscall.c      | 20 +-------------------
>>>    3 files changed, 19 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/
>>> asm/syscall.h
>>> index 30b203ef156b..b331e09b937f 100644
>>> --- a/arch/arm64/include/asm/syscall.h
>>> +++ b/arch/arm64/include/asm/syscall.h
>>> @@ -8,6 +8,7 @@
>>>    #include <uapi/linux/audit.h>
>>>    #include <linux/compat.h>
>>>    #include <linux/err.h>
>>> +#include <linux/rseq.h>
>>>      typedef long (*syscall_fn_t)(const struct pt_regs *regs);
>>>    @@ -121,6 +122,21 @@ static inline int syscall_get_arch(struct
>>> task_struct *task)
>>>    }
>>>      int syscall_trace_enter(struct pt_regs *regs, unsigned long flags);
>>> -void syscall_trace_exit(struct pt_regs *regs, unsigned long flags);
>>> +void syscall_exit_work(struct pt_regs *regs, unsigned long flags);
>>> +
>>> +static inline bool has_syscall_work(unsigned long flags)
>>> +{
>>> +    return unlikely(flags & _TIF_SYSCALL_WORK);
>>> +}
>>> +
>>> +static __always_inline void syscall_exit_to_user_mode_work(struct
>>> pt_regs *regs)
>>> +{
>>> +    unsigned long flags = read_thread_flags();
>>              ^-- This only reflects the post-syscall flags
>>
>>> +
>>> +    rseq_syscall(regs);
>>> +
>>> +    if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
>>> +        syscall_exit_work(regs, flags);
>>> +}
>>>      #endif    /* __ASM_SYSCALL_H */
>>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>>> index 15a45eeb56da..256aa20377e1 100644
>>> --- a/arch/arm64/kernel/ptrace.c
>>> +++ b/arch/arm64/kernel/ptrace.c
>>> @@ -28,7 +28,6 @@
>>>    #include <linux/hw_breakpoint.h>
>>>    #include <linux/regset.h>
>>>    #include <linux/elf.h>
>>> -#include <linux/rseq.h>
>>>      #include <asm/compat.h>
>>>    #include <asm/cpufeature.h>
>>> @@ -2454,10 +2453,8 @@ int syscall_trace_enter(struct pt_regs *regs,
>>> unsigned long flags)
>>>        return syscall;
>>>    }
>>>    -void syscall_trace_exit(struct pt_regs *regs, unsigned long flags)
>>> +void syscall_exit_work(struct pt_regs *regs, unsigned long flags)
>>>    {
>>> -    rseq_syscall(regs);
>>> -
>>>        audit_syscall_exit(regs);
>>       ^-- This was always called if entry had work or CONFIG_DEBUG_RSEQ
>> was enabled,
>>           which is not the case anymore (same for the rest of the function)
> As explained above, thank you!
>
>>>          if (flags & _TIF_SYSCALL_TRACEPOINT)
>>> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
>>> index f6f87b042995..dac7bcc4bbdf 100644
>>> --- a/arch/arm64/kernel/syscall.c
>>> +++ b/arch/arm64/kernel/syscall.c
>>> @@ -54,11 +54,6 @@ static void invoke_syscall(struct pt_regs *regs,
>>> unsigned int scno,
>>>        syscall_set_return_value(current, regs, 0, ret);
>>>    }
>>>    -static inline bool has_syscall_work(unsigned long flags)
>>> -{
>>> -    return unlikely(flags & _TIF_SYSCALL_WORK);
>>> -}
>>> -
>>>    static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
>>>                   const syscall_fn_t syscall_table[])
>>>    {
>>> @@ -120,21 +115,8 @@ static void el0_svc_common(struct pt_regs *regs,
>>> int scno, int sc_nr,
>>>        }
>>>          invoke_syscall(regs, scno, sc_nr, syscall_table);
>>> -
>>> -    /*
>>> -     * The tracing status may have changed under our feet, so we have to
>>> -     * check again. However, if we were tracing entry, then we always
>>> trace
>>> -     * exit regardless, as the old entry assembly did.
>>> -     */
>>> -    if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
>>                        ^-- We always traced exit if CONFIG_DEBUG_RSEQ is
>> enabled
>>           ^-- `flags` is unchanged since entry, and exit was always
>> traced if there was work.
> As explained above, thank you!
>
> Best regards,
> Jinjie
Thanks again for the additional details !
Kind regards,
Ada


  reply	other threads:[~2026-06-29 14:49 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  9:20 [PATCH v15 00/11] arm64: entry: Convert to Generic Entry Jinjie Ruan
2026-05-11  9:20 ` [PATCH v15 01/11] entry: Fix potential syscall truncation in syscall_trace_enter() Jinjie Ruan
2026-05-27 12:21   ` Linus Walleij
2026-06-24 13:34   ` Ada Couprie Diaz
2026-06-24 17:34   ` Thomas Gleixner
2026-06-25 11:33     ` Jinjie Ruan
2026-05-11  9:20 ` [PATCH v15 02/11] arm64/ptrace: Refactor syscall_trace_enter/exit() to accept flags parameter Jinjie Ruan
2026-06-24 13:38   ` Ada Couprie Diaz
2026-05-11  9:20 ` [PATCH v15 03/11] arm64/ptrace: Use syscall_get_nr() helper for syscall_trace_enter() Jinjie Ruan
2026-06-24 13:42   ` Ada Couprie Diaz
2026-06-25  8:51     ` Jinjie Ruan
2026-05-11  9:20 ` [PATCH v15 04/11] arm64/ptrace: Expand secure_computing() in place Jinjie Ruan
2026-06-24 13:43   ` Ada Couprie Diaz
2026-05-11  9:20 ` [PATCH v15 05/11] arm64/ptrace: Use syscall_get_arguments() helper for audit Jinjie Ruan
2026-06-24 13:44   ` Ada Couprie Diaz
2026-05-11  9:20 ` [PATCH v15 06/11] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit() Jinjie Ruan
2026-06-24 13:46   ` Ada Couprie Diaz
2026-05-11  9:20 ` [PATCH v15 07/11] arm64: syscall: Introduce syscall_exit_to_user_mode_work() Jinjie Ruan
2026-06-24 14:37   ` Ada Couprie Diaz
2026-06-25  9:18     ` Jinjie Ruan
2026-06-29 14:48       ` Ada Couprie Diaz [this message]
2026-05-11  9:21 ` [PATCH v15 08/11] arm64/ptrace: Define and use _TIF_SYSCALL_EXIT_WORK Jinjie Ruan
2026-06-24 14:53   ` Ada Couprie Diaz
2026-06-25 11:41     ` Jinjie Ruan
2026-06-29 15:05       ` Ada Couprie Diaz
2026-05-11  9:21 ` [PATCH v15 09/11] arm64/ptrace: Skip syscall exit reporting for PTRACE_SYSEMU_SINGLESTEP Jinjie Ruan
2026-06-24 14:55   ` Ada Couprie Diaz
2026-05-11  9:21 ` [PATCH v15 10/11] arm64: entry: Convert to generic entry Jinjie Ruan
2026-06-24 15:32   ` Ada Couprie Diaz
2026-06-25 11:27     ` Jinjie Ruan
2026-05-11  9:21 ` [PATCH v15 11/11] arm64: Inline el0_svc_common() Jinjie Ruan
2026-06-24 15:36   ` Ada Couprie Diaz
2026-06-17 16:27 ` [PATCH v15 00/11] arm64: entry: Convert to Generic Entry Ada Couprie Diaz
2026-06-24 15:44   ` Ada Couprie Diaz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=feb4296b-66ad-4119-b53d-015e0957efca@arm.com \
    --to=ada.coupriediaz@arm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=kees@kernel.org \
    --cc=kevin.brodsky@arm.com \
    --cc=ldv@strace.io \
    --cc=linusw@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liqiang01@kylinos.cn \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=oleg@redhat.com \
    --cc=pengcan@kylinos.cn \
    --cc=peterz@infradead.org \
    --cc=ruanjinjie@huawei.com \
    --cc=ryan.roberts@arm.com \
    --cc=song@kernel.org \
    --cc=tglx@kernel.org \
    --cc=thuth@redhat.com \
    --cc=wad@chromium.org \
    --cc=will@kernel.org \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox