public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Charlie Jenkins <charlie@rivosinc.com>
To: Brian Gerst <brgerst@gmail.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	WANG Xuerui <kernel@xen0n.name>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	Alexandre Ghiti <alexghiti@rivosinc.com>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	loongarch@lists.linux.dev
Subject: Re: [PATCH v2 2/4] riscv: entry: Split ret_from_fork() into user and kernel
Date: Fri, 24 Jan 2025 10:26:24 -0800	[thread overview]
Message-ID: <Z5PbUBqs2Eh8JCIw@ghost> (raw)
In-Reply-To: <CAMzpN2gL9GWx-YzJd+sgQ30=gk3TN342W8Memz5j1C02jgQfrg@mail.gmail.com>

On Fri, Jan 24, 2025 at 08:08:44AM -0500, Brian Gerst wrote:
> On Fri, Jan 24, 2025 at 2:53 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
> >
> > On Fri, Jan 24, 2025 at 08:19:18AM +0100, Alexandre Ghiti wrote:
> > > Hi Charlie,
> > >
> > > On 23/01/2025 20:14, Charlie Jenkins wrote:
> > > > This function was unified into a single function in commit ab9164dae273
> > > > ("riscv: entry: Consolidate ret_from_kernel_thread into ret_from_fork").
> > > > However that imposed a performance degradation. Partially reverting this
> > > > commit to have ret_from_fork() split again results in a 1% increase on
> > > > the number of times fork is able to be called per second.
> > > >
> > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > > > ---
> > > >   arch/riscv/include/asm/asm-prototypes.h |  3 ++-
> > > >   arch/riscv/kernel/entry.S               | 13 ++++++++++---
> > > >   arch/riscv/kernel/process.c             | 17 +++++++++++------
> > > >   3 files changed, 23 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h
> > > > index 733ff609778797001006c33bba9e3cc5b1f15387..bfc8ea5f9319b19449ec59493b45b926df888832 100644
> > > > --- a/arch/riscv/include/asm/asm-prototypes.h
> > > > +++ b/arch/riscv/include/asm/asm-prototypes.h
> > > > @@ -52,7 +52,8 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s);
> > > >   DECLARE_DO_ERROR_INFO(do_trap_ecall_m);
> > > >   DECLARE_DO_ERROR_INFO(do_trap_break);
> > > > -asmlinkage void ret_from_fork(void *fn_arg, int (*fn)(void *), struct pt_regs *regs);
> > > > +asmlinkage void ret_from_fork_kernel(void *fn_arg, int (*fn)(void *), struct pt_regs *regs);
> > > > +asmlinkage void ret_from_fork_user(struct pt_regs *regs);
> > > >   asmlinkage void handle_bad_stack(struct pt_regs *regs);
> > > >   asmlinkage void do_page_fault(struct pt_regs *regs);
> > > >   asmlinkage void do_irq(struct pt_regs *regs);
> > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> > > > index 9225c322279aa90e737b1d7144db084319cf8103..9386ef7444267f0b9bf8a0550f4e31deaeb85881 100644
> > > > --- a/arch/riscv/kernel/entry.S
> > > > +++ b/arch/riscv/kernel/entry.S
> > > > @@ -319,14 +319,21 @@ SYM_CODE_END(handle_kernel_stack_overflow)
> > > >   ASM_NOKPROBE(handle_kernel_stack_overflow)
> > > >   #endif
> > > > -SYM_CODE_START(ret_from_fork_asm)
> > > > +SYM_CODE_START(ret_from_fork_kernel_asm)
> > > >     call schedule_tail
> > > >     move a0, s1 /* fn */
> > > >     move a1, s0 /* fn_arg */
> > > >     move a2, sp /* pt_regs */
> > > > -   call ret_from_fork
> > > > +   call ret_from_fork_kernel
> > > >     j ret_from_exception
> > > > -SYM_CODE_END(ret_from_fork_asm)
> > > > +SYM_CODE_END(ret_from_fork_kernel_asm)
> > > > +
> > > > +SYM_CODE_START(ret_from_fork_user_asm)
> > > > +   call schedule_tail
> > > > +   move a0, sp /* pt_regs */
> > > > +   call ret_from_fork_user
> > > > +   j ret_from_exception
> > > > +SYM_CODE_END(ret_from_fork_user_asm)
> > > >   #ifdef CONFIG_IRQ_STACKS
> > > >   /*
> > > > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> > > > index 0d07e6d8f6b57beba438dbba5e8c74a014582bee..5f15236cb526bd9fe61636ed372b4b76c94df946 100644
> > > > --- a/arch/riscv/kernel/process.c
> > > > +++ b/arch/riscv/kernel/process.c
> > > > @@ -38,7 +38,8 @@ unsigned long __stack_chk_guard __read_mostly;
> > > >   EXPORT_SYMBOL(__stack_chk_guard);
> > > >   #endif
> > > > -extern asmlinkage void ret_from_fork_asm(void);
> > > > +extern asmlinkage void ret_from_fork_kernel_asm(void);
> > > > +extern asmlinkage void ret_from_fork_user_asm(void);
> > > >   void noinstr arch_cpu_idle(void)
> > > >   {
> > > > @@ -208,14 +209,18 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
> > > >     return 0;
> > > >   }
> > > > -asmlinkage void ret_from_fork(void *fn_arg, int (*fn)(void *), struct pt_regs *regs)
> > > > +asmlinkage void ret_from_fork_kernel(void *fn_arg, int (*fn)(void *), struct pt_regs *regs)
> > > >   {
> > > > -   if (unlikely(fn))
> > > > -           fn(fn_arg);
> > > > +   fn(fn_arg);
> > > >     syscall_exit_to_user_mode(regs);
> > > >   }
> > > > +asmlinkage void ret_from_fork_user(struct pt_regs *regs)
> > > > +{
> > > > +   syscall_exit_to_user_mode(regs);
> > > > +}
> > > > +
> > > >   int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> > > >   {
> > > >     unsigned long clone_flags = args->flags;
> > > > @@ -238,6 +243,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> > > >             p->thread.s[0] = (unsigned long)args->fn;
> > > >             p->thread.s[1] = (unsigned long)args->fn_arg;
> > > > +           p->thread.ra = (unsigned long)ret_from_fork_kernel_asm;
> > > >     } else {
> > > >             *childregs = *(current_pt_regs());
> > > >             /* Turn off status.VS */
> > > > @@ -247,12 +253,11 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> > > >             if (clone_flags & CLONE_SETTLS)
> > > >                     childregs->tp = tls;
> > > >             childregs->a0 = 0; /* Return value of fork() */
> > > > -           p->thread.s[0] = 0;
> > > > +           p->thread.ra = (unsigned long)ret_from_fork_user_asm;
> > > >     }
> > > >     p->thread.riscv_v_flags = 0;
> > > >     if (has_vector())
> > > >             riscv_v_thread_alloc(p);
> > > > -   p->thread.ra = (unsigned long)ret_from_fork_asm;
> > > >     p->thread.sp = (unsigned long)childregs; /* kernel sp */
> > > >     return 0;
> > > >   }
> > > >
> > >
> > > Can you benchmark this change on some HW? I'm not sure we would indeed gain
> > > this 1%.
> >
> > It reduces the syscall path by 3 instructions, two for not needing to
> > move the fn and fn_args from:
> >
> > move a0, s1 /* fn */
> > move a1, s0 /* fn_arg */
> >
> > And one for not needing to do the conditional. This one is also saved on
> > kernel threads.
> >
> > It's a very small improvement, but there is only something like 100
> > instructions along the direct syscall path so it ends up being a large
> > percentage. On hardware moving registers is very cheap and this branch
> > will be almost always be correctly predicted so the cost is close to
> > zero. I just figured that since I am making changes around here it would
> > be nice if it was optimal instead of being close to optimal.
> 
> That may be the case on the child process side, but compared to the
> cost of the fork on the parent process side (allocating and
> initializing a new task struct), it's miniscule.
> 

Yes that is a good point. The change will have a probably unnoticeable
effect, mostly just depends on how people want this code to look.

- Charlie

> 
> 
> Brian Gerst

  reply	other threads:[~2025-01-24 18:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-23 19:14 [PATCH v2 0/4] entry: Move ret_from_fork() to C and inline syscall_exit_to_user_mode() Charlie Jenkins
2025-01-23 19:14 ` [PATCH v2 1/4] riscv: entry: Convert ret_from_fork() to C Charlie Jenkins
2025-01-24 13:14   ` Brian Gerst
2025-01-24 18:23     ` Charlie Jenkins
2025-01-23 19:14 ` [PATCH v2 2/4] riscv: entry: Split ret_from_fork() into user and kernel Charlie Jenkins
2025-01-24  7:19   ` Alexandre Ghiti
2025-01-24  7:53     ` Charlie Jenkins
2025-01-24 13:08       ` Brian Gerst
2025-01-24 18:26         ` Charlie Jenkins [this message]
2025-01-23 19:14 ` [PATCH v2 3/4] loongarch: entry: Migrate ret_from_fork() to C Charlie Jenkins
2025-01-24  9:05   ` Huacai Chen
2025-01-24 18:28     ` Charlie Jenkins
2025-01-24 22:23       ` Charlie Jenkins
2025-01-23 19:14 ` [PATCH v2 4/4] entry: Inline syscall_exit_to_user_mode() Charlie Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5PbUBqs2Eh8JCIw@ghost \
    --to=charlie@rivosinc.com \
    --cc=alex@ghiti.fr \
    --cc=alexghiti@rivosinc.com \
    --cc=brgerst@gmail.com \
    --cc=chenhuacai@kernel.org \
    --cc=kernel@xen0n.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=loongarch@lists.linux.dev \
    --cc=luto@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox