All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christopher M. Riedl" <cmr@codefail.de>
To: "Christophe Leroy" <christophe.leroy@csgroup.eu>,
	<linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user()
Date: Sat, 06 Feb 2021 11:39:13 -0600	[thread overview]
Message-ID: <C92MQRHFCFEA.37OV051PFFY6@geist> (raw)
In-Reply-To: <fce0b2d0-58a3-a94d-a8e9-d104fc2b3058@csgroup.eu>

On Sat Feb 6, 2021 at 10:32 AM CST, Christophe Leroy wrote:
>
>
> Le 20/10/2020 à 04:01, Christopher M. Riedl a écrit :
> > On Fri Oct 16, 2020 at 10:48 AM CDT, Christophe Leroy wrote:
> >>
> >>
> >> Le 15/10/2020 à 17:01, Christopher M. Riedl a écrit :
> >>> Reuse the "safe" implementation from signal.c except for calling
> >>> unsafe_copy_from_user() to copy into a local buffer. Unlike the
> >>> unsafe_copy_{vsx,fpr}_to_user() functions the "copy from" functions
> >>> cannot use unsafe_get_user() directly to bypass the local buffer since
> >>> doing so significantly reduces signal handling performance.
> >>
> >> Why can't the functions use unsafe_get_user(), why does it significantly
> >> reduces signal handling
> >> performance ? How much significant ? I would expect that not going
> >> through an intermediate memory
> >> area would be more efficient
> >>
> > 
> > Here is a comparison, 'unsafe-signal64-regs' avoids the intermediate buffer:
> > 
> > 	|                      | hash   | radix  |
> > 	| -------------------- | ------ | ------ |
> > 	| linuxppc/next        | 289014 | 158408 |
> > 	| unsafe-signal64      | 298506 | 253053 |
> > 	| unsafe-signal64-regs | 254898 | 220831 |
> > 
> > I have not figured out the 'why' yet. As you mentioned in your series,
> > technically calling __copy_tofrom_user() is overkill for these
> > operations. The only obvious difference between unsafe_put_user() and
> > unsafe_get_user() is that we don't have asm-goto for the 'get' variant.
> > Instead we wrap with unsafe_op_wrap() which inserts a conditional and
> > then goto to the label.
> > 
> > Implemenations:
> > 
> > 	#define unsafe_copy_fpr_from_user(task, from, label)   do {            \
> > 	       struct task_struct *__t = task;                                 \
> > 	       u64 __user *buf = (u64 __user *)from;                           \
> > 	       int i;                                                          \
> > 									       \
> > 	       for (i = 0; i < ELF_NFPREG - 1; i++)                            \
> > 		       unsafe_get_user(__t->thread.TS_FPR(i), &buf[i], label); \
> > 	       unsafe_get_user(__t->thread.fp_state.fpscr, &buf[i], label);    \
> > 	} while (0)
> > 
> > 	#define unsafe_copy_vsx_from_user(task, from, label)   do {            \
> > 	       struct task_struct *__t = task;                                 \
> > 	       u64 __user *buf = (u64 __user *)from;                           \
> > 	       int i;                                                          \
> > 									       \
> > 	       for (i = 0; i < ELF_NVSRHALFREG ; i++)                          \
> > 		       unsafe_get_user(__t->thread.fp_state.fpr[i][TS_VSRLOWOFFSET], \
> > 				       &buf[i], label);                        \
> > 	} while (0)
> > 
>
> Do you have CONFIG_PROVE_LOCKING or CONFIG_DEBUG_ATOMIC_SLEEP enabled in
> your config ?

I don't have these set in my config (ppc64le_defconfig). I think I
figured this out - the reason for the lower signal throughput is the
barrier_nospec() in __get_user_nocheck(). When looping we incur that
cost on every iteration. Commenting it out results in signal performance
of ~316K w/ hash on the unsafe-signal64-regs branch. Obviously the
barrier is there for a reason but it is quite costly.

This also explains why the copy_{fpr,vsx}_to_user() direction does not
suffer from the slowdown because there is no need for barrier_nospec().
>
> If yes, could you try together with the patch from Alexey
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210204121612.32721-1-aik@ozlabs.ru/
> ?
>
> Thanks
> Christophe


  reply	other threads:[~2021-02-06 17:55 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 15:01 [PATCH 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
2020-10-15 15:01 ` [PATCH 1/8] powerpc/uaccess: Add unsafe_copy_from_user Christopher M. Riedl
2020-10-16  6:54   ` Christoph Hellwig
2020-10-16 13:18     ` Christophe Leroy
2020-10-16 13:17   ` Christophe Leroy
2020-10-20  3:00     ` Christopher M. Riedl
2020-10-15 15:01 ` [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user() Christopher M. Riedl
2020-10-16 13:48   ` [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx, fpr}_from_user() Christophe Leroy
2020-10-20  2:01     ` [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user() Christopher M. Riedl
2021-02-06 16:32       ` [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx, fpr}_from_user() Christophe Leroy
2021-02-06 17:39         ` Christopher M. Riedl [this message]
2021-02-07 10:12           ` Christophe Leroy
2021-02-08 17:14             ` [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user() Christopher M. Riedl
2021-02-08 17:18               ` [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx, fpr}_from_user() Christophe Leroy
2020-10-15 15:01 ` [PATCH 3/8] powerpc: Mark functions called inside uaccess blocks w/ 'notrace' Christopher M. Riedl
2020-10-16  6:56   ` Christoph Hellwig
2020-10-16  6:56     ` Christoph Hellwig
2020-10-16  9:41     ` Peter Zijlstra
2020-10-16  9:41       ` Peter Zijlstra
2020-10-20  7:34       ` Michael Ellerman
2020-10-16  7:02   ` Christophe Leroy
2020-10-20  1:59     ` Christopher M. Riedl
2020-10-15 15:01 ` [PATCH 4/8] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext() Christopher M. Riedl
2020-10-15 15:01 ` [PATCH 5/8] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext() Christopher M. Riedl
2020-10-15 15:01 ` [PATCH 6/8] powerpc/signal64: Replace setup_trampoline() w/ unsafe_setup_trampoline() Christopher M. Riedl
2020-10-16 13:56   ` Christophe Leroy
2020-10-20  2:42     ` Christopher M. Riedl
2020-10-20  5:02       ` Christophe Leroy
2020-10-15 15:01 ` [PATCH 7/8] powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches Christopher M. Riedl
2020-10-16 14:00   ` Christophe Leroy
2020-10-20  2:44     ` Christopher M. Riedl
2020-10-15 15:01 ` [PATCH 8/8] powerpc/signal64: Rewrite rt_sigreturn() " Christopher M. Riedl
2020-10-16 14:07   ` Christophe Leroy
2020-10-20  2:45     ` Christopher M. Riedl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C92MQRHFCFEA.37OV051PFFY6@geist \
    --to=cmr@codefail.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.