linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] epoll: Save one stac/clac pair in epoll_put_uevent().
@ 2025-10-23  0:04 Kuniyuki Iwashima
  2025-10-23  0:04 ` [PATCH v1 1/2] uaccess: Add __user_write_access_begin() Kuniyuki Iwashima
  2025-10-23  0:04 ` [PATCH v1 2/2] epoll: Use __user_write_access_begin() and unsafe_put_user() in epoll_put_uevent() Kuniyuki Iwashima
  0 siblings, 2 replies; 16+ messages in thread
From: Kuniyuki Iwashima @ 2025-10-23  0:04 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Madhavan Srinivasan,
	Michael Ellerman, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Jens Axboe, Christian Brauner, Linus Torvalds
  Cc: Nicholas Piggin, Christophe Leroy, Alexandre Ghiti,
	H. Peter Anvin, Eric Dumazet, Kuniyuki Iwashima,
	Kuniyuki Iwashima, x86, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-kernel

epoll_put_uevent() calls __put_user() twice, which are inlined
to two calls of out-of-line functions, __put_user_nocheck_4()
and __put_user_nocheck_8().

Both functions wrap mov with a stac/clac pair, which is expensive
on an AMD EPYC 7B12 64-Core Processor platform.

  __put_user_nocheck_4  /proc/kcore [Percent: local period]
  Percent │
    89.91 │      stac
     0.19 │      mov  %eax,(%rcx)
     0.15 │      xor  %ecx,%ecx
     9.69 │      clac
     0.06 │    ← retq

This was remarkable while testing neper/tcp_rr with 1000 flows per
thread.

  Overhead  Shared O  Symbol
    10.08%  [kernel]  [k] _copy_to_iter
     7.12%  [kernel]  [k] ip6_output
     6.40%  [kernel]  [k] sock_poll
     5.71%  [kernel]  [k] move_addr_to_user
     4.39%  [kernel]  [k] __put_user_nocheck_4
     ...
     1.06%  [kernel]  [k] ep_try_send_events
     ...                  ^- epoll_put_uevent() was inlined
     0.78%  [kernel]  [k] __put_user_nocheck_8

Patch 1 adds a new uaccess helper that is inlined to a bare stac
without address masking or uaccess_ok(), which is already checked
in ep_check_params().

Patch 2 uses the helper and unsafe_put_user() in epoll_put_uevent().


Kuniyuki Iwashima (2):
  uaccess: Add __user_write_access_begin().
  epoll: Use __user_write_access_begin() and unsafe_put_user() in
    epoll_put_uevent().

 arch/arm64/include/asm/uaccess.h   |  1 +
 arch/powerpc/include/asm/uaccess.h | 13 ++++++++++---
 arch/riscv/include/asm/uaccess.h   |  1 +
 arch/x86/include/asm/uaccess.h     |  1 +
 include/linux/eventpoll.h          | 13 ++++++++-----
 include/linux/uaccess.h            |  1 +
 6 files changed, 22 insertions(+), 8 deletions(-)

-- 
2.51.1.814.gb8fa24458f-goog



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-10-29  1:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23  0:04 [PATCH v1 0/2] epoll: Save one stac/clac pair in epoll_put_uevent() Kuniyuki Iwashima
2025-10-23  0:04 ` [PATCH v1 1/2] uaccess: Add __user_write_access_begin() Kuniyuki Iwashima
2025-10-23  5:37   ` Linus Torvalds
2025-10-23  8:29     ` David Laight
2025-10-24  5:31       ` Kuniyuki Iwashima
2025-10-23  0:04 ` [PATCH v1 2/2] epoll: Use __user_write_access_begin() and unsafe_put_user() in epoll_put_uevent() Kuniyuki Iwashima
2025-10-23 19:40   ` Dave Hansen
2025-10-24  5:16     ` Kuniyuki Iwashima
2025-10-24 14:05       ` Dave Hansen
2025-10-24 14:47         ` David Laight
2025-10-28  5:32         ` Kuniyuki Iwashima
2025-10-28  9:54           ` David Laight
2025-10-28 16:42             ` Kuniyuki Iwashima
2025-10-28 16:58               ` Linus Torvalds
2025-10-29  1:42                 ` Andrew Cooper
2025-10-28 22:30               ` David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).