[PATCH 0/2] execve scalability issues, part 1

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mateusz Guzik <mjguzik@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: dennis@kernel.org, tj@kernel.org, cl@linux.com,
	akpm@linux-foundation.org, shakeelb@google.com,
	linux-mm@kvack.org, Mateusz Guzik <mjguzik@gmail.com>
Subject: [PATCH 0/2] execve scalability issues, part 1
Date: Mon, 21 Aug 2023 22:28:27 +0200	[thread overview]
Message-ID: <20230821202829.2163744-1-mjguzik@gmail.com> (raw)

To start I figured I'm going to bench about as friendly case as it gets
-- statically linked *separate* binaries all doing execve in a loop.

I borrowed the bench from found here:
http://apollo.backplane.com/DFlyMisc/doexec.c

$ cc -static -O2 -o static-doexec doexec.c
$ ./static-doexec $(nproc)

It prints a result every second (warning: first line is garbage).

My test box is temporarily only 26 cores and even at this scale I run
into massive lock contention stemming from back-to-back calls to
percpu_counter_init (and _destroy later).

While not a panacea, one simple thing to do here is to batch these ops.
Since the term "batching" is already used in the file, I decided to
refer to it as "grouping" instead.

Even if this code could be patched to dodge these counters,  I would
argue a high-traffic alloc/free consumer is only a matter of time so it
makes sense to facilitate it.

With the fix I get an ok win, to quote from the commit:
> Even at a very modest scale of 26 cores (ops/s):
> before: 133543.63
> after:  186061.81 (+39%)

> While with the patch these allocations remain a significant problem,
> the primary bottleneck shifts to:
> 
>     __pv_queued_spin_lock_slowpath+1
>     _raw_spin_lock_irqsave+57
>     folio_lruvec_lock_irqsave+91
>     release_pages+590
>     tlb_batch_pages_flush+61
>     tlb_finish_mmu+101
>     exit_mmap+327
>     __mmput+61
>     begin_new_exec+1245
>     load_elf_binary+712
>     bprm_execve+644
>     do_execveat_common.isra.0+429
>     __x64_sys_execve+50
>     do_syscall_64+46
>     entry_SYSCALL_64_after_hwframe+110

I intend to do more work on the area to mostly sort it out, but I would
not mind if someone else took the hammer to folio. :)

With this out of the way I'll be looking at some form of caching to
eliminate these allocs as a problem.

Thoughts?

Mateusz Guzik (2):
  pcpcntr: add group allocation/free
  fork: group allocation of per-cpu counters for mm struct

 include/linux/percpu_counter.h | 19 ++++++++---
 kernel/fork.c                  | 13 ++------
 lib/percpu_counter.c           | 61 ++++++++++++++++++++++++----------
 3 files changed, 60 insertions(+), 33 deletions(-)

-- 
2.39.2

next             reply	other threads:[~2023-08-21 20:28 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21 20:28 Mateusz Guzik [this message]
2023-08-21 20:28 ` [PATCH 1/2] pcpcntr: add group allocation/free Mateusz Guzik
2023-08-22 13:37   ` Vegard Nossum
2023-08-22 14:06     ` Mateusz Guzik
2023-08-22 17:02   ` Dennis Zhou
2023-08-21 20:28 ` [PATCH 2/2] fork: group allocation of per-cpu counters for mm struct Mateusz Guzik
2023-08-21 21:20   ` Matthew Wilcox
2023-08-21 20:42 ` [PATCH 0/2] execve scalability issues, part 1 Matthew Wilcox
2023-08-21 20:44   ` [PATCH 1/7] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 2/7] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 3/7] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 4/7] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 5/7] memcg: Add mem_cgroup_uncharge_batch() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 6/7] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2023-08-21 20:44     ` [PATCH 7/7] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2023-08-21 21:07 ` [PATCH 0/2] execve scalability issues, part 1 Dennis Zhou
2023-08-21 21:39   ` Mateusz Guzik
2023-08-21 22:29     ` Mateusz Guzik
2023-08-22  9:51       ` Jan Kara
2023-08-22 14:24         ` Mateusz Guzik
2023-08-23  9:49           ` Jan Kara
2023-08-23 10:49             ` David Laight
2023-08-23 12:01               ` Mateusz Guzik
2023-08-23 12:13             ` Mateusz Guzik
2023-08-23 15:47               ` Jan Kara
2023-08-23 16:10                 ` Mateusz Guzik
2023-08-23 16:41                   ` Jan Kara
2023-08-23 17:12                     ` Mateusz Guzik
2023-08-23 20:27             ` Dennis Zhou
2023-08-24  9:19               ` Jan Kara
2023-08-26 18:33 ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230821202829.2163744-1-mjguzik@gmail.com \
    --to=mjguzik@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).