From: Barret Rhoden <brho@google.com>
To: ebiederm@xmission.com
Cc: Christian Brauner <christian.brauner@ubuntu.com>,
Andrew Morton <akpm@linux-foundation.org>,
Alexey Gladkov <legion@kernel.org>,
William Cohen <wcohen@redhat.com>,
Viresh Kumar <viresh.kumar@linaro.org>,
Alexey Dobriyan <adobriyan@gmail.com>,
Chris Hyser <chris.hyser@oracle.com>,
Peter Collingbourne <pcc@google.com>,
Xiaofeng Cao <caoxiaofeng@yulong.com>,
David Hildenbrand <david@redhat.com>,
Cyrill Gorcunov <gorcunov@gmail.com>,
linux-kernel@vger.kernel.org
Subject: [PATCH v3 0/3] prlimit and set/getpriority tasklist_lock optimizations
Date: Thu, 6 Jan 2022 12:20:38 -0500 [thread overview]
Message-ID: <20220106172041.522167-1-brho@google.com> (raw)
The tasklist_lock popped up as a scalability bottleneck on some testing
workloads. The readlocks in do_prlimit and set/getpriority are not
necessary in all cases.
Based on a cycles profile, it looked like ~87% of the time was spent in
the kernel, ~42% of which was just trying to get *some* spinlock
(queued_spin_lock_slowpath, not necessarily the tasklist_lock).
The big offenders (with rough percentages in cycles of the overall trace):
- do_wait 11%
- setpriority 8% (this patchset)
- kill 8%
- do_exit 5%
- clone 3%
- prlimit64 2% (this patchset)
- getrlimit 1% (this patchset)
I can't easily test this patchset on the original workload for various
reasons. Instead, I used the microbenchmark below to at least verify
there was some improvement. This patchset had a 28% speedup (12% from
baseline to set/getprio, then another 14% for prlimit).
One interesting thing is that my libc's getrlimit() was calling
prlimit64, so hoisting the read_lock(tasklist_lock) into sys_prlimit64
had no effect - it essentially optimized the older syscalls only. I
didn't do that in this patchset, but figured I'd mention it since it was
an option from the previous patch's discussion.
v2: https://lore.kernel.org/lkml/20220105212828.197013-1-brho@google.com/
- update_rlimit_cpu on the group_leader instead of for_each_thread.
- update_rlimit_cpu still returns 0 or -ESRCH, even though we don't care
about the error here. it felt safer that way in case someone uses
that function again.
v1: https://lore.kernel.org/lkml/20211213220401.1039578-1-brho@google.com/
#include <sys/resource.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
pid_t child;
struct rlimit rlim[1];
fork(); fork(); fork(); fork(); fork(); fork();
for (int i = 0; i < 5000; i++) {
child = fork();
if (child < 0)
exit(1);
if (child > 0) {
usleep(1000);
kill(child, SIGTERM);
waitpid(child, NULL, 0);
} else {
for (;;) {
setpriority(PRIO_PROCESS, 0,
getpriority(PRIO_PROCESS, 0));
getrlimit(RLIMIT_CPU, rlim);
}
}
}
return 0;
}
Barret Rhoden (3):
setpriority: only grab the tasklist_lock for PRIO_PGRP
prlimit: make do_prlimit() static
prlimit: do not grab the tasklist_lock
include/linux/posix-timers.h | 2 +-
include/linux/resource.h | 2 -
kernel/sys.c | 127 +++++++++++++++++----------------
kernel/time/posix-cpu-timers.c | 12 +++-
4 files changed, 76 insertions(+), 67 deletions(-)
--
2.34.1.448.ga2b2bfdf31-goog
next reply other threads:[~2022-01-06 17:20 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-06 17:20 Barret Rhoden [this message]
2022-01-06 17:20 ` [PATCH v3 1/3] setpriority: only grab the tasklist_lock for PRIO_PGRP Barret Rhoden
2022-01-06 17:20 ` [PATCH v3 2/3] prlimit: make do_prlimit() static Barret Rhoden
2022-01-06 17:20 ` [PATCH v3 3/3] prlimit: do not grab the tasklist_lock Barret Rhoden
2022-03-23 20:14 ` [GIT PULL] prlimit and set/getpriority tasklist_lock optimizations Eric W. Biederman
2022-03-24 19:44 ` pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220106172041.522167-1-brho@google.com \
--to=brho@google.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=caoxiaofeng@yulong.com \
--cc=chris.hyser@oracle.com \
--cc=christian.brauner@ubuntu.com \
--cc=david@redhat.com \
--cc=ebiederm@xmission.com \
--cc=gorcunov@gmail.com \
--cc=legion@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pcc@google.com \
--cc=viresh.kumar@linaro.org \
--cc=wcohen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.