From: Fox Chen <foxhlchen@gmail.com>
To: gregkh@linuxfoundation.org, tj@kernel.org
Cc: Fox Chen <foxhlchen@gmail.com>, linux-kernel@vger.kernel.org
Subject: [PATCH 0/2] kernfs: speed up concurrency performance
Date: Wed, 2 Dec 2020 22:58:35 +0800 [thread overview]
Message-ID: <20201202145837.48040-1-foxhlchen@gmail.com> (raw)
Hello,
kernfs is an important facillity to support pseudo file systems and cgroup.
Currently, with a global mutex, reading files concurrently from kernfs (e.g. /sys)
is very slow.
This problem is reported by Brice Goglin on thread:
Re: [PATCH 1/4] drivers core: Introduce CPU type sysfs interface
https://lore.kernel.org/lkml/X60dvJoT4fURcnsF@kroah.com/
I independently comfirmed this on a 96-core AWS c5.metal server.
Do open+read+write on /sys/devices/system/cpu/cpu15/topology/core_id 1000 times.
With a single thread it takes ~2.5 us for each open+read+close.
With one thread per core, 96 threads running simultaneously takes 540 us
for each of the same operation (without much variation) -- 200x slower than the
single thread one.
The problem can only be observed in large machines (>=16 cores).
The more cores you have the slower it can be.
Perf shows that CPUs spend most of the time (>80%) waiting on mutex locks in
kernfs_iop_permission and kernfs_dop_revalidate.
This patchset contains the following 2 patches:
0001-kernfs-replace-the-mutex-in-kernfs_iop_permission-wi.patch
0002-kernfs-remove-mutex-in-kernfs_dop_revalidate.patch
0001 replace the mutex lock in kernfs_iop_permission with a new rwlock and
0002 removes the mutex lock in kernfs_dop_revalidate.
After applying this patchset, the multi-thread performance becomes linear with
the fastest one at ~30 us to the worst at ~150 us, very similar as I tested it
on a normal ext4 file system with fastest one at ~20 us to slowest at ~100 us.
And I believe that is largely due to spin_locks in filesystems which are normal.
Although it's still slower than single thread, users can benefit from this
patchset, especially ones working on HPC realm with lots of cpu cores and want to
fetch system information from sysfs.
I tried my best to solve this problem. If there is stupid mistake, please kindly
point out. I would appreciate it greatly.
Fox
fs/kernfs/dir.c | 9 +++------
fs/kernfs/inode.c | 16 ++++++++--------
include/linux/kernfs.h | 1 +
3 files changed, 12 insertions(+), 14 deletions(-)
--
2.29.2
next reply other threads:[~2020-12-02 15:00 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-02 14:58 Fox Chen [this message]
2020-12-02 14:58 ` [PATCH 1/2] kernfs: replace the mutex in kernfs_iop_permission with a rwlock Fox Chen
2020-12-02 18:27 ` Greg KH
2020-12-02 18:34 ` Tejun Heo
2020-12-02 18:37 ` Tejun Heo
2020-12-03 6:34 ` Fox Chen
2020-12-03 7:19 ` [kernfs] d680236464: BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c kernel test robot
2020-12-02 14:58 ` [PATCH 2/2] kernfs: remove mutex in kernfs_dop_revalidate Fox Chen
2020-12-02 18:27 ` Greg KH
2020-12-03 6:35 ` Fox Chen
2020-12-02 18:46 ` Tejun Heo
2020-12-03 6:44 ` Fox Chen
2020-12-02 18:29 ` [PATCH 0/2] kernfs: speed up concurrency performance Greg KH
2020-12-03 6:38 ` Fox Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201202145837.48040-1-foxhlchen@gmail.com \
--to=foxhlchen@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox