From: Dmitry Monakhov <dmonakhov@openvz.org>
To: linux-ext4@vger.kernel.org
Cc: tytso@mit.edu
Subject: Re: [PATCH] ext4: improve smp scalability for inode generation
Date: Wed, 18 Oct 2017 21:08:21 +0300 [thread overview]
Message-ID: <87376gpbvu.fsf@openvz.org> (raw)
In-Reply-To: <8760bcpdc8.fsf@openvz.org>
[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]
Dmitry Monakhov <dmonakhov@openvz.org> writes:
> ->s_next_generation is protected by s_next_gen_lock but it usage
> pattern is very primitive and can be replaced with atomic_ops
>
> This significantly improve creation/unlink scenario on SMP systems,
> for example lat_fs_create_unlink test [1] on x2 E5-2680 (32vcpu) system
> shows ~20% improvement.
> | nr_tsk | wo/ patch | w/ patch |
> |--------+-----------+----------|
> | 1 | 137 | 140 |
> | 2 | 224 | 233 |
> | 4 | 356 | 372 |
> | 8 | 439 | 519 |
> | 16 | 443 | 585 |
> | 32 | 598 | 695 |
> | 64 | 559 | 707 |
> | 128 | 385 | 437 |
FYI with lazytime enabled lat_fs_create_unlink is ~16x times slower.
The reason is quite obvious ext4_update_other_inodes_time() increase
lock contention for inode_hash_lock (4k/256) times.
->ext4_do_update_inode
->ext4_update_other_inodes_time
for (i = 0; i < inodes_per_block; i++, ino++, buf += inode_size)
->find_inode_nowait
->spin_lock(&inode_hash_lock) -> 16x contention increase
inode_hash_lock is known problem. I have patches to convert inode_hash_table
per bucket lock similar to dentry_hash, but this require massige changes in
various filesystems so will require a lot of time to be merged.
Currently lazytime amplify it significantly. May be it is reasonable to
use spin_trylock inside find_inode_nowait to make it true lightweight hint?
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: lazytime_trylock.patch --]
[-- Type: text/x-patch, Size: 410 bytes --]
diff --git a/fs/inode.c b/fs/inode.c
index d1e35b5..a5b1cba1 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1360,7 +1360,9 @@ struct inode *find_inode_nowait(struct super_block *sb,
struct inode *inode, *ret_inode = NULL;
int mval;
- spin_lock(&inode_hash_lock);
+ if (!spin_trylock(&inode_hash_lock))
+ return NULL;
+
hlist_for_each_entry(inode, head, i_hash) {
if (inode->i_sb != sb)
continue;
next prev parent reply other threads:[~2017-10-18 18:04 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-18 17:36 [PATCH] ext4: improve smp scalability for inode generation Dmitry Monakhov
2017-10-18 18:08 ` Dmitry Monakhov [this message]
2017-10-19 11:50 ` Andreas Dilger
2017-11-09 3:23 ` Theodore Ts'o
2017-11-10 17:33 ` Dmitry Monakhov
2017-11-10 22:57 ` Theodore Ts'o
2017-11-10 22:39 ` Andreas Dilger
2017-11-10 22:55 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87376gpbvu.fsf@openvz.org \
--to=dmonakhov@openvz.org \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox