public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Monakhov <dmonakhov@openvz.org>
To: linux-ext4@vger.kernel.org
Cc: tytso@mit.edu
Subject: Re: [PATCH] ext4: improve smp scalability for inode generation
Date: Wed, 18 Oct 2017 21:08:21 +0300	[thread overview]
Message-ID: <87376gpbvu.fsf@openvz.org> (raw)
In-Reply-To: <8760bcpdc8.fsf@openvz.org>

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]


Dmitry Monakhov <dmonakhov@openvz.org> writes:

> ->s_next_generation is protected by s_next_gen_lock but it usage
> pattern is very primitive and can be replaced with atomic_ops
>
> This significantly improve creation/unlink scenario on SMP systems,
> for example lat_fs_create_unlink test [1] on x2 E5-2680 (32vcpu) system
> shows ~20% improvement.
> | nr_tsk | wo/ patch | w/ patch |
> |--------+-----------+----------|
> |      1 |       137 |      140 |
> |      2 |       224 |      233 |
> |      4 |       356 |      372 |
> |      8 |       439 |      519 |
> |     16 |       443 |      585 |
> |     32 |       598 |      695 |
> |     64 |       559 |      707 |
> |    128 |       385 |      437 |

FYI with lazytime enabled lat_fs_create_unlink is ~16x times slower.
The reason is quite obvious ext4_update_other_inodes_time() increase
lock contention for inode_hash_lock (4k/256) times.

->ext4_do_update_inode
  ->ext4_update_other_inodes_time
    for (i = 0; i < inodes_per_block; i++, ino++, buf += inode_size)
      ->find_inode_nowait
        ->spin_lock(&inode_hash_lock) -> 16x contention increase

inode_hash_lock is known problem. I have patches to convert inode_hash_table
per bucket lock similar to dentry_hash, but this require massige changes in
various filesystems so will require a lot of time to be merged.

Currently lazytime amplify it significantly. May be it is reasonable to
use spin_trylock inside find_inode_nowait to make it true lightweight hint?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: lazytime_trylock.patch --]
[-- Type: text/x-patch, Size: 410 bytes --]

diff --git a/fs/inode.c b/fs/inode.c
index d1e35b5..a5b1cba1 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1360,7 +1360,9 @@ struct inode *find_inode_nowait(struct super_block *sb,
 	struct inode *inode, *ret_inode = NULL;
 	int mval;
 
-	spin_lock(&inode_hash_lock);
+	if (!spin_trylock(&inode_hash_lock))
+		return NULL;
+
 	hlist_for_each_entry(inode, head, i_hash) {
 		if (inode->i_sb != sb)
 			continue;

  reply	other threads:[~2017-10-18 18:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-18 17:36 [PATCH] ext4: improve smp scalability for inode generation Dmitry Monakhov
2017-10-18 18:08 ` Dmitry Monakhov [this message]
2017-10-19 11:50 ` Andreas Dilger
2017-11-09  3:23   ` Theodore Ts'o
2017-11-10 17:33     ` Dmitry Monakhov
2017-11-10 22:57       ` Theodore Ts'o
2017-11-10 22:39     ` Andreas Dilger
2017-11-10 22:55       ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87376gpbvu.fsf@openvz.org \
    --to=dmonakhov@openvz.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox