From: Theodore Ts'o <tytso@mit.edu>
To: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Cc: linux-ext4@vger.kernel.org,
"linux-fsdevel@vger.kernel.org Emmanuel Jeanvoine"
<emmanuel.jeanvoine@inria.fr>
Subject: [PATCH, RFC] fs: only call sync_filesystem() when remounting read-only
Date: Sat, 8 Mar 2014 11:08:18 -0500 [thread overview]
Message-ID: <20140308160818.GC11633@thunk.org> (raw)
In-Reply-To: <20140305141343.GA26225@xanadu.blop.info>
On Wed, Mar 05, 2014 at 03:13:43PM +0100, Lucas Nussbaum wrote:
> TL;DR: we experience long temporary hangs when doing multiple mount -o
> remount at the same time as other I/O on an ext4 filesystem.
>
> When starting hundreds of LXC containers simultaneously on a system, the
> boot of some containers was hanging. We tracked this down to an
> initscript's use of mount -o remount, which was hanging in D state.
>
> We reproduced the problem outside of LXC, with the script available at
> [0]. That script initiates 1000 mount -o remount, and performs some
> writes using a big cp to the same filesystem during the remounts....
+linux-fsdevel since the patch modifies fs/super.c
Lukas, can you try this patch? I'm pretty sure this is what's going
on. It turns out each "mount -o remount" is implying an fsync(), so
your test case is identical to copying a large file while having
thousand of processes calling syncfs() on the file system, with the
predictable results.
Folks on linux-fsdevel, any objections if I carry this patch in the
ext4 tree? I don't think it should cause problems for other file
systems, since any file system that tries to rely on the implied
syncfs() is going to be subject to races, but it might make such a
race condition bug much more visible...
- Ted
commit 8862c3c69acc205b59b00baed67e50446e2fd093
Author: Theodore Ts'o <tytso@mit.edu>
Date: Sat Mar 8 11:05:35 2014 -0500
fs: only call sync_filesystem() when remounting read-only
Currently "mount -o remount" always implies an syncfs() on the file
system. This can cause a problem if a workload calls "mount -o
remount" many, many times while concurrent I/O is happening:
http://article.gmane.org/gmane.comp.file-systems.ext4/42876
Whether it would ever be sane for a workload to call "mount -o
remount" gazillions of times when they are effectively no-ops, it
seems stupid for a remount to imply an fsync().
It's possible that there is some file system which is relying on the
implied fsync(), but that's arguably broken, since aside for the
remount read-only case, there's nothing that will prevent other writes
from sneaking in between the sync_filesystem() and the call to
sb->s_op->remount_fs().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
diff --git a/fs/super.c b/fs/super.c
index 80d5cf2..0fc87ac 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -717,10 +717,9 @@ int do_remount_sb(struct super_block *sb, int flags, void *data, int force)
if (retval)
return retval;
}
+ sync_filesystem(sb);
}
- sync_filesystem(sb);
next prev parent reply other threads:[~2014-03-08 16:08 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-05 14:13 Extremely slow remounts with concurrent I/O Lucas Nussbaum
2014-03-06 13:56 ` [PATCH, RFC] jbd2: don't write non-commit blocks synchronously Theodore Ts'o
2014-03-06 17:28 ` Lucas Nussbaum
2014-03-06 18:27 ` Theodore Ts'o
2014-03-06 18:45 ` Lucas Nussbaum
2014-03-06 18:37 ` Lucas Nussbaum
2014-03-08 16:08 ` Theodore Ts'o [this message]
2014-03-10 11:45 ` [PATCH, RFC] fs: only call sync_filesystem() when remounting read-only Lucas Nussbaum
2014-03-10 14:41 ` Theodore Ts'o
2014-03-10 12:15 ` Lucas Nussbaum
2014-03-13 0:36 ` Dave Chinner
2014-03-13 1:16 ` Theodore Ts'o
2014-03-13 3:14 ` Theodore Ts'o
2014-03-13 6:04 ` Dave Chinner
2014-03-13 12:55 ` Theodore Ts'o
2014-03-13 7:39 ` Christoph Hellwig
2014-03-13 14:20 ` [PATCH] fs: push sync_filesystem() down to the file system's remount_fs() Theodore Ts'o
[not found] ` <1394720456-16629-1-git-send-email-tytso-3s7WtUTddSA@public.gmane.org>
2014-03-13 16:23 ` Jan Kara
2014-03-13 16:28 ` Steven Whitehouse
2014-03-13 23:15 ` [Cluster-devel] " Theodore Ts'o
[not found] ` <20140313231506.GB16785-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2014-03-14 12:13 ` Jan Kara
2014-03-14 0:33 ` Steve French
2014-03-14 1:23 ` Theodore Ts'o
2014-03-13 7:19 ` Extremely slow remounts with concurrent I/O Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140308160818.GC11633@thunk.org \
--to=tytso@mit.edu \
--cc=emmanuel.jeanvoine@inria.fr \
--cc=linux-ext4@vger.kernel.org \
--cc=lucas.nussbaum@loria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).