linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [PATCH v2] vfs: introduce UMOUNT_WAIT which waits for umount completion
Date: Wed, 20 Sep 2017 19:38:25 +0100	[thread overview]
Message-ID: <20170920183825.GD32076@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20170920173831.GA7151@jaegeuk-macbookpro.roam.corp.google.com>

On Wed, Sep 20, 2017 at 10:38:31AM -0700, Jaegeuk Kim wrote:
> This patch introduces UMOUNT_WAIT flag for umount(2) which let user wait for
> umount(2) to complete filesystem shutdown. This should fix a kernel panic
> triggered when a living filesystem tries to access dead block device after
> device_shutdown done by kernel_restart as below.

NAK.  This is just papering over the race you've got; it does not fix it.
You count upon the kernel threads in question having already gotten past
scheduling delayed fput, but what's there to guarantee that?  You are
essentially adding a "flush all pending fput that had already been
scheduled" syscall.  It
	a) doesn't belong in umount(2) and
	b) doesn't fix the race.
It might change the timing enough to have your specific reproducer survive,
but that kind of approach is simply wrong.

Incidentally, the name is a misnomer - it does *NOT* wait for completion of
fs shutdown.  Proof: have a filesystem mounted in two namespaces and issue
that thing in one of them.  Then observe how it's still alive, well and
accessible in another.

The only case that gets affected by it is when another mount is heading for
shutdown and is in a very specific part of that.  That is waited for.
If it's just before *OR* just past that stage, you are fucked.

And yes, "just past" is also affected.  Look:
CPU1: delayed_fput()
        struct llist_node *node = llist_del_all(&delayed_fput_list);
delayed_fput_list() is empty now
        llist_for_each_entry_safe(f, t, node, f_u.fu_llist)
                __fput(f);
CPU2: your umount UMOUNT_WAIT
	flush_delayed_fput()
		does nothing, the list is empty
	....
	flush_scheduled_work()
		waits for delayed_fput() to finish
CPU1:
	finish __fput()
	call mntput() from it
	schedule_delayed_work(&delayed_mntput_work, 1);
CPU2:
	OK, everything scheduled prior to call of flush_scheduled_work() is completed,
we are done.
	return from umount(2)
	(in bogus userland code) tell it to shut devices down
...
oops, that delayed_mntput_work we'd scheduled there got to run.  Too bad...

  reply	other threads:[~2017-09-20 18:38 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-13 20:09 [PATCH] vfs: introduce UMOUNT_WAIT which waits for umount completion Jaegeuk Kim
2017-09-13 23:04 ` Al Viro
2017-09-13 23:31   ` Jaegeuk Kim
2017-09-13 23:44     ` Al Viro
2017-09-14  1:10       ` Jaegeuk Kim
2017-09-14  1:30         ` Al Viro
2017-09-14 18:37           ` Al Viro
2017-09-14 19:14             ` Jaegeuk Kim
2017-09-15  0:19               ` Jaegeuk Kim
2017-09-15  2:06                 ` Al Viro
2017-09-15  3:45                   ` Jaegeuk Kim
2017-09-15  4:21                     ` Al Viro
2017-09-15 18:44                       ` Jaegeuk Kim
2017-09-15 22:12                         ` Theodore Ts'o
2017-09-15 23:29                           ` Jaegeuk Kim
2017-09-15 23:43                             ` Al Viro
2017-09-19 15:55                               ` Jaegeuk Kim
2017-09-16  7:11                           ` Amir Goldstein
2017-09-20 17:38 ` [PATCH v2] " Jaegeuk Kim
2017-09-20 18:38   ` Al Viro [this message]
2017-09-21  0:34     ` Jaegeuk Kim
2017-09-21  2:42       ` Al Viro
2017-09-21  5:02         ` Jaegeuk Kim
2017-09-21 14:48           ` Theodore Ts'o
2017-09-21 17:16             ` Jaegeuk Kim
2017-09-21 18:20   ` [PATCH v3] vfs: introduce UMOUNT_WAIT to wait for delayed_fput/mntput completion Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170920183825.GD32076@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).