linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Jeff Layton <jlayton@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	lkml <linux-kernel@vger.kernel.org>
Subject: Do we really need d_weak_revalidate???
Date: Fri, 11 Aug 2017 14:31:10 +1000	[thread overview]
Message-ID: <87bmnmrai9.fsf@notabene.neil.brown.name> (raw)

[-- Attachment #1: Type: text/plain, Size: 2113 bytes --]


Funny story.  4.5 years ago we discarded the FS_REVAL_DOT superblock
flag and introduced the d_weak_revalidate dentry operation instead.
We duly removed the flag from NFS superblocks and NFSv4 superblocks,
and added the new dentry operation to NFS dentries .... but not to NFSv4
dentries.

And nobody noticed.

Until today.

A customer reports a situation where mount(....,MS_REMOUNT,..) on an NFS
filesystem hangs because the network has been deconfigured.  This makes
perfect sense and I suggested a code change to fix the problem.
However when a colleague was trying to reproduce the problem to validate
the fix, he couldn't.  Then nor could I.

The problem is trivially reproducible with NFSv3, and not at all with
NFSv4.  The reason is the missing d_weak_revalidate.

We could simply add d_weak_revalidate for NFSv4, but given that it
has been missing for 4.5 years, and the only time anyone noticed was
when the ommission resulted in a better user experience, I do wonder if
we need to.  Can we just discard d_weak_revalidate?  What purpose does
it serve?  I couldn't find one.

Thanks,
NeilBrown

For reference, see
Commit: ecf3d1f1aa74 ("vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op")



To reproduce the problem at home, on a system that uses systemd:
1/ place (or find) a filesystem image in a file on an NFS filesystem.
2/ mount the nfs filesystem with "noac" - choose v3 or v4
3/ loop-mount the filesystem image read-only somewhere
4/ reboot

If you choose v4, the reboot will succeed, possibly after a 90second timeout.
If you choose v3, the reboot will hang indefinitely in systemd-shutdown while
remounting the nfs filesystem read-only.

If you don't use "noac" it can still hang, but only if something slows
down the reboot enough that attributes have timed out by the time that
systemd-shutdown runs.  This happens for our customer.

If the loop-mounted filesystem is not read-only, you get other problems.

We really want systemd to figure out that the loop-mount needs to be
unmounted first.  I have ideas concerning that, but it is messy.  But
that isn't the only bug here.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

             reply	other threads:[~2017-08-11  4:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11  4:31 NeilBrown [this message]
2017-08-11  5:55 ` Do we really need d_weak_revalidate??? Trond Myklebust
2017-08-11 11:01   ` Jeff Layton
2017-08-13 23:36     ` NeilBrown
2017-08-14 10:10       ` Jeff Layton
2017-08-16  2:43         ` NeilBrown
2017-08-16 11:34           ` Jeff Layton
2017-08-16 23:47             ` NeilBrown
2017-08-17  2:20             ` Ian Kent
2017-08-18  5:24               ` NeilBrown
2017-08-18  6:47                 ` Ian Kent
2017-08-18  6:55                   ` Ian Kent
2017-08-21  6:23                   ` NeilBrown
2017-08-21  6:32                     ` Ian Kent
2017-08-21  7:46                       ` NeilBrown
2017-08-23  1:06                       ` NeilBrown
2017-08-23  2:32                         ` Ian Kent
2017-08-23  2:40                           ` Ian Kent
2017-08-23  2:54                             ` Ian Kent
2017-08-23  7:51                               ` Ian Kent
2017-08-24  3:21                             ` NeilBrown
2017-08-24  4:35                               ` Ian Kent
2017-08-24  4:07                           ` NeilBrown
2017-08-24  4:47                             ` Ian Kent
2017-08-24  4:58                             ` Ian Kent
2017-08-24 11:03                             ` Michael Kerrisk (man-pages)
2017-08-25  0:05                               ` Ian Kent
2017-08-25  5:32                               ` [PATCH manpages] stat.2: correct AT_NO_AUTOMOUNT text and general revisions NeilBrown
2017-09-14 13:38                                 ` Michael Kerrisk (man-pages)
2017-09-14 22:25                                   ` NeilBrown
2017-09-16 13:11                                     ` Michael Kerrisk (man-pages)
2017-09-08 15:15                             ` Do we really need d_weak_revalidate??? David Howells
2017-08-13 23:29   ` NeilBrown
2017-08-24  6:34     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bmnmrai9.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=jlayton@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).