From: NeilBrown <neilb@suse.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Steve Dickson <SteveD@redhat.com>,
Linux NFS Mailing list <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 3/8] mountd: remove 'dev_missing' checks
Date: Fri, 19 Aug 2016 11:28:30 +1000 [thread overview]
Message-ID: <8737m1im2p.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20160818135754.GA21470@fieldses.org>
[-- Attachment #1: Type: text/plain, Size: 3999 bytes --]
On Thu, Aug 18 2016, J. Bruce Fields wrote:
> Not really arguing--I'll trust your judgement--just some random ideas:
>
> On Thu, Aug 18, 2016 at 11:32:52AM +1000, NeilBrown wrote:
>> On Wed, Aug 17 2016, J. Bruce Fields wrote:
>> > In which case what it really wants to say is "before nfs mounts" (or
>> > even "before nfs mounts of localhost"; and vice versa on shutdown). I
>> > can't tell if there's an easy way to get say that.
>>
>> I'd be happy with a difficult/complex way, if it was reliable.
>> Could we write a systemd generator which parses /etc/fstab, determines
>> all mount points which a loop-back NFS mounts (or even just any NFS
>> mounts) and creates a drop-in for nfs-server which adds
>> Before=mount-point.mount
>> for each /mount/point.
>>
>> Could that be reliable? I might try.
>
> Digging around... we've also got this callout from mount to start-statd,
> can we use something like that to make loopback nfs mounts wait on nfs
> server startup?
An nfs mount already waits for the server to start up. The ordering
dependency between NFS mounts and the nfs-server only really matters at
shutdown, and we cannot enhance mount.nfs to wait for a negative amount
of time (also known as "time travel")
>
>> > Is that the only risk, though? Maybe so--presumably you've killed any
>> > users, so any write data associated with opens should be flushed. And
>> > if you do a sync after that you take care of write delegations too.
>>
>> In the easily reproducible case, all user processes are gone.
>> It would be worth checking what happens if processes are accessing a
>> filesystem from an unreachable server at shutdown.
>> "kill -9" should get rid of them all now, so it might be OK.
>> "sync" would hang though. I'd be happy for that to cause a delay of a
>> minute or so, but hopefully systemd would (or could be told to) kill -9
>> a sync if it took too long.
>
> We shouldn't have to resort to that in the loopback nfs case, where we
> control ordering. So in that case, I'm just pointing out that:
>
> kill -9 all users of the filesystem
> shutdown nfs server
> umount nfs filesystems
>
> isn't the right ordering, because in the presence of write delegations
> there could still be writeback data.
Yes, that does make a good case for getting the ordering right, rather
than just getting the shutdown-sequence not to block. Thanks,
>
> (OK, actually, knfsd doesn't currently implement write delegations--but
> we shouldn't depend on that assumption.)
>
> Adding a sync between the first two steps might help, though the write
> delegations themselves could still linger, and I don't know how the
> client will behave when it finds it can't return them.
>
> So it'd be nice if we could just order the umount before the server
> shutdown.
>
> The case of a remote server shut down too early is different of course.
>
>> > Looking at rpcbind(8).... Shouldn't "-w" prevent this by loading some
>> > registrations before it starts responding to requests?
>>
>> "-w" (which isn't listed in the SYNOPSIS!) only applies to a warm-start
>> where the daemons which previously registered are still running.
>> The problem case is that the daemons haven't registered yet (so we don't
>> necessarily know what port number they will get).
>
> We probably know the port in the specific case of nfsd, and could fake
> up rpcbind's state file if necessary. Eh, your idea's not as bad:
>
>> To address the issue in rpcbind, we would need a flag to say "don't
>> respond to lookup requests, just accept registrations", then when all
>> registrations are complete, send some message to rpcbind to say "OK,
>> respond to lookups now". That could even be done by killing and
>> restarting with "-w", though that it a bit ugly.
>>
>> I'm leaning towards having mount retry after RPC_PROGNOTREGISTERED for
>> fg like it does with bg.
>
> Anyway, sounds OK to me.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
next prev parent reply other threads:[~2016-08-19 1:29 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-14 2:26 [PATCH 0/8] Assorted mount-related nfs-utils patches NeilBrown
2016-07-14 2:26 ` [PATCH 1/8] nfs.man: clarify effect of 'retry' option NeilBrown
2016-07-14 2:26 ` [PATCH 7/8] mount: don't treat temporary name resolution failure as permanent NeilBrown
2016-07-19 23:01 ` NeilBrown
2016-07-14 2:26 ` [PATCH 6/8] mountd: don't add paths to non-mounted export points to pseudo-root NeilBrown
2016-07-18 20:32 ` J. Bruce Fields
2016-07-19 8:00 ` Chuck Lever
2016-07-19 22:59 ` NeilBrown
2016-07-21 17:33 ` J. Bruce Fields
2016-07-25 7:22 ` NeilBrown
2016-07-28 20:54 ` J. Bruce Fields
2016-07-14 2:26 ` [PATCH 8/8] mount: use a public address for IPv6 callback NeilBrown
2016-07-14 2:26 ` [PATCH 5/8] mountd: Don't export unmounted exports to NFSv4 NeilBrown
2016-07-14 2:26 ` [PATCH 3/8] mountd: remove 'dev_missing' checks NeilBrown
2016-07-18 20:01 ` J. Bruce Fields
2016-07-19 22:50 ` NeilBrown
2016-07-21 17:24 ` J. Bruce Fields
2016-08-11 2:51 ` NeilBrown
2016-08-16 15:21 ` J. Bruce Fields
2016-08-18 1:32 ` NeilBrown
2016-08-18 2:57 ` Chuck Lever
2016-08-19 1:31 ` NeilBrown
2016-08-18 13:57 ` J. Bruce Fields
2016-08-19 1:28 ` NeilBrown [this message]
2016-08-19 17:27 ` J. Bruce Fields
2016-07-14 2:26 ` [PATCH 2/8] mountd: remove the --exports-file option NeilBrown
2016-07-18 16:19 ` J. Bruce Fields
2016-07-14 2:26 ` [PATCH 4/8] mountd: cause attempts to access unmounted exportpoints to return ESTALE NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8737m1im2p.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=SteveD@redhat.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).