From: "J. Bruce Fields" <bfields@fieldses.org>
To: NeilBrown <neilb@suse.com>
Cc: Linux NFS Mailing <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] NFSDv4: use export cache flushtime for changeid on V4ROOT objects.
Date: Mon, 6 Feb 2017 17:28:25 -0500 [thread overview]
Message-ID: <20170206222825.GD19704@fieldses.org> (raw)
In-Reply-To: <87k293vxj2.fsf@notabene.neil.brown.name>
On Tue, Feb 07, 2017 at 08:07:13AM +1100, NeilBrown wrote:
> On Tue, Jan 31 2017, J. Bruce Fields wrote:
>
> > On Tue, Jan 31, 2017 at 09:28:37AM +1100, NeilBrown wrote:
> >> On Mon, Jan 30 2017, J. Bruce Fields wrote:
> >>
> >> > On Mon, Jan 30, 2017 at 05:17:00PM +1100, NeilBrown wrote:
> >> >>
> >> >> If you change the set of filesystems that are exported, then
> >> >> the contents of various directories in the NFSv4 pseudo-root
> >> >> is likely to change. However the change-id of those
> >> >> directories is currently tied to the underlying directory,
> >> >> so the clinet may not see the changes in a timely fashion.
> >> >
> >> > Oh, good catch.
> >> >
> >> >> This patch changes the change-id number to be derived from the
> >> >> "flush_time" of the export cache. Whenever any changes are
> >> >> made to the set of exported filesystems, this flush_time is
> >> >> updated. The result is that clients see changes to the set
> >> >> of exported filesystems much more quickly, often immediately.
> >> >
> >> > And, a clever solution, as usual....
> >> >
> >> > I wonder if it's completely right yet, though. Off the top of my head:
> >> > can't the client see the new flush time before it sees the new contents?
> >> > If so, a client that caches both during that window could cache the old
> >> > contents indefinitely.
> >>
> >> uhm....
> >> Yes, it could see the new flush time before it sees the new contents.
> >> When it sees that new flush time (i.e. new change attribute), it will
> >> invalidate its cached contents and ask for the contents again.
> >
> > The problem comes if it's still possible for the client to read (and
> > cache) the old contents at this point, in which case the client's cache
> > will incorrectly associate old contents with new change attribute.
>
> I agree with this.
>
> >
> >> It will then definitely get new contents.
> >
> > So the problem with changing change attribute before contents is:
> >
> > - client retrieves old contents and new attribute, caches.
> > - client revalidates cache at an arbitrarily later time, sees
> > it's still the new attribute, continues caching old contents.
> >
> > So usually I believe you want the two changes--contents and change
> > attribute--to be atomic or, if that's not possible, for them to be
> > changed in that order.
>
> I believe that setting ->flush_time atomically effects both changes.
>
> >
> > I haven't thought through how that applies to this case, but I think it
> > should be possible if in-progress rpc's hold references to objects in
> > the flushed cache?
>
> How would it do that?
> In NFSv4 'READDIR' and 'GETATTR' are separate operations.
> If the client sends READDIR and then GETATTR, it must not assume that
> the change number in the GETATTR reply implies anything about the
> READDIR reply.
> But it (presumably) sends them in the order other, so if GETATTR gets a
> new change number, then when nfsd4_encode_dirent_fattr() calls
> nfsd_crossmnt() it will find the changed to the exports table, though it
> may need to wait for an upcall to complete.
>
> You are right to be cautious, but I think ->flush_time effectively
> provides the needed atomicity.
Yeah, I just hadn't thought it through. So long as the only "content"
we care about is readdir/lookup results, and so long as those always
require nfsd_crossmnt() and a new cache lookup, then I agree this works.
Thanks!
--b.
prev parent reply other threads:[~2017-02-06 22:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-30 6:17 [PATCH] NFSDv4: use export cache flushtime for changeid on V4ROOT objects NeilBrown
2017-01-30 15:35 ` J. Bruce Fields
2017-01-30 22:28 ` NeilBrown
2017-01-31 14:38 ` J. Bruce Fields
2017-02-06 21:07 ` NeilBrown
2017-02-06 22:28 ` J. Bruce Fields [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170206222825.GD19704@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).