From: Trond Myklebust <trondmy@kernel.org>
To: Benjamin Coddington <bcodding@redhat.com>,
Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH] NFSv4: Ensure we revalidate data after OPEN expired
Date: Tue, 14 Feb 2023 15:30:10 -0500 [thread overview]
Message-ID: <a44dfa9367526593f1be28dad281e2b6d50aaa2e.camel@kernel.org> (raw)
In-Reply-To: <7e97897a29878a56236ef8e15bce7a295d5e8a41.1676403514.git.bcodding@redhat.com>
On Tue, 2023-02-14 at 14:39 -0500, Benjamin Coddington wrote:
> We've observed that if the NFS client experiences a network partition
> and
> the server revokes the client's state, the client may not revalidate
> cached
> data for an open file during recovery. If the file is extended by a
> second
> client during this network partition, the first client will correctly
> update the file's size and attributes during recovery, but another
> extending write will discard the second client's data.
I'm having trouble fully understanding your problem description. Is the
issue that both clients are opening the same file with something like
O_WRONLY|O_DSYNC|O_APPEND?
If so, what if the network partition happens during the write() system
call of client 1, so that the page cache is updated but the flush of
the write data ends up being delayed by the partition?
In that case, client 2 doesn't know that client 1 has writes
outstanding so it may write its data to where the server thinks the eof
offset is. However once client 1 is able to recover its open state, it
will still have dirty page cache data that is going to overwrite that
same offset.
>
> In the case where another client opened the file during the network
> partition and the server revoked the first client's state, the
> recovery can
> forego optimizations and instead attempt to avoid corruption.
>
> It's a little tricky to solve this in a per-file way during recovery
> without plumbing arguments or introducing new flags. This patch
> side-steps
> the per-file complexity by simply checking if the client is within a
> NOGRACE recovery window, and if so, invalidates data during the open
> recovery.
>
I don't see how this scenario can ever be made fully safe. If people
care, then we should probably have the open recovery of client 1 fail
altogether in this case (subject to some switch similar to the existing
'recover_lost_locks' kernel option).
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
next prev parent reply other threads:[~2023-02-14 20:30 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-14 19:39 [PATCH] NFSv4: Ensure we revalidate data after OPEN expired Benjamin Coddington
2023-02-14 20:30 ` Trond Myklebust [this message]
2023-02-14 21:00 ` Benjamin Coddington
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a44dfa9367526593f1be28dad281e2b6d50aaa2e.camel@kernel.org \
--to=trondmy@kernel.org \
--cc=anna@kernel.org \
--cc=bcodding@redhat.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.