All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Fields <bfields@fieldses.org>
To: Stan Hu <stanhu@gmail.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: Stale data after file is renamed while another process has an open file handle
Date: Wed, 19 Sep 2018 16:02:14 -0400	[thread overview]
Message-ID: <20180919200214.GB14422@fieldses.org> (raw)
In-Reply-To: <CAMBWrQmLDwhESALcH2kamrRr4VhRsgVP0kAMeiE2J0uubcfNNw@mail.gmail.com>

On Wed, Sep 19, 2018 at 10:39:19AM -0700, Stan Hu wrote:
> On Tue, Sep 18, 2018 at 11:19 AM J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> > We know node B has that cat loop that will keep reopening the file.
> >
> > The only way node B could avoid translating those open syscalls into
> > on-the-wire OPENs is if the client holds a delegation.
> >
> > But it can't hold a delegation on the file that was newly renamed to
> > test.txt--delegations are revoked on rename, and it would need to do
> > another OPEN after the rename to get a new delegation.  Similarly the
> > file that gets renamed over should have its delegation revoked--and we
> > can see that the client does return that delegation.  The OPEN here is
> > actually part of that delegation return process--the CLAIM_DELEGATE_CUR
> > value on "claim type" is telling the server that this is an open that
> > the client had cached locally under the delegation it is about to
> > return.
> >
> > Looks like a client bug to me, possibly some sort of race handling the
> > delegation return and the new open.
> >
> > It might help if it were possible to confirm that this is still
> > reproduceable on the latest upstream kernel.
> 
> Thanks for that information. I did more testing, and it looks like
> this stale file problem only appears to happen when the NFS client
> protocol is 4.0 (via the vers=4.0 mount option). 4.1 doesn't appear to
> have the problem.
> 
> I've also confirmed this problem happens on the mainline kernel
> version (4.19.0-rc4). Do you have any idea why 4.1 would be working
> but 4.0 has this bug?

No.  I mean, the 4.1/4.0 differences are complicated, so it's not too
surprising a bug could hit one and not the other, but I don't have an
explanation for this one off the top of my head.

> https://s3.amazonaws.com/gitlab-support/nfs/nfs-4.0-kernel-4.19-0-rc4-rename.pcap
> is the latest capture that also includes the NFS callbacks. Here's
> what I see after the first RENAME from Node A:
> 
> Node B: DELEGRETURN StateId: 0xa93
> NFS server: DELEGRETURN
> Node A: RENAME From: test2.txt To: test.txt
> NFS server: RENAME
> Node B: GETATTR
> NFS Server: GETATTR (with old inode)
> Node B: READ StateId: 0xa93
> NFS Server: READ

Presumably the GETATTR and READ use a filehandle for the old file (the
one that was renamed over)?

That's what's weird, and indicates a possible client bug.  It should be
doing a new OPEN("test.txt").

Also, READ shouldn't be using the stateid that was returned in
DELEGRETURN.  And the server should reject any attempt to use that
stateid.  I wonder if you misread the stateids--may be worth taking a
closer look to see if they're really bit-for-bit identical.  (They're
128 bits, so that 0xa93 is either a hash or just some subset of the
stateid.)

(Apologies, I haven't gotten a chance to look at it myself.)

> In comparison, if I don't have a process with an open file to
> test.txt, things work and the trace looks like:
> 
> Node B: DELEGRETURN StateId: 0xa93
> NFS server: DELEGRETURN
> Node A: RENAME From: test2.txt To: test.txt
> NFS server: RENAME
> Node B: OPEN test.txt
> NFS Server: OPEN StateID: 0xa93
> Node B: CLOSE StateID: 0xa93
> NFS Server: CLOSE
> Node B: OPEN test.txt
> NFS Server: OPEN StateId: 0xa93
> Node B: READ StateID: 0xa93
> NFS Server: READ
> 
> In the first case, since the client reused the StateId that it should
> have released in DELEGRETURN, does this suggest that perhaps the
> client isn't properly releasing that delegation? How might the open
> file affect this behavior? Any pointers to where things might be going
> awry in the code base would be appreciated here.

I'd expect the first trace to look more like this one, with new OPENs
and CLOSEs after the rename.

--b.

> 
> 
> 
> >
> > --b.
> >
> > >
> > > On Mon, Sep 17, 2018 at 3:16 PM Stan Hu <stanhu@gmail.com> wrote:
> > > >
> > > > Attached is the compressed pcap of port 2049 traffic. The file is
> > > > pretty large because the while loop generated a fair amount of
> > > > traffic.
> > > >
> > > > On Mon, Sep 17, 2018 at 3:01 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> > > > >
> > > > > On Mon, Sep 17, 2018 at 02:37:16PM -0700, Stan Hu wrote:
> > > > > > On Mon, Sep 17, 2018 at 2:15 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> > > > > >
> > > > > > > Sounds like a bug to me, but I'm not sure where.  What filesystem are
> > > > > > > you exporting?  How much time do you think passes between steps 1 and 4?
> > > > > > > (I *think* it's possible you could hit a bug caused by low ctime
> > > > > > > granularity if you could get from step 1 to step 4 in less than a
> > > > > > > millisecond.)
> > > > > >
> > > > > > For CentOS, I am exporting xfs. In Ubuntu, I think I was using ext4.
> > > > > >
> > > > > > Steps 1 through 4 are all done by hand, so I don't think we're hitting
> > > > > > a millisecond issue. Just for good measure, I've done experiments
> > > > > > where I waited a few minutes between steps 1 and 4.
> > > > > >
> > > > > > > Those kernel versions--are those the client (node A and B) versions, or
> > > > > > > the server versions?
> > > > > >
> > > > > > The client and server kernel versions are the same across the board. I
> > > > > > didn't mix and match kernels.
> > > > > >
> > > > > > > > Note that with an Isilon NFS server, instead of seeing stale content,
> > > > > > > > I see "Stale file handle" errors indefinitely unless I perform one of
> > > > > > > > the corrective steps.
> > > > > > >
> > > > > > > You see "stale file handle" errors from the "cat test1.txt"?  That's
> > > > > > > also weird.
> > > > > >
> > > > > > Yes, this is the problem I'm actually more concerned about, which led
> > > > > > to this investigation in the first place.
> > > > >
> > > > > It might be useful to look at the packets on the wire.  So, run
> > > > > something on the server like:
> > > > >
> > > > >         tcpdump -wtmp.pcap -s0 -ieth0
> > > > >
> > > > > (replace eth0 by the relevant interface), then run the test, then kill
> > > > > the tcpdump and take a look at tmp.pcap in wireshark, or send tmp.pcap
> > > > > to the list (as long as there's no sensitive info in there).
> > > > >
> > > > > What we'd be looking for:
> > > > >         - does the rename cause the directory's change attribute to
> > > > >           change?
> > > > >         - does the server give out a delegation, and, if so, does it
> > > > >           return it before allowing the rename?
> > > > >         - does the client do an open by filehandle or an open by name
> > > > >           after the rename?
> > > > >
> > > > > --b.

  reply	other threads:[~2018-09-20  1:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 20:57 Stale data after file is renamed while another process has an open file handle Stan Hu
2018-09-17 21:15 ` J. Bruce Fields
2018-09-17 21:37   ` Stan Hu
2018-09-17 22:01     ` J. Bruce Fields
     [not found]       ` <CAMBWrQmRtPHOFbiMsz2YAn-yQXCYjRBqq0zLJUB7snPg2MQ+tA@mail.gmail.com>
2018-09-17 22:48         ` Stan Hu
2018-09-18 17:42           ` Stan Hu
2018-09-18 18:33             ` J. Bruce Fields
2018-09-18 19:06               ` Chris Siebenmann
2018-09-18 19:27                 ` J. Bruce Fields
2018-09-18 18:19           ` J. Bruce Fields
2018-09-19 17:39             ` Stan Hu
2018-09-19 20:02               ` Bruce Fields [this message]
2018-09-20  0:18                 ` Bruce Fields
2018-09-20 18:23                 ` Stan Hu
2018-09-20 18:39                   ` Bruce Fields
2018-09-24 20:34                     ` Stan Hu
2018-09-25 18:56                       ` Stan Hu
2018-09-25 20:34                         ` Bruce Fields
2018-09-25 20:40                           ` Stan Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180919200214.GB14422@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=stanhu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.