public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Cache invalidation bug in NFS v3 - trivially reproducible
@ 2005-10-11  9:09 Leif Nixon
  2005-10-11  9:22 ` Jakob Oestergaard
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Leif Nixon @ 2005-10-11  9:09 UTC (permalink / raw)
  To: linux-kernel

Hi,

We have come across a bug where a NFS v3 client fails to invalidate
its data cache for a file even though it realizes that the file
attributes have changed. We have been able to recreate the bug on a
range of kernel versions and different underlying file systems.

Here's a minimal way to reproduce the error (there seems to be some
timing issues involved, but this has worked at least 90% of the time):

  NFS client n1                NFS client n2

  $ echo 1 > f
			       $ cat f
			       1
  $ touch .
  $ echo 2 > f
			       $ touch f
			       $ cat f
			       1

Now client n2 is stuck in a state where it uses its old cached data
forever (or at least for several hours):

  NFS client n1                NFS client n2

  $ cat f
  2
			       $ cat f
			       1

However, "stat f" gives the same output on both clients. "touch f" on
either machine corrects the situation; n2 invalidates its data cache.

Interestingly, the second write to the file ("echo 2 > f") must
not change the size of the file. If you do "echo foo > f" instead,
the erroneous behaviour isn't triggered.

We have seen this on a range of kernels between 2.6.9 and 2.6.13.2 on
Debian, CentOS, RHEL, Fedora and vanilla kernel.org, on both clients
and server. We have *not* been able to reproduce the bug with Linux
clients and a Solaris server, neither with Solaris clients and a Linux
server. Underlying file systems have been ext3 and xfs (and Solaris
ufs). We have tried varying mount options, but to no avail; the bug
persists, even with "noac".


Hypothesis:

When n2 does "touch f" and wants to do SETATTR, it first has to do a
LOOKUP (because n1 has updated the attributes on cwd with "touch .").
It seems that when n2 receives the updated attributes for f as a part
of the LOOKUP reply, it updates its attribute cache without
invalidating its data cache, leading to the anomalous situation.

If the "touch ." is omitted, n2 receives the updated file attributes
via an explicit GETATTR on f, and then everything works properly.

-- 
Leif Nixon                       -            Systems expert
------------------------------------------------------------
National Supercomputer Centre    -      Linkoping University
------------------------------------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-10-14 13:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-11  9:09 Cache invalidation bug in NFS v3 - trivially reproducible Leif Nixon
2005-10-11  9:22 ` Jakob Oestergaard
2005-10-11 14:47 ` Trond Myklebust
2005-10-12  4:31   ` Willy Tarreau
2005-10-13 22:34 ` Trond Myklebust
2005-10-14 13:34   ` Leif Nixon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox