Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Cc: "jiali@redhat.com" <jiali@redhat.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] nfs: don't extend writes to cover entire page if pagecache is invalid
Date: Mon, 10 Dec 2012 11:05:47 -0500	[thread overview]
Message-ID: <20121210110547.2eade853@corrin.poochiereds.net> (raw)
In-Reply-To: <4FA345DA4F4AE44899BD2B03EEEC2FA91191BACE@SACEXCMBX04-PRD.hq.netapp.com>

On Mon, 10 Dec 2012 14:42:23 +0000
"Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:

> On Mon, 2012-12-10 at 09:25 -0500, Jeff Layton wrote:
> > Jian reported that the following sequence would leave "testfile" with
> > corrupt data:
> > 
> >     # mount localhost:/export /mnt/nfs/ -o vers=3
> >     # echo abc > /mnt/nfs/testfile; echo def >> /export/testfile; echo ghi >> /mnt/nfs/testfile
> >     # cat -v /export/testfile
> >     abc
> >     ^@^@^@^@ghi
> > 
> > While there's no locking involved here, the operations are serialized,
> > so CTO should prevent corruption.
> > 
> > The first write to the file is fine and writes 4 bytes. The file is then
> > extended on the server. When it's reopened a GETATTR is issued and the
> > size change is noticed. This causes NFS_INO_INVALID_DATA to be set on
> > the file. Because the file is opened for write only,
> > nfs_want_read_modify_write() returns 0 to nfs_write_begin().
> > nfs_updatepage then calls nfs_write_pageuptodate() to see if it should
> > extend the nfs_page to cover the whole page. NFS_INO_INVALID_DATA is
> > still set on the file at that point, but that flag is ignored and
> > nfs_pageuptodate erroneously extends the write to cover the whole page,
> > with the write done on the server side filled in with zeroes.
> > 
> > This patch just has that function check for NFS_INO_INVALID_DATA in
> > addition to NFS_INO_REVAL_PAGECACHE. This fixes the bug, but looking
> > over the code, I wonder if we might have a similar bug in
> > nfs_revalidate_size(). The difference between those two flags is very
> > subtle, so it seems like we ought to be checking for
> > NFS_INO_INVALID_DATA in most of the places that we look for
> > NFS_INO_REVAL_PAGECACHE.
> > 
> > I believe this is regression introduced by commit 8d197a568. The code
> > did check for NFS_INO_INVALID_DATA prior to that patch.
> > 
> > Original bug report is here:
> > 
> >     https://bugzilla.redhat.com/show_bug.cgi?id=885743
> 
> Hi Jeff,
> 
> The point of NFS_INO_REVAL_PAGECACHE is to be able to distinguish
> between situations where the file size or change attribute is in doubt
> (and so the page cache validity is in doubt), and situations where other
> attributes are in doubt (such as ACCESS caches, file owner, nlinks,...).
> This is why we're only checking for NFS_INO_REVAL_PAGECACHE and not
> NFS_INO_INVALID_DATA in the functions you mention above.
> 

Ok, now I'm really confused... I thought that:

NFS_INO_INVALID_DATA == pagecache data was known to be wrong
NFS_INO_REVAL_PAGECACHE == pagecache data is questionable

...but if I understand what you're saying above, the "data" that
NFS_INO_INVALID_DATA refers to is in peripheral caches, like the ACCESS
cache? I guess then I don't quite understand what NFS_INO_INVALID_ATTR
and NFS_INO_INVALID_ACCESS are for. Aren't they supposed to indicate that
the attrs and access caches are invalid?

> So the fix here should be to set NFS_INO_REVAL_PAGECACHE when the change
> attribute and/or size change is noticed. The only function I can see
> that appears to get this wrong is nfs_wcc_update_inode(). Does fixing
> that lay the bug to rest?
> 

No, that doesn't fix it. There's another place that misses setting that
flag too in nfs_update_inode() where the size changes. I added it there
too and that also didn't fix it. Here's why (I think):

When the size change is noticed via the CTO GETATTR call, then my
patched kernel now sets NFS_INO_REVAL_PAGECACHE in nfs_update_inode.
Now cache_validity is set to 0x2a. The kernel then makes an ACCESS call
that also ultimately ends up in nfs_update_inode.

nfs_update_inode saves off the cache_validity flags and clears all of
them except for NFS_INO_INVALID_DATA. As best I can tell, the
saved_cache_validity is then ending up discarded and on exit the
cache_validity is ending up set to just NFS_INO_INVALID_DATA.

It seems like we ought to be restoring the save_cache_validity flags
before exiting that function, but I confess that I don't quite grasp
the logic in nfs_update_inode. Thoughts?

-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2012-12-10 16:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-10 14:25 [PATCH] nfs: don't extend writes to cover entire page if pagecache is invalid Jeff Layton
2012-12-10 14:42 ` Myklebust, Trond
2012-12-10 16:05   ` Jeff Layton [this message]
2012-12-10 16:28     ` Myklebust, Trond
2012-12-10 16:53       ` Jeff Layton
2012-12-10 17:02         ` Myklebust, Trond
2012-12-10 17:20           ` Jeff Layton
2012-12-10 18:00             ` Myklebust, Trond
2012-12-10 18:08               ` Jeff Layton
2012-12-10 18:55                 ` Myklebust, Trond
2012-12-10 18:28               ` Jeff Layton
2012-12-10 19:07                 ` Myklebust, Trond

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121210110547.2eade853@corrin.poochiereds.net \
    --to=jlayton@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=jiali@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox