From: malahal naineni <malahal@us.ibm.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFSv3/v2: Fix data corruption with NFS short reads.
Date: Wed, 10 Apr 2013 22:55:06 -0500 [thread overview]
Message-ID: <20130411035506.GA32459@us.ibm.com> (raw)
In-Reply-To: <20130411112121.74d996c5@notabene.brown>
NeilBrown [neilb@suse.de] wrote:
> On Fri, 29 Mar 2013 19:57:06 -0000 malahal naineni <malahal@us.ibm.com> wrote:
>
> > This bug seems to be present in v2.6.37 or lower versions. The code was
> > re-organized in v2.6.38 that eliminated the bug. Current upstream code
> > doesn't have this bug. This may be applicable to some longterm releases!
> >
> > Here are the bug details:
> >
> > 1. nfs_read_rpcsetup(), args.count and res.count are both set to the
> > actual number of bytes to be read. Let us assume that the request is
> > for 16K, so arg.count = res.count = 16K
> > 2. nfs3_xdr_readres() conditionally sets res.count to to the actual
> > number of bytes read. This condition is true for the first response
> > as res.count was set to args.count before the first request. Let us
> > say the server returned only 4K bytes. res.count=4K
> > 3. Another read request is sent for the remaining data. Note that
> > res.count is NOT updated. It is still set to the actual amount of
> > bytes we got in the first response. The client will send a READ
> > request for the remaining 12K.
>
> This is looks like a real bug, but I think the "NOT" above is the best thing
> to fix.
No doubt, a real bug! Easily reproduced with an instrumented nfsd.
> i.e. when another read request is set, res.count *SHOULD*BE* updated. That
> makes it consistent with the original send, and consistency is good!
I thought about it, but the resp->count is NOT really used as far as I
know. And the current upstream code unconditionally sets the resp->count
in xdr function. So I chose the upstream method! I agree, your patch is
more consistent with the existing code.
> Index: linux-2.6.32-SLE11-SP1-LTSS/fs/nfs/read.c
> ===================================================================
> --- linux-2.6.32-SLE11-SP1-LTSS.orig/fs/nfs/read.c 2013-03-20 16:24:31.426605189 +1100
> +++ linux-2.6.32-SLE11-SP1-LTSS/fs/nfs/read.c 2013-04-11 11:19:57.670724540 +1000
> @@ -368,6 +368,7 @@ static void nfs_readpage_retry(struct rp
> argp->offset += resp->count;
> argp->pgbase += resp->count;
> argp->count -= resp->count;
> + resp->count = argp->count;
> nfs4_restart_rpc(task, NFS_SERVER(data->inode)->nfs_client);
> return;
> out:
This patch should fix the bug as well.
> This would old affect clients with servers which would sometimes
> return partial reads, and I don't think the Linux NFS server does.
> What server have you seen this against?
We came across under Ganesha development, and I was told the same thing
that Linux NFS server doesn't do this. I didn't bother to post then, but
now we saw the same thing with linux NFS server, so I decided to post it
now. Although linux NFS server doesn't in itself create short reads but
it just calls the underlying back-end file system and sends without a
short read "check" to NFS clients. In other words, the short read
behaviour actually depends on the back-end file system rather than linux
NFS server. We saw this bug with our GPFS file system in combination
with HSM -- request for reading data on tape would fail until it is
brought back to disk.
Regards, Malahal.
prev parent reply other threads:[~2013-04-11 3:55 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-29 19:57 [PATCH] NFSv3/v2: Fix data corruption with NFS short reads Malahal Naineni
2013-04-11 1:21 ` NeilBrown
2013-04-11 3:55 ` malahal naineni [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130411035506.GA32459@us.ibm.com \
--to=malahal@us.ibm.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox