From: malahal naineni <malahal@us.ibm.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFSv3/v2: Fix data corruption with NFS short reads.
Date: Wed, 10 Apr 2013 22:55:06 -0500 [thread overview]
Message-ID: <20130411035506.GA32459@us.ibm.com> (raw)
In-Reply-To: <20130411112121.74d996c5@notabene.brown>
NeilBrown [neilb@suse.de] wrote:
> On Fri, 29 Mar 2013 19:57:06 -0000 malahal naineni <malahal@us.ibm.com> wrote:
>
> > This bug seems to be present in v2.6.37 or lower versions. The code was
> > re-organized in v2.6.38 that eliminated the bug. Current upstream code
> > doesn't have this bug. This may be applicable to some longterm releases!
> >
> > Here are the bug details:
> >
> > 1. nfs_read_rpcsetup(), args.count and res.count are both set to the
> > actual number of bytes to be read. Let us assume that the request is
> > for 16K, so arg.count = res.count = 16K
> > 2. nfs3_xdr_readres() conditionally sets res.count to to the actual
> > number of bytes read. This condition is true for the first response
> > as res.count was set to args.count before the first request. Let us
> > say the server returned only 4K bytes. res.count=4K
> > 3. Another read request is sent for the remaining data. Note that
> > res.count is NOT updated. It is still set to the actual amount of
> > bytes we got in the first response. The client will send a READ
> > request for the remaining 12K.
>
> This is looks like a real bug, but I think the "NOT" above is the best thing
> to fix.
No doubt, a real bug! Easily reproduced with an instrumented nfsd.
> i.e. when another read request is set, res.count *SHOULD*BE* updated. That
> makes it consistent with the original send, and consistency is good!
I thought about it, but the resp->count is NOT really used as far as I
know. And the current upstream code unconditionally sets the resp->count
in xdr function. So I chose the upstream method! I agree, your patch is
more consistent with the existing code.
> Index: linux-2.6.32-SLE11-SP1-LTSS/fs/nfs/read.c
> ===================================================================
> --- linux-2.6.32-SLE11-SP1-LTSS.orig/fs/nfs/read.c 2013-03-20 16:24:31.426605189 +1100
> +++ linux-2.6.32-SLE11-SP1-LTSS/fs/nfs/read.c 2013-04-11 11:19:57.670724540 +1000
> @@ -368,6 +368,7 @@ static void nfs_readpage_retry(struct rp
> argp->offset += resp->count;
> argp->pgbase += resp->count;
> argp->count -= resp->count;
> + resp->count = argp->count;
> nfs4_restart_rpc(task, NFS_SERVER(data->inode)->nfs_client);
> return;
> out:
This patch should fix the bug as well.
> This would old affect clients with servers which would sometimes
> return partial reads, and I don't think the Linux NFS server does.
> What server have you seen this against?
We came across under Ganesha development, and I was told the same thing
that Linux NFS server doesn't do this. I didn't bother to post then, but
now we saw the same thing with linux NFS server, so I decided to post it
now. Although linux NFS server doesn't in itself create short reads but
it just calls the underlying back-end file system and sends without a
short read "check" to NFS clients. In other words, the short read
behaviour actually depends on the back-end file system rather than linux
NFS server. We saw this bug with our GPFS file system in combination
with HSM -- request for reading data on tape would fail until it is
brought back to disk.
Regards, Malahal.
prev parent reply other threads:[~2013-04-11 3:55 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-29 19:57 [PATCH] NFSv3/v2: Fix data corruption with NFS short reads Malahal Naineni
2013-04-11 1:21 ` NeilBrown
2013-04-11 3:55 ` malahal naineni [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130411035506.GA32459@us.ibm.com \
--to=malahal@us.ibm.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.