public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] Short write in nfsd becomes a full write to the client
@ 2009-06-27  4:24 Dale Stimson
       [not found] ` <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Dale Stimson @ 2009-06-27  4:24 UTC (permalink / raw)
  To: linux-nfs; +Cc: David Shaw, J. Bruce Fields

On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote:
> If a filesystem being written to via NFS returns a short write count
> (as opposed to an error) to nfsd, nfsd treats that as a success for
> the entire write, rather than the short count that actually succeeded.
> 
> For example, given a 8192 byte write, if the underlying filesystem
> only writes 4096 bytes, nfsd will ack back to the nfs client that all
> 8192 bytes were written.  The nfs client does have retry logic for
> short writes, but this is never called as the client is told the
> complete write succeeded.
...
> Here is a patch to properly return the short write count to the
> client.
[patch elided]

I bring this to your attention so you may, if you choose, look into
this further:

Problem synopsis:
An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
attempts to seek on a file mounted over nfs.  The operation fails
with "Illegal seek" or "Input/Output error".  The server is running
Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
short write patch.  When this kernel is re-built without the short
write patch, everything works as before.

Detais are at
https://bugzilla.redhat.com/show_bug.cgi?id=508174

Caveats: I am specifically referring to the patch file
"linux-2.6-nfsd-report-short-writes.patch" as newly included in
Fedora's file kernel-2.6.29.5-191.fc11.src.rpm .  I can't vouch
that that patch file is identical to what was posted to this list
or merged for 2.6.30-rc1.

(As an aside, in this case, the client was attempting a simple gcc
compile and link.  The failing programs (invoked by fcc) were the
assember ("as)" and "ld".)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Short write in nfsd becomes a full write to the client
       [not found] ` <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
@ 2009-06-29 14:59   ` J. Bruce Fields
  2009-06-29 19:29     ` Dale Stimson
  0 siblings, 1 reply; 7+ messages in thread
From: J. Bruce Fields @ 2009-06-29 14:59 UTC (permalink / raw)
  To: Dale Stimson; +Cc: linux-nfs, David Shaw

On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote:
> On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote:
> > If a filesystem being written to via NFS returns a short write count
> > (as opposed to an error) to nfsd, nfsd treats that as a success for
> > the entire write, rather than the short count that actually succeeded.
> > 
> > For example, given a 8192 byte write, if the underlying filesystem
> > only writes 4096 bytes, nfsd will ack back to the nfs client that all
> > 8192 bytes were written.  The nfs client does have retry logic for
> > short writes, but this is never called as the client is told the
> > complete write succeeded.
> ...
> > Here is a patch to properly return the short write count to the
> > client.
> [patch elided]
> 
> I bring this to your attention so you may, if you choose, look into
> this further:
> 
> Problem synopsis:
> An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
> attempts to seek on a file mounted over nfs.  The operation fails
> with "Illegal seek" or "Input/Output error".  The server is running
> Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
> short write patch.  When this kernel is re-built without the short
> write patch, everything works as before.

Does that server kernel have some version of
a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs
client while sync write data to nfs server" applied?

If not, would it be possible to get a network trace?  (Run "tcpdump -s0
-wTMP", then run the test case, then kill tcpdump and mail me a copy of
TMP.)

--b.

> 
> Detais are at
> https://bugzilla.redhat.com/show_bug.cgi?id=508174
> 
> Caveats: I am specifically referring to the patch file
> "linux-2.6-nfsd-report-short-writes.patch" as newly included in
> Fedora's file kernel-2.6.29.5-191.fc11.src.rpm .  I can't vouch
> that that patch file is identical to what was posted to this list
> or merged for 2.6.30-rc1.
> 
> (As an aside, in this case, the client was attempting a simple gcc
> compile and link.  The failing programs (invoked by fcc) were the
> assember ("as)" and "ld".)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Short write in nfsd becomes a full write to the client
  2009-06-29 14:59   ` J. Bruce Fields
@ 2009-06-29 19:29     ` Dale Stimson
       [not found]       ` <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Dale Stimson @ 2009-06-29 19:29 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, David Shaw

On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote:
> On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote:
> > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote:
> > > If a filesystem being written to via NFS returns a short write count
> > > (as opposed to an error) to nfsd, nfsd treats that as a success for
> > > the entire write, rather than the short count that actually succeeded.
> > > 
> > > For example, given a 8192 byte write, if the underlying filesystem
> > > only writes 4096 bytes, nfsd will ack back to the nfs client that all
> > > 8192 bytes were written.  The nfs client does have retry logic for
> > > short writes, but this is never called as the client is told the
> > > complete write succeeded.
> > ...
> > > Here is a patch to properly return the short write count to the
> > > client.
> > [patch elided]
> > 
> > I bring this to your attention so you may, if you choose, look into
> > this further:
> > 
> > Problem synopsis:
> > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
> > attempts to seek on a file mounted over nfs.  The operation fails
> > with "Illegal seek" or "Input/Output error".  The server is running
> > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
> > short write patch.  When this kernel is re-built without the short
> > write patch, everything works as before.
> 
> Does that server kernel have some version of
> a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs
> client while sync write data to nfs server" applied?

No.  As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases.  There is definitely no Fedora-specific version of that patch present.  The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch.

> If not, would it be possible to get a network trace?  (Run "tcpdump -s0
> -wTMP", then run the test case, then kill tcpdump and mail me a copy of
> TMP.)
> 
> --b.

Capture done, to be emailed off-list.

> > Detais are at
> > https://bugzilla.redhat.com/show_bug.cgi?id=508174
> > 
> > Caveats: I am specifically referring to the patch file
> > "linux-2.6-nfsd-report-short-writes.patch" as newly included in
> > Fedora's file kernel-2.6.29.5-191.fc11.src.rpm .  I can't vouch
> > that that patch file is identical to what was posted to this list
> > or merged for 2.6.30-rc1.
> > 
> > (As an aside, in this case, the client was attempting a simple gcc
> > compile and link.  The failing programs (invoked by fcc) were the
> > assember ("as)" and "ld".)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Short write in nfsd becomes a full write to the client
       [not found]       ` <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
@ 2009-06-29 19:47         ` J. Bruce Fields
  2009-06-29 19:49           ` J. Bruce Fields
  0 siblings, 1 reply; 7+ messages in thread
From: J. Bruce Fields @ 2009-06-29 19:47 UTC (permalink / raw)
  To: Dale Stimson; +Cc: linux-nfs, David Shaw

On Mon, Jun 29, 2009 at 12:29:51PM -0700, Dale Stimson wrote:
> On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote:
> > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote:
> > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote:
> > > > If a filesystem being written to via NFS returns a short write count
> > > > (as opposed to an error) to nfsd, nfsd treats that as a success for
> > > > the entire write, rather than the short count that actually succeeded.
> > > > 
> > > > For example, given a 8192 byte write, if the underlying filesystem
> > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all
> > > > 8192 bytes were written.  The nfs client does have retry logic for
> > > > short writes, but this is never called as the client is told the
> > > > complete write succeeded.
> > > ...
> > > > Here is a patch to properly return the short write count to the
> > > > client.
> > > [patch elided]
> > > 
> > > I bring this to your attention so you may, if you choose, look into
> > > this further:
> > > 
> > > Problem synopsis:
> > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
> > > attempts to seek on a file mounted over nfs.  The operation fails
> > > with "Illegal seek" or "Input/Output error".  The server is running
> > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
> > > short write patch.  When this kernel is re-built without the short
> > > write patch, everything works as before.
> > 
> > Does that server kernel have some version of
> > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs
> > client while sync write data to nfs server" applied?
> 
> No.  As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases.  There is definitely no Fedora-specific version of that patch present.  The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch.

That patch fixes a problem with the showrt write patch, so it would be
worth retesting with it applied.

--b.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Short write in nfsd becomes a full write to the client
  2009-06-29 19:47         ` J. Bruce Fields
@ 2009-06-29 19:49           ` J. Bruce Fields
  2009-06-30 15:22             ` [solved] " Dale Stimson
  0 siblings, 1 reply; 7+ messages in thread
From: J. Bruce Fields @ 2009-06-29 19:49 UTC (permalink / raw)
  To: Dale Stimson; +Cc: linux-nfs, David Shaw

On Mon, Jun 29, 2009 at 03:47:11PM -0400, bfields wrote:
> On Mon, Jun 29, 2009 at 12:29:51PM -0700, Dale Stimson wrote:
> > On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote:
> > > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote:
> > > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote:
> > > > > If a filesystem being written to via NFS returns a short write count
> > > > > (as opposed to an error) to nfsd, nfsd treats that as a success for
> > > > > the entire write, rather than the short count that actually succeeded.
> > > > > 
> > > > > For example, given a 8192 byte write, if the underlying filesystem
> > > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all
> > > > > 8192 bytes were written.  The nfs client does have retry logic for
> > > > > short writes, but this is never called as the client is told the
> > > > > complete write succeeded.
> > > > ...
> > > > > Here is a patch to properly return the short write count to the
> > > > > client.
> > > > [patch elided]
> > > > 
> > > > I bring this to your attention so you may, if you choose, look into
> > > > this further:
> > > > 
> > > > Problem synopsis:
> > > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
> > > > attempts to seek on a file mounted over nfs.  The operation fails
> > > > with "Illegal seek" or "Input/Output error".  The server is running
> > > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
> > > > short write patch.  When this kernel is re-built without the short
> > > > write patch, everything works as before.
> > > 
> > > Does that server kernel have some version of
> > > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs
> > > client while sync write data to nfs server" applied?
> > 
> > No.  As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases.  There is definitely no Fedora-specific version of that patch present.  The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch.
> 
> That patch fixes a problem with the showrt write patch, so it would be
> worth retesting with it applied.

After looking at the trace: yes, looks like the same problem.  (The
(succesful) FILE_SYNC write request returned with a 0 length.)  So
retest with the new patch applied, then poke the Fedora people to apply
it as well.

--b.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [solved] Re: [PATCH] Short write in nfsd becomes a full write to the client
  2009-06-29 19:49           ` J. Bruce Fields
@ 2009-06-30 15:22             ` Dale Stimson
       [not found]               ` <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Dale Stimson @ 2009-06-30 15:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, David Shaw

On Mon, Jun 29, 2009 at 03:49:25PM -0400, J. Bruce Fields wrote:
> On Mon, Jun 29, 2009 at 03:47:11PM -0400, bfields wrote:
> > On Mon, Jun 29, 2009 at 12:29:51PM -0700, Dale Stimson wrote:
> > > On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote:
> > > > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote:
> > > > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw@jabberwocky.com> wrote:
> > > > > > If a filesystem being written to via NFS returns a short write count
> > > > > > (as opposed to an error) to nfsd, nfsd treats that as a success for
> > > > > > the entire write, rather than the short count that actually succeeded.
> > > > > > 
> > > > > > For example, given a 8192 byte write, if the underlying filesystem
> > > > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all
> > > > > > 8192 bytes were written.  The nfs client does have retry logic for
> > > > > > short writes, but this is never called as the client is told the
> > > > > > complete write succeeded.
> > > > > ...
> > > > > > Here is a patch to properly return the short write count to the
> > > > > > client.
> > > > > [patch elided]
> > > > > 
> > > > > I bring this to your attention so you may, if you choose, look into
> > > > > this further:
> > > > > 
> > > > > Problem synopsis:
> > > > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
> > > > > attempts to seek on a file mounted over nfs.  The operation fails
> > > > > with "Illegal seek" or "Input/Output error".  The server is running
> > > > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
> > > > > short write patch.  When this kernel is re-built without the short
> > > > > write patch, everything works as before.
> > > > 
> > > > Does that server kernel have some version of
> > > > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs
> > > > client while sync write data to nfs server" applied?
> > > 
> > > No.  As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases.  There is definitely no Fedora-specific version of that patch present.  The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch.
> > 
> > That patch fixes a problem with the showrt write patch, so it would be
> > worth retesting with it applied.
> 
> After looking at the trace: yes, looks like the same problem.  (The
> (succesful) FILE_SYNC write request returned with a 0 length.)  So
> retest with the new patch applied, then poke the Fedora people to apply
> it as well.
> 
> --b.

Success.

I applied patch a0d24b295aed7a9daf4ca36bd4784e4d40f82303
"nfsd: fix hung up of nfs client while sync write data to nfs server"
to the otherwise unmodified Fedora 11 kernel 2.6.29.5-191.fc11.i686.PAE
and the problem was resolved.

Thank you for your help.  I will update
https://bugzilla.redhat.com/show_bug.cgi?id=508174
and suggest that the patch be applied to future Fedora 2.6.29-based kernels.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [solved] Re: [PATCH] Short write in nfsd becomes a full write to the client
       [not found]               ` <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
@ 2009-06-30 15:22                 ` J. Bruce Fields
  0 siblings, 0 replies; 7+ messages in thread
From: J. Bruce Fields @ 2009-06-30 15:22 UTC (permalink / raw)
  To: Dale Stimson; +Cc: linux-nfs, David Shaw

On Tue, Jun 30, 2009 at 08:22:09AM -0700, Dale Stimson wrote:
> On Mon, Jun 29, 2009 at 03:49:25PM -0400, J. Bruce Fields wrote:
> > After looking at the trace: yes, looks like the same problem.  (The
> > (succesful) FILE_SYNC write request returned with a 0 length.)  So
> > retest with the new patch applied, then poke the Fedora people to apply
> > it as well.
> > 
> > --b.
> 
> Success.
> 
> I applied patch a0d24b295aed7a9daf4ca36bd4784e4d40f82303
> "nfsd: fix hung up of nfs client while sync write data to nfs server"
> to the otherwise unmodified Fedora 11 kernel 2.6.29.5-191.fc11.i686.PAE
> and the problem was resolved.
> 
> Thank you for your help.  I will update
> https://bugzilla.redhat.com/show_bug.cgi?id=508174
> and suggest that the patch be applied to future Fedora 2.6.29-based kernels.

Thanks for following up.--b.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-06-30 15:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-27  4:24 [PATCH] Short write in nfsd becomes a full write to the client Dale Stimson
     [not found] ` <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
2009-06-29 14:59   ` J. Bruce Fields
2009-06-29 19:29     ` Dale Stimson
     [not found]       ` <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
2009-06-29 19:47         ` J. Bruce Fields
2009-06-29 19:49           ` J. Bruce Fields
2009-06-30 15:22             ` [solved] " Dale Stimson
     [not found]               ` <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
2009-06-30 15:22                 ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox