* Re: [PATCH] Short write in nfsd becomes a full write to the client
@ 2009-06-27 4:24 Dale Stimson
[not found] ` <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Dale Stimson @ 2009-06-27 4:24 UTC (permalink / raw)
To: linux-nfs; +Cc: David Shaw, J. Bruce Fields
On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote:
> If a filesystem being written to via NFS returns a short write count
> (as opposed to an error) to nfsd, nfsd treats that as a success for
> the entire write, rather than the short count that actually succeeded.
>
> For example, given a 8192 byte write, if the underlying filesystem
> only writes 4096 bytes, nfsd will ack back to the nfs client that all
> 8192 bytes were written. The nfs client does have retry logic for
> short writes, but this is never called as the client is told the
> complete write succeeded.
...
> Here is a patch to properly return the short write count to the
> client.
[patch elided]
I bring this to your attention so you may, if you choose, look into
this further:
Problem synopsis:
An old client (running RHL 9 with kernel "2.4.20-43.9.legacy")
attempts to seek on a file mounted over nfs. The operation fails
with "Illegal seek" or "Input/Output error". The server is running
Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the
short write patch. When this kernel is re-built without the short
write patch, everything works as before.
Detais are at
https://bugzilla.redhat.com/show_bug.cgi?id=508174
Caveats: I am specifically referring to the patch file
"linux-2.6-nfsd-report-short-writes.patch" as newly included in
Fedora's file kernel-2.6.29.5-191.fc11.src.rpm . I can't vouch
that that patch file is identical to what was posted to this list
or merged for 2.6.30-rc1.
(As an aside, in this case, the client was attempting a simple gcc
compile and link. The failing programs (invoked by fcc) were the
assember ("as)" and "ld".)
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>]
* Re: [PATCH] Short write in nfsd becomes a full write to the client [not found] ` <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org> @ 2009-06-29 14:59 ` J. Bruce Fields 2009-06-29 19:29 ` Dale Stimson 0 siblings, 1 reply; 7+ messages in thread From: J. Bruce Fields @ 2009-06-29 14:59 UTC (permalink / raw) To: Dale Stimson; +Cc: linux-nfs, David Shaw On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote: > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote: > > If a filesystem being written to via NFS returns a short write count > > (as opposed to an error) to nfsd, nfsd treats that as a success for > > the entire write, rather than the short count that actually succeeded. > > > > For example, given a 8192 byte write, if the underlying filesystem > > only writes 4096 bytes, nfsd will ack back to the nfs client that all > > 8192 bytes were written. The nfs client does have retry logic for > > short writes, but this is never called as the client is told the > > complete write succeeded. > ... > > Here is a patch to properly return the short write count to the > > client. > [patch elided] > > I bring this to your attention so you may, if you choose, look into > this further: > > Problem synopsis: > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy") > attempts to seek on a file mounted over nfs. The operation fails > with "Illegal seek" or "Input/Output error". The server is running > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the > short write patch. When this kernel is re-built without the short > write patch, everything works as before. Does that server kernel have some version of a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs client while sync write data to nfs server" applied? If not, would it be possible to get a network trace? (Run "tcpdump -s0 -wTMP", then run the test case, then kill tcpdump and mail me a copy of TMP.) --b. > > Detais are at > https://bugzilla.redhat.com/show_bug.cgi?id=508174 > > Caveats: I am specifically referring to the patch file > "linux-2.6-nfsd-report-short-writes.patch" as newly included in > Fedora's file kernel-2.6.29.5-191.fc11.src.rpm . I can't vouch > that that patch file is identical to what was posted to this list > or merged for 2.6.30-rc1. > > (As an aside, in this case, the client was attempting a simple gcc > compile and link. The failing programs (invoked by fcc) were the > assember ("as)" and "ld".) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Short write in nfsd becomes a full write to the client 2009-06-29 14:59 ` J. Bruce Fields @ 2009-06-29 19:29 ` Dale Stimson [not found] ` <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Dale Stimson @ 2009-06-29 19:29 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs, David Shaw On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote: > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote: > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote: > > > If a filesystem being written to via NFS returns a short write count > > > (as opposed to an error) to nfsd, nfsd treats that as a success for > > > the entire write, rather than the short count that actually succeeded. > > > > > > For example, given a 8192 byte write, if the underlying filesystem > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all > > > 8192 bytes were written. The nfs client does have retry logic for > > > short writes, but this is never called as the client is told the > > > complete write succeeded. > > ... > > > Here is a patch to properly return the short write count to the > > > client. > > [patch elided] > > > > I bring this to your attention so you may, if you choose, look into > > this further: > > > > Problem synopsis: > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy") > > attempts to seek on a file mounted over nfs. The operation fails > > with "Illegal seek" or "Input/Output error". The server is running > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the > > short write patch. When this kernel is re-built without the short > > write patch, everything works as before. > > Does that server kernel have some version of > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs > client while sync write data to nfs server" applied? No. As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases. There is definitely no Fedora-specific version of that patch present. The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch. > If not, would it be possible to get a network trace? (Run "tcpdump -s0 > -wTMP", then run the test case, then kill tcpdump and mail me a copy of > TMP.) > > --b. Capture done, to be emailed off-list. > > Detais are at > > https://bugzilla.redhat.com/show_bug.cgi?id=508174 > > > > Caveats: I am specifically referring to the patch file > > "linux-2.6-nfsd-report-short-writes.patch" as newly included in > > Fedora's file kernel-2.6.29.5-191.fc11.src.rpm . I can't vouch > > that that patch file is identical to what was posted to this list > > or merged for 2.6.30-rc1. > > > > (As an aside, in this case, the client was attempting a simple gcc > > compile and link. The failing programs (invoked by fcc) were the > > assember ("as)" and "ld".) ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>]
* Re: [PATCH] Short write in nfsd becomes a full write to the client [not found] ` <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org> @ 2009-06-29 19:47 ` J. Bruce Fields 2009-06-29 19:49 ` J. Bruce Fields 0 siblings, 1 reply; 7+ messages in thread From: J. Bruce Fields @ 2009-06-29 19:47 UTC (permalink / raw) To: Dale Stimson; +Cc: linux-nfs, David Shaw On Mon, Jun 29, 2009 at 12:29:51PM -0700, Dale Stimson wrote: > On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote: > > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote: > > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote: > > > > If a filesystem being written to via NFS returns a short write count > > > > (as opposed to an error) to nfsd, nfsd treats that as a success for > > > > the entire write, rather than the short count that actually succeeded. > > > > > > > > For example, given a 8192 byte write, if the underlying filesystem > > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all > > > > 8192 bytes were written. The nfs client does have retry logic for > > > > short writes, but this is never called as the client is told the > > > > complete write succeeded. > > > ... > > > > Here is a patch to properly return the short write count to the > > > > client. > > > [patch elided] > > > > > > I bring this to your attention so you may, if you choose, look into > > > this further: > > > > > > Problem synopsis: > > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy") > > > attempts to seek on a file mounted over nfs. The operation fails > > > with "Illegal seek" or "Input/Output error". The server is running > > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the > > > short write patch. When this kernel is re-built without the short > > > write patch, everything works as before. > > > > Does that server kernel have some version of > > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs > > client while sync write data to nfs server" applied? > > No. As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases. There is definitely no Fedora-specific version of that patch present. The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch. That patch fixes a problem with the showrt write patch, so it would be worth retesting with it applied. --b. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Short write in nfsd becomes a full write to the client 2009-06-29 19:47 ` J. Bruce Fields @ 2009-06-29 19:49 ` J. Bruce Fields 2009-06-30 15:22 ` [solved] " Dale Stimson 0 siblings, 1 reply; 7+ messages in thread From: J. Bruce Fields @ 2009-06-29 19:49 UTC (permalink / raw) To: Dale Stimson; +Cc: linux-nfs, David Shaw On Mon, Jun 29, 2009 at 03:47:11PM -0400, bfields wrote: > On Mon, Jun 29, 2009 at 12:29:51PM -0700, Dale Stimson wrote: > > On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote: > > > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote: > > > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw-wh+mT2OhP0WF0gnf/s2wvA@public.gmane.org> wrote: > > > > > If a filesystem being written to via NFS returns a short write count > > > > > (as opposed to an error) to nfsd, nfsd treats that as a success for > > > > > the entire write, rather than the short count that actually succeeded. > > > > > > > > > > For example, given a 8192 byte write, if the underlying filesystem > > > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all > > > > > 8192 bytes were written. The nfs client does have retry logic for > > > > > short writes, but this is never called as the client is told the > > > > > complete write succeeded. > > > > ... > > > > > Here is a patch to properly return the short write count to the > > > > > client. > > > > [patch elided] > > > > > > > > I bring this to your attention so you may, if you choose, look into > > > > this further: > > > > > > > > Problem synopsis: > > > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy") > > > > attempts to seek on a file mounted over nfs. The operation fails > > > > with "Illegal seek" or "Input/Output error". The server is running > > > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the > > > > short write patch. When this kernel is re-built without the short > > > > write patch, everything works as before. > > > > > > Does that server kernel have some version of > > > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs > > > client while sync write data to nfs server" applied? > > > > No. As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases. There is definitely no Fedora-specific version of that patch present. The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch. > > That patch fixes a problem with the showrt write patch, so it would be > worth retesting with it applied. After looking at the trace: yes, looks like the same problem. (The (succesful) FILE_SYNC write request returned with a 0 length.) So retest with the new patch applied, then poke the Fedora people to apply it as well. --b. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [solved] Re: [PATCH] Short write in nfsd becomes a full write to the client 2009-06-29 19:49 ` J. Bruce Fields @ 2009-06-30 15:22 ` Dale Stimson [not found] ` <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Dale Stimson @ 2009-06-30 15:22 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs, David Shaw On Mon, Jun 29, 2009 at 03:49:25PM -0400, J. Bruce Fields wrote: > On Mon, Jun 29, 2009 at 03:47:11PM -0400, bfields wrote: > > On Mon, Jun 29, 2009 at 12:29:51PM -0700, Dale Stimson wrote: > > > On Mon, Jun 29, 2009 at 10:59:52AM -0400, J. Bruce Fields wrote: > > > > On Fri, Jun 26, 2009 at 09:24:49PM -0700, Dale Stimson wrote: > > > > > On Thu, 5 Mar 2009 20:16:14 -0500, David Shaw <dshaw@jabberwocky.com> wrote: > > > > > > If a filesystem being written to via NFS returns a short write count > > > > > > (as opposed to an error) to nfsd, nfsd treats that as a success for > > > > > > the entire write, rather than the short count that actually succeeded. > > > > > > > > > > > > For example, given a 8192 byte write, if the underlying filesystem > > > > > > only writes 4096 bytes, nfsd will ack back to the nfs client that all > > > > > > 8192 bytes were written. The nfs client does have retry logic for > > > > > > short writes, but this is never called as the client is told the > > > > > > complete write succeeded. > > > > > ... > > > > > > Here is a patch to properly return the short write count to the > > > > > > client. > > > > > [patch elided] > > > > > > > > > > I bring this to your attention so you may, if you choose, look into > > > > > this further: > > > > > > > > > > Problem synopsis: > > > > > An old client (running RHL 9 with kernel "2.4.20-43.9.legacy") > > > > > attempts to seek on a file mounted over nfs. The operation fails > > > > > with "Illegal seek" or "Input/Output error". The server is running > > > > > Fedora 11 kernel-PAE-2.6.29.5-191.fc11.i686, which includes the > > > > > short write patch. When this kernel is re-built without the short > > > > > write patch, everything works as before. > > > > > > > > Does that server kernel have some version of > > > > a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs > > > > client while sync write data to nfs server" applied? > > > > > > No. As far as I can see that patch has been merged only in 2.6.30 and is not in any of the 2.6.29.? releases. There is definitely no Fedora-specific version of that patch present. The only Fedora-specific patch related to NFS applied in kernel-2.6.29.5-191.fc11.src.rpm is linux-2.6-nfsd-report-short-writes.patch. > > > > That patch fixes a problem with the showrt write patch, so it would be > > worth retesting with it applied. > > After looking at the trace: yes, looks like the same problem. (The > (succesful) FILE_SYNC write request returned with a 0 length.) So > retest with the new patch applied, then poke the Fedora people to apply > it as well. > > --b. Success. I applied patch a0d24b295aed7a9daf4ca36bd4784e4d40f82303 "nfsd: fix hung up of nfs client while sync write data to nfs server" to the otherwise unmodified Fedora 11 kernel 2.6.29.5-191.fc11.i686.PAE and the problem was resolved. Thank you for your help. I will update https://bugzilla.redhat.com/show_bug.cgi?id=508174 and suggest that the patch be applied to future Fedora 2.6.29-based kernels. ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>]
* Re: [solved] Re: [PATCH] Short write in nfsd becomes a full write to the client [not found] ` <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org> @ 2009-06-30 15:22 ` J. Bruce Fields 0 siblings, 0 replies; 7+ messages in thread From: J. Bruce Fields @ 2009-06-30 15:22 UTC (permalink / raw) To: Dale Stimson; +Cc: linux-nfs, David Shaw On Tue, Jun 30, 2009 at 08:22:09AM -0700, Dale Stimson wrote: > On Mon, Jun 29, 2009 at 03:49:25PM -0400, J. Bruce Fields wrote: > > After looking at the trace: yes, looks like the same problem. (The > > (succesful) FILE_SYNC write request returned with a 0 length.) So > > retest with the new patch applied, then poke the Fedora people to apply > > it as well. > > > > --b. > > Success. > > I applied patch a0d24b295aed7a9daf4ca36bd4784e4d40f82303 > "nfsd: fix hung up of nfs client while sync write data to nfs server" > to the otherwise unmodified Fedora 11 kernel 2.6.29.5-191.fc11.i686.PAE > and the problem was resolved. > > Thank you for your help. I will update > https://bugzilla.redhat.com/show_bug.cgi?id=508174 > and suggest that the patch be applied to future Fedora 2.6.29-based kernels. Thanks for following up.--b. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-06-30 15:22 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-27 4:24 [PATCH] Short write in nfsd becomes a full write to the client Dale Stimson
[not found] ` <20090627042449.GC15665-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
2009-06-29 14:59 ` J. Bruce Fields
2009-06-29 19:29 ` Dale Stimson
[not found] ` <20090629192951.GA3851-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
2009-06-29 19:47 ` J. Bruce Fields
2009-06-29 19:49 ` J. Bruce Fields
2009-06-30 15:22 ` [solved] " Dale Stimson
[not found] ` <20090630152209.GA3320-NmLOIDrUSDQtrE7AZYN0JQC/G2K4zDHf@public.gmane.org>
2009-06-30 15:22 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox