* NFS corruption on 2.6.11.7 @ 2005-05-23 22:47 Kenneth Johansson 2005-05-24 0:35 ` Trond Myklebust 0 siblings, 1 reply; 8+ messages in thread From: Kenneth Johansson @ 2005-05-23 22:47 UTC (permalink / raw) To: Linux Kernel Mailing List [-- Attachment #1.1: Type: text/plain, Size: 597 bytes --] I have both the server and client running 2.6.11.7 and have some severe data corruption when reading from the server (maybe on write also I have not tested). If I copy the data over with scp or ftp I get correct data. Also nfs works OK with a mac os x 10.4 client. Running gen.sh on the server and then cmp.sh on the client results in a md5 checksum difference on 5-12 files I have never done one run where there was no errors. This is what cat /proc/mounts reports on the nfs mount :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0 [-- Attachment #1.2: gen.sh --] [-- Type: application/x-shellscript, Size: 176 bytes --] [-- Attachment #1.3: cmp.sh --] [-- Type: application/x-shellscript, Size: 147 bytes --] [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NFS corruption on 2.6.11.7 2005-05-23 22:47 NFS corruption on 2.6.11.7 Kenneth Johansson @ 2005-05-24 0:35 ` Trond Myklebust 2005-05-24 1:04 ` Trond Myklebust 2005-05-24 10:15 ` Kenneth Johansson 0 siblings, 2 replies; 8+ messages in thread From: Trond Myklebust @ 2005-05-24 0:35 UTC (permalink / raw) To: Kenneth Johansson; +Cc: Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 818 bytes --] ty den 24.05.2005 Klokka 00:47 (+0200) skreiv Kenneth Johansson: > I have both the server and client running 2.6.11.7 and have some severe > data corruption when reading from the server (maybe on write also I have > not tested). > > If I copy the data over with scp or ftp I get correct data. Also nfs > works OK with a mac os x 10.4 client. > > Running gen.sh on the server and then cmp.sh on the client results in a > md5 checksum difference on 5-12 files I have never done one run where > there was no errors. > > This is what cat /proc/mounts reports on the nfs mount > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0 > I'm seeing no problems at all with this on a loopback mount with 2.6.12-rc4. Mind giving us some more details on your setup? Cheers, Trond [-- Attachment #2: sum_org --] [-- Type: text/plain, Size: 1100 bytes --] 9e068cf97d3382d033e9fd2ad811b6a3 0 79b40e24ed13f1c16007e844d8fe9d93 1 4f3abe3f5e6de2e9e59454de1a768a9f 2 dfe58bae3169382d47ab018565cd2cc9 3 56135ada6249175b1f78e596ecbfd859 4 441156cb860423f4903b01353dcd986e 5 b5b2ffbcfb0902476fa67139238a66c4 6 8c5cd6943b9f02fa752b9ee5faeb30d2 7 ee8428a730822171053970b461bd5c27 8 78e4a42024075801df90e2adc4c314aa 9 d5c48cc0007a212f27621b180195f061 10 ab610700befe732d8abbf23de41b2f0d 11 af6c2a4372a7537725d8245cada83b72 12 9ec40dbfeeac74e69ecd7d16c3e6bcb1 13 542e624e404b1c277db7190288028201 14 7be4bd273006e5f535a365f917e9cb88 15 75a61da2dbd52484d091e9abc594eca7 16 4b9a14953134873e718b9251a573a0c2 17 40b04c291ae3843aec667cb1c90d50e6 18 c96e00d3ce1cb0a690f9cf51e42aa21f 19 8231093f537ad37fc1825de7ae7d1623 20 2051e8708251d0e32b4078882303e1d1 21 1878e49986ebbc1c3fedf3fc253f8949 22 81a10eb96087cdf75cb1250daa54c033 23 b0eade848db80c36f567a3994214868f 24 4f09df57a60e4de43e662a535a131b7a 25 a5b0991219a0d6ef5d1aaec5dc95d28a 26 c3bc7156aca526b81ade9bf6ddc991bf 27 b0794f598cfd8a5e61a741453c2c0142 28 dccff34e9b630fd575faefe4bd3f4a29 29 [-- Attachment #3: sum_new --] [-- Type: text/plain, Size: 1100 bytes --] 9e068cf97d3382d033e9fd2ad811b6a3 0 79b40e24ed13f1c16007e844d8fe9d93 1 4f3abe3f5e6de2e9e59454de1a768a9f 2 dfe58bae3169382d47ab018565cd2cc9 3 56135ada6249175b1f78e596ecbfd859 4 441156cb860423f4903b01353dcd986e 5 b5b2ffbcfb0902476fa67139238a66c4 6 8c5cd6943b9f02fa752b9ee5faeb30d2 7 ee8428a730822171053970b461bd5c27 8 78e4a42024075801df90e2adc4c314aa 9 d5c48cc0007a212f27621b180195f061 10 ab610700befe732d8abbf23de41b2f0d 11 af6c2a4372a7537725d8245cada83b72 12 9ec40dbfeeac74e69ecd7d16c3e6bcb1 13 542e624e404b1c277db7190288028201 14 7be4bd273006e5f535a365f917e9cb88 15 75a61da2dbd52484d091e9abc594eca7 16 4b9a14953134873e718b9251a573a0c2 17 40b04c291ae3843aec667cb1c90d50e6 18 c96e00d3ce1cb0a690f9cf51e42aa21f 19 8231093f537ad37fc1825de7ae7d1623 20 2051e8708251d0e32b4078882303e1d1 21 1878e49986ebbc1c3fedf3fc253f8949 22 81a10eb96087cdf75cb1250daa54c033 23 b0eade848db80c36f567a3994214868f 24 4f09df57a60e4de43e662a535a131b7a 25 a5b0991219a0d6ef5d1aaec5dc95d28a 26 c3bc7156aca526b81ade9bf6ddc991bf 27 b0794f598cfd8a5e61a741453c2c0142 28 dccff34e9b630fd575faefe4bd3f4a29 29 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NFS corruption on 2.6.11.7 2005-05-24 0:35 ` Trond Myklebust @ 2005-05-24 1:04 ` Trond Myklebust 2005-05-24 10:15 ` Kenneth Johansson 1 sibling, 0 replies; 8+ messages in thread From: Trond Myklebust @ 2005-05-24 1:04 UTC (permalink / raw) To: Kenneth Johansson; +Cc: Linux Kernel Mailing List må den 23.05.2005 Klokka 20:35 (-0400) skreiv Trond Myklebust: > ty den 24.05.2005 Klokka 00:47 (+0200) skreiv Kenneth Johansson: > > This is what cat /proc/mounts reports on the nfs mount > > > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0 > > BTW: Why is /proc/mounts reporting the server as being an empty string? Normally, the "mount" program should be setting that to whatever you specified on the command line. Cheers, Trond ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NFS corruption on 2.6.11.7 2005-05-24 0:35 ` Trond Myklebust 2005-05-24 1:04 ` Trond Myklebust @ 2005-05-24 10:15 ` Kenneth Johansson 2005-05-24 12:01 ` Trond Myklebust 1 sibling, 1 reply; 8+ messages in thread From: Kenneth Johansson @ 2005-05-24 10:15 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 3814 bytes --] On Mon, 2005-05-23 at 20:35 -0400, Trond Myklebust wrote: > ty den 24.05.2005 Klokka 00:47 (+0200) skreiv Kenneth Johansson: > > I have both the server and client running 2.6.11.7 and have some severe > > data corruption when reading from the server (maybe on write also I have > > not tested). > > > > If I copy the data over with scp or ftp I get correct data. Also nfs > > works OK with a mac os x 10.4 client. > > > > Running gen.sh on the server and then cmp.sh on the client results in a > > md5 checksum difference on 5-12 files I have never done one run where > > there was no errors. > > > > This is what cat /proc/mounts reports on the nfs mount > > > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0 > > > > I'm seeing no problems at all with this on a loopback mount with > 2.6.12-rc4. Mind giving us some more details on your setup? > > Cheers, > Trond I did some more investigation what type of data error I get and it looks a bit strange. I always get 28 bytes wrong in a sequence some times this is data repeated from previous in the file but not always. Anybody know what cache line size this cpu has? processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(TM) XP 2200+ stepping : 0 cpu MHz : 1802.998 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow bogomips : 3547.13 Here is a sample if three files with errors in them. file 13 "od -Ax -tx1z" -924dc0 df b3 0c 89 2d a2 83 da 1c 08 f2 66 da f6 6b f4 >....-......f..k.< +924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< -924dd0 af c2 44 57 9a 13 01 43 84 bf 99 c3 1b 16 8a 00 >..DW...C........< +924dd0 3e 64 d7 bd 4f 8d 26 cf 4f 4f 2c 62 1b 16 8a 00 >>d..O.&.OO,b....< 28 bytes wrong in a sequence The data is a repeat from previous data in the file. >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_org 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_err 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< 924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< 924dc0 is a copy of 924d40 128 bytes offset file 14 "od -Ax -tx1z" -0912f0 91 45 bb cd eb 4f 01 d3 69 27 88 b5 7d 7d 17 8d >.E...O..i'..}}..< +0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< -091300 7d 94 8e f9 81 d0 c2 4a b5 8e c6 af b0 03 4c 16 >}......J......L.< +091300 d9 05 ac 0d fc eb 00 71 17 bd fb 3e b0 03 4c 16 >.......q...>..L.< >grep "b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29" 14_err 0912b0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< 0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< 28 bytes wrong 64 bytes offset file 16 "od -Ax -tx1z" -635200 c3 1d f2 b8 c4 d5 12 c1 3f 48 e6 9d dc 98 1f e5 >........?H......< +635200 c3 1d f2 b8 c4 d5 12 c1 00 10 00 00 00 d0 ec 08 >................< -635210 9e 54 e7 f1 49 5b 1e d0 9f e2 7c 26 24 cb 98 24 >.T..I[....|&$..$< +635210 00 10 00 00 00 90 14 08 00 10 00 00 00 50 25 06 >.............P%.< -635220 25 fc 63 2a bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >%.c*......g...]m< +635220 00 10 00 00 bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >..........g...]m< 28 bytes wrong This time the data is not from this file. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NFS corruption on 2.6.11.7 2005-05-24 10:15 ` Kenneth Johansson @ 2005-05-24 12:01 ` Trond Myklebust 2005-05-24 14:58 ` Kenneth Johansson 2005-05-25 20:13 ` Kenneth Johansson 0 siblings, 2 replies; 8+ messages in thread From: Trond Myklebust @ 2005-05-24 12:01 UTC (permalink / raw) To: Kenneth Johansson; +Cc: Linux Kernel Mailing List ty den 24.05.2005 Klokka 12:15 (+0200) skreiv Kenneth Johansson: > > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0 > > > > > > > I'm seeing no problems at all with this on a loopback mount with > > 2.6.12-rc4. Mind giving us some more details on your setup? > > > > Cheers, > > Trond Does the above export line mean that you are running with amd? If so, could you retry using an ordinary NFS mount (preferably a loopback mount - i.e. mount something over "localhost"). Again, please could you give us more details on how you are doing these tests: what hardware (i.e. what NIC, switch, server, memory,...), lsmod output, (and ditto for the server). How are you using your scripts? Are you first running one on the server, then the other on the client, are you deleting the old files before you start a new run, etc. > I did some more investigation what type of data error I get and it looks > a bit strange. I always get 28 bytes wrong in a sequence some times this > is data repeated from previous in the file but not always. Anybody know > what cache line size this cpu has? > > processor : 0 > vendor_id : AuthenticAMD > cpu family : 6 > model : 8 > model name : AMD Athlon(TM) XP 2200+ > stepping : 0 > cpu MHz : 1802.998 > cache size : 256 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 1 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow > bogomips : 3547.13 > > Here is a sample if three files with errors in them. > > file 13 "od -Ax -tx1z" > > > -924dc0 df b3 0c 89 2d a2 83 da 1c 08 f2 66 da f6 6b f4 >....-......f..k.< > +924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > -924dd0 af c2 44 57 9a 13 01 43 84 bf 99 c3 1b 16 8a 00 >..DW...C........< > +924dd0 3e 64 d7 bd 4f 8d 26 cf 4f 4f 2c 62 1b 16 8a 00 >>d..O.&.OO,b....< > > > 28 bytes wrong in a sequence > The data is a repeat from previous data in the file. > > >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_org > 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_err > 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > 924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > 924dc0 is a copy of 924d40 > 128 bytes offset > > > file 14 "od -Ax -tx1z" > > -0912f0 91 45 bb cd eb 4f 01 d3 69 27 88 b5 7d 7d 17 8d >.E...O..i'..}}..< > +0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< > > -091300 7d 94 8e f9 81 d0 c2 4a b5 8e c6 af b0 03 4c 16 >}......J......L.< > +091300 d9 05 ac 0d fc eb 00 71 17 bd fb 3e b0 03 4c 16 >.......q...>..L.< > > >grep "b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29" 14_err > 0912b0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< > 0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< > > 28 bytes wrong > 64 bytes offset > > > file 16 "od -Ax -tx1z" > > -635200 c3 1d f2 b8 c4 d5 12 c1 3f 48 e6 9d dc 98 1f e5 >........?H......< > +635200 c3 1d f2 b8 c4 d5 12 c1 00 10 00 00 00 d0 ec 08 >................< > > -635210 9e 54 e7 f1 49 5b 1e d0 9f e2 7c 26 24 cb 98 24 >.T..I[....|&$..$< > +635210 00 10 00 00 00 90 14 08 00 10 00 00 00 50 25 06 >.............P%.< > > -635220 25 fc 63 2a bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >%.c*......g...]m< > +635220 00 10 00 00 bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >..........g...]m< > > 28 bytes wrong > This time the data is not from this file. > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NFS corruption on 2.6.11.7 2005-05-24 12:01 ` Trond Myklebust @ 2005-05-24 14:58 ` Kenneth Johansson 2005-05-25 20:13 ` Kenneth Johansson 1 sibling, 0 replies; 8+ messages in thread From: Kenneth Johansson @ 2005-05-24 14:58 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 4952 bytes --] On Tue, 2005-05-24 at 08:01 -0400, Trond Myklebust wrote: > ty den 24.05.2005 Klokka 12:15 (+0200) skreiv Kenneth Johansson: > > > > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0 > > > > > > > > > > I'm seeing no problems at all with this on a loopback mount with > > > 2.6.12-rc4. Mind giving us some more details on your setup? > > > > > > Cheers, > > > Trond > > Does the above export line mean that you are running with amd? If so, This only means that I had no imagination naming the computer and simply used the name of the cpu manufacturer used in the computer. > could you retry using an ordinary NFS mount (preferably a loopback mount > - i.e. mount something over "localhost"). This works OK. > Again, please could you give us more details on how you are doing these > tests: what hardware (i.e. what NIC, switch, server, memory,...), lsmod > output, (and ditto for the server). The only new thing is. 0000:00:0e.0 Ethernet controller: D-Link System Inc Gigabit Ethernet Adapter (rev 11) And the driver is sk98lin compiled into the kernel. Everything else has been the same for over a year. hmm I did change the switch also but I do not remember what I got. I do not get any problem reading with a osx client also in gigabit speed but the client cpu is much slower so it's not exactly the same thing. > How are you using your scripts? Are you first running one on the server, > then the other on the client, are you deleting the old files before you > start a new run, etc. Telnet to the server run the gen part then run the cmp on the client. And yes I do delete the files otherwise they would more or less only be in the cache. > > > I did some more investigation what type of data error I get and it looks > > a bit strange. I always get 28 bytes wrong in a sequence some times this > > is data repeated from previous in the file but not always. Anybody know > > what cache line size this cpu has? > > > > processor : 0 > > vendor_id : AuthenticAMD > > cpu family : 6 > > model : 8 > > model name : AMD Athlon(TM) XP 2200+ > > stepping : 0 > > cpu MHz : 1802.998 > > cache size : 256 KB > > fdiv_bug : no > > hlt_bug : no > > f00f_bug : no > > coma_bug : no > > fpu : yes > > fpu_exception : yes > > cpuid level : 1 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow > > bogomips : 3547.13 > > > > Here is a sample if three files with errors in them. > > > > file 13 "od -Ax -tx1z" > > > > > > -924dc0 df b3 0c 89 2d a2 83 da 1c 08 f2 66 da f6 6b f4 >....-......f..k.< > > +924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > > > -924dd0 af c2 44 57 9a 13 01 43 84 bf 99 c3 1b 16 8a 00 >..DW...C........< > > +924dd0 3e 64 d7 bd 4f 8d 26 cf 4f 4f 2c 62 1b 16 8a 00 >>d..O.&.OO,b....< > > > > > > 28 bytes wrong in a sequence > > The data is a repeat from previous data in the file. > > > > >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_org > > 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > > > >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_err > > 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > 924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..< > > > > 924dc0 is a copy of 924d40 > > 128 bytes offset > > > > > > file 14 "od -Ax -tx1z" > > > > -0912f0 91 45 bb cd eb 4f 01 d3 69 27 88 b5 7d 7d 17 8d >.E...O..i'..}}..< > > +0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< > > > > -091300 7d 94 8e f9 81 d0 c2 4a b5 8e c6 af b0 03 4c 16 >}......J......L.< > > +091300 d9 05 ac 0d fc eb 00 71 17 bd fb 3e b0 03 4c 16 >.......q...>..L.< > > > > >grep "b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29" 14_err > > 0912b0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< > > 0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)< > > > > 28 bytes wrong > > 64 bytes offset > > > > > > file 16 "od -Ax -tx1z" > > > > -635200 c3 1d f2 b8 c4 d5 12 c1 3f 48 e6 9d dc 98 1f e5 >........?H......< > > +635200 c3 1d f2 b8 c4 d5 12 c1 00 10 00 00 00 d0 ec 08 >................< > > > > -635210 9e 54 e7 f1 49 5b 1e d0 9f e2 7c 26 24 cb 98 24 >.T..I[....|&$..$< > > +635210 00 10 00 00 00 90 14 08 00 10 00 00 00 50 25 06 >.............P%.< > > > > -635220 25 fc 63 2a bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >%.c*......g...]m< > > +635220 00 10 00 00 bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >..........g...]m< > > > > 28 bytes wrong > > This time the data is not from this file. > > > > > > > > > [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NFS corruption on 2.6.11.7 2005-05-24 12:01 ` Trond Myklebust 2005-05-24 14:58 ` Kenneth Johansson @ 2005-05-25 20:13 ` Kenneth Johansson [not found] ` <20050525.131616.59655785.davem@davemloft.net> 1 sibling, 1 reply; 8+ messages in thread From: Kenneth Johansson @ 2005-05-25 20:13 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 690 bytes --] On Tue, 2005-05-24 at 08:01 -0400, Trond Myklebust wrote: > Again, please could you give us more details on how you are doing these > tests: what hardware (i.e. what NIC, switch, server, memory,...), lsmod > output, (and ditto for the server). After changing the mount option to use tcp instead of udp I have now read several gigabytes without a single error. Is there some fundamental difference in how nfs over upd and tcp is handled regarding the packet contents like tcp using the tcp checksum and udp not using the udp checksum or something like that? Are there any counters for checksum errors in udp and tcp that can be read ?? I faild to spot anything in /proc. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20050525.131616.59655785.davem@davemloft.net>]
* Re: NFS corruption on 2.6.11.7 [not found] ` <20050525.131616.59655785.davem@davemloft.net> @ 2005-05-25 20:34 ` Kenneth Johansson 0 siblings, 0 replies; 8+ messages in thread From: Kenneth Johansson @ 2005-05-25 20:34 UTC (permalink / raw) To: David S.Miller; +Cc: trond.myklebust, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1160 bytes --] On Wed, 2005-05-25 at 13:16 -0700, David S.Miller wrote: > From: Kenneth Johansson <ken@kenjo.org> > Date: Wed, 25 May 2005 22:13:27 +0200 > > > Is there some fundamental difference in how nfs over upd and tcp is > > handled regarding the packet contents like tcp using the tcp checksum > > and udp not using the udp checksum or something like that? > > > > Are there any counters for checksum errors in udp and tcp that can be > > read ?? I faild to spot anything in /proc. > > If you are on a gigabit or faster network, IPv4 fragment sequence > numbers can wrap and if you are very unlucky the checksums will > match as well corrupting your data. This is a fatal limitation of > the small 16-bit IPv4 framgent ID. > > Use TCP for NFS unless you want NFS data corruption. > Unlikely to be the case this time. I get a sequence of 28 bytes that is wrong in the data and often the wrong data is a copy from data 64 or 128 byte earlier in the file. If this was not on a PC with cache coherency I would guess that someone forgot to do a cache invalidate/flush. But I do wonder why I only see this problem with nfs over udp. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-05-25 20:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-23 22:47 NFS corruption on 2.6.11.7 Kenneth Johansson
2005-05-24 0:35 ` Trond Myklebust
2005-05-24 1:04 ` Trond Myklebust
2005-05-24 10:15 ` Kenneth Johansson
2005-05-24 12:01 ` Trond Myklebust
2005-05-24 14:58 ` Kenneth Johansson
2005-05-25 20:13 ` Kenneth Johansson
[not found] ` <20050525.131616.59655785.davem@davemloft.net>
2005-05-25 20:34 ` Kenneth Johansson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox