* Possible problem with zero-copy TCP and sendfile() @ 2001-04-17 13:10 Jan Kasprzak 2001-04-17 14:41 ` Andi Kleen 2001-04-17 14:48 ` Alan Cox 0 siblings, 2 replies; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 13:10 UTC (permalink / raw) To: linux-kernel Hello, I have discovered a possible problem on my host. The short story is: When downloading ISO images from this host (which runs 2.4.3 + zerocopy and ProFTPd with sendfile()), the image is sometimes corrupted (MD5 checksum of the downloaded file does not match). The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. Seven IDE discs, one SCSI disc. The controllers and NIC are as follows (output of lspci): 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 00:0a.0 SCSI storage controller: Adaptec AIC-7881U 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) 00:11.0 Unknown mass storage controller: Promise Technology, Inc.: Unknown device 0d30 (rev 02) The server runs Linux 2.4.3 with zero-copy patches and ProFTPd 1.2.2rc1 compiled with --enable-sendfile. The FTP area is on RAID-1 volume, which is created over two LVM partitions (each LV spans three physical disks). I hope RAID-1 can speed things up for multiple simultaneous users. Yesterday the Red Hat Linux 7.1 has been released, and from that time the server has about 220 anonymous FTP users and was pushing data at almost full 100 Mbps ethernet speed (currently the 2hour average is 89.7 Mbps according to MRTG). Today I've got about three complains about corrupted ISO images. When I run md5sum on the server itself, the MD5 checksums, of course, perfectly match. I've tried to download the files from another machine on the same net, and MD5 sums were correct. However, I have one report of corrupted download even from the same physical network. In the last 24 hours the server pushed out about 660 gigabytes of Red Hat 7.1. Is this amount (i.e. three reports out of 660 gigabytes) a serious problem? Also note that I have no corrupted download report for rsync. But I think rsyncd does not use sendfile(), and of course vast majority of people use FTP, not rsync, for downloading. -Yenya -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 13:10 Possible problem with zero-copy TCP and sendfile() Jan Kasprzak @ 2001-04-17 14:41 ` Andi Kleen 2001-04-17 15:50 ` Jan Kasprzak 2001-04-17 14:48 ` Alan Cox 1 sibling, 1 reply; 18+ messages in thread From: Andi Kleen @ 2001-04-17 14:41 UTC (permalink / raw) To: Jan Kasprzak; +Cc: linux-kernel On Tue, Apr 17, 2001 at 03:10:07PM +0200, Jan Kasprzak wrote: > 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) IIRC the problem came up earlier. Some versions of 3com NICs seem to make problems with the hardware checksum. There were some fixes in the driver later; could you try it with 2.4.4pre3 (which includes zerocopy) ? -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 14:41 ` Andi Kleen @ 2001-04-17 15:50 ` Jan Kasprzak [not found] ` <20010417175916.A11824@gruyere.muc.suse.de> 0 siblings, 1 reply; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 15:50 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel Andi Kleen wrote: : On Tue, Apr 17, 2001 at 03:10:07PM +0200, Jan Kasprzak wrote: : > 00:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) : : IIRC the problem came up earlier. Some versions of 3com NICs seem to make : problems with the hardware checksum. There were some fixes in the driver : later; could you try it with 2.4.4pre3 (which includes zerocopy) ? : I was not able to boot 2.4.4pre3 at all: It panicked when initializing aic7xxx. So I've changed the config to old_aic7xxx, but it locked up on starting up RAID arrays. BTW, patch-2.4.4pre3 does not contain any significant change to 3c59x.c (the only change is adding some #include file). Now I am back to 2.4.3 and I'll try to run proftpd without sendfile(). -Y. -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'. Alan (yes I programmed in B)\\\ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20010417175916.A11824@gruyere.muc.suse.de>]
* Re: Possible problem with zero-copy TCP and sendfile() [not found] ` <20010417175916.A11824@gruyere.muc.suse.de> @ 2001-04-17 17:07 ` Jan Kasprzak 2001-04-17 17:40 ` Jan Kasprzak 0 siblings, 1 reply; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 17:07 UTC (permalink / raw) To: Andi Kleen; +Cc: David S. Miller, pavel, linux-kernel Andi Kleen wrote: : I guess to debug this problem it would be useful to get some idea about the : nature of the corruption. Could you enable sendfile() again, and when a : user complains ask to download it again and provide a : cmp -cl fileA fileB | head -500 listing of their differences? Well, here it is: $ cmp -cl seawolf-sendfile.iso seawolf-i386-SRPMS.iso 160628609 0 ^@ 276 M-> 160628610 0 ^@ 32 ^Z 160628611 0 ^@ 14 ^L 160628612 0 ^@ 55 - 160628613 0 ^@ 116 N 160628614 0 ^@ 300 M-@ 160628615 0 ^@ 150 h 160628616 0 ^@ 210 M-^H 160628617 0 ^@ 271 M-9 160628618 0 ^@ 307 M-G 160628619 0 ^@ 377 M-^? [ all bytes in sendfile()d image changed to zero until: ] 160661374 0 ^@ 376 M-~ 160661375 0 ^@ 231 M-^Y 160661376 0 ^@ 205 M-^E 160661377 1 ^A 364 M-t 160661378 103 C 277 M-? 160661379 104 D 13 ^K 160661380 60 0 50 ( 160661381 60 0 360 M-p 160661382 61 1 77 ? 160661383 1 ^A 304 M-D 160661384 0 ^@ 133 [ 160661385 114 L 131 Y 160661386 111 I 377 M-^? 160661387 116 N 123 S 160661388 125 U 234 M-^\ 160661389 130 X 250 M-( Which simply means, that at 160628609 it started to send the CD image from the beginning. Yes, the original image contains 0x8000 zeros, and then the text "\001CD001\001\000LINUX". So it has probably nothing to do with 3c59x driver, but with sendfile() or ProFTPd's use of sendfile(). If anybody wants to test it, I've left running ProFTPd with sendfile() enabled at ftp.linux.cz, port 2121. Thanks, -Yenya -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'. Alan (yes I programmed in B)\\\ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 17:07 ` Jan Kasprzak @ 2001-04-17 17:40 ` Jan Kasprzak 0 siblings, 0 replies; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 17:40 UTC (permalink / raw) To: Andi Kleen; +Cc: David S. Miller, pavel, linux-kernel Jan Kasprzak wrote: : $ cmp -cl seawolf-sendfile.iso seawolf-i386-SRPMS.iso [...] : : Which simply means, that at 160628609 it started to send : the CD image from the beginning. Well, I did strace of proftpd, and it _may_ be a mis-interpretation of the sendfile(2) semantics on the proftpd side. The relevant part of strace follows: gettimeofday({987527927, 46167}, NULL) = 0 fcntl64(12, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(12, F_SETFL, O_RDWR) = 0 sendfile(12, 9, [0], 678244352) = 138133872 --- SIGALRM (Alarm clock) --- rt_sigaction(SIGALRM, {0x804f520, [], SA_INTERRUPT|0x4000000}, NULL, 8) = 0 rt_sigaction(SIGALRM, NULL, {0x804f520, [], SA_INTERRUPT|0x4000000}, 8) = 0 rt_sigaction(SIGALRM, {0x804f520, [], SA_INTERRUPT|0x4000000}, NULL, 8) = 0 alarm(300) = 0 sigreturn() = ? (mask now []) fcntl64(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 alarm(0) = 300 alarm(300) = 0 alarm(0) = 300 alarm(300) = 0 getpid() = 24482 geteuid32() = 14 getegid32() = 50 flock(6, LOCK_EX) = 0 lseek(6, 644, SEEK_SET) = 644 read(6, "\242_\0\0\16\0\0\0002\0\0\0\0\0\0\0I\10\0\0\0\0\0\0ftp"..., 644) = 644 lseek(6, 644, SEEK_SET) = 644 write(6, "\242_\0\0\16\0\0\0002\0\0\0\0\0\0\0I\10\0\0\0\0\0\0ftp"..., 644) = 644 flock(6, LOCK_UN) = 0 fcntl64(12, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(12, F_SETFL, O_RDWR) = 0 sendfile(12, 9, [0], 540110480) = 103469424 Now the fd 6 is the control connection, fd 9 is the file on disk, and fd 12 is the data connection. The ProFTPd seems to set alarm to 300 seconds (to detect stalled clients), but when interrupted, something strange happens: either sendfile does not update the offset in its third parameter, or it fails to update the offset in the filedescriptor, or something like that. Maybe ProFTPd should pass the non-zero value (actual offset?) to sendfile() second time? What is the expected semantics of sendfile() wrt. restarting transfers and being interrupted by SIGALRM? -Yenya -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 13:10 Possible problem with zero-copy TCP and sendfile() Jan Kasprzak 2001-04-17 14:41 ` Andi Kleen @ 2001-04-17 14:48 ` Alan Cox 2001-04-17 15:02 ` Jan Kasprzak 1 sibling, 1 reply; 18+ messages in thread From: Alan Cox @ 2001-04-17 14:48 UTC (permalink / raw) To: Jan Kasprzak; +Cc: linux-kernel > The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. > Seven IDE discs, one SCSI disc. The controllers and NIC are as follows > (output of lspci): See the VIA chipset report on www.theregister.co.uk about corruption problems with VIA chipsets. The cases seen on Linux included short and also sometimes stale/corrupted DMA transfers. Nothing in your report says it is or isnt going to be a VIA chipset problem but once a fixed BIOS is out for your board that would be a good first step. If it still does it then, its worth digging for kernel naughties ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 14:48 ` Alan Cox @ 2001-04-17 15:02 ` Jan Kasprzak 2001-04-17 16:04 ` Alan Cox 0 siblings, 1 reply; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 15:02 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Alan Cox wrote: : > The long story: My server is Athlon 850 on ASUS A7V, 256M RAM. : > Seven IDE discs, one SCSI disc. The controllers and NIC are as follows : > (output of lspci): : : See the VIA chipset report on www.theregister.co.uk about corruption problems : with VIA chipsets. The cases seen on Linux included short and also sometimes : stale/corrupted DMA transfers. : : Nothing in your report says it is or isnt going to be a VIA chipset problem : but once a fixed BIOS is out for your board that would be a good first step. : If it still does it then, its worth digging for kernel naughties : I don't think I have 686b southbridge. I have 686 (without "b"): 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) 00:01.0 PCI bridge: VIA Technologies, Inc.: Unknown device 8305 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22) 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) 00:04.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 00:04.3 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) 00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30 [...] -Yenya -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'. Alan (yes I programmed in B)\\\ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 15:02 ` Jan Kasprzak @ 2001-04-17 16:04 ` Alan Cox 2001-04-17 16:15 ` Jan Kasprzak 0 siblings, 1 reply; 18+ messages in thread From: Alan Cox @ 2001-04-17 16:04 UTC (permalink / raw) To: Jan Kasprzak; +Cc: Alan Cox, linux-kernel > : but once a fixed BIOS is out for your board that would be a good first step. > : If it still does it then, its worth digging for kernel naughties > : > I don't think I have 686b southbridge. I have 686 (without "b"): Ok. What revision of 3c90x card do you have ? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 16:04 ` Alan Cox @ 2001-04-17 16:15 ` Jan Kasprzak 2001-04-17 18:24 ` Pekka Pietikainen 2001-04-17 20:10 ` Jesse S Sipprell 0 siblings, 2 replies; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 16:15 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Alan Cox wrote: : > : but once a fixed BIOS is out for your board that would be a good first step. : > : If it still does it then, its worth digging for kernel naughties : > : : > I don't think I have 686b southbridge. I have 686 (without "b"): : : Ok. What revision of 3c90x card do you have ? : PCI: Found IRQ 11 for device 00:0c.0 3c59x.c:LK1.1.13 27 Jan 2001 Donald Becker and others. http://www.scyld.com/network/vortex.html See Documentation/networking/vortex.txt eth0: 3Com PCI 3c905C Tornado at 0xa000, 00:50:da:06:95:21, IRQ 11 product code 5957 rev 00.13 date 07-17-99 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. MII transceiver found at address 24, status 782d. Enabling bus-master transmits and whole-frame receives. eth0: scatter/gather enabled. h/w checksums enabled Some more progress: I now downgraded to proftpd without sendfile(). The CPU usage is now nearly 100% (with ~170 FTP users; with sendfile() it was under 50% with >320 FTP users). But nevertheless, the downloaded images now seem to be OK. Should I try the stock 2.4.3 without zero-copy patches? -Yenya -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// ///... in B its 'extrn' not 'extern'. Alan (yes I programmed in B)\\\ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 16:15 ` Jan Kasprzak @ 2001-04-17 18:24 ` Pekka Pietikainen 2001-04-17 20:10 ` Jesse S Sipprell 1 sibling, 0 replies; 18+ messages in thread From: Pekka Pietikainen @ 2001-04-17 18:24 UTC (permalink / raw) To: Jan Kasprzak; +Cc: linux-kernel On Tue, Apr 17, 2001 at 06:15:24PM +0200, Jan Kasprzak wrote: > Some more progress: I now downgraded to proftpd without sendfile(). > The CPU usage is now nearly 100% (with ~170 FTP users; with sendfile() > it was under 50% with >320 FTP users). But nevertheless, the downloaded > images now seem to be OK. > > Should I try the stock 2.4.3 without zero-copy patches? It might also be useful to try 2.4.3+zc with the dev->features |= NETIF_F_SG; in the 3c59x driver taken out (so it won't use zero-copy) Since it starts from the beginning instead of corrupting random packets I doubt it's a hardware problem, though. -- Pekka Pietikainen ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 16:15 ` Jan Kasprzak 2001-04-17 18:24 ` Pekka Pietikainen @ 2001-04-17 20:10 ` Jesse S Sipprell 2001-04-17 20:23 ` David S. Miller 2001-04-17 20:36 ` Jan Kasprzak 1 sibling, 2 replies; 18+ messages in thread From: Jesse S Sipprell @ 2001-04-17 20:10 UTC (permalink / raw) To: Jan Kasprzak; +Cc: Alan Cox, linux-kernel, proftpd-devel On Tue, Apr 17, 2001 at 06:15:24PM +0200, Jan Kasprzak wrote: > Alan Cox wrote: > : > : but once a fixed BIOS is out for your board that would be a good first step. > : > : If it still does it then, its worth digging for kernel naughties > : > : > : > I don't think I have 686b southbridge. I have 686 (without "b"): > : > : Ok. What revision of 3c90x card do you have ? > : > PCI: Found IRQ 11 for device 00:0c.0 > 3c59x.c:LK1.1.13 27 Jan 2001 Donald Becker and others. http://www.scyld.com/network/vortex.html > See Documentation/networking/vortex.txt > eth0: 3Com PCI 3c905C Tornado at 0xa000, 00:50:da:06:95:21, IRQ 11 > product code 5957 rev 00.13 date 07-17-99 > 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. > MII transceiver found at address 24, status 782d. > Enabling bus-master transmits and whole-frame receives. > eth0: scatter/gather enabled. h/w checksums enabled > > Some more progress: I now downgraded to proftpd without sendfile(). > The CPU usage is now nearly 100% (with ~170 FTP users; with sendfile() > it was under 50% with >320 FTP users). But nevertheless, the downloaded > images now seem to be OK. After cursory examination of proftpd, it appears that there is a misuse of the sendfile() call under Linux, which may be responsible for the corruption. The code was originally based on BSD semantics. Under Linux, the offset argument is not being used correctly to determine how much data has been sent in the case of EINTR. A patch will be coming out soon, as it is a fairly trivial fix. -- "In the event of a failure, the system can be configured to automatically restart itself. This feature of Windows NT Server provides maximum system up-time." -- Reliability and Fault Tolerance in Windows NT Server, MSC ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 20:10 ` Jesse S Sipprell @ 2001-04-17 20:23 ` David S. Miller 2001-04-17 20:44 ` Jesse S Sipprell 2001-04-17 20:36 ` Jan Kasprzak 1 sibling, 1 reply; 18+ messages in thread From: David S. Miller @ 2001-04-17 20:23 UTC (permalink / raw) To: Jesse S Sipprell; +Cc: Jan Kasprzak, Alan Cox, linux-kernel, proftpd-devel Jesse S Sipprell writes: > A patch will be coming out soon, as it is a fairly trivial fix. Thank you for tracking this down. One more subtle note, for the case of error handling. There is a change to sendfile() in the zerocopy patches which causes sendfile() to act more like sendmsg() when errors occur. Specifically, sendmsg() works roughly like the following when an error happens: handle_error: if (sent_something) return how_much_we_sent; else return ERROR_CODE; So when an error happens, and the kernel was able to send some of the data, you see something like this in the trace: sendmsg() = N ... sendmsg() = ERROR_CODE sendfile() used to act differently, and this made it difficult to directly transform a sendmsg()+local_buffer based server into a sendfile() one because the error handling was so different. Previously, sendfile() wouldn't give you the partial transfer length, you'd just get the error _regardless_ of whether any data was sent successfully during that call. Alexey, myself, and others considered this behavior bogus and inconsistent. So it was changed. The long and short of it is that sendfile() now acts just like sendmsg() when errors happen mid-send. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 20:23 ` David S. Miller @ 2001-04-17 20:44 ` Jesse S Sipprell 2001-04-17 20:50 ` David S. Miller 0 siblings, 1 reply; 18+ messages in thread From: Jesse S Sipprell @ 2001-04-17 20:44 UTC (permalink / raw) To: David S. Miller; +Cc: Jesse S Sipprell, Jan Kasprzak, Alan Cox, linux-kernel On Tue, Apr 17, 2001 at 01:23:07PM -0700, David S. Miller wrote: > One more subtle note, for the case of error handling. There is a > change to sendfile() in the zerocopy patches which causes sendfile() > to act more like sendmsg() when errors occur. How is this likely to affect applications? Currently, the glibc2.1 sendfile interface looks like: ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count); On error, -1 is returned in the usual fashion and offset is purported to be updated to point to the next byte following the last one sent. Will the zerocopy patches break this? > > Specifically, sendmsg() works roughly like the following when an > error happens: > > handle_error: > if (sent_something) > return how_much_we_sent; > else > return ERROR_CODE; > > So when an error happens, and the kernel was able to send some of > the data, you see something like this in the trace: > > sendmsg() = N > ... > sendmsg() = ERROR_CODE > > sendfile() used to act differently, and this made it difficult to > directly transform a sendmsg()+local_buffer based server into a > sendfile() one because the error handling was so different. > > Previously, sendfile() wouldn't give you the partial transfer length, > you'd just get the error _regardless_ of whether any data was sent > successfully during that call. Alexey, myself, and others considered > this behavior bogus and inconsistent. So it was changed. > > The long and short of it is that sendfile() now acts just like > sendmsg() when errors happen mid-send. > > Later, > David S. Miller > davem@redhat.com -- "In the event of a failure, the system can be configured to automatically restart itself. This feature of Windows NT Server provides maximum system up-time." -- Reliability and Fault Tolerance in Windows NT Server, MSC ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 20:44 ` Jesse S Sipprell @ 2001-04-17 20:50 ` David S. Miller 0 siblings, 0 replies; 18+ messages in thread From: David S. Miller @ 2001-04-17 20:50 UTC (permalink / raw) To: Jesse S Sipprell; +Cc: Jan Kasprzak, Alan Cox, linux-kernel Jesse S Sipprell writes: > On error, -1 is returned in the usual fashion and offset is purported to be > updated to point to the next byte following the last one sent. > > Will the zerocopy patches break this? No, they should not. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 20:10 ` Jesse S Sipprell 2001-04-17 20:23 ` David S. Miller @ 2001-04-17 20:36 ` Jan Kasprzak 2001-04-17 21:22 ` Wolfgang Rohdewald 1 sibling, 1 reply; 18+ messages in thread From: Jan Kasprzak @ 2001-04-17 20:36 UTC (permalink / raw) To: Jesse S Sipprell; +Cc: Alan Cox, linux-kernel, proftpd-devel, pavel Jesse S Sipprell wrote: : After cursory examination of proftpd, it appears that there is a misuse of the : sendfile() call under Linux, which may be responsible for the corruption. The : code was originally based on BSD semantics. Under Linux, the offset argument : is not being used correctly to determine how much data has been sent in the : case of EINTR. : : A patch will be coming out soon, as it is a fairly trivial fix. : FWIW, I've fixed ProFTPd on my server with the following patch. Sorry for making noise @ linux-kernel list, it was totally unrelated to the Linux kernel: --- proftpd-1.2.2rc1/src/data.c.sendfile Thu Feb 15 15:24:53 2001 +++ proftpd-1.2.2rc1/src/data.c Tue Apr 17 21:35:24 2001 @@ -760,7 +760,9 @@ * * ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count) */ - if((len = sendfile(session.d->outf->fd, retr_fd, offset, count)) == -1) { + len = sendfile(session.d->outf->fd, retr_fd, offset, count); + if (len == -1 || len > 0 && len < count) { + errno = EINTR; #elif defined(HAVE_BSD_SENDFILE) /* BSD semantics for sendfile are flexible...it'd be nice if we could * standardize on something like it. The semantics are: @@ -797,7 +799,9 @@ if((count -= len) <= 0) break; +#if !defined(HAVE_LINUX_SENDFILE) *offset += len; +#endif if(TimeoutStalled) reset_timer(TIMER_STALLED, ANY_MODULE); -Yenya -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 20:36 ` Jan Kasprzak @ 2001-04-17 21:22 ` Wolfgang Rohdewald 2001-04-17 21:43 ` Martin Josefsson 2001-04-18 6:09 ` Jan Kasprzak 0 siblings, 2 replies; 18+ messages in thread From: Wolfgang Rohdewald @ 2001-04-17 21:22 UTC (permalink / raw) To: linux-kernel; +Cc: Jan Kasprzak On Tuesday 17 April 2001 22:36, Jan Kasprzak wrote: > + if (len == -1 || len > 0 && len < count) { are you sure there are no missing () ? if ((len == -1) || (len > 0) && (len < count)) { assumig that && has precedence over || (I believe so) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 21:22 ` Wolfgang Rohdewald @ 2001-04-17 21:43 ` Martin Josefsson 2001-04-18 6:09 ` Jan Kasprzak 1 sibling, 0 replies; 18+ messages in thread From: Martin Josefsson @ 2001-04-17 21:43 UTC (permalink / raw) To: Wolfgang Rohdewald; +Cc: linux-kernel, Jan Kasprzak On Tue, 17 Apr 2001, Wolfgang Rohdewald wrote: > On Tuesday 17 April 2001 22:36, Jan Kasprzak wrote: > > + if (len == -1 || len > 0 && len < count) { > > are you sure there are no missing () ? > > if ((len == -1) || (len > 0) && (len < count)) { > > assumig that && has precedence over || (I believe so) I don't this makes it that much cleaner. If you want to make it clear what this does you should write it more like this: if (len == -1 || (len > 0 && len < count)) I don't think it's the == and < , > that confusing but the || and && /Martin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible problem with zero-copy TCP and sendfile() 2001-04-17 21:22 ` Wolfgang Rohdewald 2001-04-17 21:43 ` Martin Josefsson @ 2001-04-18 6:09 ` Jan Kasprzak 1 sibling, 0 replies; 18+ messages in thread From: Jan Kasprzak @ 2001-04-18 6:09 UTC (permalink / raw) To: Wolfgang Rohdewald; +Cc: linux-kernel Wolfgang Rohdewald wrote: : On Tuesday 17 April 2001 22:36, Jan Kasprzak wrote: : > + if (len == -1 || len > 0 && len < count) { : : are you sure there are no missing () ? : : if ((len == -1) || (len > 0) && (len < count)) { : : assumig that && has precedence over || (I believe so) Yes, but the precedence of ==, <, and > is even higher. However, I've found a problem with the previous patch: The first chunk should read: - if((len = sendfile(session.d->outf->fd, retr_fd, offset, count)) == -1) { + len = sendfile(session.d->outf->fd, retr_fd, offset, count); + if (len == -1 || len > 0 && len < count) { + if (len != -1) + errno = EINTR; i.e. we should not overwrite errno, when it is valid. -Yenya PS.: You can find the C operators precedence for example at http://www.howstuffworks.com/c14.htm (found by Google). -- \ Jan "Yenya" Kasprzak <kas at fi.muni.cz> http://www.fi.muni.cz/~kas/ \\ PGP: finger kas at aisa.fi.muni.cz 0D99A7FB206605D7 8B35FCDE05B18A5E // \\\ Czech Linux Homepage: http://www.linux.cz/ /// Mantra: "everything is a stream of bytes". Repeat until enlightened. --Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2001-04-18 6:10 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-17 13:10 Possible problem with zero-copy TCP and sendfile() Jan Kasprzak
2001-04-17 14:41 ` Andi Kleen
2001-04-17 15:50 ` Jan Kasprzak
[not found] ` <20010417175916.A11824@gruyere.muc.suse.de>
2001-04-17 17:07 ` Jan Kasprzak
2001-04-17 17:40 ` Jan Kasprzak
2001-04-17 14:48 ` Alan Cox
2001-04-17 15:02 ` Jan Kasprzak
2001-04-17 16:04 ` Alan Cox
2001-04-17 16:15 ` Jan Kasprzak
2001-04-17 18:24 ` Pekka Pietikainen
2001-04-17 20:10 ` Jesse S Sipprell
2001-04-17 20:23 ` David S. Miller
2001-04-17 20:44 ` Jesse S Sipprell
2001-04-17 20:50 ` David S. Miller
2001-04-17 20:36 ` Jan Kasprzak
2001-04-17 21:22 ` Wolfgang Rohdewald
2001-04-17 21:43 ` Martin Josefsson
2001-04-18 6:09 ` Jan Kasprzak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox