* select() doesn't respect SO_RCVLOWAT ? @ 2005-03-10 21:58 Felix Matathias 2005-03-11 5:43 ` Willy Tarreau 2005-03-11 19:09 ` Alan Cox 0 siblings, 2 replies; 7+ messages in thread From: Felix Matathias @ 2005-03-10 21:58 UTC (permalink / raw) To: linux-kernel Dear all, I am running a 2.4.21-9.0.3.ELsmp #1 kernel and I can setsockopt and getsockopt correctly the SO_RCVLOWAT option, but select() seems to mark a socket readable even if a single byte is ready to be read. Then, a read() blocks until the specified number of bytes in SO_RCVLOWAT makes it to the socket buffer. This is the exact opposite behaviour of what I yould have expected/desired. Our application receives data at many Khz rate and we want to avoid reading the socket until a predetermined amount of data is sent, to avoid partial reads. SO_RCVLOWAT seemed to be a nice way to implement that. An earlier message by Alan Cox was a bit cryptic: "But is the cost of all those special case checks and all the handling for it such as select computing if enough tcp packets together accumulated worth the cost on every app not using LOWAT for the microscopic gain given that essentially nobody uses it." Does this mean that select() in Linux will wake up no matter what SO_RCVLOWAT is set to ? Best Regards, Felix Matathias P.S. I would appreciate if you could also cc your response to me. -- ______________________________________________________________________ Felix Matathias of Columbia University, Nevis Labs Brookhaven National Lab cell : 631-988-3694 Bldg 1005, 3-304 web : http://www.matathias.com Upton, NY, 11973 photo: http://www.pbase.com/matathias tel/fax :631-344-7622/3253 email: felix@nevis.columbia.edu _______________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: select() doesn't respect SO_RCVLOWAT ? 2005-03-10 21:58 select() doesn't respect SO_RCVLOWAT ? Felix Matathias @ 2005-03-11 5:43 ` Willy Tarreau 2005-03-11 19:09 ` Alan Cox 1 sibling, 0 replies; 7+ messages in thread From: Willy Tarreau @ 2005-03-11 5:43 UTC (permalink / raw) To: Felix Matathias; +Cc: linux-kernel On Thu, Mar 10, 2005 at 04:58:51PM -0500, Felix Matathias wrote: > > I am running a 2.4.21-9.0.3.ELsmp #1 kernel and I can setsockopt and > getsockopt correctly the SO_RCVLOWAT option, but select() seems to mark a > socket readable even if a single byte is ready to be read. Then, a read() > blocks until the specified number of bytes in SO_RCVLOWAT makes it to the > socket buffer. as discussed in a previous thread, if you use select(), you should also use non-blocking sockets. There are cases where select() can wake you up without anything to read, eg if there is a packet waiting with a wrong checksum. > This is the exact opposite behaviour of what I yould have > expected/desired. Our application receives data at many Khz rate and we > want to avoid reading the socket until a predetermined amount of data is > sent, to avoid partial reads. SO_RCVLOWAT seemed to be a nice way to > implement that. I too came across this problem a long time ago and concluded that LOWAT was not really usable on Linux. But in the end, this is not really a big deal, because as long as your application doesn't eat all CPU, it does not change anything performance-wise, and when it becomes to eat a lot of CPU, the latency will increase, letting more data come in when you do one read. > An earlier message by Alan Cox was a bit cryptic: > > "But is the cost of all those special case checks and all the handling > for it such as select computing if enough tcp packets together accumulated > worth the cost on every app not using LOWAT for the microscopic gain given > that essentially nobody uses it." > > Does this mean that select() in Linux will wake up no matter what > SO_RCVLOWAT is set to ? Yes. Regards, Willy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: select() doesn't respect SO_RCVLOWAT ? 2005-03-10 21:58 select() doesn't respect SO_RCVLOWAT ? Felix Matathias 2005-03-11 5:43 ` Willy Tarreau @ 2005-03-11 19:09 ` Alan Cox 2005-03-11 20:26 ` Felix Matathias 1 sibling, 1 reply; 7+ messages in thread From: Alan Cox @ 2005-03-11 19:09 UTC (permalink / raw) To: Felix Matathias; +Cc: Linux Kernel Mailing List On Iau, 2005-03-10 at 21:58, Felix Matathias wrote: > Dear all, > > I am running a 2.4.21-9.0.3.ELsmp #1 kernel and I can setsockopt and > getsockopt correctly the SO_RCVLOWAT option The only value the code at least used to support was setting it to 1. Are you sure you are actually setting/checking ok ? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: select() doesn't respect SO_RCVLOWAT ? 2005-03-11 19:09 ` Alan Cox @ 2005-03-11 20:26 ` Felix Matathias 2005-03-14 13:24 ` Alan Cox 2005-03-22 2:30 ` Robert White 0 siblings, 2 replies; 7+ messages in thread From: Felix Matathias @ 2005-03-11 20:26 UTC (permalink / raw) To: Alan Cox; +Cc: Linux Kernel Mailing List Dear Alan, I am positive. I can setsockopt, and then, getsockopt returns the value that I requested. Stevens very clearly states that SO_RCVLOWAT has a direct impact on select() and I assumed that this would be the case for Linux. What is the rationale for not complying with that ? Is it the micromanagement of select() that you dislike ? Isn't a significant reduction in the amount of read operations a real gain in high speed networking ? Best Regards, Felix On Fri, 11 Mar 2005, Alan Cox wrote: > On Iau, 2005-03-10 at 21:58, Felix Matathias wrote: >> Dear all, >> >> I am running a 2.4.21-9.0.3.ELsmp #1 kernel and I can setsockopt and >> getsockopt correctly the SO_RCVLOWAT option > > The only value the code at least used to support was setting it to 1. > Are you sure you are actually setting/checking ok ? > -- ______________________________________________________________________ Felix Matathias of Columbia University, Nevis Labs Brookhaven National Lab cell : 631-988-3694 Bldg 1005, 3-304 web : http://www.matathias.com Upton, NY, 11973 photo: http://www.pbase.com/matathias tel/fax :631-344-7622/3253 email: felix@nevis.columbia.edu _______________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: select() doesn't respect SO_RCVLOWAT ? 2005-03-11 20:26 ` Felix Matathias @ 2005-03-14 13:24 ` Alan Cox 2005-03-14 13:34 ` YOSHIFUJI Hideaki / 吉藤英明 2005-03-22 2:30 ` Robert White 1 sibling, 1 reply; 7+ messages in thread From: Alan Cox @ 2005-03-14 13:24 UTC (permalink / raw) To: Felix Matathias; +Cc: Linux Kernel Mailing List On Gwe, 2005-03-11 at 20:26, Felix Matathias wrote: > Dear Alan, > > I am positive. I can setsockopt, and then, getsockopt returns the value > that I requested. Ok I misremembered - its SNDLOWAT that is locked to one in Linux. > Stevens very clearly states that SO_RCVLOWAT has a direct impact on > select() and I assumed that this would be the case for Linux. > What is the rationale for not complying with that ? Is it the micromanagement > of select() that you dislike ? Isn't a significant reduction in the > amount of read operations a real gain in high speed networking ? I believe since we implement SO_SNDLOWAT that its a bug. Stevens and 1003.1g both agree with your expectations. The right list is probably netdev@oss.sgi.com however. Alan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: select() doesn't respect SO_RCVLOWAT ? 2005-03-14 13:24 ` Alan Cox @ 2005-03-14 13:34 ` YOSHIFUJI Hideaki / 吉藤英明 0 siblings, 0 replies; 7+ messages in thread From: YOSHIFUJI Hideaki / 吉藤英明 @ 2005-03-14 13:34 UTC (permalink / raw) To: alan, felix; +Cc: linux-kernel, netdev In article <1110806662.15927.108.camel@localhost.localdomain> (at Mon, 14 Mar 2005 13:24:24 +0000), Alan Cox <alan@lxorguk.ukuu.org.uk> says: > 1003.1g both agree with your expectations. The right list is probably > netdev@oss.sgi.com however. I've just forwarded this thread to netdev. --yoshfuji ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: select() doesn't respect SO_RCVLOWAT ? 2005-03-11 20:26 ` Felix Matathias 2005-03-14 13:24 ` Alan Cox @ 2005-03-22 2:30 ` Robert White 1 sibling, 0 replies; 7+ messages in thread From: Robert White @ 2005-03-22 2:30 UTC (permalink / raw) To: 'Felix Matathias', 'Alan Cox' Cc: 'Linux Kernel Mailing List' -----Original Message----- From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Felix Matathias Sent: Friday, March 11, 2005 12:27 PM > Isn't a significant reduction in the amount of read operations > a real gain in high speed networking ? In a word? No. Here at my company we make various pieces of cell phone test equipment in diverse configurations. One of these involves an XScale based linux board and an DSTnI based board running RTXC. The XScale board has several devices connected to it and a private Ethernet segment connects the two boards (blah blah blah, I'll skip the boring parts, but leave it to say that I have intimate control over the RTXC end of the link and we are doing 1-5 millisecond timing of real time events +/- uniform event delay factoring). The cost of receiving large numbers of small packet data dwarfs the cost of read(). Depending on your actual network media, you will better sustained throughtput by worrying about transmit fragmentation that you will ever have to concern yourself with (well written) small-read() buffer reassembly. Consider Ethernet, where sending one byte uses something like 70 (?) bytes of wire bandwidth. If you have the slightest chance of recognizing framing or sagging in your datastream, using TCP_CORK to make sure you only transmit if you have more than ~45 characters pending can make a real difference. Your particular mileage will, of course, vary. [On our box the win/win point was to cork for ~800 bytes or a known end-of-frame, whichever came first; said calculation included the DSTnI board's byte-copy and task switching rates and a bunch of other things.] In practical terms, if you can get to the read() before more data arrived, then, unless you _really_ have something better to do, you might as well do the read(). If your processing takes longer than the strobe on the read() you will get some backlogging between reads that you will make up next time. There is a "natural speed" for any given application, and as long as your data is slower than this speed, the practical load doesn't matter much. Something somewhere is going to have to combine fragments, or not, so until you get to the point where your particular application is starting to waste too much time in context switching (the "real overhead" cost of a syscall) then you need to sweat the syscall density. If you are "always" passing in a read buffer that is bigger than the pending data, [e.g. if Y != X for all "Y = read(fd,buf,X);"] then you are pretty much under the power curve and doing nicely. Regardless, the real place to maximize network throughput is in intelligent write() combining. "The media is always slower than the computer" is the watch-phrase for eeking out your best throughput. Rob White, Casabyte, Inc. P.S. "High Speed Networking" is not the same thing as "Fair Resource Usage Networking" for the purposes of this discussion... 8-) ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-03-22 2:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-03-10 21:58 select() doesn't respect SO_RCVLOWAT ? Felix Matathias 2005-03-11 5:43 ` Willy Tarreau 2005-03-11 19:09 ` Alan Cox 2005-03-11 20:26 ` Felix Matathias 2005-03-14 13:24 ` Alan Cox 2005-03-14 13:34 ` YOSHIFUJI Hideaki / 吉藤英明 2005-03-22 2:30 ` Robert White
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox