* Re: packet re-ordering on SMP machines. [not found] <Pine.GSO.4.30.0208251149320.29461-100000@shell.cyberus.ca> @ 2002-08-25 18:32 ` Ben Greear 2002-08-26 0:52 ` jamal 0 siblings, 1 reply; 19+ messages in thread From: Ben Greear @ 2002-08-25 18:32 UTC (permalink / raw) To: jamal; +Cc: netdev jamal wrote: > > > > NAPI fixes packet reordering problems. It does indeed. I just patched the e1000 with the latest NAPI patch I could find (from Aug 15 or so), and the re-ordering problems went away. The amount of packets dropped decreased too, but I still see about 1 out of 1000 packets dropped due to rx-FIFO or rx-dropped. This is when trying to run 60,000 pps of 1514 byte packets from one port to the other on the same dual-port e1000 NIC (copper). It will generate up to about 72,000 pps without dropping too many more... I will do some more tests on two single-port NICs soon to see if that performs better. Also, I see the hard_start_xmit call failing 5876 times out of 2719493 calls (for example). The code that calls the method looks like this: spin_lock_bh(&odev->xmit_lock); if (!netif_queue_stopped(odev)) { if (odev->hard_start_xmit(next->skb, odev)) { if (net_ratelimit()) { printk(KERN_INFO "Hard xmit error\n"); } next->errors++; next->last_ok = 0; } else { next->last_ok = 1; next->sofar++; next->tx_bytes += (next->cur_pkt_size + 4); /* count csum */ } next->next_tx_ns = getRelativeCurNs() + next->ipg; } else { /* Re-try it next time */ next->last_ok = 0; } spin_unlock_bh(&odev->xmit_lock); I have not seen hard_start_xmit fail on other drivers, even when over-driving them well beyond their capabilities. Any ideas what causes the hard_start_xmit errors? Thanks, Ben -- Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-25 18:32 ` packet re-ordering on SMP machines Ben Greear @ 2002-08-26 0:52 ` jamal 2002-08-26 4:34 ` Ben Greear 0 siblings, 1 reply; 19+ messages in thread From: jamal @ 2002-08-26 0:52 UTC (permalink / raw) To: Ben Greear; +Cc: netdev On Sun, 25 Aug 2002, Ben Greear wrote: > jamal wrote: > > > > > > > > NAPI fixes packet reordering problems. > > It does indeed. I just patched the e1000 with the latest NAPI patch > I could find (from Aug 15 or so), and the re-ordering problems went away. > > The amount of packets dropped decreased too, but I still see about 1 out of > 1000 packets dropped due to rx-FIFO or rx-dropped. This is when trying to run > 60,000 pps of 1514 byte packets from one port to the other on the same dual-port e1000 > NIC (copper). It will generate up to about 72,000 pps without dropping too many > more... > That doesnt sound impressive at all. I know it's about .8 of wire rate but you should be able to exceed that. Robert was generating in the range of 800Kpps with that NIC if i recall corectly > I will do some more tests on two single-port NICs soon to see if that > performs better. You should see better numbers. Also if you have SMP, tie each onto a CPU. Additionaly get the skb recycler patch from Robert, it should improve things even more. > > Also, I see the hard_start_xmit call failing 5876 times out of 2719493 > calls (for example). The code that calls the method looks like this: > I dont have access to that NIC. But a stoopid question: Have you tried increasing the transmit queue via ifconfig? 1000 packets is reasonable for gige. cheers, jamal ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-26 0:52 ` jamal @ 2002-08-26 4:34 ` Ben Greear 2002-08-26 11:20 ` jamal 2002-08-26 23:03 ` Xiaoliang (David) Wei 0 siblings, 2 replies; 19+ messages in thread From: Ben Greear @ 2002-08-26 4:34 UTC (permalink / raw) To: jamal; +Cc: netdev jamal wrote: > That doesnt sound impressive at all. I know it's about .8 of wire rate > but you should be able to exceed that. > Robert was generating in the range of 800Kpps with that NIC if i recall > corectly I had only tested 1514 byte pkts, so I was getting around 880Mbps, which is pretty good as far as I know. I see about 255 kpps when sending 64 byte pkts to myself. Still dropping about 1 in 4000 packets at this speed. I think most of Robert's tests didn't involve actually doing something with the received packet though, and I am inspecting it for latency, sequence number, etc. I'm even doing a __get_timeofday() call to calculate the latency...need to find a faster way to do that... If I only allocate/scan 1 per 100 packets (ie alloc one packet and send it 100 times), then I get a more respectable 365kpps. Robert's patch should definately help! > Also if you have SMP, tie each onto a CPU. That's with the irq_afinity thing in proc, right? > Additionaly get the skb recycler patch from Robert, it should improve > things even more. Do you happen to have a URL for this? Actually, the various network tweaks are relatively hard to find (at least to find the most up-to-date coppies). It would be great if there was a place where they were all concentrated. > > >>Also, I see the hard_start_xmit call failing 5876 times out of 2719493 >>calls (for example). The code that calls the method looks like this: >> > > > I dont have access to that NIC. But a stoopid question: Have you tried > increasing the transmit queue via ifconfig? 1000 packets is reasonable > for gige. I upped it, but it didn't stop the errors. The NIC is still performing, so it may not be a real problem... Thanks for the info, Ben -- Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-26 4:34 ` Ben Greear @ 2002-08-26 11:20 ` jamal 2002-08-26 23:03 ` Xiaoliang (David) Wei 1 sibling, 0 replies; 19+ messages in thread From: jamal @ 2002-08-26 11:20 UTC (permalink / raw) To: Ben Greear; +Cc: netdev On Sun, 25 Aug 2002, Ben Greear wrote: > jamal wrote: > > > That doesnt sound impressive at all. I know it's about .8 of wire rate > > but you should be able to exceed that. > > Robert was generating in the range of 800Kpps with that NIC if i recall > > corectly > > I had only tested 1514 byte pkts, so I was getting around 880Mbps, > which is pretty good as far as I know. theres no reason you shouldnt be able to do wire rate. > > I see about 255 kpps when sending 64 byte pkts to myself. Still > dropping about 1 in 4000 packets at this speed. I think most of Robert's > tests didn't involve actually doing something with the received packet > though, and I am inspecting it for latency, sequence number, etc. > > I'm even doing a __get_timeofday() call to calculate the latency...need > to find a faster way to do that... > ouch. for latency or sequencing you dont really need to all packets. Read academic papers on the subject. You probably need about 5% of the total packets. Also you dont have to do the checks at runtime, you can do them once the run is complete (which you should be able to tell since you control both send and receive). > If I only allocate/scan 1 per 100 packets (ie alloc one packet and send it 100 times), > then I get a more respectable 365kpps. Robert's patch should definately help! > Yes, clearly you will benefit. > > Also if you have SMP, tie each onto a CPU. > > That's with the irq_afinity thing in proc, right? yes. > > > Additionaly get the skb recycler patch from Robert, it should improve > > things even more. > > Do you happen to have a URL for this? > > Actually, the various network tweaks are relatively hard to find > (at least to find the most up-to-date coppies). It would be great if > there was a place where they were all concentrated. Roberts site is the main repository; it may have READMEs with URLs pointing to various locations. ftp://130.238.98.12/pub/Linux/net-development/ and look at the recycling and NAPI sub-directories. > > > > > > >>Also, I see the hard_start_xmit call failing 5876 times out of 2719493 > >>calls (for example). The code that calls the method looks like this: > >> > > > > > > I dont have access to that NIC. But a stoopid question: Have you tried > > increasing the transmit queue via ifconfig? 1000 packets is reasonable > > for gige. > > I upped it, but it didn't stop the errors. The NIC is still performing, > so it may not be a real problem... > I dont have this NIC. When Robert shows up he may be able to explain this. cheers, jamal ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-26 4:34 ` Ben Greear 2002-08-26 11:20 ` jamal @ 2002-08-26 23:03 ` Xiaoliang (David) Wei 2002-08-26 23:20 ` Ben Greear 2002-08-27 10:59 ` jamal 1 sibling, 2 replies; 19+ messages in thread From: Xiaoliang (David) Wei @ 2002-08-26 23:03 UTC (permalink / raw) To: Ben Greear, jamal, Cheng Jin, Cheng Hu, Steven Low; +Cc: netdev Hi Ben and Jamal, Are you guys sure that getdayoftime per packet is a big overhead on Gbps connection? Do you compare the performance with getdayoftime per packet and without? I guess RFC 1323 specifies that each packet should have a timestamp (although not from getdayoftime). Also, what's your testbed's configuration, Ben? (I guess if we can use faster hardware to overcome this effect...) Thank you:) ps: I am working on some high speed TCP experiment and may want to make getdayoftime every packet... -David Xiaoliang (David) Wei Graduate Student in CS@Caltech http://www.cs.caltech.edu/~weixl ==================================================== ----- Original Message ----- From: "Ben Greear" <greearb@candelatech.com> To: "jamal" <hadi@cyberus.ca> Cc: <netdev@oss.sgi.com> Sent: Sunday, August 25, 2002 9:34 PM Subject: Re: packet re-ordering on SMP machines. > > jamal wrote: > > > That doesnt sound impressive at all. I know it's about .8 of wire rate > > but you should be able to exceed that. > > Robert was generating in the range of 800Kpps with that NIC if i recall > > corectly > > I had only tested 1514 byte pkts, so I was getting around 880Mbps, > which is pretty good as far as I know. > > I see about 255 kpps when sending 64 byte pkts to myself. Still > dropping about 1 in 4000 packets at this speed. I think most of Robert's > tests didn't involve actually doing something with the received packet > though, and I am inspecting it for latency, sequence number, etc. > > I'm even doing a __get_timeofday() call to calculate the latency...need > to find a faster way to do that... > > If I only allocate/scan 1 per 100 packets (ie alloc one packet and send it 100 times), > then I get a more respectable 365kpps. Robert's patch should definately help! > > > Also if you have SMP, tie each onto a CPU. > > That's with the irq_afinity thing in proc, right? > > > Additionaly get the skb recycler patch from Robert, it should improve > > things even more. > > Do you happen to have a URL for this? > > Actually, the various network tweaks are relatively hard to find > (at least to find the most up-to-date coppies). It would be great if > there was a place where they were all concentrated. > > > > > > >>Also, I see the hard_start_xmit call failing 5876 times out of 2719493 > >>calls (for example). The code that calls the method looks like this: > >> > > > > > > I dont have access to that NIC. But a stoopid question: Have you tried > > increasing the transmit queue via ifconfig? 1000 packets is reasonable > > for gige. > > I upped it, but it didn't stop the errors. The NIC is still performing, > so it may not be a real problem... > > Thanks for the info, > Ben > > -- > Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> > President of Candela Technologies Inc http://www.candelatech.com > ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear > > > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-26 23:03 ` Xiaoliang (David) Wei @ 2002-08-26 23:20 ` Ben Greear 2002-08-27 10:59 ` jamal 1 sibling, 0 replies; 19+ messages in thread From: Ben Greear @ 2002-08-26 23:20 UTC (permalink / raw) To: Xiaoliang (David) Wei; +Cc: jamal, Cheng Jin, Cheng Hu, Steven Low, netdev Xiaoliang (David) Wei wrote: > Hi Ben and Jamal, > Are you guys sure that getdayoftime per packet is a big overhead on > Gbps connection? > Do you compare the performance with getdayoftime per packet and > without? I guess RFC 1323 specifies that each packet should have a timestamp > (although not from getdayoftime). > Also, what's your testbed's configuration, Ben? (I guess if we can > use faster hardware to overcome this effect...) > Thank you:) > > ps: I am working on some high speed TCP experiment and may want to make > getdayoftime every packet... Actually, now that I think back, I believe the generic ethernet code timestamps each skb when it's received anyway.... So, my hit probably comes mostly from allocating new buffers and potentially the gettimeofday that is done then. I have not benchmarked the kernel gettimeofday call in any sort of isolated case. It does not appear that the CPU is what is limiting my particular test, I think it's either the NIC or the driver, or more likely, the way I'm driving it... Ben > > -David > Xiaoliang (David) Wei Graduate Student in CS@Caltech > http://www.cs.caltech.edu/~weixl > ==================================================== > ----- Original Message ----- > From: "Ben Greear" <greearb@candelatech.com> > To: "jamal" <hadi@cyberus.ca> > Cc: <netdev@oss.sgi.com> > Sent: Sunday, August 25, 2002 9:34 PM > Subject: Re: packet re-ordering on SMP machines. > > > >>jamal wrote: >> >> >>>That doesnt sound impressive at all. I know it's about .8 of wire rate >>>but you should be able to exceed that. >>>Robert was generating in the range of 800Kpps with that NIC if i recall >>>corectly >> >>I had only tested 1514 byte pkts, so I was getting around 880Mbps, >>which is pretty good as far as I know. >> >>I see about 255 kpps when sending 64 byte pkts to myself. Still >>dropping about 1 in 4000 packets at this speed. I think most of Robert's >>tests didn't involve actually doing something with the received packet >>though, and I am inspecting it for latency, sequence number, etc. >> >>I'm even doing a __get_timeofday() call to calculate the latency...need >>to find a faster way to do that... >> >>If I only allocate/scan 1 per 100 packets (ie alloc one packet and send it > > 100 times), > >>then I get a more respectable 365kpps. Robert's patch should definately > > help! > >>>Also if you have SMP, tie each onto a CPU. >> >>That's with the irq_afinity thing in proc, right? >> >> >>>Additionaly get the skb recycler patch from Robert, it should improve >>>things even more. >> >>Do you happen to have a URL for this? >> >>Actually, the various network tweaks are relatively hard to find >>(at least to find the most up-to-date coppies). It would be great if >>there was a place where they were all concentrated. >> >> >>> >>>>Also, I see the hard_start_xmit call failing 5876 times out of 2719493 >>>>calls (for example). The code that calls the method looks like this: >>>> >>> >>> >>>I dont have access to that NIC. But a stoopid question: Have you tried >>>increasing the transmit queue via ifconfig? 1000 packets is reasonable >>>for gige. >> >>I upped it, but it didn't stop the errors. The NIC is still performing, >>so it may not be a real problem... >> >>Thanks for the info, >>Ben >> >>-- >>Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> >>President of Candela Technologies Inc http://www.candelatech.com >>ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear >> >> >> >> >> > > -- Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-26 23:03 ` Xiaoliang (David) Wei 2002-08-26 23:20 ` Ben Greear @ 2002-08-27 10:59 ` jamal 2002-08-27 11:12 ` Andi Kleen 1 sibling, 1 reply; 19+ messages in thread From: jamal @ 2002-08-27 10:59 UTC (permalink / raw) To: Xiaoliang (David) Wei; +Cc: Ben Greear, Cheng Jin, Cheng Hu, Steven Low, netdev On Mon, 26 Aug 2002, Xiaoliang (David) Wei wrote: > Hi Ben and Jamal, > Are you guys sure that getdayoftime per packet is a big overhead on > Gbps connection? We may be talking about different things; I am talking about do_gettimeofday -- which is very expensive. Anyone who has time could look at improving that. It is run per incoming packet. > Do you compare the performance with getdayoftime per packet and > without? I think it would be pretty noticeable if you got rid of the per-incoming-packet calls to do_gettimeofday > I guess RFC 1323 specifies that each packet should have a timestamp > (although not from getdayoftime). In Linux, this is cleverly based on the system clock (jiffies). cheers, jamal ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 10:59 ` jamal @ 2002-08-27 11:12 ` Andi Kleen 2002-08-27 12:05 ` jamal 0 siblings, 1 reply; 19+ messages in thread From: Andi Kleen @ 2002-08-27 11:12 UTC (permalink / raw) To: jamal Cc: Xiaoliang (David) Wei, Ben Greear, Cheng Jin, Cheng Hu, Steven Low, netdev On Tue, Aug 27, 2002 at 06:59:33AM -0400, jamal wrote: > > > > On Mon, 26 Aug 2002, Xiaoliang (David) Wei wrote: > > > Hi Ben and Jamal, > > Are you guys sure that getdayoftime per packet is a big overhead on > > Gbps connection? > > We may be talking about different things; > I am talking about do_gettimeofday -- which is very expensive. > Anyone who has time could look at improving that. It is run per incoming > packet. That is because of the lock it takes. Locks are always slow. Older kernels used gettimeoffset which ran without lock, but that was changed because in some very obscure cases it could cause non monotonous timestamps when the user turns on timestamp receiving to user space (kernel protocols do not care) Possibilities: - Ignore the problem and switch back to gettimeoffset again - Switch to gettimeoffset but add some correction step for the unlikely case that someone wants the timestamp from user space (would be my prefered solution) - Implement lockless gettimeofday like x86-64 or sparc (good one too, but likely slower than last) -Andi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 11:12 ` Andi Kleen @ 2002-08-27 12:05 ` jamal 2002-08-27 12:20 ` Andi Kleen 2002-08-27 19:43 ` Xiaoliang (David) Wei 0 siblings, 2 replies; 19+ messages in thread From: jamal @ 2002-08-27 12:05 UTC (permalink / raw) To: Andi Kleen Cc: Xiaoliang (David) Wei, Ben Greear, Cheng Jin, Cheng Hu, Steven Low, netdev On Tue, 27 Aug 2002, Andi Kleen wrote: > > That is because of the lock it takes. Locks are always slow. xtime_lock? > > Older kernels used gettimeoffset which ran without lock, but that was > changed because in some very obscure cases it could cause non monotonous > timestamps when the user turns on timestamp receiving to user space > (kernel protocols do not care) > > Possibilities: > > - Ignore the problem and switch back to gettimeoffset again Is it safe to call gettimeoffset without the lock? > - Switch to gettimeoffset but add some correction step for the unlikely > case that someone wants the timestamp from user space > (would be my prefered solution) > - Implement lockless gettimeofday like x86-64 or sparc > (good one too, but likely slower than last) ia64 seems to also have the lock. cheers, jamal ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 12:05 ` jamal @ 2002-08-27 12:20 ` Andi Kleen 2002-08-27 13:06 ` kuznet 2002-08-27 17:22 ` Cheng Jin 2002-08-27 19:43 ` Xiaoliang (David) Wei 1 sibling, 2 replies; 19+ messages in thread From: Andi Kleen @ 2002-08-27 12:20 UTC (permalink / raw) To: jamal Cc: Andi Kleen, Xiaoliang (David) Wei, Ben Greear, Cheng Jin, Cheng Hu, Steven Low, netdev On Tue, Aug 27, 2002 at 08:05:04AM -0400, jamal wrote: > > > > On Tue, 27 Aug 2002, Andi Kleen wrote: > > > > > That is because of the lock it takes. Locks are always slow. > > xtime_lock? Yes. It also has some other overhead. > > > > > Older kernels used gettimeoffset which ran without lock, but that was > > changed because in some very obscure cases it could cause non monotonous > > timestamps when the user turns on timestamp receiving to user space > > (kernel protocols do not care) > > > > Possibilities: > > > > - Ignore the problem and switch back to gettimeoffset again > > Is it safe to call gettimeoffset without the lock? Of course. The only problem is that the clock can be non mononotonous sometimes and not be in sync with gettimeofday, but at least the kernel users of packet timestamps do not care. The only problem is the socket option, but it is obscure enough that I would not worry too much about it. > > > - Switch to gettimeoffset but add some correction step for the unlikely > > case that someone wants the timestamp from user space > > (would be my prefered solution) > > - Implement lockless gettimeofday like x86-64 or sparc > > (good one too, but likely slower than last) > > > ia64 seems to also have the lock. Quick fix is to just use gettimeoffset in netif_rx again. Should be fine for you. -Andi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 12:20 ` Andi Kleen @ 2002-08-27 13:06 ` kuznet 2002-08-27 13:13 ` Andi Kleen 2002-08-27 17:22 ` Cheng Jin 1 sibling, 1 reply; 19+ messages in thread From: kuznet @ 2002-08-27 13:06 UTC (permalink / raw) To: Andi Kleen; +Cc: netdev Hello! > Of course. The only problem is that the clock can be non mononotonous > sometimes and not be in sync with gettimeofday, but at least the kernel > users of packet timestamps do not care. What kernel users? Where did you find them? :-) > The only problem is the socket option, but it is obscure enough that I > would not worry too much about it. I am very sorry, but passing timestamp to user level is the only purpose of timestamping and it _MUST_ be monotonic and synchronous to time of day, otherwise it is completely useless. Shortly, this timestmap must be synchronous to timeofday. > > > - Implement lockless You talk about this for ages. :-) Actually, the problem is solved very easily. Deprecate SIOCGSTAMP, and either count users of SO_TIMESTAMP and enable timestamping only when it is required, or, alternatively, to move retrirval timestamp to socket level. Alexey ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 13:06 ` kuznet @ 2002-08-27 13:13 ` Andi Kleen 2002-08-27 13:24 ` kuznet 2002-09-15 8:42 ` Harald Welte 0 siblings, 2 replies; 19+ messages in thread From: Andi Kleen @ 2002-08-27 13:13 UTC (permalink / raw) To: kuznet; +Cc: Andi Kleen, netdev On Tue, Aug 27, 2002 at 05:06:30PM +0400, A.N.Kuznetsov wrote: > Hello! > > > Of course. The only problem is that the clock can be non mononotonous > > sometimes and not be in sync with gettimeofday, but at least the kernel > > users of packet timestamps do not care. > > What kernel users? Where did you find them? :-) Hmm, I thought TCP used it, but it seems to use jiffies directly. Ok, no kernel users then. Not sure about sunrpc and out of tree stuff like SCTP. > > The only problem is the socket option, but it is obscure enough that I > > would not worry too much about it. > > I am very sorry, but passing timestamp to user level is the only purpose > of timestamping and it _MUST_ be monotonic and synchronous to time of day, > otherwise it is completely useless. That make monotonous step doesn't need to be in netif_rx. My old proposal was to move it to socket layer. Then it would be only done when needed. Unfortunately it could get somewhat inaccurate when the queueing delay is too long. > > > > - Implement lockless > > You talk about this for ages. :-) It is nearly there for x86-64 ;) (code is in for vsyscalls, just kernel do_gettimeofday doesn't use it yet) > > > Actually, the problem is solved very easily. Deprecate SIOCGSTAMP, > and either count users of SO_TIMESTAMP and enable timestamping only > when it is required, or, alternatively, to move retrirval timestamp to socket > level. Moving it later may make it useless for RTT purposes when the queueing delays are too long. But if no kernel users exist then just making it a global refcnt could work nicely. Then most people would not eat the overhead when count == 0. -Andi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 13:13 ` Andi Kleen @ 2002-08-27 13:24 ` kuznet 2002-09-15 8:42 ` Harald Welte 1 sibling, 0 replies; 19+ messages in thread From: kuznet @ 2002-08-27 13:24 UTC (permalink / raw) To: Andi Kleen; +Cc: ak, netdev Hello! > Moving it later may make it useless for RTT purposes when the queueing > delays are too long. Absolutely wrong. RTT is always calculated end-to-end, otherwise it some meaningless quantity, be it sctp, rpc or something. The only place where precesion of timestamp is more or less interesting is tcpdump. But not enough to make it not monotonic. :-) Alexey ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 13:13 ` Andi Kleen 2002-08-27 13:24 ` kuznet @ 2002-09-15 8:42 ` Harald Welte 2002-09-15 21:55 ` Alexey Kuznetsov 1 sibling, 1 reply; 19+ messages in thread From: Harald Welte @ 2002-09-15 8:42 UTC (permalink / raw) To: Andi Kleen; +Cc: kuznet, netdev [-- Attachment #1: Type: text/plain, Size: 1448 bytes --] On Tue, Aug 27, 2002 at 03:13:17PM +0200, Andi Kleen wrote: > > On Tue, Aug 27, 2002 at 05:06:30PM +0400, A.N.Kuznetsov wrote: > > Hello! > > > > > Of course. The only problem is that the clock can be non mononotonous > > > sometimes and not be in sync with gettimeofday, but at least the kernel > > > users of packet timestamps do not care. > > > > What kernel users? Where did you find them? :-) > > Hmm, I thought TCP used it, but it seems to use jiffies directly. > > Ok, no kernel users then. Not sure about sunrpc and out of tree stuff > like SCTP. The iptables ULOG target passes the skb receive timestamp to userspace, where it is (depending on local ulogd configuration) written in logging/accounting databases. (ULOG is in the kernel tree). The issue is that ULOG is batching multiple packets (or parts of packets) into one netlink message sent to userspace. If userspace would make a timestamp, it would be very inaccurate. There is at least one more iptables extension (out of the kernel tree) using it - but I wouldn't consider this as important. > -Andi -- Live long and prosper - Harald Welte / laforge@gnumonks.org http://www.gnumonks.org/ ============================================================================ GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M+ V-- PS++ PE-- Y++ PGP++ t+ 5-- !X !R tv-- b+++ !DI !D G+ e* h--- r++ y+(*) [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-09-15 8:42 ` Harald Welte @ 2002-09-15 21:55 ` Alexey Kuznetsov 0 siblings, 0 replies; 19+ messages in thread From: Alexey Kuznetsov @ 2002-09-15 21:55 UTC (permalink / raw) To: Harald Welte; +Cc: ak, netdev Hello! > The iptables ULOG target passes the skb receive timestamp to userspace, No differences of packet socket. > one netlink message sent to userspace. If userspace would make a timestamp, Nobody proposed to do this in userspace. This would be even not "inaccuracy", the userspace time of read() is not correlated to real one at all f.e. if userspace is going to do some dns, times will differ inpredictably. Alexey ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 12:20 ` Andi Kleen 2002-08-27 13:06 ` kuznet @ 2002-08-27 17:22 ` Cheng Jin 2002-08-27 17:33 ` Andi Kleen 1 sibling, 1 reply; 19+ messages in thread From: Cheng Jin @ 2002-08-27 17:22 UTC (permalink / raw) To: Andi Kleen Cc: jamal, Xiaoliang (David) Wei, Ben Greear, Cheng Hu, Steven Low, netdev@oss.sgi.com Hi, Andi, > Quick fix is to just use gettimeoffset in netif_rx again. Should > be fine for you. There doesn't appear to be a function called gettimeoffset in 2.4.18 anymore. The closest I found was do_fast_gettimeoffset in "arch/i386/kernel/time.c" This appears to be the unlocked version that you are referring to, except I can't tell why the higher 32 bits (edx) of the timestamp isn't used. (maybe the asm code takes care of it, but it seems that the result is stored in edx so) What you said about a light-weight gettime function makes sense. For our purpose of timing RTTs, any gettime function with a resolution higher than 1 ms will probably be enough. The time doesn't need to be in exactly in sync with the one obtained from the locking version of the gettime function. Thanks, Cheng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 17:22 ` Cheng Jin @ 2002-08-27 17:33 ` Andi Kleen 0 siblings, 0 replies; 19+ messages in thread From: Andi Kleen @ 2002-08-27 17:33 UTC (permalink / raw) To: Cheng Jin Cc: Andi Kleen, jamal, Xiaoliang (David) Wei, Ben Greear, Cheng Hu, Steven Low, netdev@oss.sgi.com On Tue, Aug 27, 2002 at 10:22:13AM -0700, Cheng Jin wrote: > Hi, Andi, > > > Quick fix is to just use gettimeoffset in netif_rx again. Should > > be fine for you. > > There doesn't appear to be a function called gettimeoffset in 2.4.18 > anymore. The closest I found was do_fast_gettimeoffset in > "arch/i386/kernel/time.c" This appears to be the unlocked version that Yes, I mean do_fast_gettimeoffset. > you are referring to, except I can't tell why the higher 32 bits (edx) of > the timestamp isn't used. (maybe the asm code takes care of it, but it seems > that the result is stored in edx so) 32bit precision are probably enough for this. > > What you said about a light-weight gettime function makes sense. For our > purpose of timing RTTs, any gettime function with a resolution higher than > 1 ms will probably be enough. The time doesn't need to be in exactly in sync > with the one obtained from the locking version of the gettime function. TSC should be fine then. -Andi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. 2002-08-27 12:05 ` jamal 2002-08-27 12:20 ` Andi Kleen @ 2002-08-27 19:43 ` Xiaoliang (David) Wei 1 sibling, 0 replies; 19+ messages in thread From: Xiaoliang (David) Wei @ 2002-08-27 19:43 UTC (permalink / raw) To: jamal, Andi Kleen; +Cc: Ben Greear, Cheng Jin, Cheng Hu, Steven Low, netdev > > > > That is because of the lock it takes. Locks are always slow. > > xtime_lock? I guess so, after looked at do_gettimeofday > > > Possibilities: > > > > - Ignore the problem and switch back to gettimeoffset again > > Is it safe to call gettimeoffset without the lock? What's the possible danger to ignore the lock? Can I read the xtime directly? > > > - Switch to gettimeoffset but add some correction step for the unlikely > > case that someone wants the timestamp from user space > > (would be my prefered solution) > > - Implement lockless gettimeofday like x86-64 or sparc > > (good one too, but likely slower than last) > > > ia64 seems to also have the lock. > > cheers, > jamal > > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: packet re-ordering on SMP machines. @ 2002-08-25 15:56 jamal 0 siblings, 0 replies; 19+ messages in thread From: jamal @ 2002-08-25 15:56 UTC (permalink / raw) To: linux-kernel; +Cc: netdev NAPI fixes packet reordering problems. Could people please post network related questions to netdev please? I think it even says so in the FAQ Richard Gooch maintains. Believe it or not, quiet a few people are not subscribed to lk cheers, jamal ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2002-09-15 21:55 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.GSO.4.30.0208251149320.29461-100000@shell.cyberus.ca>
2002-08-25 18:32 ` packet re-ordering on SMP machines Ben Greear
2002-08-26 0:52 ` jamal
2002-08-26 4:34 ` Ben Greear
2002-08-26 11:20 ` jamal
2002-08-26 23:03 ` Xiaoliang (David) Wei
2002-08-26 23:20 ` Ben Greear
2002-08-27 10:59 ` jamal
2002-08-27 11:12 ` Andi Kleen
2002-08-27 12:05 ` jamal
2002-08-27 12:20 ` Andi Kleen
2002-08-27 13:06 ` kuznet
2002-08-27 13:13 ` Andi Kleen
2002-08-27 13:24 ` kuznet
2002-09-15 8:42 ` Harald Welte
2002-09-15 21:55 ` Alexey Kuznetsov
2002-08-27 17:22 ` Cheng Jin
2002-08-27 17:33 ` Andi Kleen
2002-08-27 19:43 ` Xiaoliang (David) Wei
2002-08-25 15:56 jamal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).