From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: Network latency regressions from 2.6.22 to 2.6.29 (results with IRQ affinity) Date: Mon, 20 Apr 2009 20:46:34 +0200 Message-ID: <49ECC30A.9040501@cosmosbay.com> References: <49E78A79.6050604@cosmosbay.com> <49E78C1E.9060405@cosmosbay.com> <20090416.160002.09845606.davem@davemloft.net> <49EA2D7F.3080405@cosmosbay.com> <49ECB775.6030202@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: David Miller , Michael Chan , Ben Hutchings , netdev@vger.kernel.org To: Christoph Lameter Return-path: Received: from gw1.cosmosbay.com ([212.99.114.194]:48255 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750966AbZDTSrX convert rfc822-to-8bit (ORCPT ); Mon, 20 Apr 2009 14:47:23 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Christoph Lameter a =E9crit : > On Mon, 20 Apr 2009, Eric Dumazet wrote: >=20 >>> Sounds very good. If I just knew what you are measuring. >> Rephrasing my email, I was measuring latencies on the receiving mach= ine, >> by using tcpdump and doing substraction of 'answer_time' and 'reques= t_time' >> Thing is that timestamps dont care about hardware delays. >> (we note time when RX interrupt delivers the packet, and time right = before giving frame to hardware) >> >> >> 21:04:23.780421 IP 192.168.20.112.9001 > 192.168.20.110.9000: UDP, l= ength 300 (request) >> 21:04:23.780428 IP 192.168.20.110.9000 > 192.168.20.112.9001: UDP, l= ength 300 (answer) >> >> Here, [21:04:23.780428 - 21:04:23.780421] =3D 7 us >> >> So my results are extensively published :) >=20 > But they are not comparable with my results. There could be other eff= ects > in the system call API etc that have caused this regression. Plus tcp= dump > causes additional copies of the packet to be delivered to user space. Yep, this was all mentioned in my mail. I wanted to compare latencies on receiver only, ruling out hardware, an= d ruling out sender (no need to reboot it) These latencies are higher than ones without tcpdump, since more copies= are involved with tcpdump. About system call API effects, they are included in my tests. Since : t0 : time we receive packet from NIC -> wakeup user process, scheduler... User process returns from the recvfrom() copy from system to user= space User process does the sendto() copy from user to system space t1: -> calling dev_start_xmit() packet given to NIC driver (idle during the tests, so should reall= y send the packet asap) User call again recvfrom() and block (this one is not accounted in= latency, as in your test) t2: NIC driver acknowledge the TX=20 delta =3D t1 - t0 One thing that could hurt is the TX done interrupt, but this is done a = few us after "t1" so it doesnt hurt your workload, since next frame is received at least 100us = after the last answer... (cpu is idle 99%) Point is that even with tcpdump running, latencies are very good on 2.6= =2E30-rc2, and were very good with 2.6.22. I see no significant increase/decrease... >=20 >>> CONFIG_HPET_TIMER=3Dy >>> CONFIG_HPET_EMULATE_RTC=3Dy >>> CONFIG_NR_CPUS=3D32 >>> CONFIG_SCHED_SMT=3Dy >> OK, I had "# CONFIG_SCHED_SMT is not set" >> I'll try with this option set >=20 > Should not be relevant since the processor has no hyperthreading. >=20 >> Are you running a 32 or 64 bits kernel ? >=20 > Test was done using a 64 bit kernel. Ok, I'll try 64bit too :) 1 us is time to access about 10 false shared cache lines. 64 bit arches store less pointers/long per cache line. So a 64 bit kernel could be slower on this kind of workload in the gene= ral case (if several cpus play the game)