From mboxrd@z Thu Jan 1 00:00:00 1970 From: Badalian Vyacheslav Subject: Re: e1000: Question about polling Date: Wed, 20 Feb 2008 12:15:01 +0300 Message-ID: <47BBEF95.6010307@bigtelecom.ru> References: <47B94D5C.2070509@bigtelecom.ru> <36D9DB17C6DE9E40B059440DB8D95F520474680C@orsmsx418.amr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org To: "Brandeburg, Jesse" Return-path: Received: from mail.bigtelecom.ru ([87.255.0.61]:50052 "EHLO mail.bigtelecom.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757890AbYBTJPK (ORCPT ); Wed, 20 Feb 2008 04:15:10 -0500 In-Reply-To: <36D9DB17C6DE9E40B059440DB8D95F520474680C@orsmsx418.amr.corp.intel.com> Sender: netdev-owner@vger.kernel.org List-ID: Very big thanks for this answer. You ask for all my questions and for all future questions too. Thanks Again! > Badalian Vyacheslav wrote: > >> Hello all. >> >> Interesting think: >> >> Have PC that do NAT. Bandwidth about 600 mbs. >> >> Have 4 CPU (2xCoRe 2 DUO "HT OFF" 3.2 HZ). >> >> irqbalance in kernel is off. >> >> nat2 ~ # cat /proc/irq/217/smp_affinity >> 00000001 >> > this binds all 217 irq interrupts to cpu 0 > > >> nat2 ~ # cat /proc/irq/218/smp_affinity >> 00000003 >> > > do you mean to be balancing interrupts between core 1 and 2 here? > 1 = cpu 0 > 2 = cpu 1 > 4 = cpu 2 > 8 = cpu 3 > > so 1+2 = 3 for irq 218, ie balancing between the two. > > sometimes the cpus will have a paired cache, depending on your bios it > will be organized like cpu 0/2 = shared cache, and cput 1/3 = shared > cache. > you can find this out by looking at physical ID and CORE ID in > /proc/cpuinfo > > >> Load SI on CPU0 and CPU1 is about 90% >> >> Good... try do >> echo ffffffff > /proc/irq/217/smp_affinity >> echo ffffffff > /proc/irq/218/smp_affinity >> >> Get 100% SI at CPU0 >> >> Question Why? >> > > because as each adapter generating interrupts gets rotated through cpu0, > it gets "stuck" on cpu0 because the napi scheduling can only run one at > a time, and so each is always waiting in line behind the other to run > its napi poll, always fills its quota (work_done is always != 0) and > keeps interrupts disabled "forever" > > >> I listen that if use IRQ from 1 netdevice to 1 CPU i can get 30% >> perfomance... but i have 4 CPU... i must get more perfomance if i cat >> "ffffffff" to smp_affinity. >> > > only if your performance is not cache limited but cpu horsepower > limited. you're sacrificing cache coherency for cpu power, but if that > works for you then great. > > >> picture looks liks this: >> 0-3 CPU get over 50% SI.... bandwith up.... 55% SI... bandwith up... >> 100% SI on CPU0.... >> >> I remember patch to fix problem like it... patched function >> e1000_clean... kernel on pc have this patch (2.6.24-rc7-git2)... >> e1000 driver work much better (i up to 1.5-2x bandwidth before i get >> 100% SI), but i think that it not get 100% that it can =) >> > > the patch helps a little because it decreases the amount of time the > driver spends in napi mode, basically shortening the exit condition > (which reenables interrupts, and therefore balancing) to work_done < > budget, not work_done == 0. > > >> Thanks for answers and sorry for my English >> > > you basically can't get much more than one cpu can do for each nic. its > possible to get a little more, but my guess is you won't get much. The > best thing you can do is make sure as much traffic as possible stays in > the same cache, on two different cores. > > you can try turning off NAPI mode either in the .config, or build the > sourceforge driver with CFLAGS_EXTRA=-DE1000_NO_NAPI, which seems > counterintuitive, but with the non-napi e1000 pushing packets to the > backlog queue on each cpu, you may actually get better performance due > to the balancing. > > some day soon (maybe) we'll have some coherent way to have one tx and rx > interrupt per core, and enough queues for each port to be able to handle > 1 queue per core. > > good luck, > Jesse > >