All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Kenny Chang <kchang@athenacr.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Christoph Lameter <cl@linux-foundation.org>
Subject: Re: Multicast packet loss
Date: Sun, 01 Mar 2009 18:03:12 +0100	[thread overview]
Message-ID: <49AABFD0.5090204@cosmosbay.com> (raw)
In-Reply-To: <49A8FAFF.7060104@cosmosbay.com>

Eric Dumazet a écrit :
> Kenny Chang a écrit :
>> It's been a while since I updated this thread.  We've been running
>> through the different suggestions and tabulating their effects, as well
>> as trying out an Intel card.  The short story is that setting affinity
>> and MSI works to some extent, and the Intel card doesn't seem to change
>> things significantly.  The results don't seem consistent enough for us
>> to be able to point to a smoking gun.
>>
>> It does look like the 2.6.29-rc4 kernel performs okay with the Intel
>> card, but this is not a real-time build and it's not likely to be in a
>> supported Ubuntu distribution real soon.  We've reached the point where
>> we'd like to look for an expert dedicated to work on this problem for a
>> period of time.  The final result being some sort of solution to produce
>> a realtime configuration with a reasonably "aged" kernel (.24~.28) that
>> has multicast performance greater than or equal to that of 2.6.15.
>>
>> If anybody is interested in devoting some compensated time to this
>> issue, we're offering up a bounty:
>> http://www.athenacr.com/bounties/multicast-performance/
>>
>> For completeness, here's the table of our experiment results:
>>
>> ====================== ================== ========= ==========
>> =============== ============== ============== =================
>> Kernel                 flavor             IRQ       affinity   *4x
>> mcasttest*  *5x mcasttest* *6x mcasttest*  *Mtools2* [4]_
>> ====================== ================== ========= ==========
>> =============== ============== ============== =================
>> Intel
>> e1000e                                                                                                                 
>>
>> -----------------------------------------+---------+----------+---------------+--------------+--------------+-----------------
>>
>> 2.6.24.19              rt                |          any       |
>> OK              Maybe          X                            
>> 2.6.24.19              rt                |          CPU0      |
>> OK              OK             X                            
>> 2.6.24.19              generic           |          any       |
>> X                                                           
>> 2.6.24.19              generic           |          CPU0      |
>> OK                                                          
>> 2.6.29-rc3             vanilla-server    |          any       |
>> X                                                           
>> 2.6.29-rc3             vanilla-server    |          CPU0      |
>> OK                                                          
>> 2.6.29-rc4             vanilla-generic   |          any       |
>> X                                             OK            
>> 2.6.29-rc4             vanilla-generic   |          CPU0      | OK  
>>           OK             OK [5]_        OK            
>> -----------------------------------------+---------+----------+---------------+--------------+--------------+-----------------
>>
>> Broadcom
>> BNX2                                                                                                                
>>
>> -----------------------------------------+---------+----------+---------------+--------------+--------------+-----------------
>>
>> 2.6.24-19              rt                | MSI      any       |
>> OK              OK             X                            
>> 2.6.24-19              rt                | MSI      CPU0      |
>> OK              Maybe          X                            
>> 2.6.24-19              rt                | APIC     any       |
>> OK              OK             X                            
>> 2.6.24-19              rt                | APIC     CPU0      |
>> OK              Maybe          X                            
>> 2.6.24-19-bnx-latest   rt                | APIC     CPU0      |
>> OK              X                                           
>> 2.6.24-19              server            | MSI      any       |
>> X                                                           
>> 2.6.24-19              server            | MSI      CPU0      |
>> OK                                                          
>> 2.6.24-19              generic           | APIC     any       |
>> X                                                           
>> 2.6.24-19              generic           | APIC     CPU0      |
>> OK                                                          
>> 2.6.27-11              generic           | APIC     any       |
>> X                                                           
>> 2.6.27-11              generic           | APIC     CPU0      |
>> OK              10% drop                                     
>> 2.6.28-8               generic           | APIC     any       |
>> OK              X                                            
>> 2.6.28-8               generic           | APIC     CPU0      |
>> OK              OK             0.5% drop                     
>> 2.6.29-rc3             vanilla-server    | MSI      any       |
>> X                                                           
>> 2.6.29-rc3             vanilla-server    | MSI      CPU0      |
>> X                                                           
>> 2.6.29-rc3             vanilla-server    | APIC     any       |
>> OK              X                                           
>> 2.6.29-rc3             vanilla-server    | APIC     CPU0      |
>> OK              OK                                          
>> 2.6.29-rc4             vanilla-generic   | APIC     any       |
>> X                                                           
>> 2.6.29-rc4             vanilla-generic   | APIC     CPU0      |
>> OK              3% drop        10% drop       X             
>> ======================
>> ==================+=========+==========+===============+==============+==============+=================
>>
>> * [4] MTools2 is a test from 29West: http://www.29west.com/docs/TestNet/
>> * [5] In 5 trials, 1 of the trials dropped 2%, 4 of the trials dropped
>> nothing.
>>
>> Kenny
>>
> 
> Hi Kenny
> 
> I am investigating how to reduce contention (and schedule() calls) on this workload.
> 

I bound NIC (gigabit BNX2) irq to cpu 0, so that oprofile results on this cpu can show us
where ksoftirqd is spending its time.

We can see scheduler at work :)

Also, one thing to note is __copy_skb_header() : 9.49 % of cpu0 time.
The problem comes from dst_clone() (6.05 % total, so 2/3 of __copy_skb_header()),
touching a highly contended cache line. (other cpus are doing the decrement of
dst refcounter)

CPU: Core 2, speed 3000.05 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) 
with a unit mask of 0x00 (Unhalted core cycles) count 100000
Samples on CPU 0
(samples for other cpus 1..7 omitted)
samples  cum. samples  %        cum. %     symbol name
23750    23750          9.8159   9.8159    try_to_wake_up
22972    46722          9.4944  19.3103    __copy_skb_header
20217    66939          8.3557  27.6660    enqueue_task_fair
14565    81504          6.0197  33.6857    sock_def_readable
13454    94958          5.5606  39.2463    task_rq_lock
13381    108339         5.5304  44.7767    resched_task
13090    121429         5.4101  50.1868    udp_queue_rcv_skb
11441    132870         4.7286  54.9154    skb_queue_tail
10109    142979         4.1781  59.0935    sock_queue_rcv_skb
10024    153003         4.1429  63.2364    __wake_up_sync
9952     162955         4.1132  67.3496    update_curr
8761     171716         3.6209  70.9705    sched_clock_cpu
7414     179130         3.0642  74.0347    rb_insert_color
7381     186511         3.0506  77.0853    select_task_rq_fair
6749     193260         2.7894  79.8747    __slab_alloc
5881     199141         2.4306  82.3053    __wake_up_common
5432     204573         2.2451  84.5504    __skb_clone
4306     208879         1.7797  86.3300    kmem_cache_alloc
3524     212403         1.4565  87.7865    place_entity
2783     215186         1.1502  88.9367    skb_clone
2576     217762         1.0647  90.0014    __udp4_lib_rcv
2430     220192         1.0043  91.0057    bnx2_poll_work
2184     222376         0.9027  91.9084    ipt_do_table
2090     224466         0.8638  92.7722    ip_route_input
1877     226343         0.7758  93.5479    __alloc_skb
1495     227838         0.6179  94.1658    native_sched_clock
1166     229004         0.4819  94.6477    __update_sched_clock
1083     230087         0.4476  95.0953    netif_receive_skb
1062     231149         0.4389  95.5343    activate_task
644      231793         0.2662  95.8004    __kmalloc_track_caller
638      232431         0.2637  96.0641    nf_iterate
549      232980         0.2269  96.2910    skb_put


  reply	other threads:[~2009-03-01 17:03 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-30 17:49 Multicast packet loss Kenny Chang
2009-01-30 19:04 ` Eric Dumazet
2009-01-30 19:17 ` Denys Fedoryschenko
2009-01-30 20:03 ` Neil Horman
2009-01-30 22:29   ` Kenny Chang
2009-01-30 22:41     ` Eric Dumazet
2009-01-31 16:03       ` Neil Horman
2009-02-02 16:13         ` Kenny Chang
2009-02-02 16:48         ` Kenny Chang
2009-02-03 11:55           ` Neil Horman
2009-02-03 15:20             ` Kenny Chang
2009-02-04  1:15               ` Neil Horman
2009-02-04 16:07                 ` Kenny Chang
2009-02-04 16:46                   ` Wesley Chow
2009-02-04 18:11                     ` Eric Dumazet
2009-02-05 13:33                       ` Neil Horman
2009-02-05 13:46                         ` Wesley Chow
2009-02-05 13:29                   ` Neil Horman
2009-02-01 12:40       ` Eric Dumazet
2009-02-02 13:45         ` Neil Horman
2009-02-02 16:57           ` Eric Dumazet
2009-02-02 18:22             ` Neil Horman
2009-02-02 19:51               ` Wes Chow
2009-02-02 20:29                 ` Eric Dumazet
2009-02-02 21:09                   ` Wes Chow
2009-02-02 21:31                     ` Eric Dumazet
2009-02-03 17:34                       ` Kenny Chang
2009-02-04  1:21                         ` Neil Horman
2009-02-26 17:15                           ` Kenny Chang
2009-02-28  8:51                             ` Eric Dumazet
2009-03-01 17:03                               ` Eric Dumazet [this message]
2009-03-04  8:16                               ` David Miller
2009-03-04  8:36                                 ` Eric Dumazet
2009-03-07  7:46                                   ` Eric Dumazet
2009-03-08 16:46                                     ` Eric Dumazet
2009-03-09  2:49                                       ` David Miller
2009-03-09  6:36                                         ` Eric Dumazet
2009-03-13 21:51                                           ` David Miller
2009-03-13 22:30                                             ` Eric Dumazet
2009-03-13 22:38                                               ` David Miller
2009-03-13 22:45                                                 ` Eric Dumazet
2009-03-14  9:03                                                   ` [PATCH] net: reorder fields of struct socket Eric Dumazet
2009-03-16  2:59                                                     ` David Miller
2009-03-16 22:22                                                 ` Multicast packet loss Eric Dumazet
2009-03-17 10:11                                                   ` Peter Zijlstra
2009-03-17 11:08                                                     ` Eric Dumazet
2009-03-17 11:57                                                       ` Peter Zijlstra
2009-03-17 15:00                                                       ` Brian Bloniarz
2009-03-17 15:16                                                         ` Eric Dumazet
2009-03-17 19:39                                                           ` David Stevens
2009-03-17 21:19                                                             ` Eric Dumazet
2009-04-03 19:28                                                   ` Brian Bloniarz
2009-04-05 13:49                                                     ` Eric Dumazet
2009-04-06 21:53                                                       ` Brian Bloniarz
2009-04-06 22:12                                                         ` Brian Bloniarz
2009-04-07 20:08                                                       ` Brian Bloniarz
2009-04-08  8:12                                                         ` Eric Dumazet
2009-03-09 22:56                                       ` Brian Bloniarz
2009-03-10  5:28                                         ` Eric Dumazet
2009-03-10 23:22                                           ` Brian Bloniarz
2009-03-11  3:00                                             ` Eric Dumazet
2009-03-12 15:47                                               ` Brian Bloniarz
2009-03-12 16:34                                                 ` Eric Dumazet
2009-02-27 18:40       ` Christoph Lameter
2009-02-27 18:56         ` Eric Dumazet
2009-02-27 19:45           ` Christoph Lameter
2009-02-27 20:12             ` Eric Dumazet
2009-02-27 21:36               ` Eric Dumazet
2009-02-02 13:53     ` Eric Dumazet
  -- strict thread matches above, loose matches on Subject: below --
2009-04-05 14:42 bmb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49AABFD0.5090204@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=cl@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=kchang@athenacr.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.