netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Christoph Lameter <cl@linux.com>
Cc: jesse.brandeburg@intel.com, netdev@vger.kernel.org,
	bhutchiings@solarflare.com, mchan@broadcom.com,
	David Miller <davem@davemloft.net>
Subject: Re: udp ping pong with various process bindings (and correct cpu mappings)
Date: Fri, 24 Apr 2009 23:18:03 +0200	[thread overview]
Message-ID: <49F22C8B.9000102@cosmosbay.com> (raw)
In-Reply-To: <alpine.DEB.1.10.0904241554100.29995@qirst.com>

Christoph Lameter a écrit :
> Here are the results of a 40 byte udpping (http://gentwo.org/ll) run on
> kernel from 2.6.22 to 2.6.30-rc3 on a Dell 1950 dual quad core 3.3Ghz.
> One system fixed 2.6.22 kernel version on the other are varied.
> 
> Nice graph at http://gentwo.org/results/udpping-results.pdf
> 
> Summary:
> - Loss of ~1.5usec on fastest path (same cpu) since 2.6.22
> - Different cpu same core looses 2-3 usecs vs. same cpu
> - Different cpu different core looses ~ 8 usecs vs same cpu
> - Maximum is usual if threads are on different sockets but sometimes
>   the same socket different core is worse (2.6.26/2.6.27).
> - Up to 9 usecs variance in a basic network operation just because
>   of process placement.
> 
> Same CPU
> Kernel		Test 1	Test 2	Test 3	Test 4	Average
> 2.6.22		83.03	82.9	82.89	82.92	82.94
> 2.6.23		83.35	82.81	82.83	82.86	82.96
> 2.6.24		82.66	82.56	82.64	82.73	82.65
> 2.6.25		84.28	84.29	84.37	84.3	84.31
> 2.6.26		84.72	84.38	84.41	84.68	84.55
> 2.6.27		84.56	84.44	84.41	84.58	84.5
> 2.6.28		84.7	84.43	84.47	84.48	84.52
> 2.6.29		84.91	84.67	84.69	84.75	84.76
> 2.6.30-rc2	84.94	84.72	84.69	84.93	84.82
> 2.6.30-rc3	84.88	84.7	84.73	84.89	84.8
> 
> Same core, different processor (l2 is shared)
> Kernel		Test 1	Test 2	Test 3	Test 4	Average
> 2.6.22		84.6	84.71	84.52	84.53	84.59
> 2.6.23		84.59	84.5	84.33	84.34	84.44
> 2.6.24		84.28	84.3	84.38	84.28	84.31
> 2.6.25		86.12	85.8	86.2	86.04	86.04
> 2.6.26		86.61	86.46	86.49	86.7	86.57
> 2.6.27		87	87.01	87	86.95	86.99
> 2.6.28		86.53	86.44	86.26	86.24	86.37
> 2.6.29		85.88	85.94	86.1	85.69	85.9
> 2.6.30-rc2	86.03	85.93	85.99	86.06	86
> 2.6.30-rc3	85.73	85.88	85.67	85.94	85.81
> 
> Same Socket, different core (l2 not shared)
> Kernel		Test 1	Test 2	Test 3	Test 4	Average
> 2.6.22		90.08	89.72	90	89.9	89.93
> 2.6.23		89.72	90.1	89.99	89.86	89.92
> 2.6.24		89.18	89.28	89.25	89.22	89.23
> 2.6.25		90.83	90.78	90.87	90.61	90.77
> 2.6.26		90.51	91.25	91.8	91.69	91.31
> 2.6.27		91.98	91.93	91.97	91.91	91.95
> 2.6.28		91.72	91.7	91.84	91.75	91.75
> 2.6.29		89.85	89.85	90.14	89.9	89.94
> 2.6.30-rc2	90.78	90.8	90.87	90.73	90.8
> 2.6.30-rc3	90.84	90.94	91.05	90.84	90.92
> 
> Different Socket
> Kernel		Test 1	Test 2	Test 3	Test 4	Average
> 2.6.22		91.64	91.65	91.61	91.68	91.645
> 2.6.23		91.9	91.84	91.92	91.83	91.873
> 2.6.24		91.33	91.24	91.42	91.38	91.343
> 2.6.25		92.39	92.04	92.3	92.23	92.240
> 2.6.26		90.64	90.57	90.6	90.08	90.473
> 2.6.27		91.14	91.26	90.9	91.09	91.098
> 2.6.28		92.3	91.92	92.3	92.23	92.188
> 2.6.29		90.57	89.83	89.9	90.41	90.178
> 2.6.30-rc2	90.59	90.97	90.27	91.69	90.880
> 2.6.30-rc3	92.08	91.32	91.21	92.06	91.668
> 
> 

Thanks Christoph for doing this

I believe we can restore pre 2.6.25 performance level with litle changes.

[Problem is that on 2.6.25, UDP mem accounting forced us to add a callback
to sock_def_write_space() at skb TX completion time. This function
then wake up all thread(s) blocked in revfrom() syscall. Once awaken,
thread(s) block again because no frame was received]


Davide Libenzi added a 'key' opaque argument to wakeups so that eventpoll
can avoid unnecessary wakeups. This infrastructure could be used on other paths.
(Most important being this one : receivers, because writers are rarely blocked
because of sndbuffer filled)

commit 37e5540b3c9d838eb20f2ca8ea2eb8072271e403
Author: Davide Libenzi <davidel@xmailserver.org>
Date:   Tue Mar 31 15:24:21 2009 -0700

    epoll keyed wakeups: make sockets use keyed wakeups

    Add support for event-aware wakeups to the sockets code.  Events are
    delivered to the wakeup target, so that epoll can avoid spurious wakeups
    for non-interesting events.

commit : 2dfa4eeab0fc7e8633974f2770945311b31eedf6

    epoll keyed wakeups: teach epoll about hints coming with the wakeup key

    Use the events hint now sent by some devices, to avoid unnecessary wakeups
    for events that are of no interest for the caller.  This code handles both
    devices that are sending keyed events, and the ones that are not (and
    event the ones that sometimes send events, and sometimes don't).

We can add support for these key on regular socket code, so that a process
waiting on receive wont be scheduled because a TX completion occured.


Standard way is using autoremove_wake_function() :

int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key)
{
        int ret = default_wake_function(wait, mode, sync, key);

        if (ret)
                list_del_init(&wait->task_list);
        return ret;
}


/* this function ignores "key" argument */
int default_wake_function(wait_queue_t *curr, unsigned mode, int sync,
                          void *key)
{
        return try_to_wake_up(curr->private, mode, sync);
}


While new 'keyed' events can do better :

static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *key)
{
        int pwake = 0;
        unsigned long flags;
        struct epitem *epi = ep_item_from_wait(wait);
        struct eventpoll *ep = epi->ep;

        spin_lock_irqsave(&ep->lock, flags);


...
        /*
         * Check the events coming with the callback. At this stage, not
         * every device reports the events in the "key" parameter of the
         * callback. We need to be able to handle both cases here, hence the
         * test for "key" != NULL before the event match test.
         */
        if (key && !((unsigned long) key & epi->event.events))
                goto out_unlock;

}


I'll try to cook a patch in following days, unless someone beats me :)

Thanks


  reply	other threads:[~2009-04-24 21:18 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-24 20:10 udp ping pong with various process bindings (and correct cpu mappings) Christoph Lameter
2009-04-24 21:18 ` Eric Dumazet [this message]
2009-04-25 15:47 ` [PATCH] net: Avoid extra wakeups of threads blocked in wait_for_packet() Eric Dumazet
2009-04-26  9:04   ` David Miller
2009-04-26 10:46     ` [PATCH] poll: Avoid extra wakeups Eric Dumazet
2009-04-26 13:33       ` Jarek Poplawski
2009-04-26 14:27         ` Eric Dumazet
2009-04-28  9:15       ` David Miller
2009-04-28  9:24         ` Eric Dumazet
2009-04-28 14:21       ` Andi Kleen
2009-04-28 14:58         ` Eric Dumazet
2009-04-28 15:06         ` [PATCH] poll: Avoid extra wakeups in select/poll Eric Dumazet
2009-04-28 19:05           ` Christoph Lameter
2009-04-28 20:05             ` Eric Dumazet
2009-04-28 20:14               ` Christoph Lameter
2009-04-28 20:33                 ` Eric Dumazet
2009-04-28 20:49                   ` Christoph Lameter
2009-04-28 21:04                     ` Eric Dumazet
2009-04-28 21:00                       ` Christoph Lameter
2009-04-28 21:05                       ` Eric Dumazet
2009-04-28 21:04                         ` Christoph Lameter
2009-04-28 21:11                       ` Eric Dumazet
2009-04-29  9:11                         ` Ingo Molnar
2009-04-30 10:49                           ` Eric Dumazet
2009-04-30 11:57                             ` Ingo Molnar
2009-04-30 14:08                               ` Eric Dumazet
2009-04-30 16:07                                 ` [BUG] perf_counter: change cpu frequencies Eric Dumazet
2009-05-03  6:06                                   ` Eric Dumazet
2009-05-03  7:25                                     ` Ingo Molnar
2009-05-04 10:39                                       ` Eric Dumazet
2009-04-30 21:24                                 ` [PATCH] poll: Avoid extra wakeups in select/poll Paul E. McKenney
2009-04-29  7:20           ` Andrew Morton
2009-04-29  7:35             ` Andi Kleen
2009-04-29  7:37               ` Eric Dumazet
2009-04-29  9:22               ` Ingo Molnar
2009-04-29  7:39             ` Eric Dumazet
2009-04-29  8:26               ` Eric Dumazet
2009-04-29  9:16           ` Ingo Molnar
2009-04-29  9:36             ` Eric Dumazet
2009-04-29 10:27               ` Ingo Molnar
2009-04-29 12:29                 ` Eric Dumazet
2009-04-29 13:07                   ` Ingo Molnar
2009-04-29 15:53                   ` Davide Libenzi
2009-04-28  9:26   ` [PATCH] net: Avoid extra wakeups of threads blocked in wait_for_packet() David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F22C8B.9000102@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=bhutchiings@solarflare.com \
    --cc=cl@linux.com \
    --cc=davem@davemloft.net \
    --cc=jesse.brandeburg@intel.com \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).