Poor localhost net performance on recent stable kernel

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Poor localhost net performance on recent stable kernel
@ 2010-04-15 15:44 Kelly Burkhart
  2010-04-28 19:25 ` Andrew Morton
  0 siblings, 1 reply; 2+ messages in thread
From: Kelly Burkhart @ 2010-04-15 15:44 UTC (permalink / raw)
  To: netdev, linux-kernel

Hello,

While working on upgrading distributions, I've noticed that local
network communication is much slower on 2.6.33.2 than on our old
kernel 2.6.16.60 (sles 10.2).

Results of netperf, UDP_RR against localhost I get around 150000 tps
on the new kernel vs. 290000 tps with the old kernel.  The netperf
command:

netperf -T 1 -H 127.0.0.1 -t UDP_RR -c -C -- -r 100

TCP_RR had similar results.  The problem did not exist with TCP_STREAM.

While trying to track this down, I wrote a test program that writes
then reads a 32 bit integer to a pipe:

static void tst_pipe0( int sleep_us )
{
    int pipefd[2];
    int idx;
    uint32_t tarr[ITERS];

    printf("tst_pipe0 -- sleep %dus\n", sleep_us);

    if (pipe(pipefd) < 0)
        err_exit("pipe");

    for(idx=0; idx<ITERS; ++idx) {
        uint32_t btsc;
        uint32_t rtsc;
        uint32_t etsc;
        get_tscl(btsc);
        write(pipefd[1], (char *)&btsc, sizeof(btsc));
        read(pipefd[0], (char *)&rtsc, sizeof(rtsc));
        get_tscl(etsc);
        tarr[idx] = etsc-btsc;
        do_sleep(sleep_us);
    }
    prt_avg(tarr, ITERS);
    close(pipefd[0]);
    close(pipefd[1]);
    printf("\n");
}

There's a dramatic difference if there's a sleep between iterations on
the new kernel.  On the old kernel the write/read round trip takes
1100-1300 cycles with or without sleep.  On the new kernel, with no
sleep the round trip is about 1400 cycles.  It doubles with a 1us
sleep then gradually increases to 12000-14000 cycles then stabilizes
as I increase the sleep time to 1500us.  I'm not sure if this is
related to the netperf difference or is a completely different
scheduling issue.

I'm running on an Intel Xeon X5570 @ 2.93GHz.  Different tick/notick,
preemption, HZ kernel config option values doesn't substantially change
the magnitude of the difference.

Does anyone have any ideas regarding what could be causing the netperf
issue?  And is the pipe microbenchmark meaningful and if so what does
it mean?

Thanks,

-Kelly

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Poor localhost net performance on recent stable kernel
  2010-04-15 15:44 Poor localhost net performance on recent stable kernel Kelly Burkhart
@ 2010-04-28 19:25 ` Andrew Morton
  0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2010-04-28 19:25 UTC (permalink / raw)
  To: Kelly Burkhart; +Cc: netdev, linux-kernel

On Thu, 15 Apr 2010 10:44:44 -0500
Kelly Burkhart <kelly.burkhart@gmail.com> wrote:

> Hello,
> 
> While working on upgrading distributions, I've noticed that local
> network communication is much slower on 2.6.33.2 than on our old
> kernel 2.6.16.60 (sles 10.2).
> 
> Results of netperf, UDP_RR against localhost I get around 150000 tps
> on the new kernel vs. 290000 tps with the old kernel.  The netperf
> command:
> 
> netperf -T 1 -H 127.0.0.1 -t UDP_RR -c -C -- -r 100

I ran this command on a Red Hat 2.6.18-1.2868 kernel and on 2.6.34-rc5.

2.6.18-1.2868: 43903.29 per second
2.6.34-rc5:    72506.11 per second

IIRC, localhost communications have always exhibited quite large
variations between kernel versions depending on various vagaries
of alignemnt, cacheline sharing, etc.

> TCP_RR had similar results.  The problem did not exist with TCP_STREAM.
> 
> While trying to track this down, I wrote a test program that writes
> then reads a 32 bit integer to a pipe:
> 
> static void tst_pipe0( int sleep_us )
> {
>     int pipefd[2];
>     int idx;
>     uint32_t tarr[ITERS];
> 
>     printf("tst_pipe0 -- sleep %dus\n", sleep_us);
> 
>     if (pipe(pipefd) < 0)
>         err_exit("pipe");
> 
>     for(idx=0; idx<ITERS; ++idx) {
>         uint32_t btsc;
>         uint32_t rtsc;
>         uint32_t etsc;
>         get_tscl(btsc);
>         write(pipefd[1], (char *)&btsc, sizeof(btsc));
>         read(pipefd[0], (char *)&rtsc, sizeof(rtsc));
>         get_tscl(etsc);
>         tarr[idx] = etsc-btsc;
>         do_sleep(sleep_us);
>     }
>     prt_avg(tarr, ITERS);
>     close(pipefd[0]);
>     close(pipefd[1]);
>     printf("\n");
> }
> 
> There's a dramatic difference if there's a sleep between iterations on
> the new kernel.  On the old kernel the write/read round trip takes
> 1100-1300 cycles with or without sleep.  On the new kernel, with no
> sleep the round trip is about 1400 cycles.  It doubles with a 1us
> sleep then gradually increases to 12000-14000 cycles then stabilizes
> as I increase the sleep time to 1500us.  I'm not sure if this is
> related to the netperf difference or is a completely different
> scheduling issue.
> 
> I'm running on an Intel Xeon X5570 @ 2.93GHz.  Different tick/notick,
> preemption, HZ kernel config option values doesn't substantially change
> the magnitude of the difference.
> 
> Does anyone have any ideas regarding what could be causing the netperf
> issue?  And is the pipe microbenchmark meaningful and if so what does
> it mean?

Pipes don't share much code with udp-to-localhost - this is probably
something different.

If you were using two processes then I'd cheerily blame the scheduler. 
Because blaming the scheduler for WeirdShitWhichBroke is usually
correct.  But as you're using a single process then the pipe code
itself is a more likely source for any slowdowns.

As for the strange behavior with sleeps: dunno.  There are various
adjustments made to the sleep duration when performing short sleeps -
some in-kernel, perhaps some in glibc.  Plus we've been evolving the
internal implementation for sleeps, and changes in x86 clocksources and
NOHZ could impact the accuracy of the sleep duration.  So perhaps
what's happening is that different kernels are sleeping for different
durations when asked to sleep for short durations.

If it's not that then it's probably the scheduler ;) But even the
scheduler would have trouble causing these sorts of effects if the
machine is otherwise idle.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-04-28 19:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-15 15:44 Poor localhost net performance on recent stable kernel Kelly Burkhart
2010-04-28 19:25 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).