public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: torvalds@osdl.org
Cc: linux-ia64@vger.kernel.org
Subject: pipe performance regression on ia64
Date: Tue, 18 Jan 2005 17:41:16 +0000	[thread overview]
Message-ID: <200501181741.j0IHfGf30058@unix-os.sc.intel.com> (raw)

David Mosberger pointed out to me that 2.6.11-rc1 kernel scores
very badly on ia64 in lmbench pipe throughput test (bw_pipe) compared
with earlier kernels.

Nanhai Zou looked into this, and found that the performance loss
began with Linus' patch to speed up pipe performance by allocating
a circular list of pages.

Here's his analysis:

>OK, I know the reason now.
>
>This regression we saw comes from scheduler load balancer.
>
>Pipe is a kind of workload that writer and reader will never run at the
>same time. They are synchronized by semaphore. One is always sleeping
>when the other end is working.
>
>To have cache hot, we do not wish to let writer and reader
>to be balanced to 2 cpus. That is why in fs/pipe.c, kernel use
>wake_up_interruptible_sync() instead of wake_up_interruptible to wakeup
>process.
>
>Now, load balancer is still balancing the processes if we have other
>any cpu idle.  Note that on an HT enabled x86 the load balancer will
>first balance the process to a cpu in SMT domain without cache miss
>penalty.
>
>So, when we run bw_pipe on a low load SMP machine, the kernel running in
>a way load balancer always trying to spread out 2 processes while the
>wake_up_interruptible_sync() is always trying to draw them back into
>1 cpu.
>
>Linus's patch will reduce the change to call wake_up_interruptible_sync()
>a lot.
>
>For bw_pipe writer or reader, the buffer size is 64k.  In a 16k page
>kernel. The old kernel will call wake_up_interruptible_sync 4 times but
>the new kernel will call wakeup only 1 time.
>
>Now the load balancer wins, processes are running on 2 cpus at most of
>the time.  They got a lot of cache miss penalty.
>
>To prove this, Just run 4 instances of bw_pipe on a 4 -way Tiger to let
>load balancer not so active.
>
>Or simply add some code at the top of main() in bw_pipe.c
>
>{
>  long affinity = 1;
>  sched_setaffinity(getpid(), sizeof(long), &affinity);
>}
>then make and run bw_pipe again.
>
>Now I get a throughput of 5GB...

-Tony

             reply	other threads:[~2005-01-18 17:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-18 17:41 Luck, Tony [this message]
2005-01-18 18:11 ` pipe performance regression on ia64 Linus Torvalds
2005-01-18 18:31   ` David Mosberger
2005-01-18 20:17     ` Linus Torvalds
2005-01-19  3:05       ` [Lmbench-users] " Larry McVoy
2005-01-19  3:20         ` Linus Torvalds
2005-01-19  3:24         ` Zou, Nanhai
2005-01-19  6:35         ` Luck, Tony
2005-01-19  9:23         ` Staelin, Carl
2005-01-19 16:40       ` Larry McVoy
2005-01-18 23:34   ` Nick Piggin
2005-01-19  5:11     ` David Mosberger
2005-01-19 12:43       ` Nick Piggin
2005-01-19 17:31         ` David Mosberger
2005-01-19 12:52   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200501181741.j0IHfGf30058@unix-os.sc.intel.com \
    --to=tony.luck@intel.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox