public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: pipe performance regression on ia64
Date: Wed, 19 Jan 2005 13:52:57 +0100	[thread overview]
Message-ID: <20050119125257.GA8112@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.58.0501180951050.8178@ppc970.osdl.org>


* Linus Torvalds <torvalds@osdl.org> wrote:

> The "wake_up_sync()" hack only helps for the special case where we
> know the writer is going to write more. Of course, we could make the
> pipe code use that "synchronous" write unconditionally, and benchmarks
> would look better, but I suspect it would hurt real life.

not just that, it's incorrect scheduling, because it introduces the
potential to delay the woken up task by a long time, amounting to a
missed wakeup.

> I don't know how to make the benchmark look repeatable and good,
> though.  The CPU affinity thing may be the right thing.

the fundamental bw_pipe scenario is this: the wakeup will happen earlier
than the waker suspends. (because it's userspace that decides about
suspension.) So the kernel rightfully notifies another, idle CPU to run
the freshly woken task. If the message passing across CPUs and the
target CPU is fast enough to 'grab' the task, then we'll get the "slow"
benchmark case, waker remaining on this CPU, wakee running on another
CPU. If this CPU happens to be fast enough suspending, before that other
CPU had the chance to grab the CPU (we 'steal the task back') then we'll
see the "fast" benchmark scenario.

i've seen traces where a single bw_pipe testrun showed _both_ variants
in chunks of 100s of milliseconds, probably due to cacheline placement
putting the overhead sometimes above the critical latency, sometimes
below it.

so there will always be this 'latency and tendency to reschedule on
another CPU' thing that will act as a barrier between 'really good' and
'really bad' numbers, and if a test happens to be around that boundary
it will fluctuate back and forth.

and this property also has another effect: _worse_ scheduling decisions
(not waking up an idle CPU when we could) can result in _better_ bw_pipe
numbers. Also, a _slower_ scheduler can sometimes move the bw_pipe
workload below the threshold, resulting in _better_ numbers. So as far
as SMP systems are concerned, bw_pipe numbers have to be considered very
carefully.

this is a generic thing: message passing latency scales inversely always
to the quality of distribution of SMP tasks. The better we are at
spreading out tasks, the worse message passing latency gets. (nothing
will beat passive, work-less 'message passing' between two tasks on the
same CPU.)

	Ingo

      parent reply	other threads:[~2005-01-19 12:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-18 17:41 pipe performance regression on ia64 Luck, Tony
2005-01-18 18:11 ` Linus Torvalds
2005-01-18 18:31   ` David Mosberger
2005-01-18 20:17     ` Linus Torvalds
2005-01-19  3:05       ` [Lmbench-users] " Larry McVoy
2005-01-19  3:20         ` Linus Torvalds
2005-01-19 16:40       ` Larry McVoy
2005-01-18 23:34   ` Nick Piggin
2005-01-19  5:11     ` David Mosberger
2005-01-19 12:43       ` Nick Piggin
2005-01-19 17:31         ` David Mosberger
2005-01-19 12:52   ` Ingo Molnar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050119125257.GA8112@elte.hu \
    --to=mingo@elte.hu \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox