public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: David Mosberger-Tang <David.Mosberger@acm.org>
To: linux-ia64@vger.kernel.org
Subject: Re: Attribute spinlock contention ticks to caller.
Date: Mon, 19 Sep 2005 17:52:11 +0000	[thread overview]
Message-ID: <ed5aea4305091910526c0e475@mail.gmail.com> (raw)
In-Reply-To: <20050914222644.GA5036@lnx-holt.americas.sgi.com>

And as Stephane already explained, if you use the right tool, there is
no need for the hack that you suggest.  You can either use a
q-syscollect-like approach (which will give you call-counts, but not
necessarily distribute the time accurately) or you can unwind the
call-stack and even distribute the time correctly.  That's all doable
today without any special-case hacks.

  --david

On 9/19/05, Robin Holt <holt@sgi.com> wrote:
> On Sun, Sep 18, 2005 at 06:18:20PM -0700, David Mosberger-Tang wrote:
> > Well, it's an example where attributing the spinlock contention time
> > to the caller would have completely obfuscated the problem.
> 
> Either way, we have obfuscation.  In the one case (attributing to caller),
> the obfuscation can be resolved by looking at the code.  In the other
> (multiple paths contending on independent locks), the obfuscation can
> only be resolved by repeating the test with different sampling.
> 
> Although that sounds simple, what if it is a difficult to execute test.
> What if this appeared to be a one-time aberration that was captured during
> one of many iterations.  The chance to capture is gone.
> 
> For a more complete illustration, I would like to elaborate my previous
> example.  I had a sample file produced by our benchmarkers.  They had
> received the results on their third run after tweaking some app settings
> and the results were nearly impossible to believe.  This happened to be
> an MPI job where all ranks barrier at the end of a phase so one single
> rank being slow results in the entire application being slow.
> 
> After the third run, they repeated with the app settings from the
> second run and then repeated again with the settings from the third
> run.  Neither run showed any signs of a similar problem.  The customer
> acceptance test continued.  Before the customer would accept the results,
> they needed that anomaly explained.
> 
> Fortunately, the customer had required a sampling output from every
> run so data had been taken using perfmon and retained.  This was on a
> 2.4 based system.  The system had eight Ethernet adapters spread across
> the machine.  Interrupts for each were targeted to different cpus.
> 
> Because sampling was showing the caller, this turned into a simple
> question, why was there so much network receive activity.  On some of
> the cpus, we noticed a significant number of processes were trying to
> en-queue network packets at the same time.  The sample IP showed we were
> in a bundle after a spinlock was acquired.
> 
> Had we not provided the caller, we would have been left with something
> that was relatively impossible to diagnose definitively.  With the unroll,
> it became a simple matter of looking at the enabled network services and
> finding somebody had run a network benchmark using all eight network
> adapters.  We contacted the group responsible for network benchmarks
> and the problem was isolated and explained to the customers satisfaction.
> 
> I hope this illustrates that one way of sampling makes it slightly more
> difficult to determine that the source of slowdown is contention on
> a lock where the other way of sampling results in it being impossible
> to determine the source of a problem.  Given the choices, I would say
> the right way to do the sampling is to not attribute the samples to
> the caller.
> 
> Thanks,
> Robin
> 


-- 
Mosberger Consulting LLC, voice/fax: 510-744-9372,
http://www.mosberger-consulting.com/
35706 Runckel Lane, Fremont, CA 94536

      parent reply	other threads:[~2005-09-19 17:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-14 22:26 Attribute spinlock contention ticks to caller Robin Holt
2005-09-15  0:10 ` Keith Owens
2005-09-15  6:34 ` Stephane Eranian
2005-09-15  8:19 ` Stephane Eranian
2005-09-15 17:14 ` Robin Holt
2005-09-15 17:23 ` Robin Holt
2005-09-15 17:37 ` Luck, Tony
2005-09-15 22:29 ` Robin Holt
2005-09-15 22:54 ` Zou Nan hai
2005-09-16  9:37 ` Stephane Eranian
2005-09-16 22:29 ` Robin Holt
2005-09-17  1:08 ` David Mosberger-Tang
2005-09-18 23:06 ` Robin Holt
2005-09-19  1:18 ` David Mosberger-Tang
2005-09-19  8:35 ` Stephane Eranian
2005-09-19 15:17 ` Robin Holt
2005-09-19 17:52 ` David Mosberger-Tang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed5aea4305091910526c0e475@mail.gmail.com \
    --to=david.mosberger@acm.org \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox