All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Mosberger-Tang <David.Mosberger@acm.org>
To: linux-ia64@vger.kernel.org
Subject: Re: Attribute spinlock contention ticks to caller.
Date: Mon, 19 Sep 2005 17:52:11 +0000	[thread overview]
Message-ID: <ed5aea4305091910526c0e475@mail.gmail.com> (raw)
In-Reply-To: <20050914222644.GA5036@lnx-holt.americas.sgi.com>

And as Stephane already explained, if you use the right tool, there is
no need for the hack that you suggest.  You can either use a
q-syscollect-like approach (which will give you call-counts, but not
necessarily distribute the time accurately) or you can unwind the
call-stack and even distribute the time correctly.  That's all doable
today without any special-case hacks.

  --david

On 9/19/05, Robin Holt <holt@sgi.com> wrote:
> On Sun, Sep 18, 2005 at 06:18:20PM -0700, David Mosberger-Tang wrote:
> > Well, it's an example where attributing the spinlock contention time
> > to the caller would have completely obfuscated the problem.
> 
> Either way, we have obfuscation.  In the one case (attributing to caller),
> the obfuscation can be resolved by looking at the code.  In the other
> (multiple paths contending on independent locks), the obfuscation can
> only be resolved by repeating the test with different sampling.
> 
> Although that sounds simple, what if it is a difficult to execute test.
> What if this appeared to be a one-time aberration that was captured during
> one of many iterations.  The chance to capture is gone.
> 
> For a more complete illustration, I would like to elaborate my previous
> example.  I had a sample file produced by our benchmarkers.  They had
> received the results on their third run after tweaking some app settings
> and the results were nearly impossible to believe.  This happened to be
> an MPI job where all ranks barrier at the end of a phase so one single
> rank being slow results in the entire application being slow.
> 
> After the third run, they repeated with the app settings from the
> second run and then repeated again with the settings from the third
> run.  Neither run showed any signs of a similar problem.  The customer
> acceptance test continued.  Before the customer would accept the results,
> they needed that anomaly explained.
> 
> Fortunately, the customer had required a sampling output from every
> run so data had been taken using perfmon and retained.  This was on a
> 2.4 based system.  The system had eight Ethernet adapters spread across
> the machine.  Interrupts for each were targeted to different cpus.
> 
> Because sampling was showing the caller, this turned into a simple
> question, why was there so much network receive activity.  On some of
> the cpus, we noticed a significant number of processes were trying to
> en-queue network packets at the same time.  The sample IP showed we were
> in a bundle after a spinlock was acquired.
> 
> Had we not provided the caller, we would have been left with something
> that was relatively impossible to diagnose definitively.  With the unroll,
> it became a simple matter of looking at the enabled network services and
> finding somebody had run a network benchmark using all eight network
> adapters.  We contacted the group responsible for network benchmarks
> and the problem was isolated and explained to the customers satisfaction.
> 
> I hope this illustrates that one way of sampling makes it slightly more
> difficult to determine that the source of slowdown is contention on
> a lock where the other way of sampling results in it being impossible
> to determine the source of a problem.  Given the choices, I would say
> the right way to do the sampling is to not attribute the samples to
> the caller.
> 
> Thanks,
> Robin
> 


-- 
Mosberger Consulting LLC, voice/fax: 510-744-9372,
http://www.mosberger-consulting.com/
35706 Runckel Lane, Fremont, CA 94536

      parent reply	other threads:[~2005-09-19 17:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-14 22:26 Attribute spinlock contention ticks to caller Robin Holt
2005-09-15  0:10 ` Keith Owens
2005-09-15  6:34 ` Stephane Eranian
2005-09-15  8:19 ` Stephane Eranian
2005-09-15 17:14 ` Robin Holt
2005-09-15 17:23 ` Robin Holt
2005-09-15 17:37 ` Luck, Tony
2005-09-15 22:29 ` Robin Holt
2005-09-15 22:54 ` Zou Nan hai
2005-09-16  9:37 ` Stephane Eranian
2005-09-16 22:29 ` Robin Holt
2005-09-17  1:08 ` David Mosberger-Tang
2005-09-18 23:06 ` Robin Holt
2005-09-19  1:18 ` David Mosberger-Tang
2005-09-19  8:35 ` Stephane Eranian
2005-09-19 15:17 ` Robin Holt
2005-09-19 17:52 ` David Mosberger-Tang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed5aea4305091910526c0e475@mail.gmail.com \
    --to=david.mosberger@acm.org \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.