All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: sparclinux@vger.kernel.org
Subject: Re: [PATCH 7/7] sparc64: Add function graph tracer support.
Date: Fri, 16 Apr 2010 15:44:21 +0000	[thread overview]
Message-ID: <20100416154419.GH5162@nowhere> (raw)
In-Reply-To: <20100412.234300.212396783.davem@davemloft.net>

On Fri, Apr 16, 2010 at 02:12:32AM -0700, David Miller wrote:
> 
> Hey Frederic, I just wanted you to know that I'm slowly but
> surely trying to make progress on these crashes.
> 
> I'm trying various different things to narrow down the source of the
> corruptions, so here's what I've done so far.
> 
> I did some things to eliminate various aspects of the function tracing
> code paths, and see if the problem persists.
> 
> First, I made function_trace_call() unconditionally return
> immediately.
> 
> Next, I restored function_trace_call() back to normal, and instead
> made trace_function() return immediately.
> 
> I could not reproduce the corruptions in either of these cases with
> the function tracer enabled in situations where I was guarenteed
> normally to see a crash.
> 
> So the only part of the code paths left is the ring buffer and the
> filling in of the entries.
> 
> Therefore, what I'm doing now is trying things like running various
> hacked up variants of the ring buffer benchmark module while doing
> things that usually trigger the bug (for me a "make -j128" is usually
> enough) hoping I can trigger corruption.  No luck on that so far but
> I'll keep trying this angle just to make sure.
> 
> BTW, I noticed that every single time we see the corruptions now, we
> always see that "hrtimer: interrupt took xxx ns" message first.  I
> have never seen the corruption messages without that reaching the logs
> first.
> 
> Have you?
> 
> That might be an important clue, who knows...


Yep that's what I told you in my previous mail :)

"""(note the hrtimer warnings are normals. This is a hanging prevention
that has been added because of the function graph tracer first but
eventually serves as a general protection for hrtimer. It's about
similar to the balancing problem scheme: the time to service timers
is so slow that timers re-expire before we exit the servicing loop,
so we risk an endless loop)."""

This comes from the early days of the function graph tracer.
To work on it, I was sometimes using VirtualBox and the function
graph tracer and noticed it was making the system so slow that hrtimers
was hanging (in fact it was also partly promoted by guest switches).

Hence we've made this hanging protection, but that's ok, hrtimer
can sort it out this situation. Though if it happens too much,
some timers may be often delayed.

That said it also means there is a problem I think. It's normal
that it happens in a guest, but not a normal box. May be there
a contention in the tracer fast path that slows down the machine.

Do you have CONFIG_DEBUG_LOCKDEP enabled? This was one of the
sources of these contentions (fixed lately in -tip but for
.35).


  parent reply	other threads:[~2010-04-16 15:44 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-13  6:43 [PATCH 7/7] sparc64: Add function graph tracer support David Miller
2010-04-13 19:18 ` Frederic Weisbecker
2010-04-13 19:39 ` Rostedt
2010-04-13 19:45 ` Frederic Weisbecker
2010-04-13 21:34 ` David Miller
2010-04-13 21:35 ` David Miller
2010-04-13 21:51 ` Frederic Weisbecker
2010-04-13 21:52 ` Steven Rostedt
2010-04-13 21:56 ` David Miller
2010-04-13 21:57 ` David Miller
2010-04-13 21:57 ` Frederic Weisbecker
2010-04-13 22:05 ` Frederic Weisbecker
2010-04-13 22:11 ` David Miller
2010-04-13 23:34 ` David Miller
2010-04-13 23:56 ` David Miller
2010-04-14  1:59 ` David Miller
2010-04-14  9:04 ` David Miller
2010-04-14 15:29 ` Frederic Weisbecker
2010-04-14 15:48 ` Frederic Weisbecker
2010-04-14 23:08 ` David Miller
2010-04-16  9:12 ` David Miller
2010-04-16 15:44 ` Frederic Weisbecker [this message]
2010-04-16 20:47 ` David Miller
2010-04-16 22:51 ` David Miller
2010-04-16 23:14 ` Frederic Weisbecker
2010-04-16 23:17 ` David Miller
2010-04-17  7:51 ` David Miller
2010-04-17 16:59 ` Frederic Weisbecker
2010-04-17 17:22 ` Frederic Weisbecker
2010-04-17 21:24 ` David Miller
2010-04-17 21:25 ` David Miller
2010-04-17 21:29 ` David Miller
2010-04-17 21:34 ` Frederic Weisbecker
2010-04-17 21:38 ` Frederic Weisbecker
2010-04-17 21:38 ` David Miller
2010-04-17 21:41 ` Frederic Weisbecker
2010-04-18 15:31 ` Frederic Weisbecker
2010-04-18 21:19 ` David Miller
2010-04-19  7:56 ` David Miller
2010-04-19  8:15 ` David Miller
2010-04-19 19:52 ` Frederic Weisbecker
2010-04-19 19:56 ` David Miller
2010-04-19 20:37 ` Frederic Weisbecker
2010-04-20  5:51 ` David Miller
2010-04-20  7:50 ` David Miller
2010-04-20 13:58 ` Frederic Weisbecker
2010-04-20 21:17 ` David Miller
2010-04-20 22:52 ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100416154419.GH5162@nowhere \
    --to=fweisbec@gmail.com \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.