All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: xenomai-core <xenomai@xenomai.org>
Subject: [Xenomai-core] Re: XENO_OPT_DEBUG impact
Date: Mon, 20 Nov 2006 14:22:57 +0100	[thread overview]
Message-ID: <1164028977.5006.92.camel@domain.hid> (raw)
In-Reply-To: <456193E2.8030605@domain.hid>

On Mon, 2006-11-20 at 12:39 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Mon, 2006-11-20 at 11:01 +0100, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Mon, 2006-11-20 at 10:20 +0100, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Fri, 2006-11-17 at 20:10 +0100, Jan Kiszka wrote:
> >>>>>> Philippe Gerum wrote:
> >>>>>>> On Fri, 2006-11-17 at 19:41 +0100, Jan Kiszka wrote:
> >>>>>>>> I'm currently seeing two potential "misuses" of the common switch:
> >>>>>>>>
> >>>>>>>>  - the posix skin (Gilles, how heavy-weighted are those checks?)
> >>>>>>>>    => CONFIG_XENO_OPT_DEBUG_POSIX
> >>>>>>>>
> >>>>>>>>  - CONFIG_XENO_SPINLOCK_DEBUG => CONFIG_XENO_OPT_DEBUG_SPINLOCK
> >>>>>>>>
> >>>>>>>> Both should be explicitly controllable in Kconfig.
> >>>>>>>>
> >>>>>>> Nack for CONFIG_XENO_OPT_DEBUG_SPINLOCK. Most of the issue we tracked
> >>>>>>> with Gilles regarding the domain migration code had side-effects on the
> >>>>>>> nucleus lock. So having CONFIG_XENO_OPT_DEBUG enabled for identifying
> >>>>>>> internal state weirdnesses - like those triggered by migration bugs -
> >>>>>>> implies enabling the spinlock watchdogs too.
> >>>>>> Ok, if it only makes sense to have both enabled at the same time, then
> >>>>>> let us create XENO_OPT_DEBUG_NUCLEUS. It should include both, but it
> >>>>>> shall not be automatically on when, say, only XENO_OPT_DEBUG_RTDM is
> >>>>>> required.
> >>>>> No objection.
> >>>>>
> >>>> Looking at the spinlock debugging code: it serves two inseparable
> >>>> purposes, a watchdog for stuck locks + lock statistics. The latter make
> >>>> this feature pop up when XENO_OPT_STATS are set on a SMP box - rather
> >>>> surprising effect. Do we still need the stats? If not, I would kick them
> >>>> out in favour of using the latency tracer for such analysis, making
> >>>> spinlock debugging a real pure debug feature.
> >>>>
> >>> The spinlock stats are about uncovering a problem, the latency tracer is
> >>> about finding where the problem lies. Both are orthogonal.
> >> Not fully true: the tracer provides the same information when you enable
> >> CONFIG_IPIPE_TRACE_IRQSOFF. When you disable CONFIG_IPIPE_TRACE_MCOUNT,
> >> you even get this at comparable (if not lower) costs. I once played with
> >> the spinlock debug code before decided to invest time into the tracer. I
> >> think I even posted a patch to enable that code on UP. But I didn't find
> >> the spinlock stats useful enough, even for the scenario "lock length
> >> analysis".
> >>
> >> We basically have now two ways to get the same information (or please
> >> explain what is missing with the tracer). Besides the redundancy, there
> >> is the problem that one of this way comes in via two different,
> >> orthogonal paths (STATS+SMP || DEBUG). That's not very consistent IMHO.
> >>
> > 
> > Nothing is missing in the tracer. The point is that you don't
> > immediately know that you are having a spinlock issue which would make
> > you build the tracer support, and having those stats is a cheap way to
> > detect such problem in a lightweight manner. 
> 
> If it were cheap, we wouldn't discuss it here. Actually, due to its
> inline nature, this instrumentation is fairly costly. That's ok, as long
> as you can explicitly ask for such a feature.
> 

You are talking about different issues here:
#1 - having SMP+STAT enabling the SPINLOCK_DEBUG is suboptimal
#2 - because you don't like #1, we should kill it entirely, and only
rely on the tracer to provide spinlock latency tracing.

I agree on your conclusion regarding #1. I need to be sure that #2 is
not going to kill us too, during SMP debugging sessions.

Fixing #1 is a matter of decoupling config options, but does not require
#2. Going for #2 requires to make sure that we are not going to add some
temporal perturbations caused by the tracer. (Btw, it would be quite
easy to reduce the impact of SPINLOCK_DEBUG on the I-cache, by moving
the stamping code out of line, so this is not a bad code "by design",
it's just a suboptimal implementation).

> But now we have the situation that the (default y!) XENO_OPT_STAT
> feature on UP is far more costly than on SMP.

You mean the opposite, I guess.

>  You know that the stats
> are very useful already without any spinlock instrumentation, i.e. for
> analysing the RT-system load. My feeling is that, for SMP, we currently
> have a huge config mess here. And this is what I'm trying to address,
> /maybe/ also by removing redundant instrumentation means.
> 

I would not call a mess something you don't happen to like; it may still
serve legitimate purposes. It's just a feature after all, which has
proven to be quite useful to the people debugging SMP issues. It's not
redundant in my mind, for the reasons already given. This does not
preclude the opportunity to improve the config situation, though.

> > Running with the tracer
> > enabled usually means that you are chasing an issue you have already
> > detected.
> 
> Again, tracer != mcount. It can be used just like that spinlock stats:
> to *detect* long locking periods. Have a look.
> 

Relax, I had a look already a fair number of times, and I agree with you
that the tracer provides a very useful set of latency tracing data, but
the point is that I'm worried about the perturbations the tracer adds,
which are real, mcount or not, and I don't want to chase the wild goose
when tracking SMP latency issues.

On the other hand, only idiots never change mind, so let's move on the
smart way: please submit your ideal fix for that issue. Since Gilles and
I are usually the ones who bang their heads on SMP issues, we will
experiment with the tracer as a SMP latency tracking tool for Xenomai.
If we actually save some debug time using the tracer, or at least don't
lose any, then I will merge this patch.

> Jan
> 
-- 
Philippe.




  reply	other threads:[~2006-11-20 13:22 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-13 12:13 [Xenomai-help] exception 768 Daniel Schnell
2006-11-17 17:32 ` Philippe Gerum
2006-11-17 18:02   ` Daniel Schnell
2006-11-17 18:41   ` [Xenomai-core] XENO_OPT_DEBUG impact (was: exception 768) Jan Kiszka
2006-11-17 19:05     ` [Xenomai-core] " Philippe Gerum
2006-11-17 19:10       ` [Xenomai-core] Re: XENO_OPT_DEBUG impact Jan Kiszka
2006-11-17 21:58         ` Philippe Gerum
2006-11-20  9:20           ` Jan Kiszka
2006-11-20  9:38             ` Philippe Gerum
2006-11-20 10:01               ` Jan Kiszka
2006-11-20 10:46                 ` Philippe Gerum
2006-11-20 11:39                   ` Jan Kiszka
2006-11-20 13:22                     ` Philippe Gerum [this message]
2006-11-20  9:07     ` Gilles Chanteperdrix
2006-11-20  9:14       ` Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1164028977.5006.92.camel@domain.hid \
    --to=rpm@xenomai.org \
    --cc=jan.kiszka@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.