Re: 2.6.16-rc6-rt1 - Ingo Molnar

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: Esben Nielsen <simlo@phys.au.dk>
Cc: linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: 2.6.16-rc6-rt1
Date: Tue, 14 Mar 2006 11:18:11 +0100	[thread overview]
Message-ID: <20060314101811.GA10450@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0603140944050.1291-100000@lifa01.phys.au.dk>


* Esben Nielsen <simlo@phys.au.dk> wrote:

> > Thomas' testing method has the advantage that it utilizes the kernel's
> > PI mechanism directly, hence it is easy to keep it uptodate without
> > having to port the kernel's PI code to userspace.
> 
> I call that a disadvantage. I the impression you work like this
> 0) Write or fix code
> 1) Try to compile the kernel
> 2) On compile error goto 0
> 3) Try to boot the kernel
> 4) If the kernel doesn't boot goto 0
> 5) Test whatever you have changed
> 6) If your test fails goto 0
> 
> Just the time spend in 1) is between a few seconds to figure out 
> simple syntax errors and up to several minuttes to recompile a lot of 
> the kernel. 3) takes a minute or two, 5) usually also takes some time, 
> depending on how much you have set up automaticly. In short: Each 
> iteration is minuttes.

correct workload but wrong timing assumptions. For me to reboot into a 
completely rebuilt kernel is about 90 seconds (from the point of having 
saved the change in the editor, to the point i can ssh into the freshly 
booted up box). To reboot into a partially rebuilt kernel is less than 
30 seconds. (It is important to keep this particular latency as low as 
possible, for a number of other reasons as well, not just pure 
development time.)

> The way I work is
> 0) Fix whatever code (in TestRTMutex, rt.c or wherever)
> 1) Try to compile rttest and run the tests (done as one step with make)
> 2) If something fails goto 0,
> 
> Each iteration takes a few seoncds. I can do it within Emacs (please, 
> no flame wars! :-) where I in the compile buffer can jump directly to 
> the lines in the C-code or in the test-scripts where the error is 
> reported. I can also to some degree (as shown below) find SMP 
> deadlocks without having a SMP machine.

the PI code, while currently seeing alot of changes, isnt supposed to 
change all that often in the long run. Hence it is far more important 
to:

- _always_ have a testsuite available without maintainance overhead, 
  even if we only do small fixes to the PI code. With your method, both 
  the userspace PI code, and the kernel-space PI code has to be updated, 
  all the time.

- the have the _real_ PI code utilized. The real scheduler, and on a 
  real box.

- to be able to do stress-tests too, which is much less possible and 
  practical in a simulated PI environment.

> The point is: Even though you have to maintain an extra level of stubs 
> in userspace you gain much speedier development cycle. You gain 
> quality as you can test the logic in a more controlled manner 
> independent of the real timing on the target. You are forced to think 
> isolation and therefore get an overall better architecture.

the same benefit can be gotten by simply cutting down on the kernel 
compilation time and on the install-new-kernel-and-reboot latency.

> > thanks, applied. [NOTE: had to apply it by hand because the patch was
> > whitespace damaged, it had all tabs converted to spaces.]
> 
> Not again! I even used Pico from within Pine to paste it in...

hm, did you use Ctrl-R to read the patchfile in? That's pretty much the 
only good way to get a patch into Pine.

> > > 2) There is a spinlock deadlock when doing the following test on SMP:
> > >
> > > threads:   1            2
> > >          lock 1         +
> > >           +          lock 2
> > > test:   lockcount 1   lockcount 1
> > >
> > >          lock 2      lock 1            <- spin deadlocks here
> > >           -             -
> > > test:   lockcount 1   lockcount 1
> > >
> > > This happens because both tasks tries to lock both tasks's pi_lock but
> > > in opposit order.  I don't have fix for that one yet.
> >
> > well, this is a circular dependency deadlock - which is illegal in the
> > kernel,
> 
> I still find it better to gracefully go to sleep or report a bug than 
> just crashing the machine.

we do report it.

> > and which we detect for futex locks too - so it shouldnt happen.
> 
> You mean that you have to run deadlock detection for all futexes to 
> avoid crashing the kernel? [...]

no. We have to run deadlock detection to avoid things like circular lock 
dependencies causing an infinite schedule+wakeup 'storm' during priority 
boosting. (like possible with your wakeup based method i think) Note 
that deadlock detection and priority boosting is 'merged', so there is 
no CPU overhead from it. There is no global lock for deadlock detection 
anymore, etc.

> Anyway: As far as I can see the deadlock happens in 
> adjust_prio_chain() no matter what the detect_deadlock argument is.  
> The bug is because you to lock current->pi_lock and owner->pi_lock at 
> the same time in different order. As far as I can see that can happen 
> from futex_lock_pi() as well.

ok, checking this.

> We are basicly back to the discussions I had last fall: Doing deadlock 
> detection and PI is almost the same thing. You have to somehow 
> traverse the list of locks. So to protect the kernel from crashing in 
> the PI code when futexes deadlock, you have to traverse the list of 
> locks without spin deadlocking to detect the futex deadlock. If you 
> can do that, you could just as well do the PI that way in the first 
> place.

we are doing it precisely that way - PI and deadlock detection is 
'merged'. We do it in one go, in adjust_prio_chain().

	Ingo

next prev parent reply	other threads:[~2006-03-14 10:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-12 22:02 2.6.16-rc6-rt1 Ingo Molnar
2006-03-12 23:17 ` 2.6.16-rc6-rt1 Michal Piotrowski
2006-03-12 23:36   ` 2.6.16-rc6-rt1 Michal Piotrowski
2006-03-13  6:43     ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-13  9:25   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-13 13:49     ` 2.6.16-rc6-rt1 Michal Piotrowski
2006-03-13  6:56 ` [PATCH] 2.6.16-rc6-rt1: Fix redefinition and unknown symbol Jan Altenberg
2006-03-13  9:11   ` Ingo Molnar
2006-03-13 13:21 ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-13 14:29   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-13 16:06     ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14  0:22   ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14  8:12     ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-14  8:33       ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-14 10:02       ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 10:18         ` Ingo Molnar [this message]
2006-03-14 22:02           ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 22:11             ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-14 22:22               ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-15 11:21                 ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 22:28               ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 10:58         ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-14 20:40           ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 22:00             ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-14  1:54 ` 2.6.16-rc6-rt1 Steven Rostedt
2006-03-14  7:51   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-14  2:22 ` 2.6.16-rc6-rt1 Steven Rostedt
2006-03-14  8:00   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-15 16:25 ` 2.6.16-rc6-rt1 John Richard Moser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060314101811.GA10450@elte.hu \
    --to=mingo@elte.hu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=simlo@phys.au.dk \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.