Re: 2.6.16-rc6-rt1 - Ingo Molnar

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: Esben Nielsen <simlo@phys.au.dk>
Cc: linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: 2.6.16-rc6-rt1
Date: Tue, 14 Mar 2006 11:18:11 +0100	[thread overview]
Message-ID: <20060314101811.GA10450@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0603140944050.1291-100000@lifa01.phys.au.dk>


* Esben Nielsen <simlo@phys.au.dk> wrote:

> > Thomas' testing method has the advantage that it utilizes the kernel's
> > PI mechanism directly, hence it is easy to keep it uptodate without
> > having to port the kernel's PI code to userspace.
> 
> I call that a disadvantage. I the impression you work like this
> 0) Write or fix code
> 1) Try to compile the kernel
> 2) On compile error goto 0
> 3) Try to boot the kernel
> 4) If the kernel doesn't boot goto 0
> 5) Test whatever you have changed
> 6) If your test fails goto 0
> 
> Just the time spend in 1) is between a few seconds to figure out 
> simple syntax errors and up to several minuttes to recompile a lot of 
> the kernel. 3) takes a minute or two, 5) usually also takes some time, 
> depending on how much you have set up automaticly. In short: Each 
> iteration is minuttes.

correct workload but wrong timing assumptions. For me to reboot into a 
completely rebuilt kernel is about 90 seconds (from the point of having 
saved the change in the editor, to the point i can ssh into the freshly 
booted up box). To reboot into a partially rebuilt kernel is less than 
30 seconds. (It is important to keep this particular latency as low as 
possible, for a number of other reasons as well, not just pure 
development time.)

> The way I work is
> 0) Fix whatever code (in TestRTMutex, rt.c or wherever)
> 1) Try to compile rttest and run the tests (done as one step with make)
> 2) If something fails goto 0,
> 
> Each iteration takes a few seoncds. I can do it within Emacs (please, 
> no flame wars! :-) where I in the compile buffer can jump directly to 
> the lines in the C-code or in the test-scripts where the error is 
> reported. I can also to some degree (as shown below) find SMP 
> deadlocks without having a SMP machine.

the PI code, while currently seeing alot of changes, isnt supposed to 
change all that often in the long run. Hence it is far more important 
to:

- _always_ have a testsuite available without maintainance overhead, 
  even if we only do small fixes to the PI code. With your method, both 
  the userspace PI code, and the kernel-space PI code has to be updated, 
  all the time.

- the have the _real_ PI code utilized. The real scheduler, and on a 
  real box.

- to be able to do stress-tests too, which is much less possible and 
  practical in a simulated PI environment.

> The point is: Even though you have to maintain an extra level of stubs 
> in userspace you gain much speedier development cycle. You gain 
> quality as you can test the logic in a more controlled manner 
> independent of the real timing on the target. You are forced to think 
> isolation and therefore get an overall better architecture.

the same benefit can be gotten by simply cutting down on the kernel 
compilation time and on the install-new-kernel-and-reboot latency.

> > thanks, applied. [NOTE: had to apply it by hand because the patch was
> > whitespace damaged, it had all tabs converted to spaces.]
> 
> Not again! I even used Pico from within Pine to paste it in...

hm, did you use Ctrl-R to read the patchfile in? That's pretty much the 
only good way to get a patch into Pine.

> > > 2) There is a spinlock deadlock when doing the following test on SMP:
> > >
> > > threads:   1            2
> > >          lock 1         +
> > >           +          lock 2
> > > test:   lockcount 1   lockcount 1
> > >
> > >          lock 2      lock 1            <- spin deadlocks here
> > >           -             -
> > > test:   lockcount 1   lockcount 1
> > >
> > > This happens because both tasks tries to lock both tasks's pi_lock but
> > > in opposit order.  I don't have fix for that one yet.
> >
> > well, this is a circular dependency deadlock - which is illegal in the
> > kernel,
> 
> I still find it better to gracefully go to sleep or report a bug than 
> just crashing the machine.

we do report it.

> > and which we detect for futex locks too - so it shouldnt happen.
> 
> You mean that you have to run deadlock detection for all futexes to 
> avoid crashing the kernel? [...]

no. We have to run deadlock detection to avoid things like circular lock 
dependencies causing an infinite schedule+wakeup 'storm' during priority 
boosting. (like possible with your wakeup based method i think) Note 
that deadlock detection and priority boosting is 'merged', so there is 
no CPU overhead from it. There is no global lock for deadlock detection 
anymore, etc.

> Anyway: As far as I can see the deadlock happens in 
> adjust_prio_chain() no matter what the detect_deadlock argument is.  
> The bug is because you to lock current->pi_lock and owner->pi_lock at 
> the same time in different order. As far as I can see that can happen 
> from futex_lock_pi() as well.

ok, checking this.

> We are basicly back to the discussions I had last fall: Doing deadlock 
> detection and PI is almost the same thing. You have to somehow 
> traverse the list of locks. So to protect the kernel from crashing in 
> the PI code when futexes deadlock, you have to traverse the list of 
> locks without spin deadlocking to detect the futex deadlock. If you 
> can do that, you could just as well do the PI that way in the first 
> place.

we are doing it precisely that way - PI and deadlock detection is 
'merged'. We do it in one go, in adjust_prio_chain().

	Ingo

next prev parent reply	other threads:[~2006-03-14 10:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-12 22:02 2.6.16-rc6-rt1 Ingo Molnar
2006-03-12 23:17 ` 2.6.16-rc6-rt1 Michal Piotrowski
2006-03-12 23:36   ` 2.6.16-rc6-rt1 Michal Piotrowski
2006-03-13  6:43     ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-13  9:25   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-13 13:49     ` 2.6.16-rc6-rt1 Michal Piotrowski
2006-03-13  6:56 ` [PATCH] 2.6.16-rc6-rt1: Fix redefinition and unknown symbol Jan Altenberg
2006-03-13  9:11   ` Ingo Molnar
2006-03-13 13:21 ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-13 14:29   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-13 16:06     ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14  0:22   ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14  8:12     ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-14  8:33       ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-14 10:02       ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 10:18         ` Ingo Molnar [this message]
2006-03-14 22:02           ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 22:11             ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-14 22:22               ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-15 11:21                 ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 22:28               ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 10:58         ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-14 20:40           ` 2.6.16-rc6-rt1 Esben Nielsen
2006-03-14 22:00             ` 2.6.16-rc6-rt1 Thomas Gleixner
2006-03-14  1:54 ` 2.6.16-rc6-rt1 Steven Rostedt
2006-03-14  7:51   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-14  2:22 ` 2.6.16-rc6-rt1 Steven Rostedt
2006-03-14  8:00   ` 2.6.16-rc6-rt1 Ingo Molnar
2006-03-15 16:25 ` 2.6.16-rc6-rt1 John Richard Moser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060314101811.GA10450@elte.hu \
    --to=mingo@elte.hu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=simlo@phys.au.dk \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox