All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@domain.hid>
To: rpm@xenomai.org
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-core] [Xenomai-help] Sporadic PC freeze after rt_task_start
Date: Thu, 19 Jul 2007 19:18:26 +0200	[thread overview]
Message-ID: <469F9CE2.9080603@domain.hid> (raw)
In-Reply-To: <1184861035.28303.108.camel@domain.hid>

[-- Attachment #1: Type: text/plain, Size: 3911 bytes --]

Philippe Gerum wrote:
> On Thu, 2007-07-19 at 17:35 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Thu, 2007-07-19 at 14:40 +0200, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>>> And when looking at the holders of rpilock, I think one issue could be
>>>>>> that we hold that lock while calling into xnpod_renice_root [1], ie.
>>>>>> doing a potential context switch. Was this checked to be save?
>>>>> xnpod_renice_root() does no reschedule immediately on purpose, we would
>>>>> never have been able to run any SMP config more than a couple of seconds
>>>>> otherwise. (See the NOSWITCH bit).
>>>> OK, then it's not the cause.
>>>>
>>>>>> Furthermore, that code path reveals that we take nklock nested into
>>>>>> rpilock [2]. I haven't found a spot for the other way around (and I hope
>>>>>> there is none)
>>>>> xnshadow_start().
>>>> Nope, that one is not holding nklock. But I found an offender...
>>> Gasp. xnshadow_renice() kills us too.
>> Looks like we are approaching mainline "qualities" here - but they have
>> at least lockdep (and still face nasty races regularly).
>>
> 
> We only have a 2-level locking depth at most, thare barely qualifies for
> being compared to the situation with mainline. Most often, the more
> radical the solution, the less relevant it is: simple nesting on very
> few levels is not bad, bugous nesting sequence is.
> 
>> As long as you can't avoid nesting or the inner lock only protects
>> really, really trivial code (list manipulation etc.), I would say there
>> is one lock too much... Did I mention that I consider nesting to be
>> evil? :-> Besides correctness, there is also an increasing worst-case
>> behaviour issue with each additional nesting level.
>>
> 
> In this case, we do not want the RPI manipulation to affect the
> worst-case of all other threads by holding the nklock. This is
> fundamentally a migration-related issue, which is a situation that must
> not impact all other contexts relying on the nklock. Given this, you
> need to protect the RPI list and prevent the scheduler data to be
> altered at the same time, there is no cheap trick to avoid this.
> 
> We need to keep the rpilock, otherwise we would have significantly large
> latency penalties, especially when domain migration are frequent, and
> yes, we do need RPI, otherwise the sequence for emulated RTOS services
> would be plain wrong (e.g. task creation).

If rpilock is known to protect potentially costly code, you _must not_
hold other locks while taking it. Otherwise, you do not win a dime by
using two locks, rather make things worse (overhead of taking two locks
instead of just one). That all relates to the worst case, of course, the
one thing we are worried about most.

In that light, the nesting nklock->rpilock must go away, independently
of the ordering bug. The other way around might be a different thing,
though I'm not sure if there is actually so much difference between the
locks in the worst case.

What is the actual _combined_ lock holding time in the longest
nklock/rpilock nesting path? Is that one really larger than any other
pre-existing nklock path? Only in that case, it makes sense to think
about splitting, though you will still be left with precisely the same
(rather a few cycles more) CPU-local latency. Is there really no chance
to split the lock paths?

> Ok, the rpilock is local, the nesting level is bearable, let's focus on
> putting this thingy straight.

The whole RPI thing, though required for some scenarios, remains ugly
and error-prone (including worst-case latency issues). I can only
underline my recommendation to switch off complexity in Xenomai when one
doesn't need it - which often includes RPI. Sorry, Philippe, but I think
we have to be honest to the users here. RPI remains problematic, at
least /wrt your beloved latency.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

  reply	other threads:[~2007-07-19 17:18 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-10  8:00 [Xenomai-help] Sporadic PC freeze after rt_task_start M. Koehrer
2007-07-10  8:40 ` Jan Kiszka
2007-07-10 12:29   ` M. Koehrer
2007-07-10 12:41     ` Jan Kiszka
2007-07-10 14:40       ` M. Koehrer
2007-07-10 15:34         ` Jan Kiszka
2007-07-11  6:43           ` M. Koehrer
2007-07-11  7:32             ` Jan Kiszka
2007-07-11 12:45               ` M. Koehrer
2007-07-11 14:47           ` Jan Kiszka
2007-07-13  7:27             ` M. Koehrer
2007-07-13  8:26               ` Jan Kiszka
2007-07-16  7:07                 ` M. Koehrer
2007-07-16 22:42                   ` Jan Kiszka
2007-07-19 10:58                     ` M. Koehrer
2007-07-19 11:27                       ` Jan Kiszka
2007-07-19 12:19                         ` Philippe Gerum
2007-07-19 12:40                           ` Jan Kiszka
2007-07-19 13:55                             ` [Xenomai-core] " Philippe Gerum
2007-07-19 15:14                             ` Philippe Gerum
2007-07-19 15:35                               ` Jan Kiszka
2007-07-19 16:03                                 ` Philippe Gerum
2007-07-19 17:18                                   ` Jan Kiszka [this message]
2007-07-19 18:24                                     ` Philippe Gerum
2007-07-19 20:15                                       ` Jan Kiszka
2007-07-19 21:35                                         ` Philippe Gerum
2007-07-20 14:20                                           ` Jan Kiszka
2007-07-20 18:33                                             ` Philippe Gerum
2007-07-21  8:49                                             ` Philippe Gerum
2007-07-22 16:44                                               ` Jan Kiszka
2007-07-19 17:57                                   ` Jan Kiszka
2007-07-21 20:15                                     ` Philippe Gerum
2007-07-20  7:03                               ` M. Koehrer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=469F9CE2.9080603@domain.hid \
    --to=jan.kiszka@domain.hid \
    --cc=rpm@xenomai.org \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.