All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@domain.hid>
To: Philippe Gerum <rpm@xenomai.org>
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] kernel oopses when killing realtime task
Date: Fri, 12 Nov 2010 10:14:04 +0100	[thread overview]
Message-ID: <4CDD055C.6040103@domain.hid> (raw)
In-Reply-To: <1289551711.1937.108.camel@domain.hid>

[-- Attachment #1: Type: text/plain, Size: 6300 bytes --]

Am 12.11.2010 09:48, Philippe Gerum wrote:
> On Tue, 2010-11-09 at 14:12 +0100, Jan Kiszka wrote:
>> Am 09.11.2010 10:36, Philippe Gerum wrote:
>>> On Tue, 2010-11-09 at 09:39 +0100, Jan Kiszka wrote:
>>>> Am 09.11.2010 09:26, Philippe Gerum wrote:
>>>>> On Tue, 2010-11-09 at 09:01 +0100, Jan Kiszka wrote:
>>>>>> Am 07.11.2010 17:22, Jan Kiszka wrote:
>>>>>>> Am 07.11.2010 16:15, Philippe Gerum wrote:
>>>>>>>> The following patches implements the teardown approach. The basic idea
>>>>>>>> is:
>>>>>>>> - neither break nor improve old setups with legacy I-pipe patches not
>>>>>>>> providing the revised ipipe_control_irq call.
>>>>>>>> - fix the SMP race when detaching interrupts.
>>>>>>>
>>>>>>> Looks good.
>>>>>>
>>>>>> This actually causes one regression: I've just learned that people are
>>>>>> already happily using MSIs with Xenomai in the field. This is perfectly
>>>>>> fine as long as you don't fiddle with rtdm_irq_disable/enable in
>>>>>> non-root contexts or while hard IRQs are disable. The latter requirement
>>>>>> would be violated by this fix now.
>>>>>
>>>>> What we could do is handle this corner-case in the ipipe directly, going
>>>>> for a nop when IRQs are off on a per-arch basis only to please those
>>>>> users,
>>>>
>>>> Don't we disable hard IRQs also then the root domain is the only
>>>> registered one? I'm worried about pushing regressions around, then to
>>>> plain Linux use-cases of MSI (which are not broken in anyway - except
>>>> for powerpc).
>>>
>>> The idea is to provide an ad hoc ipipe service for this, to be used by
>>> the HAL. A service that would check the controller for the target IRQ,
>>> and handle MSI ones conditionally. For sure, we just can't put those
>>> conditionally bluntly into the chip mask handler and expect the kernel
>>> to be happy.
>>>
>>> In fact, we already have __ipipe_enable/disable_irq from the internal
>>> Adeos interface avail, but they are mostly wrappers for now. We could
>>> make them a bit more smart, and handle the MSI issue as well. We would
>>> then tell the HAL to switch to using those arch-agnostic helpers
>>> generally, instead of peeking directly into the chip controller structs
>>> like today.
>>
>> This belongs to I-pipe, like we already have ipipe_end, just properly
>> wrapped to avoid descriptor access. That's specifically important if we
>> want to emulate MSI masking in software. I've the generic I-pipe
>> infrastructure ready, but the backend, so far consisting of x86 MSI
>> hardening, unfortunately needs to be rewritten.
>>
>>>
>>> If that ipipe "feature" is not detected by the HAL, then we would
>>> refrain from disabling the IRQ in xnintr_detach. In effect, this would
>>> leave the SMP race window open, but since we need recent ipipes to get
>>> it plugged already anyway (for the revised ipipe_control_irq), we would
>>> still remain in the current situation:
>>> - old patches? no SMP race fix, no regression
>>> - new patches? SMP race fix avail, no regression
>>
>> Sounds good.
> 
> Now that I slept on it, I find the approach of working around pipeline
> limitations this way, to be incorrect.
> 
> Basically, the issue is that we still don't have 100% reliable handling
> of MSI interrupts (actually, we only have partial handling, and solely
> for x86), but this is no reason to introduce code in the pipeline
> interface which would perpetuate this fact. I see this as a "all or
> nothing" issue: either MSI is fully handled and there shall be no
> restriction on applying common operations such as masking/unmasking on
> the related IRQs, or it is not, and we should not export "conditionally
> working" APIs.
> 
> In the latter case, the responsibility to rely on MSI support belongs to
> the user, which then should know about the pending restrictions, and
> decides for himself whether to use MSI. So I'm heading to this solution
> instead:
> 
> - when detaching the last handler for a given IRQ, instead of forcibly
> disabling the IRQ line, the nucleus would just make sure that such IRQ
> is already in a disabled state, and bail out on error if not (probably
> with a kernel warning to make the issue obvious).

Fiddling with the IRQ "line" state is a workaround for the missing
synchronize_irq service in Xenomai/I-pipe. If we had this, all this
disabling become unneeded.

> 
> - track the IRQ line state from xnintr_enable/xnintr_disable routines,
> so that xnintr_detach can determine whether the call is legit. Of
> course, this also means that any attempt to take sideways to
> enable/disable nucleus managed interrupts at PIC level would break that
> logic, but doing so would be the root bug anyway.
> 
> The advantage of doing so would be three-fold:
> 
> - no pipeline code to acknowledge (or even perpetuate) the fact that MSI
> support is half working, half broken. We need to fix it properly, so
> that we can use it 100% reliably, from whatever context commonly allowed
> for enabling/disabling IRQs (and not "from root domain with IRQs on"
> only). Typically, I fail to see how one would cope with such limitation,
> if a real-time handler detects that some device is going wild and really
> needs to shut it down before the whole system crashes.

MSIs are edge-triggered. Only broken hardware continuously sending bogus
messages can theoretically cause troubles. In practice (ie. in absence
of broken HW), we see a single spurious IRQ at worst.

> 
> - we enforce the API usage requirement to disable an interrupt line with
> rtdm_irq_disable(), before eventually detaching the last IRQ handler for
> it, which is common sense anyway.

That's an easy-to-get-wrong API. It would apply to non-shared IRQs only
(aka MSIs). No-go IMHO.

> 
> - absolutely no change for people who currently rely on partial MSI
> support, provided they duly disable IRQ lines before detaching their
> last handler via the appropriate RTDM interface.
> 
> Can we deal on this?
> 

Nope, don't think so. The only option I see (besides using my original
proposal of a dummy handler for deregistering - still much simpler than
the current patches) is to emulate MSI masking in the same run, thus
providing solutions for both issues.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

  reply	other threads:[~2010-11-12  9:14 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-07 11:57 [Xenomai-help] kernel oopses when killing realtime task Pavel Machek
2010-10-07 12:11 ` Gilles Chanteperdrix
2010-10-07 13:00   ` Gilles Chanteperdrix
2010-10-07 12:32 ` Jan Kiszka
2010-10-08  7:01   ` Pavel Machek
2010-10-08  7:20     ` Gilles Chanteperdrix
2010-10-08  8:17     ` Philippe Gerum
2010-10-08  8:41       ` Jan Kiszka
2010-10-08  8:57         ` Philippe Gerum
2010-10-08  9:00           ` Philippe Gerum
2010-10-08  9:41     ` Philippe Gerum
2010-10-13  9:03       ` Pavel Machek
2010-10-13  9:16         ` Philippe Gerum
2010-10-13  9:26           ` Pavel Machek
2010-10-13 14:52             ` Philippe Gerum
2010-10-25 16:48               ` Philippe Gerum
2010-10-25 18:10                 ` Jan Kiszka
2010-10-25 19:08                   ` Philippe Gerum
2010-10-25 19:11                     ` Philippe Gerum
2010-10-25 19:15                     ` Jan Kiszka
2010-10-25 19:20                       ` Philippe Gerum
2010-10-25 19:22                         ` Jan Kiszka
2010-10-25 21:12                           ` Philippe Gerum
2010-10-25 21:22                             ` Jan Kiszka
2010-10-25 21:40                               ` Philippe Gerum
2010-10-25 21:47                                 ` Jan Kiszka
2010-10-26  4:43                                   ` Philippe Gerum
2010-10-26  5:22                                     ` Jan Kiszka
2010-10-26 19:33                                       ` Jan Kiszka
2010-10-28  5:17                                         ` Philippe Gerum
2010-10-28  7:31                                           ` Jan Kiszka
2010-10-28  7:38                                             ` Jan Kiszka
2010-10-28  7:46                                             ` Philippe Gerum
2010-11-07 15:15                                               ` Philippe Gerum
2010-11-07 16:22                                                 ` Jan Kiszka
2010-11-07 16:55                                                   ` Philippe Gerum
2010-11-07 16:59                                                   ` Philippe Gerum
2010-11-07 17:19                                                   ` Philippe Gerum
2010-11-09  8:01                                                   ` Jan Kiszka
2010-11-09  8:26                                                     ` Philippe Gerum
2010-11-09  8:39                                                       ` Jan Kiszka
2010-11-09  9:36                                                         ` Philippe Gerum
2010-11-09 13:12                                                           ` Jan Kiszka
2010-11-12  8:48                                                             ` Philippe Gerum
2010-11-12  9:14                                                               ` Jan Kiszka [this message]
2010-11-12 13:57                                                                 ` Philippe Gerum
2010-11-12 14:30                                                                   ` Jan Kiszka
2010-11-12 17:42                                                                     ` Philippe Gerum
2010-11-12 18:42                                                                       ` Jan Kiszka
2010-11-14 21:28                                                                         ` Philippe Gerum
2010-10-07 14:07 ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CDD055C.6040103@domain.hid \
    --to=jan.kiszka@domain.hid \
    --cc=rpm@xenomai.org \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.