From: Philippe Gerum <rpm@xenomai.org>
To: rpm@xenomai.org
Cc: xenomai-core <xenomai@xenomai.org>
Subject: Re: [Xenomai-core] local_irq_save/local_irq_restore in real-time interrupt handler and slab corruption.
Date: Tue, 13 Nov 2007 19:04:58 +0100 [thread overview]
Message-ID: <4739E74A.1080705@domain.hid> (raw)
In-Reply-To: <4739E29B.3080908@domain.hid>
Philippe Gerum wrote:
> Gilles Chanteperdrix wrote:
>> On Nov 13, 2007 6:10 PM, Philippe Gerum <rpm@xenomai.org> wrote:
>>> Gilles Chanteperdrix wrote:
>>>> On Nov 13, 2007 3:17 PM, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am chasing a slab corruption bug which happens on a Xenomai+RTnet
>>>>>> enabled box under heavy non real-time network load (which passes
>>>>>> through rtnet and rtmac_vnic to Linux which does NAT and resend it to
>>>>>> another rtmac_vnic). When reading some I-pipe tracer traces, I
>>>>>> remarked that I forgot to replace a local_irq_save/local_irq_restore
>>>>>> with local_irq_save_hw/local_irq_restore_hw in a real-time interrupt
>>>>>> handler. I fixed this bug, and the slab corruption seems to be gone.
>>>>> Hope you mean rtdm_lock_irqsave/irqrestore instead. Otherwise Xenomai's
>>>>> domain state would not be updated appropriately - which is at least unclean.
>>>> It is some low level secondary timer handling code, there is no rtdm
>>>> involved. The code protected by the interrupt masking routines is one
>>>> or two inline assembly instructions.
>>>>
>>>>> BTW, CONFIG_IPIPE_DEBUG_CONTEXT should have caught this bug as well.
>>>> I am using an old I-pipe pacth without CONFIG_IPIPE_DEBUG_CONTEXT.
>>>> I-pipe patch and Xenomai update is scheduled for when RT applications
>>>> and drivers porting will be finished.
>>>>
>>>> Besides the BUG_ON(!ipipe_root_domain_p) in ipipe_restore_root and
>>>> ipipe_unstall_root are unconditional.
>>>>
>>> What bothers me, is that even looking at the old 1.3 series here and on,
>>> the code should exhibit a call chain like
>>> local_irq_restore -> raw_local_irq_restore() -> __ipipe_restore_root ->
>>> __ipipe_unstall_root -> __ipipe_sync_stage, without touching the current
>>> domain pointer, which is ok, since well, it has to be right in the first
>>> place. If we were running over a real-time handler, then I assume the
>>> Xenomai domain was active. So BUG_ON() should have triggered if present
>>> in __ipipe_unstall_root.
>> I am using an I-pipe arm 1.5-04 (now that I have done cat
>> /proc/ipipe/version, I really feel ashamed). And it has no BUG_ON in
>> __ipipe_unstall_root or __ipipe_restore_root. I promise, one day, I
>> will switch to Xenomai 2.4.
>>
>>> Additionally, calling __ipipe_sync_pipeline() would sync the current
>>> stage, i.e. Xenomai, and run the real-time ISRs, not the Linux handlers.
>>>
>>> Mm, ok, in short: I have no clue.
>> The system runs stably, so I have to assume that calling
>> local_irq_restore in a real-time interrupt handler can cause slab
>> corruption. Strange.
>>
>
> I guess this is likely not on your critical path, but when time allows,
> I'd be interested to know whether such bug still occurs when using a
> purely kernel-only tasking, assuming that you currently see this bug
> with userland tasks. Basically, I wonder if migrating shadows between
> both domains would not reveal the bug, since your real-time handler
> starts being preemptible by hw IRQs as soon as it returns from
> __ipipe_unstall_root, which forces local_irq_enable_hw().
>
Well, the 1.5 series still has a deep log, so you would also have to
make sure that no IRQ is pending in the pipeline.
--
Philippe.
next prev parent reply other threads:[~2007-11-13 18:04 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-13 13:56 [Xenomai-core] local_irq_save/local_irq_restore in real-time interrupt handler and slab corruption Gilles Chanteperdrix
2007-11-13 14:17 ` Jan Kiszka
2007-11-13 14:34 ` Gilles Chanteperdrix
2007-11-13 17:10 ` Philippe Gerum
2007-11-13 17:24 ` Gilles Chanteperdrix
2007-11-13 17:44 ` Philippe Gerum
2007-11-13 17:50 ` Gilles Chanteperdrix
2007-11-13 18:02 ` Philippe Gerum
2007-11-26 10:42 ` Gilles Chanteperdrix
2007-11-13 18:04 ` Philippe Gerum [this message]
2007-11-13 17:45 ` Jan Kiszka
2007-11-13 17:51 ` Gilles Chanteperdrix
2007-11-13 17:54 ` Jan Kiszka
2007-11-13 18:33 ` Gilles Chanteperdrix
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4739E74A.1080705@domain.hid \
--to=rpm@xenomai.org \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.