From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: "xenomai@xenomai.org" <xenomai@xenomai.org>
Subject: Re: [Xenomai-help] kernel oopses when killing realtime task
Date: Fri, 08 Oct 2010 11:00:05 +0200 [thread overview]
Message-ID: <1286528405.13186.106.camel@domain.hid> (raw)
In-Reply-To: <1286528239.13186.104.camel@domain.hid>
On Fri, 2010-10-08 at 10:57 +0200, Philippe Gerum wrote:
> On Fri, 2010-10-08 at 10:41 +0200, Jan Kiszka wrote:
> > Am 08.10.2010 10:17, Philippe Gerum wrote:
> > > On Fri, 2010-10-08 at 09:01 +0200, Pavel Machek wrote:
> > >> Hi!
> > >>
> > >>>> I have... quite an interesting setup here.
> > >>>>
> > >>>> SMP machine, with special PCI card; that card has GPIOs and serial
> > >>>> ports. Unfortunately, there's only one interrupt, shared between
> > >>>> serials and GPIO pins, and serials are way too complex to be handled
> > >>>> by realtime layer.
> > >>>>
> > >>>> So I ended up with
> > >>>>
> > >>>> // we also have an interrupt handler:
> > >>>> ret = rtdm_irq_request(&my_context->irq_handle,
> > >>>> gpio_rt_config.irq, demo_interrupt,
> > >>>> RTDM_IRQTYPE_SHARED,
> > >>>> context->device->proc_name, my_context);
> > >>>>
> > >>>> and
> > >>>>
> > >>>> static int demo_interrupt(rtdm_irq_t *irq_context)
> > >>>> {
> > >>>> struct demodrv_context *ctx;
> > >>>> int dev_id;
> > >>>> int ret = RTDM_IRQ_HANDLED; // usual return value
> > >>>> unsigned pending, output;
> > >>>>
> > >>>> ctx = rtdm_irq_get_arg(irq_context, struct demodrv_context);
> > >>>> dev_id = ctx->dev_id;
> > >>>>
> > >>>> if (!ctx->ready) {
> > >>>> printk(KERN_CRIT "Unexpected interrupt\n");
> > >>>> return XN_ISR_PROPAGATE;
> > >>>
> > >>> Who sets ready and when? Looks racy.
> > >>
> > >> Debugging aid; yes, this one is racy.
> > >>
> > >>>> rtdm_lock_put(&ctx->lock);
> > >>>>
> > >>>> /* We need to propagate the interrupt, so that PMC-6L serials
> > >>>> work. Result is that interrupt latencies can't be
> > >>>> guaranteed when serials are in use. */
> > >>>>
> > >>>> return RTDM_IRQ_HANDLED;
> > >>>> }
> > >>>>
> > >>>> Unregistration is:
> > >>>> my_context->ready = 0;
> > >>>> rtdm_irq_disable(&my_context->irq_handle);
> > >>>
> > >>> Where is rtdm_irq_free? Again, this ready flag looks racy.
> > >>
> > >> Aha, sorry, I quoted wrong snippet. rtdm_irq_free() follows
> > >> immediately, like this:
> > >>
> > >> int demo_close_rt(struct rtdm_dev_context *context,
> > >> rtdm_user_info_t *user_info)
> > >> {
> > >> struct demodrv_context *my_context;
> > >> rtdm_lockctx_t lock_ctx;
> > >> // get the context
> > >> my_context = (struct demodrv_context *)context->dev_private;
> > >>
> > >> // if we need to do some stuff with preemption disabled:
> > >> rtdm_lock_get_irqsave(&my_context->lock, lock_ctx);
> > >>
> > >> my_context->ready = 0;
> > >> rtdm_irq_disable(&my_context->irq_handle);
> > >>
> > >>
> > >> // free irq in RTDM
> > >> rtdm_irq_free(&my_context->irq_handle);
> > >>
> > >> // destroy our interrupt signal/event
> > >> rtdm_event_destroy(&my_context->irq_event);
> > >>
> > >> // other stuff here
> > >> rtdm_lock_put_irqrestore(&my_context->lock, lock_ctx);
> > >>
> > >> return 0;
> > >> }
> > >>
> > >> Now... I'm aware that lock_get/put around irq_free should be
> > >> unneccessary, as should be irq_disable and my ->ready flag. Those were
> > >> my attempts to work around the problem. I'll attach the full source at
> > >> the end.
> > >>
> > >>>> Unfortunately, when the userspace app is ran and killed repeatedly (so
> > >>>> that interrupt is registered/unregistered all the time), I get
> > >>>> oopses in __ipipe_dispatch_wired() -- it seems to call into the NULL
> > >>>> pointer.
> > >>>>
> > >>>> I decided that "wired" interrupt when the source is shared between
> > >>>> Linux and Xenomai, is wrong thing, so I disable "wired" interrupts
> > >>>> altogether, but that only moved oops to __virq_end.
> > >>>
> > >>> This is wrong. The only way to get a determistically shared IRQs across
> > >>> domains is via the wired path, either using the pattern Gilles cited or,
> > >>> in a slight variation, signaling down via a separate rtdm_nrtsig.
> > >>
> > >> For now, I'm trying to get it not to oops; deterministic latencies are
> > >> the next topic :-(.
> > >
> > > The main issue is that we don't lock our IRQ descriptors (the pipeline
> > > ones) when running the handlers, so another CPU clearing them via
> > > ipipe_virtualize_irq() may well sink the boat...
> > >
> > > The unwritten rule has always been to assume that drivers would stop
> > > _and_ drain interrupts on all CPUs before unregistering handlers, then
> > > exiting the code. Granted, that's a bit much.
> >
> > IIRC, we drain at nucleus-level if statistic are enabled. I guess we
> > should make this unconditional.
>
> Draining is currently performed after the descriptor release via
> rthal_irq_release() in this code, and it depends on the stat counters to
> determine whether the IRQ handler is still running on any CPU it seems.
> A saner way would be to define a draining service in the pipeline, and
> have rtdm_irq_free() invoke it early.
s,rtdm_irq_free,xnintr_detach,
>
--
Philippe.
next prev parent reply other threads:[~2010-10-08 9:00 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-07 11:57 [Xenomai-help] kernel oopses when killing realtime task Pavel Machek
2010-10-07 12:11 ` Gilles Chanteperdrix
2010-10-07 13:00 ` Gilles Chanteperdrix
2010-10-07 12:32 ` Jan Kiszka
2010-10-08 7:01 ` Pavel Machek
2010-10-08 7:20 ` Gilles Chanteperdrix
2010-10-08 8:17 ` Philippe Gerum
2010-10-08 8:41 ` Jan Kiszka
2010-10-08 8:57 ` Philippe Gerum
2010-10-08 9:00 ` Philippe Gerum [this message]
2010-10-08 9:41 ` Philippe Gerum
2010-10-13 9:03 ` Pavel Machek
2010-10-13 9:16 ` Philippe Gerum
2010-10-13 9:26 ` Pavel Machek
2010-10-13 14:52 ` Philippe Gerum
2010-10-25 16:48 ` Philippe Gerum
2010-10-25 18:10 ` Jan Kiszka
2010-10-25 19:08 ` Philippe Gerum
2010-10-25 19:11 ` Philippe Gerum
2010-10-25 19:15 ` Jan Kiszka
2010-10-25 19:20 ` Philippe Gerum
2010-10-25 19:22 ` Jan Kiszka
2010-10-25 21:12 ` Philippe Gerum
2010-10-25 21:22 ` Jan Kiszka
2010-10-25 21:40 ` Philippe Gerum
2010-10-25 21:47 ` Jan Kiszka
2010-10-26 4:43 ` Philippe Gerum
2010-10-26 5:22 ` Jan Kiszka
2010-10-26 19:33 ` Jan Kiszka
2010-10-28 5:17 ` Philippe Gerum
2010-10-28 7:31 ` Jan Kiszka
2010-10-28 7:38 ` Jan Kiszka
2010-10-28 7:46 ` Philippe Gerum
2010-11-07 15:15 ` Philippe Gerum
2010-11-07 16:22 ` Jan Kiszka
2010-11-07 16:55 ` Philippe Gerum
2010-11-07 16:59 ` Philippe Gerum
2010-11-07 17:19 ` Philippe Gerum
2010-11-09 8:01 ` Jan Kiszka
2010-11-09 8:26 ` Philippe Gerum
2010-11-09 8:39 ` Jan Kiszka
2010-11-09 9:36 ` Philippe Gerum
2010-11-09 13:12 ` Jan Kiszka
2010-11-12 8:48 ` Philippe Gerum
2010-11-12 9:14 ` Jan Kiszka
2010-11-12 13:57 ` Philippe Gerum
2010-11-12 14:30 ` Jan Kiszka
2010-11-12 17:42 ` Philippe Gerum
2010-11-12 18:42 ` Jan Kiszka
2010-11-14 21:28 ` Philippe Gerum
2010-10-07 14:07 ` Philippe Gerum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1286528405.13186.106.camel@domain.hid \
--to=rpm@xenomai.org \
--cc=jan.kiszka@domain.hid \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.