public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Remy Bohmer <linux@bohmer.net>
To: linux-rt-users <linux-rt-users@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: 2.6.31-rt11 freeze on userland start on ARM
Date: Mon, 21 Sep 2009 20:36:34 +0200	[thread overview]
Message-ID: <3efb10970909211136g4e74c8b3vc339d548cdd0959f@mail.gmail.com> (raw)

Hi all,

I am integrating the 2.6.31-rt11 kernel on our ARM9 based (Atmel
at91sam9261) board.
Kernel boots fine but when userland starts the linuxrc process, and
the first 'echo' from the /etc/init.d/rcS script is printed to the
serial console (DBGU) the system locks up completely, from userland no
character ever makes it to the terminal.

I found the reason of the lockup and know a workaround, but I can use
some good suggestions to solve it the correct way.

What happens is that the kernel continuously schedules a IRQ-thread;
namely IRQ1-atmel_serial. And this IRQ thread keeps getting scheduled
forever...

Looking more closely I noticed that it is new compared to 2.6.24/26-RT
that a IRQ thread is started for this driver.
Notice that the DBGU interrupt is called the system-interrupt and it
is shared with the timer interrupt. The timer interrupt has IRQF_TIMER
set which incorporates IRQF_NODELAY. This is different compared to
2.6.24/26 where a sharing with a IRQF_NODELAY interrupt would make all
shared handlers also run in IRQF_NODELAY context.
As such we have here a interrupt handler running as NODELAY handler,
that is shared with a interrupt handler that runs in thread context.

So, as workaround/test I made this change:

Index: linux-2.6.31/drivers/serial/atmel_serial.c
===================================================================
--- linux-2.6.31.orig/drivers/serial/atmel_serial.c	2009-09-21
19:44:48.000000000 +0200
+++ linux-2.6.31/drivers/serial/atmel_serial.c	2009-09-21
19:45:15.000000000 +0200
@@ -808,7 +808,8 @@ static int atmel_startup(struct uart_por
 	/*
 	 * Allocate the IRQ
 	 */
-	retval = request_irq(port->irq, atmel_interrupt, IRQF_SHARED,
+	retval = request_irq(port->irq, atmel_interrupt,
+			IRQF_SHARED | IRQF_NODELAY,
 			tty ? tty->name : "atmel_serial", port);
 	if (retval) {
 		printk("atmel_serial: atmel_startup - Can't get irq\n");
---

This change makes the atmel-serial driver interrupt handler run as
IRQF_NODELAY handler again, just as on 2.6.24/26, and the board is
booting properly again with 2.6.31.
Anyone any ideas how to fix it properly? Or interested in more
debugging information. (I have an ETM tracer hooked up...)

Notice that this driver actually needs the NODELAY flag set on
preempt-RT to prevent missing characters with its 1 byte FIFO-hardware
without flow-control ;-)  (I will provide a clean patch later)
For now, at least it shows a bug in the new irq-threading mechanisms...

I also have a few related questions, besides investigating the
root-cause of this bug:
What is the rationale behind the per-driver irq-thread? What is the
gain here for RT? My first impression is that this would increase the
latencies in case of sharing interrupts with NODELAY interrupts. All
handlers need to run, so the master interrupt cannot be enabled again
until all IRQ-threads have run, so the NODELAY handler must wait until
all IRQ-threads have run. So, giving different prios to the
IRQ-threads that share the same source would increase the latencies
even more.
If different drivers share the same interrupt line, even additional
schedule overhead can be added to the latencies...
On first impression the former implementation seems more efficient. I
guess it is changed for a good reason, so, I must be missing something
here... I hope someone can explain...

Kind regards,

Remy

             reply	other threads:[~2009-09-21 18:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-21 18:36 Remy Bohmer [this message]
2009-09-23 23:06 ` 2.6.31-rt11 freeze on userland start on ARM Azraiyl
2009-09-24  8:26   ` Remy Bohmer
2009-09-24  9:27 ` yi li
2009-09-30 12:57 ` Thomas Gleixner
2009-09-30 16:11   ` Remy Bohmer
2009-10-04 11:59     ` Thomas Gleixner
2009-10-04 20:59       ` Remy Bohmer
2009-10-05 10:33         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3efb10970909211136g4e74c8b3vc339d548cdd0959f@mail.gmail.com \
    --to=linux@bohmer.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox