All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pieter Palmers <pieterp@joow.be>
To: Lee Revell <rlrevell@joe-job.com>
Cc: Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: status of: tasklet_unlock_wait() causes soft lockup with -rt and ieee1394	audio
Date: Sun, 21 Jan 2007 20:30:45 -0300	[thread overview]
Message-ID: <45B3F7A5.2050708@joow.be> (raw)
In-Reply-To: <1152371924.4736.169.camel@mindpipe>

Dear all,

What is the status with respect to this problem? I see that in the 
current -rt patch the problematic code piece is different. I personally 
haven't tried to reproduce this myself on a more recent kernel, but I 
just got a report from one of our users who experienced the same problem 
with 2.6.19-rt15 and RT preemption (desktop preemption works fine).

Should the latest -rt patches be fixed with respect to this issue? If so 
I'll try and test them, otherwise I omit the effort.

Thanks,

Pieter

Lee Revell wrote:
> Pieter has found this bug in -rt:
> 
> We are experiencing 'soft' deadlocks when running our code (Freebob, 
> i.e. userspace lib for firewire audio) on RT kernels. After a few 
> seconds of system freeze, I get a kernel panic message that signals a soft lockup.
> 
> I've uploaded the photo's of the panic here:
> http://freebob.sourceforge.net/old/img_3378.jpg (without flash)
> http://freebob.sourceforge.net/old/img_3377.jpg (with flash)
> both are of suboptimal quality unfortunately, but all info is readable 
> on one or the other.
> 
> The problems occur when an ISO stream (receive and/or transmit) is shut 
> down in a SCHED_FIFO thread. More precisely when running the freebob 
> jackd backend in real-time mode. And even more precise: they only seem 
> to occur when jackd is shut down. There are no problems when jackd is 
> started without RT scheduling.
> 
> I havent been able to reproduce this with other test programs that are 
> shutting down streams in a SCHED_FIFO thread.
> 
> The problem is not reproducible on non-RT kernels, and it only occurs on those configured for 
> PREEMPT_RT. If I use PREEMPT_DESKTOP, there is no problem. The PREEMPT_DESKTOP setting was the only change between the two tests, all other kernel settings (threaded irq's etc...) were unchanged.
> 
> My tests are performed on 2.6.17-rt1, but the lockups are confirmed for 
> PREEMPT_RT configured kernels 2.6.14 and 2.6.16.
> 
> Some extra information:
> 
> Lee Revell wrote:
> 
>> <...>
>>
>> It seems that the -rt patch changes tasklet_kill:
>>
>> Unpatched 2.6.17:
>>
>> void tasklet_kill(struct tasklet_struct *t)
>> {
>>         if (in_interrupt())
>>                 printk("Attempt to kill tasklet from interrupt\n");
>>
>>         while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>>                 do
>>                         yield();
>>                 while (test_bit(TASKLET_STATE_SCHED, &t->state));
>>         }
>>         tasklet_unlock_wait(t);
>>         clear_bit(TASKLET_STATE_SCHED, &t->state);
>> }
>>
>> 2.6.17-rt:
>>
>> void tasklet_kill(struct tasklet_struct *t)
>> {
>>         if (in_interrupt())
>>                 printk("Attempt to kill tasklet from interrupt\n");
>>
>>         while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>>                 do                              msleep(1);
>>                 while (test_bit(TASKLET_STATE_SCHED, &t->state));
>>         }
>>         tasklet_unlock_wait(t);
>>         clear_bit(TASKLET_STATE_SCHED, &t->state);
>> }
>>
>> You should ask Ingo & the other -rt developers what the intent of this
>> change was.  Obviously it loops forever waiting for the state bit to
>> change.
> 
> On Thu, 2006-07-06 at 22:14 +0200, Pieter Palmers wrote:
> 
>>> I've put the debugging printk's into tasklet_kill. One interesting 
>>> remark is that now that they are in place, I had to start/stop jackd 
>>> multiple times before deadlock occurs. Without the printk's the machine 
>>> always locked up on the first pass. However I stopped after the first 
>>> lockup, so maybe this is not really significant.
>>>
>>> Anyway, the new tasklet_kill looks like this:
>>>
>>> void tasklet_kill(struct tasklet_struct *t)
>>> {
>>> 	printk("enter tasklet_kill\n");
>>> 	if (in_interrupt())
>>> 		printk("Attempt to kill tasklet from interrupt\n");
>>> 	
>>> 	printk("passed interrupt check\n");
>>>
>>> 	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>>> 		do
>>> 			msleep(1);
>>> 		while (test_bit(TASKLET_STATE_SCHED, &t->state));
>>> 	}
>>> 	printk("passed test_and_set_bit\n");
>>> 	
>>> 	tasklet_unlock_wait(t);
>>> 	printk("passed tasklet_unlock_wait\n");
>>> 	
>>> 	clear_bit(TASKLET_STATE_SCHED, &t->state);
>>> }
>>>
>>> And the last line printed before lockup is:
>>> "passed test_and_set_bit"
>>   
> This makes the change in tasklet_unlock_wait() as the prime suspect for this problem.
> 
> 
> 
> 
> 
> 
> 


  parent reply	other threads:[~2007-01-21 21:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-08 15:18 tasklet_unlock_wait() causes soft lockup with -rt and ieee1394 audio Lee Revell
2006-07-09  1:51 ` Steven Rostedt
2006-07-09  2:12   ` Lee Revell
2006-07-09 11:17     ` Pieter Palmers
2006-07-10 14:41       ` Daniel Walker
2007-01-21 23:30 ` Pieter Palmers [this message]
2007-01-22  9:10   ` status of: " Ingo Molnar
2007-01-25  9:04     ` Pieter Palmers
2007-01-27 10:19     ` Pieter Palmers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45B3F7A5.2050708@joow.be \
    --to=pieterp@joow.be \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rlrevell@joe-job.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.