All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Hounschell <markh@compro.net>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org
Subject: Re: floppy.c soft lockup
Date: Thu, 31 May 2007 14:01:00 -0400	[thread overview]
Message-ID: <465F0D5C.6020909@compro.net> (raw)
In-Reply-To: <20070531170604.GA79@tv-sign.ru>

Oleg Nesterov wrote:
> On 05/31, Mark Hounschell wrote:
>> Andrew Morton wrote:
>>> On Tue, 29 May 2007 13:31:05 -0400 Mark Hounschell <markh@compro.net> wrote:
>>>
>>>> Changes in floppy.c from 2.6.17 and 2.6.18 have broken an application I have. I have tracked 
>>>> it down to a single line of code. When the following patch is applied to the version in 2.6.18
>>>> my application works.
>>>>
>>>> --- linux-2.6.18/drivers/block/floppy.c 2006-09-19 23:42:06.000000000 -0400
>>>> +++ linux-2.6.18-crt/drivers/block/floppy.c     2007-05-29 09:12:20.000000000 -0400
>>>> @@ -893,7 +893,6 @@
>>>>                 set_current_state(TASK_RUNNING);
>>>>                 remove_wait_queue(&fdc_wait, &wait);
>>>>
>>>> -               flush_scheduled_work();
>>>>         }
>>>>         command_status = FD_COMMAND_NONE;
>>>>
>>> Interesting.  I'd expect that the calling process is spinning, with realtime
>>> policy and is expecting some other process to do something (ie: run a workqueue).
>>>
>>> If you keep the process and irq affinities, and disable the realtime policy
>>> does that also prevent the problem?
>>>
>> Yes it does.
>>
>>> It would be interesting it you could capture a few task traces while it is stuck:
>>> echo 1 > /proc/sys/kernel/sysrq then do ALT-SYSRQ-P a bunch of times and ALT-SYSRQ-T,
>>> see if you can work out where the CPU is stuck.
>>>
>> I've attached the syslog output as a result of doing the above. I can't really make any kind of
>> determination from it myself as I don't really knowing what I'm looking at.
> 
> Could you show the full output? There are no events/* or process doing ioctl()
> in sysrq.txt you attached.
> 
>>> ALso, 2.6.22-rc3 might have accidentally fixed this.
>>>
>> No. Same thing there.  The traces attached are using 2.6.22-rc3.
>>
>> Basically the main RT-process (which is a CPU bound process on processor-2) signals a
>> thread to do some I/O. That RT-thread (running on the other processor) does a simple 
> 
> If the main RT-process monopolizes processor-2, flush_workqueue() (or cancel_work_sync())
> can hang of course, we can do nothing.
> 
>> ioctl(Q->DevSpec1, FDSETPRM, &medprm)
>>
>> and there is no return from the call. That thread is hung.
> 
> What happens if you kill the main RT-process?
> 
> Could you try the patch below? Just to see if it makes any difference.
> 
> Oleg.
> 
> (against 2.6.22-rcX)
> 
> --- OLD/drivers/block/floppy.c~	2007-04-03 13:04:58.000000000 +0400
> +++ OLD/drivers/block/floppy.c	2007-05-31 20:50:18.000000000 +0400
> @@ -862,6 +862,8 @@ static void set_fdc(int drive)
>  		FDCS->reset = 1;
>  }
>  
> +static DECLARE_WORK(floppy_work, NULL);
> +
>  /* locks the driver */
>  static int _lock_fdc(int drive, int interruptible, int line)
>  {
> @@ -893,7 +895,7 @@ static int _lock_fdc(int drive, int inte
>  		set_current_state(TASK_RUNNING);
>  		remove_wait_queue(&fdc_wait, &wait);
>  
> -		flush_scheduled_work();
> +		cancel_work_sync(&floppy_work);
>  	}
>  	command_status = FD_COMMAND_NONE;
>  
> @@ -992,8 +994,6 @@ static void empty(void)
>  {
>  }
>  
> -static DECLARE_WORK(floppy_work, NULL);
> -
>  static void schedule_bh(void (*handler) (void))
>  {
>  	PREPARE_WORK(&floppy_work, (work_func_t)handler);
> 

The patch does make it work. Would you like for me to try again to get a trace with
something meaningful in it?

Regards
Mark

  reply	other threads:[~2007-05-31 18:00 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-29 17:31 floppy.c soft lockup Mark Hounschell
2007-05-31  5:46 ` Andrew Morton
2007-05-31 14:28   ` Mark Hounschell
2007-05-31 17:06     ` Oleg Nesterov
2007-05-31 18:01       ` Mark Hounschell [this message]
2007-05-31 18:44       ` Mark Hounschell
2007-05-31 19:22         ` Oleg Nesterov
2007-05-31 20:18           ` Mark Hounschell
2007-06-01  9:51             ` Mark Hounschell
2007-06-01 11:00             ` Oleg Nesterov
2007-06-01 14:10               ` Mark Hounschell
2007-06-01 15:16                 ` Oleg Nesterov
2007-06-01 17:11                   ` Mark Hounschell
2007-06-01 18:36                     ` Oleg Nesterov
2007-06-01 19:52                       ` Mark Hounschell
2007-06-02 12:30                         ` Oleg Nesterov
2007-06-02 20:44                           ` Mark Hounschell
2007-06-03  8:14                             ` Oleg Nesterov
2007-06-04 14:00                               ` Mark Hounschell
2007-06-06 13:12                                 ` Mark Hounschell
2007-06-06 17:28                                   ` Andrew Morton
2007-06-07  1:31                                     ` Matt Mackall
2007-06-07 10:18                                       ` Mark Hounschell
2007-06-07 14:25                                         ` Matt Mackall
2007-06-08  9:54                                           ` Mark Hounschell
2007-06-13 16:17                                         ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=465F0D5C.6020909@compro.net \
    --to=markh@compro.net \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@tv-sign.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.