From: Mark Hounschell <dmarkh@cfl.rr.com>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Mark Hounschell <markh@compro.net>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: floppy.c soft lockup
Date: Fri, 01 Jun 2007 13:11:42 -0400 [thread overview]
Message-ID: <4660534E.6050903@cfl.rr.com> (raw)
In-Reply-To: <20070601151605.GA108@tv-sign.ru>
Oleg Nesterov wrote:
> On 06/01, Mark Hounschell wrote:
>> Oleg Nesterov wrote:
>>> Yes, but see above. flush_scheduled_work() needs a cooperation from events/2
>>> which is bound to CPU 2.
>>>
>> Again I don't understand why flush_scheduled_work() running on behalf of a process
>> affinitized to processor-1 requires cooperation from events/2 (affinitized to processor-2)
>> when there is an events/1 already affinitized to processor 1?
>
> flush_workqueue() blocks until any scheduled work on any CPU has run to
> completion. If we have some work_struct pending on CPU 2, it can be completed
> only when events/2 executes it.
>
>>> If you changed irq/X/smp_affinity, the patch I sent should help, because
>>> floppy_work can't be scheduled on CPU 2, but still I don't think it is right
>>> to run 100% cpu-bound RT-process.
>> The patch you sent helps with no other intervention from me. But then so does
>> the patch mentioned in the original post. I am able to bang on the floppies pretty
>> hard doing all kinds of things with no trouble using either.
>
> This patch replaces flush_scheduled_work() with cancel_work_sync(). The latter
> can still hang if the floppy interrupt happens on CPU 2 and does schedule_bh(),
> events/2 starts running floppy_work->func() and preempted by RT-thread. This is
> very unlikely, but possible.
>
>>From your original post:
> >
> > The application only runs on SMP machines and uses process and irq affinities
>
> if the floppy interrupt can't happen on CPU 2, the above scenario is not possible.
>
All the irq affinities but one are set to processor-1. The only irq not
is from an rtom (Real-Time Option Module). It's irq is handled by
processor-2.
>> As far as a 100% cpu-bound RT-process goes, well I say I don't intentionally relinquish
>> the processor but it's not really 100% cpu-bound. Running xosview I see some spare time.
>
> Well, I don't know what is xosview, sorry :) so I don't understand what does
> "spare time" precisely mean. If this thread does some i/o or something which
> can sleep, then...
>
I don't understand the _real_ meaning of spare time either but xosview
is just a little graphical window showing information obtained from the
proc fs i think.
> OK. In that case we may have another reason for deadlock, say a pending
> floppy_work needs open_lock or test_and_set_bit(0, &fdc_busy).
>
> Could you apply the trivial patch below, and change the i/o thread to do
>
> prctl(1234); // hangs ???
> printf(something);
> ioctl(Q->DevSpec1, FDSETPRM, &medprm); // this hangs
>
> to see if prctl() hangs or not? This way we can narrow the problem.
> (of course, you can just kill the above ioctl() if this is possible).
>
> Thanks!
>
> Oleg.
>
> --- OLD/kernel/sys.c~ 2007-04-03 13:05:02.000000000 +0400
> +++ OLD/kernel/sys.c 2007-06-01 18:56:22.000000000 +0400
> @@ -2147,6 +2147,11 @@ asmlinkage long sys_prctl(int option, un
> {
> long error;
>
> + if (option == 1234) {
> + flush_scheduled_work();
> + return 0;
> + }
> +
> error = security_task_prctl(option, arg2, arg3, arg4, arg5);
> if (error)
> return error;
>
> -
Ok the prctl never returned. I just replaced the ioctl with it and added
a printf before and after. I only get the one before. The thread is hung
at this point just as if I'd done the ioctl?
Regards
Mark
next prev parent reply other threads:[~2007-06-01 17:12 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-29 17:31 floppy.c soft lockup Mark Hounschell
2007-05-31 5:46 ` Andrew Morton
2007-05-31 14:28 ` Mark Hounschell
2007-05-31 17:06 ` Oleg Nesterov
2007-05-31 18:01 ` Mark Hounschell
2007-05-31 18:44 ` Mark Hounschell
2007-05-31 19:22 ` Oleg Nesterov
2007-05-31 20:18 ` Mark Hounschell
2007-06-01 9:51 ` Mark Hounschell
2007-06-01 11:00 ` Oleg Nesterov
2007-06-01 14:10 ` Mark Hounschell
2007-06-01 15:16 ` Oleg Nesterov
2007-06-01 17:11 ` Mark Hounschell [this message]
2007-06-01 18:36 ` Oleg Nesterov
2007-06-01 19:52 ` Mark Hounschell
2007-06-02 12:30 ` Oleg Nesterov
2007-06-02 20:44 ` Mark Hounschell
2007-06-03 8:14 ` Oleg Nesterov
2007-06-04 14:00 ` Mark Hounschell
2007-06-06 13:12 ` Mark Hounschell
2007-06-06 17:28 ` Andrew Morton
2007-06-07 1:31 ` Matt Mackall
2007-06-07 10:18 ` Mark Hounschell
2007-06-07 14:25 ` Matt Mackall
2007-06-08 9:54 ` Mark Hounschell
2007-06-13 16:17 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4660534E.6050903@cfl.rr.com \
--to=dmarkh@cfl.rr.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=markh@compro.net \
--cc=mingo@elte.hu \
--cc=oleg@tv-sign.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox