All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Hounschell <markh@compro.net>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: floppy.c soft lockup
Date: Fri, 01 Jun 2007 10:10:35 -0400	[thread overview]
Message-ID: <466028DB.3060509@compro.net> (raw)
In-Reply-To: <20070601110058.GA83@tv-sign.ru>

Oleg Nesterov wrote:
> I hope Ingo will correct me if I am wrong,
> 
> On 05/31, Mark Hounschell wrote:
>> Oleg Nesterov wrote:
>>> So, the main question is: is it possible that one of RT processes/threads pins itself
>>> to some CPU and eats 100% cpu power?
>>>
>> The main process is pinned to a processor(2) with all _non-kernel_  processes/threads forced over to processor 1.
>> Any already affinitized processes or kernel threads are left as is. Only user land stuff is moved. The main process
>> is for sure _not_ relinquishing it's processor(2) intentionally.
> 
> This means that a non-rt kernel thread bound to CPU 2 can't run. In particular,
> events/2. This means that the problem is not directly connected to floppy.c,
> any flush_scheduled_work() (or schedule_on_each_cpu()) can't succeed.
> 

Well, I have multiple I/O threads for many other types of I/O that don't have any problems. 
And until these changes in 2.6.18 I didn't have any problems with the floppy. I have multiple
ethernet threads, multiple scsi (SG) device threads, multiple rs232 device threads, parallel port, 
and others, no problem??


> You can change irq/X/smp_affinity, but smp_apic_timer_interrupt() still can
> queue work_struct on CPU 2 (for example, mm/slab.c uses per-cpu reap_work).
> Since events/2 is blocked by the main RT thread, such a work_struct can't be
> executed, and so flush_scheduled_work() hangs.
> 

I don't mean to sound stupid but why would a process running on processor 1 require anything
from events/2 when there is an events/1? Forgive my ignorance please.

>> All the I/O threads, floppy included, are running
>> on the other processor(1). During this failure only 1 or 2 of the I/O threads are actually doing anything.
>> I assume that what ever is going on in the kernel/floppy driver on behalf of the floppy thread is being done on processor 1? 
>> Processor 1 has lots of CPU time available.
> 
> Yes, but see above. flush_scheduled_work() needs a cooperation from events/2
> which is bound to CPU 2.
> 

Again I don't understand why flush_scheduled_work() running on behalf of a process
affinitized to processor-1 requires cooperation from events/2 (affinitized to processor-2)
when there is an events/1 already affinitized to processor 1? Again though, Forgive my 
ignorance please.

> If you changed irq/X/smp_affinity, the patch I sent should help, because
> floppy_work can't be scheduled on CPU 2, but still I don't think it is right
> to run 100% cpu-bound RT-process.
> 
> Oleg.
> 

The patch you sent helps with no other intervention from me. But then so does 
the patch mentioned in the original post.  I am able to bang on the floppies pretty
hard doing all kinds of things with no trouble using either. 

As far as a 100% cpu-bound RT-process goes, well I say I don't intentionally relinquish
the processor but it's not really 100% cpu-bound. Running xosview I see some spare time. 

Thanks
Mark


  reply	other threads:[~2007-06-01 14:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-29 17:31 floppy.c soft lockup Mark Hounschell
2007-05-31  5:46 ` Andrew Morton
2007-05-31 14:28   ` Mark Hounschell
2007-05-31 17:06     ` Oleg Nesterov
2007-05-31 18:01       ` Mark Hounschell
2007-05-31 18:44       ` Mark Hounschell
2007-05-31 19:22         ` Oleg Nesterov
2007-05-31 20:18           ` Mark Hounschell
2007-06-01  9:51             ` Mark Hounschell
2007-06-01 11:00             ` Oleg Nesterov
2007-06-01 14:10               ` Mark Hounschell [this message]
2007-06-01 15:16                 ` Oleg Nesterov
2007-06-01 17:11                   ` Mark Hounschell
2007-06-01 18:36                     ` Oleg Nesterov
2007-06-01 19:52                       ` Mark Hounschell
2007-06-02 12:30                         ` Oleg Nesterov
2007-06-02 20:44                           ` Mark Hounschell
2007-06-03  8:14                             ` Oleg Nesterov
2007-06-04 14:00                               ` Mark Hounschell
2007-06-06 13:12                                 ` Mark Hounschell
2007-06-06 17:28                                   ` Andrew Morton
2007-06-07  1:31                                     ` Matt Mackall
2007-06-07 10:18                                       ` Mark Hounschell
2007-06-07 14:25                                         ` Matt Mackall
2007-06-08  9:54                                           ` Mark Hounschell
2007-06-13 16:17                                         ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=466028DB.3060509@compro.net \
    --to=markh@compro.net \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@tv-sign.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.