public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Reproducible SMP kernel deadlock in SCSI generic driver (sg)
@ 2002-05-04 22:08 Jurgen Botz
  0 siblings, 0 replies; 3+ messages in thread
From: Jurgen Botz @ 2002-05-04 22:08 UTC (permalink / raw)
  To: linux-kernel

The sg module reproducibly deadlocks the kernel for me after some time
of heavy I/O on an SMP system.  This appears to be true in /all/ kernel
versions... I can reproduce it very reliably now in 2.4.19-pre8 and
2.5.13, and I've had problems with CD ripping on my SMP workstation at
least throughout the 2.4 series (I just never fully investigated before).
The bug is almost certainly in sg.c; here is what I've narrowed down...

- Deadlock when ripping CDs using generic device under SMP after some
  amount of heavy I/O; higher transfer rate seems to make it happen
  sooner.
- Happens with SCSI CD drives as well as IDE drives using ide-scsi or
  USB drives using usb-storage.
- Deadlock happens sooner with usb-storage than with real SCSI device,
  but will eventually happen in either case.  In worst case I've seen
  the deadlock after ~100MB transferred, in the best case after ~3-4GB
  (i.e. ripped about 5-6 CDs).
- No deadlock when using sr device on SMP kernel
- No deadlock with sg or sr on UP kernel

Searching lkml didn't turn up any recent reports of anything like this,
but I suspect that's because not too many people are ripping CDs on
SMP systems these days... however, if anyone out there /does/ and
doesn't see lockups, please let me know.

:j

-- 
Jürgen Botz                       | While differing widely in the various
jurgen@botz.org                   | little bits we know, in our infinite
                                  | ignorance we are all equal. -Karl Popper



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Reproducible SMP kernel deadlock in SCSI generic driver (sg)
@ 2002-05-05 17:45 Douglas Gilbert
  2002-05-12 22:24 ` Jurgen Botz
  0 siblings, 1 reply; 3+ messages in thread
From: Douglas Gilbert @ 2002-05-05 17:45 UTC (permalink / raw)
  To: Jurgen Botz, linux-kernel

Jurgen Botz <jurgen@botz.org>
> The sg module reproducibly deadlocks the kernel for me after some time
> of heavy I/O on an SMP system.  This appears to be true in /all/ kernel
> versions... I can reproduce it very reliably now in 2.4.19-pre8 and
> 2.5.13, and I've had problems with CD ripping on my SMP workstation at
> least throughout the 2.4 series (I just never fully investigated before).
> The bug is almost certainly in sg.c; here is what I've narrowed down...

Jurgen,
Which version of cdparanoia (or whatever) are you using?
You mention deadlock, is the machine completely locked
up or is sg and the device inoperable? Since sg doesn't
take any "big" locks (e.g. io_request_lock) then it
shouldn't be able to lock up your machine without help
(from other drivers).

Assuming you can still execute commands on your box after the
"deadlock", I'm interested in WCHAN from ps. Here are some
ps variants:
  ps -eo cmd,wchan
  ps -eo fname,tty,pid,stat,pcpu,wchan
  ps -eo pid,stat,pcpu,nwchan,wchan=WIDE-WCHAN-COLUMN -o args
The line for cdparanoia would be useful.

BTW ps needs to find the correct System.map for
the WCHAN output to be relevant.

Doug Gilbert

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Reproducible SMP kernel deadlock in SCSI generic driver (sg)
  2002-05-05 17:45 Reproducible SMP kernel deadlock in SCSI generic driver (sg) Douglas Gilbert
@ 2002-05-12 22:24 ` Jurgen Botz
  0 siblings, 0 replies; 3+ messages in thread
From: Jurgen Botz @ 2002-05-12 22:24 UTC (permalink / raw)
  To: Douglas Gilbert; +Cc: linux-kernel

Douglas Gilbert wrote:
> Jurgen Botz <jurgen@botz.org>
> > The sg module reproducibly deadlocks the kernel for me after some time
> > of heavy I/O on an SMP system.  This appears to be true in /all/ kernel
> > versions... I can reproduce it very reliably now in 2.4.19-pre8 and

This appears to have been a false alarm... my humble appologies!

I'm not sure exactly what was going on, but it may be that I had
some miscompiled kernels... all kernels I was testing with when
I reported this were compiled on RedHat 7.2.93 (the skipjack
beta) with gcc-2.96.  A 2.4.19-pre8 SMP kernel compiled with 
gcc-3.0.4 does not exhibit this problem.

:j


-- 
Jürgen Botz                       | While differing widely in the various
jurgen@botz.org                   | little bits we know, in our infinite
                                  | ignorance we are all equal. -Karl Popper



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-05-12 22:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-05 17:45 Reproducible SMP kernel deadlock in SCSI generic driver (sg) Douglas Gilbert
2002-05-12 22:24 ` Jurgen Botz
  -- strict thread matches above, loose matches on Subject: below --
2002-05-04 22:08 Jurgen Botz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox