From: Jeff Wiegley <jeffw@cyte.com>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: amd64 cdrom access locks system
Date: Thu, 09 Jun 2005 08:36:24 -0700 [thread overview]
Message-ID: <42A861F8.9000301@cyte.com> (raw)
In-Reply-To: <20050608052354.7b70052c.akpm@osdl.org>
Andrew Morton wrote:
> Jeff Wiegley <jeffw@cyte.com> wrote:
>
>>I've been having this problem in 2.6.12-rc2 and 2.6.12-rc6.
>>
>> Any continued access to /dev/hda causes a complete and total
>> lock up of the system. Nothing is logged to /var/log/kernel
>> or /var/log/messages. Just a solid freeze.
>>
>> This happens with at least cdparanoia and cdrecord as well.
>>
>> The machine is an AMD64 FX55 CPU running in a shuttle
>> ST20G5 chassis.
>
>
> Can you identify an earlier kernel which worked OK?
>
> How locked up is it? Does sysrq-P not work? Is it pingable? Tried
> enabling the nmi watchdog?
Sorry for the length of this message. Based on you suggestions I
tried lots of various tests and although none of them fixed the
problem I at least have a wealth of information (possibly useless)
to report here...
When it locks up while running cdparanoia the machine is *not* pingable.
The NMI watchdog doesn't seem to be resetting anything. (Though I've
never used it before so maybe it's not enabled at all even though
I did compile in ACPI and passed nmi_watchdog=1 on the kernel
command line. (/proc/sys/kernel/unknown_nmi_panic is present, if that
helps)
sysrq-P does work. (I won't provide what it spat out since I have a
panic trace instead.)
Ah.. You need to be at a real console and not in X-windows to get
anything from sysrq. (I didn't know that.) anyhow, when I run
"cdparanoia -d /dev/hda 5" from a real console it fires up, copies
a few sectors from the drive and then panics with this... (I didn't
see the panic before because I was in X11)
warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interupts
rip default_idle+0x24/0x30
Falling back to HPET
divide error: 0000 [1] PREEMPT
CPU 0
Modules linked in: deflate zlib_deflate twofish serpent aes blowfish des
sha256 sha1 crypto_null xfrm_user xfrm4_tunnel ipcomp esp4 ah4 af_key
ipv6 alim15x3 ide_generic reiserfs tun thermal processor fan button ac
battery i2c_ali15x3 i2c_ali1535 i2c_core ehci_hcd usbhid ohci_hcd tg3
ohci1394 sbp2 ieee1394 psmouse ide_disk ide_cd ide_core sata_uli sr_mod
cdrom sd_mod sata_promise libata sg usb_storage scsi_mod unix
Pid: 0, comm: swapper Not tainted 2.6.12-rc6-jw9
RIP: 0010:[<ffffffff80112704>] <ffffffff80112704>{timer_interrupt+244}
RSP: 0000:ffffffff803a09c0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000010101
R10: 000000000000000e R11: 0000000000000000 R12: ffffffff803a0a38
R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff803e1fb0
FS: 00002aaaab1a2640(0000) GS:ffffffff803d9f40(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aaaab51f430 CR3: 0000000074704000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff803e0000, task ffffffff802f62c0)
Stack: 0000000000000086 ffffffff802f6e00 0000000000000000 ffffffff803a0a38
0000000000000000 ffffffff801547fc 0000000000000000 0000000000000000
ffffffff803da040 ffffffff802f6e00
Call Trace: <IRQ> <ffffffff801547fc>{handle_IRQ_event+44}
<ffffffff80154905>{__do_IRQ+213}
<ffffffff80111402>{do_IRQ+66} <ffffffff8010edbd>{ret_from_intr+0}
<ffffffff80138108>{__do_softirq+72}
<ffffffff801381a5>{do_softirq+53}
<ffffffff801382bc>{irq_exit+76} <ffffffff80111407>{do_IRQ+71}
<ffffffff8010edbd>{ret_from_intr+0} <EOI>
<ffffffff8010eeed>{retint_kernel+38}
<ffffffff8010c820>{default_idle+0}
<ffffffff8010c844>{default_idle+36}
<ffffffff8010c991>{cpu_idle+49} <ffffffff803e2863>{start_kernel+435}
<ffffffff803e223f>{x86_64_start_kernel+319}
Code: 48 f7 f6 48 01 05 ba 67 29 00 e9 dd 00 00 00 83 f8 03 75 0d
RIP <ffffffff80112704>{timer_interrupt+244} RSP <ffffffff803a09c0>
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
Ummm... I have no idea what to do with all of that; I hope it's
highly meaningful to you. From what I can read it looks like an
interrupt problem since the call trace is all IRQ-thingies.
Would it do any good to try a different physical CDRom drive? The
current one is listed as
hda: SONY DVD RW DRU-500A, ATAPI CD/DVD-ROM drive.
(I have had problems with it being unable to burn DVD-R under windows
(blech!) which it should be able to, so maybe the drive is
dead/misbehaving. But I don't want to rip apart my cases and swap
drives if it's a higher level IRQ/driver problem.)
I can use an external USB cdrom reader without problems.
By the way. This shuttle case/motherboard is relatively new and
it seems to have a lot of the chipsets in it that are not recognized
by the kernel:
lspci: [with recognized stuff snipped out]
0000:00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5950 (rev 01)
0000:00:01.0 PCI bridge: ATI Technologies Inc: Unknown device 5a3f
0000:00:06.0 PCI bridge: ATI Technologies Inc: Unknown device 5a38
0000:00:1d.0 0403: ALi Corporation: Unknown device 5461
0000:00:1e.0 ISA bridge: ALi Corporation: Unknown device 1573 (rev 31)
0000:00:1f.0 IDE interface: ALi Corporation M5229 IDE (rev c7)
0000:00:1f.1 RAID bus controller: ALi Corporation: Unknown device 5287
(rev 02)
0000:01:05.0 VGA compatible controller: ATI Technologies Inc: Unknown
device 5954
Maybe these unknown devices are causing problems with interrupt
assignments or service? I would be willing to help in any way possible
to get this case/motherboard fully supported but I'm not a chipset
or kernel guru. But I've got some time and I'm willing to reboot
and test as much as is needed.
Back to the CD burner. I think 2.6.8 worked and was the highest
working kernel. And if I remember correctly it worked really slow
(max speed 4 or 8) because DMA couldn't be enabled for the device. But
I can't confirm that now since I had to switch to 2.6.9 or higher to
get the ULI serial ATA driver. So now I can't boot 2.6.8 on this machine
because it won't find a root filesystem. (I borrowed a promise SATA
card to do the original install.)
Hmmm.. hdparm /dev/hda:
IO_support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 256 (on)
HDIO_GETGEO failed: Invalid argument
and I cannot manually enable DMA either (I kind of expected to be able
to)...
root@mail:~# hdparm -d 1 /dev/hda
/dev/hda:
setting using_dma to 1 (on)
HDIO_SET_DMA failed: Operation not permitted
using_dma = 0 (off)
Sorry I can't help more. The deepest I ever got in to the kernel
workings was a port mapped non-interrupt driven device driver for
an NTSC capture device (subject to NDA) in the 2.2 kernel. Everything
else is grand voodoo to me.
I hope you can help with this. Thanks.
--
Jeff Wiegley, PhD
Cyte.Com, LLC
(ignore:cea2d3a38843531c7def1deff59114de)
next prev parent reply other threads:[~2005-06-09 22:41 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-08 1:09 amd64 cdrom access locks system Jeff Wiegley
2005-06-08 12:23 ` Andrew Morton
2005-06-09 15:36 ` Jeff Wiegley [this message]
2005-06-09 23:00 ` Andrew Morton
2005-06-09 19:38 ` Jeff Wiegley
2005-06-09 21:58 ` Jeff Wiegley
2005-06-09 23:32 ` Venkatesh Pallipadi
2005-06-09 18:23 ` Jeff Wiegley
2005-06-13 16:35 ` Jeff Wiegley
2005-06-14 7:55 ` Bartlomiej Zolnierkiewicz
2005-06-14 10:35 ` Jeff Wiegley
2005-06-14 18:16 ` Bartlomiej Zolnierkiewicz
2005-12-15 9:15 ` Aric Cyr
[not found] <4d3Xi-33s-31@gated-at.bofh.it>
[not found] ` <4d7Rk-6fq-49@gated-at.bofh.it>
[not found] ` <4dE0F-77V-17@gated-at.bofh.it>
[not found] ` <4dEk0-7ua-1@gated-at.bofh.it>
[not found] ` <4dJWr-38Z-33@gated-at.bofh.it>
2005-06-11 16:02 ` Robert Hancock
-- strict thread matches above, loose matches on Subject: below --
2005-08-09 7:47 David C. Young
[not found] <S1750841AbWAQXWc/20060117232242Z+104@vger.kernel.org>
2006-01-18 0:31 ` Christer Bäckström
2006-01-18 9:18 ` Alan Cox
2006-01-18 12:15 ` Christer Bäckström
2006-01-18 10:01 ` Bartlomiej Zolnierkiewicz
2006-02-05 12:14 ` Erwin Rol
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42A861F8.9000301@cyte.com \
--to=jeffw@cyte.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox