* [Qemu-devel] Intermittant linux kernel panic on arm
@ 2007-03-07 18:14 Quentin Barnes
2007-03-10 0:05 ` Rob Landley
0 siblings, 1 reply; 3+ messages in thread
From: Quentin Barnes @ 2007-03-07 18:14 UTC (permalink / raw)
To: qemu-devel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=unknown-8bit; format=flowed, Size: 8808 bytes --]
This is my first post to the list. Hopefully, it will go well.
I've been using the ARM qemu for Linux development for some basic
work, but wanted to expand and do more with it. I outgrew the
initrd limitation and needed a disk. Since I've been using a disk,
I've been getting intermittant panics during the udevd phase of
boot. Once the system is up though, it's been stable.
The panic occurs about 50%-75% the time. I've had this problem on both
0.9.0 and the 2007-03-07_05 snapshot. I've had this problem using both
my own 2.6.19-1 ARM kernel made directly from kernel.org as well as
Aurelien Jarno's 2.6.18 Debian ARM kernel at
http://people.debian.org/~aurel32/arm-versatile/vmlinuz-2.6.18-4-versatile
My standard invocation:
$ qemu-system-arm -M versatilepb -k en-us -kernel zImage \
-initrd initrd.img-2.6.18-4-versatile -hda hda.img \
-monitor stdio -append "root=/dev/sda1"
My host system is redhat FC6 with a Linux 2.6.19-1 i686 kernel. My
"disk" is a qcow'd 20GB image.
Based on the stack traceback, I thought the panic might have to do
with the SCSI chip emulation not doing residuals correctly, so I
built a kernel with SYM_SETUP_RESIDUAL_SUPPORT set to 0. That didn't
change anything.
I've googled around and checked the mailing list archives and can't
find anything like this. Why haven't other people seen this?
Am I doing something unusual or wrong?
Any thoughts or ideas to try?
Quentin
Here's the panic specifics:
=================================
Gdb attached to qemu in gdbserver mode with breakpoint on "sym_evaluate_dp":
=====
Breakpoint 1, sym_evaluate_dp (np=0xffd00000, cp=0xffd00c00, scr=1342180112,
ofs=0xc09dd9fc) at drivers/scsi/sym53c8xx_2/sym_hipd.c:3570
3570 in drivers/scsi/sym53c8xx_2/sym_hipd.c
(gdb) c
Continuing.
Breakpoint 1, sym_evaluate_dp (np=0xffd00000, cp=0xffd04c00, scr=1342180112,
ofs=0xc063b9fc) at drivers/scsi/sym53c8xx_2/sym_hipd.c:3570
3570 in drivers/scsi/sym53c8xx_2/sym_hipd.c
(gdb) c
=====
After this continue, the kernel panics with fault to ffd05a98.
cp=0xffd04c00 is different this time. All other couple of dozen
times it is entered with the same value, cp=0xffd00c00.
Boot console output:
=====
PCI: enabling device 0000:00:0c.0 (0140 -> 0143)
sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 27
sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK 0.9. PQ: 0 ANSI: 3
target0:0:0: tagged command queuing enabled, command queue depth 16.
target0:0:0: Beginning Domain Validation
target0:0:0: Domain Validation skipping write tests
target0:0:0: Ending Domain Validation
scsi 0:0:2:0: CD-ROM QEMU QEMU CD-ROM 0.9. PQ: 0 ANSI: 3
target0:0:2: tagged command queuing enabled, command queue depth 16.
target0:0:2: Beginning Domain Validation
target0:0:2: Domain Validation skipping write tests
target0:0:2: Ending Domain Validation
SCSI device sda: 41943040 512-byte hdwr sectors (21475 MB)
sda: Write Protect is off
sda: Mode Sense: 13 00 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 41943040 512-byte hdwr sectors (21475 MB)
sda: Write Protect is off
sda: Mode Sense: 13 00 00 00
SCSI device sda: drive cache: write back
sda: sda1 sda2 < sda5 >
sd 0:0:0:0: Attached scsi disk sda
[...]
INIT: version 2.86 booting
Starting the hotplug events dispatcher: udevd.
Synthesizing the initial hotplug events...done.
Waiting for /dev to be fully populated...Unable to handle kernel paging request at virtual address ffd05a98
pgd = c05a8000
[ffd05a98] *pgd=00a6d011, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1]
Modules linked in:
CPU: 0
PC is at sym_evaluate_dp+0x9c/0x184
LR is at 0xff0000da
pc : [<c01a97a0>] lr : [<ff0000da>] Not tainted
sp : c0bc79e4 ip : ffd05aa0 fp : c0bc79f4
r10: 00000000 r9 : 00000000 r8 : 000009f8
r7 : 0000027e r6 : c7da6180 r5 : ffd00000 r4 : c0bc79fc
r3 : 00000ea0 r2 : 0000005f r1 : ffd04c00 r0 : 000001c9
Flags: nzCv IRQs off FIQs on Mode SVC_32 Segment user
Control: 3137
Table: 005A8000 DAC: 00000015
Process scsi_id (pid: 976, stack limit = 0xc0bc6258)
[... Stack contents removed ...]
Backtrace:
[<c01a9704>] (sym_evaluate_dp+0x0/0x184) from [<c01a99fc>] (sym_compute_residual+0x78/0xe4)
r4 = FFD04C00
[<c01a9984>] (sym_compute_residual+0x0/0xe4) from [<c01acfb4>] (sym_interrupt+0xf8/0x18e0)
r4 = FFD04C00
[<c01acebc>] (sym_interrupt+0x0/0x18e0) from [<c01a7844>] (sym53c8xx_intr+0x3c/0x6c)
[<c01a7808>] (sym53c8xx_intr+0x0/0x6c) from [<c005c458>] (handle_IRQ_event+0x44/0x84)
r5 = 00000000 r4 = C7D79E60
[<c005c414>] (handle_IRQ_event+0x0/0x84) from [<c005dba8>] (handle_level_irq+0xac/0x104)
r7 = C7DA4800 r6 = 00000001 r5 = 0000001B r4 = C029A6C0
[<c005dafc>] (handle_level_irq+0x0/0x104) from [<c0023780>] (asm_do_IRQ+0x4c/0x68)
r5 = F1140000 r4 = 00000000
[<c0023734>] (asm_do_IRQ+0x0/0x68) from [<c02443f0>] (__irq_svc+0x30/0xa0)
r4 = FFFFFFFF
[<c019a760>] (scsi_dispatch_cmd+0x0/0x25c) from [<c019fa40>] (scsi_request_fn+0x250/0x31c)
r7 = C7DA03E8 r6 = C7DABC00 r5 = C7DA4800 r4 = C7DA4000
[<c019f7f0>] (scsi_request_fn+0x0/0x31c) from [<c013f974>] (elv_insert+0x80/0x1c4)
[<c013f8f4>] (elv_insert+0x0/0x1c4) from [<c013fb6c>] (__elv_add_request+0xb4/0xb8)
r7 = C0BC7CB0 r6 = 00000002 r5 = C7DABC00 r4 = C7DA03E8
[<c013fab8>] (__elv_add_request+0x0/0xb8) from [<c0142914>] (blk_execute_rq_nowait+0x80/0xa8)
r6 = 00000002 r5 = C7DABC00 r4 = C7DA03E8
[<c0142894>] (blk_execute_rq_nowait+0x0/0xa8) from [<c01429c4>] (blk_execute_rq+0x88/0xa8)
r6 = C7DA59E0 r5 = 00000000 r4 = C7DA03E8
[<c014293c>] (blk_execute_rq+0x0/0xa8) from [<c0146204>] (sg_io+0x28c/0x3b4)
[<c0145f78>] (sg_io+0x0/0x3b4) from [<c0146850>] (scsi_cmd_ioctl+0x1e4/0x41c)
[<c014666c>] (scsi_cmd_ioctl+0x0/0x41c) from [<c01b1520>] (sd_ioctl+0x90/0xc0)
[<c01b1490>] (sd_ioctl+0x0/0xc0) from [<c0144708>] (blkdev_driver_ioctl+0x50/0x5c)
[<c01446b8>] (blkdev_driver_ioctl+0x0/0x5c) from [<c0144eac>] (blkdev_ioctl+0x754/0x7b0)
r5 = BE992400 r4 = FFFFFDFD
[<c0144758>] (blkdev_ioctl+0x0/0x7b0) from [<c00a2464>] (block_ioctl+0x2c/0x30)
[<c00a2438>] (block_ioctl+0x0/0x30) from [<c0089060>] (do_ioctl+0x34/0x74)
[<c008902c>] (do_ioctl+0x0/0x74) from [<c0089304>] (vfs_ioctl+0x264/0x294)
r5 = BE992400 r4 = C0BA0D20
[<c00890a0>] (vfs_ioctl+0x0/0x294) from [<c0089374>] (sys_ioctl+0x40/0x64)
r7 = 00000036 r6 = 00002285 r5 = FFFFFFF7 r4 = C0BA0D20
[<c0089334>] (sys_ioctl+0x0/0x64) from [<c00228c0>] (ret_fast_syscall+0x0/0x2c)
r6 = 00016A88 r5 = 00000003 r4 = 00000006
Code: e35e0000 e2632060 aa00000f ea000008 (e53c3008)
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
=====
Contents of "cp" on entry to sym_evaluate_dp() just before it paniced:
=====
(gdb) print *cp
$6 = {phys = {head = {go = {start = 0x50000058, restart = 0x500004c0},
savep = 0x500007e8, lastp = 0x50000b10, status = "\000\204\000@"},
pm0 = {sg = {size = 0xffffff26, addr = 0x5840da}, ret = 0x50001338},
pm1 = {sg = {size = 0x0, addr = 0x0}, ret = 0x0}, select = {
sel_scntl4 = 0x0, sel_sxfer = 0x0, sel_id = 0x0, sel_scntl3 = 0x7},
smsg = {size = 0x8, addr = 0xa44f9c}, smsg_ext = {size = 0x0,
addr = 0x7d524de}, cmd = {size = 0x6, addr = 0xa44f5c}, sense = {
size = 0x0, addr = 0x0}, wresid = {size = 0x0, addr = 0x0}, data = {{
size = 0x0, addr = 0x0} <repeats 80 times>, {size = 0x2000,
addr = 0x7eb8000}, {size = 0x2000, addr = 0x7dce000}, {size = 0x2000,
addr = 0x7d6e000}, {size = 0x2000, addr = 0x7df0000}, {size = 0x2000,
addr = 0x7de4000}, {size = 0x2000, addr = 0x9d0000}, {size = 0x1000,
addr = 0x71f000}, {size = 0x1000, addr = 0x7c25000}, {size = 0x1000,
addr = 0xaff000}, {size = 0x1000, addr = 0xae2000}, {size = 0x1000,
addr = 0x73d000}, {size = 0x1000, addr = 0x722000}, {size = 0x1000,
addr = 0x6a2000}, {size = 0x1000, addr = 0x742000}, {size = 0x1000,
addr = 0x634000}, {size = 0xfe, addr = 0x584000}}}, cmd = 0xc7da6180,
cdb_buf = "\022\001\000\000þ\000\000\000\b\000\000\000\000\000\000",
sns_bbuf = '\0' <repeats 31 times>, data_len = 0xfe, segments = 0x1,
order = 0x20, odd_byte_adjustment = 0x0, nego_status = 0x1,
xerr_status = 0x0, extra_bytes = 0x0,
scsi_smsg = "À e\001\003\001\n\037\000\000\000",
scsi_smsg2 = '\0' <repeats 11 times>, sensecmd = "\000\000\000\000\000",
sv_scsi_status = 0x0, sv_xerr_status = 0x0, sv_resid = 0x0,
ccb_ba = 0xa44c00, tag = 0x32, target = 0x0, lun = 0x0, link_ccbh = 0x0,
link_ccbq = {flink = 0xffd004fc, blink = 0xffd004fc}, startp = 0x500007e8,
goalp = 0x500007f8, ext_sg = 0xffffffff, ext_ofs = 0x0, to_abort = 0x0,
tags_si = 0x1}
=====
=================================
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Intermittant linux kernel panic on arm
2007-03-07 18:14 [Qemu-devel] Intermittant linux kernel panic on arm Quentin Barnes
@ 2007-03-10 0:05 ` Rob Landley
2007-03-12 5:33 ` Quentin Barnes
0 siblings, 1 reply; 3+ messages in thread
From: Rob Landley @ 2007-03-10 0:05 UTC (permalink / raw)
To: qemu-devel; +Cc: Quentin Barnes
On Wednesday 07 March 2007 1:14 pm, Quentin Barnes wrote:
> This is my first post to the list. Hopefully, it will go well.
>
> I've been using the ARM qemu for Linux development for some basic
> work, but wanted to expand and do more with it. I outgrew the
> initrd limitation and needed a disk. Since I've been using a disk,
> I've been getting intermittant panics during the udevd phase of
> boot. Once the system is up though, it's been stable.
>
> The panic occurs about 50%-75% the time. I've had this problem on both
> 0.9.0 and the 2007-03-07_05 snapshot. I've had this problem using both
> my own 2.6.19-1 ARM kernel made directly from kernel.org as well as
> Aurelien Jarno's 2.6.18 Debian ARM kernel at
> http://people.debian.org/~aurel32/arm-versatile/vmlinuz-2.6.18-4-versatile
The first thing I'd do is try to figure out what's udev doing to trigger this
panic?
> r7 = 00000036 r6 = 00002285 r5 = FFFFFFF7 r4 = C0BA0D20
> [<c0089334>] (sys_ioctl+0x0/0x64) from [<c00228c0>]
(ret_fast_syscall+0x0/0x2c)
Is there any way you can figure out which ioctl this is? Presumably udev read
something from /sys that told it to mknod something in /dev. I'm not quite
sure where an ioctl comes into this...
If you could get a small C program that triggers the panic, and a
kernel .config you built your kernel with, that would be helpful.
Rob
--
Vista: Windows Millenium Second Edition
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Intermittant linux kernel panic on arm
2007-03-10 0:05 ` Rob Landley
@ 2007-03-12 5:33 ` Quentin Barnes
0 siblings, 0 replies; 3+ messages in thread
From: Quentin Barnes @ 2007-03-12 5:33 UTC (permalink / raw)
To: Rob Landley; +Cc: qemu-devel
On Fri, Mar 09, 2007 at 07:05:27PM -0500, Rob Landley wrote:
>On Wednesday 07 March 2007 1:14 pm, Quentin Barnes wrote:
>> This is my first post to the list. Hopefully, it will go well.
>>
>> I've been using the ARM qemu for Linux development for some basic
>> work, but wanted to expand and do more with it. I outgrew the
>> initrd limitation and needed a disk. Since I've been using a disk,
>> I've been getting intermittant panics during the udevd phase of
>> boot. Once the system is up though, it's been stable.
>>
>> The panic occurs about 50%-75% the time. I've had this problem on both
>> 0.9.0 and the 2007-03-07_05 snapshot. I've had this problem using both
>> my own 2.6.19-1 ARM kernel made directly from kernel.org as well as
>> Aurelien Jarno's 2.6.18 Debian ARM kernel at
>> http://people.debian.org/~aurel32/arm-versatile/vmlinuz-2.6.18-4-versatile
>
>The first thing I'd do is try to figure out what's udev doing to trigger this
>panic?
>
>> r7 = 00000036 r6 = 00002285 r5 = FFFFFFF7 r4 = C0BA0D20
>> [<c0089334>] (sys_ioctl+0x0/0x64) from [<c00228c0>]
>(ret_fast_syscall+0x0/0x2c)
>
>Is there any way you can figure out which ioctl this is?
>Presumably udev read something from /sys that told it to mknod
>something in /dev. I'm not quite sure where an ioctl comes into
>this...
>
>If you could get a small C program that triggers the panic, and a
>kernel .config you built your kernel with, that would be helpful.
I don't know if writing a small C program would trigger the panic.
The same ioctl happens earlier in the startup which doesn't panic.
However, I could still give it a try at some point if we have
no other ideas.
I ioctl is for an SG_IO which is doing a SCSI inquiry command:
==============
Breakpoint 3, scsi_dispatch_cmd (cmd=0xc7db7180) at drivers/scsi/scsi.c:475
475 struct Scsi_Host *host = cmd->device->host;
1: x/i $pc 0xc019e7f4 <scsi_dispatch_cmd+12>: ldr r1, [r0]
(gdb) print /c cmd->cmnd
$8 = {0x12, 0x1, 0x0, 0x0, 0xfe, 0x0 <repeats 11 times>}
==============
The SCSI inquiry command is properly formed and dispatched for a EVPD=1
to do a VPD read of 0x00.
It calls: sym_interrupt() -> sym_wakeup_done() -> sym_complete_ok().
In sym_complete_ok(), it executes:
if (cp->phys.head.lastp != cp->goalp)
resid = sym_compute_residual(np, cp);
cp->phys.head.lastp is 0x50000b10 and cp->goalp is 0x500007f8. Since
they're not equal, the driver thinks there is a residual.
cp->startp is 0x500007e8 which seems to make sense to me. I would
expect "lastp" to be between "startp" and "goalp", but it's not,
however, I'm just guessing here since I don't know SCSI at all.
Any ideas what might be wrong?
Partial contents of "cp" that leads up to panic:
==============
$17 = {phys = {head = {go = {start = 0x50000058, restart = 0x500004c0},
savep = 0x500007e8, lastp = 0x50000b10, status = "\000\204\000@"},
[...]
order = 0x20, odd_byte_adjustment = 0x0, nego_status = 0x1,
xerr_status = 0x0, extra_bytes = 0x0,
scsi_smsg = "\xc0 g\001\003\001\n\037\000\000\000",
scsi_smsg2 = '\0' <repeats 11 times>, sensecmd = "\000\000\000\000\000",
sv_scsi_status = 0x0, sv_xerr_status = 0x0, sv_resid = 0x0,
ccb_ba = 0x7db3c00, tag = 0x33, target = 0x0, lun = 0x0, link_ccbh = 0x0,
link_ccbq = {flink = 0xffd004fc, blink = 0xffd004fc}, startp = 0x500007e8,
goalp = 0x500007f8, ext_sg = 0xffffffff, ext_ofs = 0x0, to_abort = 0x0,
tags_si = 0x1}
==============
A strange thing to note is that this panic is only intermittent when
in graphics mode, but happens 100% of the time when qemu is in tty
console mode. If the boot makes it past this point, this system is
really stable. I've done hours of builds on it without it falling
over.
>Rob
>--
>Vista: Windows Millenium Second Edition
Quentin
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-03-12 5:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-07 18:14 [Qemu-devel] Intermittant linux kernel panic on arm Quentin Barnes
2007-03-10 0:05 ` Rob Landley
2007-03-12 5:33 ` Quentin Barnes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).