All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Morrissey <jwm@horde.net>
To: kvm@vger.kernel.org
Subject: Re: BUG() with SCSI-interfaced disk images
Date: Wed, 7 Jan 2009 15:19:18 -0500	[thread overview]
Message-ID: <20090107201918.GA12071@boost.horde.net> (raw)
In-Reply-To: <20081226210028.GA25127@boost.horde.net>

On Fri, Dec 26, 2008 at 04:00:28PM -0500, John Morrissey wrote:
> I'm encountering a kernel BUG() in guests using SCSI-interfaced disk
> images. I've tried with the Debian packaging of KVM 79 and 82; both
> exhibit the same behavior (disclaimer: Debian has about a dozen patches in
> their kvm packaging, but they all seem to be changes to the build/install
> process or security-related).

Not to be pushy, but does anyone have any ideas on this, or can I provide
any additional information? I'm afraid I'm a bit over my head when debugging
kernel internals.

john

> IDE-interfaced disk images seem fine. Host and guest are up-to-date Debian
> lenny (32-bit/i386) running kernel 2.6.26 (Debian
> linux-image-2.6.26-1-amd64 2.6.26-12).
> 
> After a few minutes of disk activity (fsck(8)ing a fairly empty ~20GB
> filesystem is a reliable trigger), the kernel BUGs (oops output below).
> 
> I was previously using KVM 72, and tried upgrading to 79 because both
> Debian lenny and Ubuntu hardy guests were panicing due to sym
> disconnects/timeouts. 79 makes the lenny guest start BUGging as described
> above. 82 is not perceivably different from 79 for the lenny guest.
> 
> FWIW, the upgrade to 79 allowed the Ubuntu hardy guest to stay up,
> although it emits:
> 
> Dec 25 00:28:51 vicar kernel: [106621.553272] sd 2:0:0:0: [sda] Sense Key : No Sense [current] 
> Dec 25 00:28:51 vicar kernel: [106621.553279] Info fld=0x0
> Dec 25 00:28:51 vicar kernel: [106621.553280] sd 2:0:0:0: [sda] Add. Sense: No additional sense information
> 
> at seemingly random intervals. The upgrade to 82 made the hardy guest
> start BUGging on soft lockups at random intervals (I can provide the full
> output if anyone's interested, but I'm much more interested in the lenny
> guest oops at this point).
> 
> john
> 
> 
> run via libvirt:
> /usr/bin/kvm -S -M pc -m 512 -smp 1 -name test -monitor pty \
> 	-boot c -drive file=image.qcow,if=scsi,index=0,boot=on
> 	-net nic,macaddr=00:0c:29:1e:ea:b9,vlan=0,model=e1000 \
> 	-net tap,fd=17,script=,vlan=0,ifname=vnet2 \
> 	-net nic,macaddr=00:0c:29:1e:ea:c3,vlan=1,model=e1000 \
> 	-net tap,fd=18,script=,vlan=1,ifname=vnet3 \
> 	-serial pty -parallel none -usb -vnc 0.0.0.0:1
> 
> [The KVMWiki asks whether the problem is reproducible with
>  -no-kvm-irqchip, -no-kvm-pit, or -no-kvm, but when I tried invoking the
>  above command line by hand (outside of libvirt), the VNC console was
>  always blank and there was no console output on the serial pty. If this
>  would be useful information to have in this case, I'd love to know what
>  I'm doing wrong, or if there's a way to specify additional command line
>  arguments with libvirt.]
> 
> oops generated in the guest:
> [  140.101828] sym0: unexpected disconnect
> [  140.102748] BUG: unable to handle kernel NULL pointer dereference at 00000358
> [  140.103818] IP: [<e08e2670>] :sym53c8xx:sym_int_sir+0x547/0x118f
> [  140.106449] *pdpt = 000000001f5f9001 *pde = 0000000000000000 
> [  140.107356] Oops: 0000 [#1] SMP 
> [  140.107864] Modules linked in: loop virtio_balloon psmouse pcspkr serio_raw i2c_piix4 i2c_core button evdev ext3 jbd mbcache sd_mod ide_cd_mod cdrom ata_generic libata dock ide_pci_generic floppy virtio_pci virtio_ring virtio sym53c8xx scsi_transport_spi scsi_mod e1000 uhci_hcd usbcore piix ide_core thermal processor fan thermal_sys
> [  140.108062] 
> [  140.108062] Pid: 131, comm: pdflush Not tainted (2.6.26-1-686-bigmem #1)
> [  140.108062] EIP: 0060:[<e08e2670>] EFLAGS: 00010287 CPU: 0
> [  140.108062] EIP is at sym_int_sir+0x547/0x118f [sym53c8xx]
> [  140.108062] EAX: 0000000a EBX: 00000000 ECX: 1f98c084 EDX: 00000030
> [  140.108062] ESI: df98c084 EDI: df98c000 EBP: df98c000 ESP: de0f3ba0
> [  140.108062]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [  140.108062] Process pdflush (pid: 131, ti=de0f2000 task=df48e520 task.ti=de0f2000)
> [  140.108062] Stack: 00000000 000144d6 7f5a222c c011a853 0021d496 00000000 00000000 00000000 
> [  140.108062]        00000000 df98c000 e08e08cd 00000000 00000000 00000001 00000000 df98c000 
> [  140.108062]        00000084 e08e3f2f df988c00 00000046 00000000 df544400 00000196 00000000 
> [  140.108062] Call Trace:
> [  140.108062]  [<c011a853>] pvclock_clocksource_read+0x4b/0xd0
> [  140.108062]  [<e08e08cd>] sym_recover_scsi_int+0xb3/0x10d [sym53c8xx]
> [  140.108062]  [<e08e3f2f>] sym_interrupt+0x3ee/0x5fd [sym53c8xx]
> [  140.108062]  [<e08df3dc>] sym53c8xx_intr+0x35/0x56 [sym53c8xx]
> [  140.108062]  [<c0158e4e>] handle_IRQ_event+0x23/0x51
> [  140.108062]  [<c0159f4d>] handle_fasteoi_irq+0x71/0xa4
> [  140.108062]  [<c010afd2>] do_IRQ+0x4d/0x63
> [  140.108062]  [<c01092a7>] common_interrupt+0x23/0x28
> [  140.108062]  [<c01300d8>] ptrace_request+0x1ec/0x278
> [  140.108062]  [<c012d0c6>] __do_softirq+0x57/0xd3
> [  140.108062]  [<c012d187>] do_softirq+0x45/0x53
> [  140.108062]  [<c012d43e>] irq_exit+0x35/0x67
> [  140.108062]  [<c01152b6>] smp_apic_timer_interrupt+0x6b/0x75
> [  140.108062]  [<c0109364>] apic_timer_interrupt+0x28/0x30
> [  140.108062]  [<c02c9953>] _spin_unlock_irqrestore+0x7/0x10
> [  140.108062]  [<e0865a94>] scsi_dispatch_cmd+0x197/0x205 [scsi_mod]
> [  140.108062]  [<e086ab2e>] scsi_request_fn+0x264/0x32a [scsi_mod]
> [  140.108063]  [<c01dcbd6>] __generic_unplug_device+0x1a/0x1c
> [  140.108063]  [<c01dd3e9>] __make_request+0x2fe/0x348
> [  140.108063]  [<c01dc008>] generic_make_request+0x34d/0x37b
> [  140.108063]  [<c015f9f1>] mempool_alloc+0x1c/0xba
> [  140.108063]  [<c01dd0e4>] submit_bio+0xc6/0xcd
> [  140.108063]  [<c019cdff>] bio_alloc_bioset+0x9b/0xf3
> [  140.108063]  [<c0199983>] submit_bh+0xcf/0xed
> [  140.108063]  [<c019b32e>] __block_write_full_page+0x1fa/0x2da
> [  140.108063]  [<c019eb73>] blkdev_get_block+0x0/0x43
> [  140.108063]  [<c019b4ef>] block_write_full_page+0xe1/0xea
> [  140.108063]  [<c019eb73>] blkdev_get_block+0x0/0x43
> [  140.108063]  [<c01626d5>] __writepage+0x8/0x21
> [  140.108063]  [<c0162b50>] write_cache_pages+0x16a/0x27b
> [  140.108063]  [<c01626cd>] __writepage+0x0/0x21
> [  140.108063]  [<c0162c61>] generic_writepages+0x0/0x21
> [  140.108063]  [<c0162c7b>] generic_writepages+0x1a/0x21
> [  140.108063]  [<c0162ca2>] do_writepages+0x20/0x30
> [  140.108063]  [<c0196525>] __writeback_single_inode+0x127/0x251
> [  140.108063]  [<c019691c>] sync_sb_inodes+0x17c/0x233
> [  140.108063]  [<c0196c93>] writeback_inodes+0x53/0x99
> [  140.108063]  [<c01638c1>] pdflush+0x0/0x1cc
> [  140.108063]  [<c016357c>] wb_kupdate+0x7b/0xdb
> [  140.108063]  [<c01639f0>] pdflush+0x12f/0x1cc
> [  140.108063]  [<c0163501>] wb_kupdate+0x0/0xdb
> [  140.108063]  [<c0138643>] kthread+0x38/0x5d
> [  140.108063]  [<c013860b>] kthread+0x0/0x5d
> [  140.108063]  [<c01094f3>] kernel_thread_helper+0x7/0x10
> [  140.108063]  =======================
> [  140.108063] Code: 93 4c 01 00 00 52 50 68 42 76 8e e0 eb 4e 8d 83 b0 00 00 00 e8 32 71 96 df 8d 93 4c 01 00 00 52 50 68 7c 76 8e e0 eb 59 8b 1c 24 <8b> 93 58 03 00 00 8b 82 84 00 00 00 8b 1a 8b 70 60 85 f6 74 29 
> [  140.108063] EIP: [<e08e2670>] sym_int_sir+0x547/0x118f [sym53c8xx] SS:ESP 0068:de0f3ba0
> [  140.162446] Kernel panic - not syncing: Fatal exception in interrupt
> 
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 23
> model name      : Intel(R) Xeon(R) CPU           L5420  @ 2.50GHz
> stepping        : 6
> cpu MHz         : 2500.087
> cache size      : 6144 KB
> [...]
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
> constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
> ssse3 cx16 xtpr dca sse4_1 lahf_lm
> bogomips        : 5000.23
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 38 bits physical, 48 bits virtual
> power management:

-- 
John Morrissey          _o            /\         ----  __o
jwm@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__

  reply	other threads:[~2009-01-07 20:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-26 21:00 BUG() with SCSI-interfaced disk images John Morrissey
2009-01-07 20:19 ` John Morrissey [this message]
2009-01-07 22:34   ` Ryan Harper
2009-01-08  2:13     ` John Morrissey
2009-01-08 14:01       ` Ryan Harper
2009-01-08 18:15         ` John Morrissey
2009-01-08 19:24 ` qcow2 corruption? John Morrissey
2009-01-08 20:10   ` Ryan Harper
2009-01-08 20:16     ` John Morrissey
2009-01-08 20:33   ` Anthony Liguori
2009-01-09  3:42     ` John Morrissey
2009-01-09 13:34       ` Ryan Harper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090107201918.GA12071@boost.horde.net \
    --to=jwm@horde.net \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.