public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Joerg Roedel <joerg.roedel@amd.com>,
	Sheng Yang <sheng@linux.intel.com>
Subject: Re: KVM guest crashes
Date: Mon, 26 Jan 2009 16:53:21 +0100	[thread overview]
Message-ID: <497DDC71.3000608@suse.de> (raw)
In-Reply-To: <20090123223644.GA4031@amt.cnet>

Marcelo Tosatti wrote:
> Hi Alexander,
>
> On Thu, Jan 22, 2009 at 09:29:46PM +0100, Alexander Graf wrote:
>
>   
>> Following the discussion on IRC, I tried -no-kvm-irqchip and found some
>> virtual machines broken after >1 day of stress testing again:
>>
>> + sudo -u contain2 env -i qemu-kvm -localtime -kernel virtio-kernel
>> -initrd virtio-initrd -nographic -append 'quiet clocksource=acpi_pm
>> cifsuser=contain2 cifspass=contain2 root=cifs://contain2:contain2@172.1
>> 6.2.1/contain2 realroot=//172.16.2.1/users/contain2
>> ip=172.16.2.2:172.16.2.1::255.255.255.0::eth0:none console=ttyS0
>> dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:2 -net
>> tap,ifname=tap2,sc
>> ript=/bin/true -m 2000 -nographic -smp 4 -no-kvm-irqchip /dev/null
>> qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000
>> Stuck ??
>> Stuck ??
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> IP: [<ffffffff802b539a>] kfree+0x18b/0x26e
>> PGD 0
>> Oops: 0000 [1] SMP
>> last sysfs file:
>> CPU 2
>> Modules linked in:
>> Supported: Yes
>> Pid: 0, comm: swapper Tainted: G S        2.6.27.7-9-default #1
>> RIP: 0010:[<ffffffff802b539a>]  [<ffffffff802b539a>] kfree+0x18b/0x26e
>> RSP: 0018:ffff88007a493e90  EFLAGS: 00010046
>> RAX: 0000000000000002 RBX: ffff8800010397f0 RCX: ffff88007a480778
>> RDX: ffffe20000000000 RSI: ffff8800010397f0 RDI: ffff88007a5ae140
>> RBP: 0000000000000000 R08: ffff8800010395d0 R09: ffff88007a493eb8
>> R10: ffffffff80a59980 R11: ffffffff8021c5d9 R12: 0000000000000001
>> R13: ffff88007ac04080 R14: 0000000010200042 R15: ffff88007a5ae140
>> FS:  0000000000000000(0000) GS:ffff88007a461f40(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process swapper (pid: 0, threadinfo ffff88007a48a000, task ffff88007a488280)
>> Stack:  ffffffff8023df9c ffffffff8073a108 0000000000000286 ffffffff8024a1eb
>>  ffffffff80259d80 ffff8800010397f0 0000000000000000 0000000000000001
>>  000000000000000a 0000000010200042 0000000000000010 ffffffff802831d0
>> Call Trace:
>>  [<ffffffff802831d0>] __rcu_process_callbacks+0x189/0x203
>>  [<ffffffff80283271>] rcu_process_callbacks+0x27/0x47
>>  [<ffffffff802464ed>] __do_softirq+0x84/0x115
>>  [<ffffffff8020dc9c>] call_softirq+0x1c/0x28
>>  [<ffffffff8020f067>] do_softirq+0x3c/0x81
>>  [<ffffffff80246204>] irq_exit+0x3f/0x83
>>  [<ffffffff8021ce5f>] smp_apic_timer_interrupt+0x95/0xae
>>  [<ffffffff8020d4a3>] apic_timer_interrupt+0x83/0x90
>>  [<ffffffff80221f1d>] native_safe_halt+0x2/0x3
>>  [<ffffffff80213465>] default_idle+0x38/0x54
>>  [<ffffffff8020b34a>] cpu_idle+0xa9/0xf1
>>
>>
>> Code: 01 00 00 00 e8 4c fa ff ff 48 83 3d a0 19 44 00 00 49 8b 44 dd 08
>> 48 8d 78 40 75 04 0f 0b eb fe e8 e5 cc f6 ff 90 e9 c7 00 00 00 <8b> 55
>> 00 3b 55 04 73 0f 89 d0 4c 89 7c c5 18 8d 42 01 e9 ad 00
>> RIP  [<ffffffff802b539a>] kfree+0x18b/0x26e
>>  RSP <ffff88007a493e90>
>> CR2: 0000000000000000
>> ---[ end trace 4eaa2a86a8e2da22 ]---
>>
>>
>> Also after two days of permanent stress testing I also got the Intel
>> machine w/ current git down:
>>
>> + sudo -u contain1 env -i /usr/local/bin/qemu-system-x86_64 -localtime
>> -kernel virtio-kernel -initrd virtio-initrd -nographic -append 'quiet
>> clocksource=acpi_pm cifsuser=contain1 cifspass=contain1
>> root=cifs://contain1:contain1@172.16.1.1/contain1
>> realroot=//172.16.1.1/users/contain1
>> ip=172.16.1.2:172.16.1.1::255.255.255.0::eth0:none console=ttyS0
>> dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:1 -net
>> tap,ifname=tap1,script=/bin/true -m 2000 -nographic -smp 8 /dev/null
>> qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000
>> Stuck ??
>>
>> No backtrace here though. That's all I got from the serial console.
>>
>> The only issues I had with the UP guests so far was this:
>>
>> + taskset -c 6 sudo -u contain6 env -i qemu-kvm -localtime -kernel
>> virtio-kernel -initrd virtio-initrd -nographic -append 'quiet
>> clocksource=acpi_pm cifsuser=contain6 cifspass=contain6
>> root=cifs://contain6:contain6@172.16.6.1/contain6
>> realroot=//172.16.6.1/users/contain6
>> ip=172.16.6.2:172.16.6.1::255.255.255.0::eth0:none console=ttyS0
>> dhcp=off builder=1' -net nic,model=virtio,macaddr=52:54:00:12:34:6 -net
>> tap,ifname=tap6,script=/bin/true -m 2000 -nographic /dev/null
>> qemu: loading initrd (0x1daf359 bytes) at 0x000000007b240000
>> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
>> Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with
>> apic=debug and send a report.  Then try booting with the 'noapic' option.
>>
>> which can be annoying at times too. Can't we just detect that it's the
>> detection and give the guest its interrupts? Or should the PIT
>> reinjection thing help here?
>>     
>
> There are a number of problems that can result in this error, and the
> problems are possibly different between the in-kernel PIT and userspace
> PIT emulation (note it also happens with in-kernel PIT, just much more
> rarely now). You can use the no_timer_check kernel option to bypass it.
>   

Hm - that option disables the whole check, making it always fail. I
haven't seen any way to actually disable the check, telling Linux things
are OK :-(.

Alex


  parent reply	other threads:[~2009-01-26 15:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-20 15:49 KVM guest crashes Alexander Graf
2009-01-20 20:07 ` Avi Kivity
2009-01-20 20:20   ` Alexander Graf
2009-01-21  8:14   ` Alexander Graf
2009-01-21  9:05     ` Avi Kivity
2009-01-21  9:36       ` Avi Kivity
2009-01-21 10:44         ` Alexander Graf
2009-01-22 20:29         ` Alexander Graf
2009-01-22 20:36           ` Alexander Graf
2009-01-22 20:55             ` Alexander Graf
2009-01-23 16:36               ` Alexander Graf
2009-01-23 22:36           ` Marcelo Tosatti
2009-01-24  7:42             ` Alexander Graf
2009-01-24 13:06               ` Marcelo Tosatti
2009-01-24 14:30                 ` Alexander Graf
2009-01-26 15:53             ` Alexander Graf [this message]
2009-01-26 16:21               ` Marcelo Tosatti
2009-01-26 16:33                 ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=497DDC71.3000608@suse.de \
    --to=agraf@suse.de \
    --cc=avi@redhat.com \
    --cc=joerg.roedel@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=sheng@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox