* EPT: Misconfiguration @ 2011-01-20 11:48 Ruben Kerkhof 2011-01-20 11:59 ` Ruben Kerkhof 2011-01-21 13:22 ` Marcelo Tosatti 0 siblings, 2 replies; 19+ messages in thread From: Ruben Kerkhof @ 2011-01-20 11:48 UTC (permalink / raw) To: kvm I'm suddenly getting lots of the following errors on a server running 2.36.7, but I have no idea what it means: 2011-01-20T12:41:18.358603+01:00 phy005 kernel: EPT: Misconfiguration. 2011-01-20T12:41:18.358621+01:00 phy005 kernel: EPT: GPA: 0x3dbff6b0 2011-01-20T12:41:18.358624+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x50743e007 level 4 2011-01-20T12:41:18.358627+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x523de2007 level 3 2011-01-20T12:41:18.358629+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x62336f007 level 2 2011-01-20T12:41:18.360109+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 2011-01-20T12:41:18.360137+01:00 phy005 kernel: ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 2011-01-20T12:41:18.360151+01:00 phy005 kernel: ------------[ cut here ]------------ 2011-01-20T12:41:18.360155+01:00 phy005 kernel: WARNING: at arch/x86/kvm/vmx.c:3425 handle_ept_misconfig+0x152/0x1d8 [kvm_intel]() 2011-01-20T12:41:18.360160+01:00 phy005 kernel: Hardware name: X8DTU 2011-01-20T12:41:18.363296+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_ filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt i2c_core ioatdma joydev iTCO_vendor_support serio_raw dca 3w_9xxx [last unloaded: scsi_wait_scan] 2011-01-20T12:41:18.363312+01:00 phy005 kernel: Pid: 3595, comm: qemu-kvm Tainted: G D W 2.6.34.7-66.tilaa.fc13.x86_64 #1 2011-01-20T12:41:18.363314+01:00 phy005 kernel: Call Trace: 2011-01-20T12:41:18.364385+01:00 phy005 kernel: [<ffffffff8104d11f>] warn_slowpath_common+0x7c/0x94 2011-01-20T12:41:18.364455+01:00 phy005 kernel: [<ffffffff8104d14b>] warn_slowpath_null+0x14/0x16 2011-01-20T12:41:18.364462+01:00 phy005 kernel: [<ffffffffa00ba7fb>] handle_ept_misconfig+0x152/0x1d8 [kvm_intel] 2011-01-20T12:41:18.364466+01:00 phy005 kernel: [<ffffffffa00bb401>] vmx_handle_exit+0x204/0x23a [kvm_intel] 2011-01-20T12:41:18.370619+01:00 phy005 kernel: [<ffffffffa0075998>] kvm_arch_vcpu_ioctl_run+0x7cd/0xa74 [kvm] 2011-01-20T12:41:18.370731+01:00 phy005 kernel: [<ffffffffa00645ba>] kvm_vcpu_ioctl+0xfd/0x56e [kvm] 2011-01-20T12:41:18.370737+01:00 phy005 kernel: [<ffffffff8100a60e>] ? apic_timer_interrupt+0xe/0x20 2011-01-20T12:41:18.370741+01:00 phy005 kernel: [<ffffffff8111aa2f>] vfs_ioctl+0x32/0xa6 2011-01-20T12:41:18.371562+01:00 phy005 kernel: [<ffffffff8111afa2>] do_vfs_ioctl+0x483/0x4c9 2011-01-20T12:41:18.371577+01:00 phy005 kernel: [<ffffffff8111b03e>] sys_ioctl+0x56/0x79 2011-01-20T12:41:18.371581+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-01-20T12:41:18.372244+01:00 phy005 kernel: ---[ end trace 7d57b311d4a5b22c ]--- 2011-01-20T12:41:57.568322+01:00 phy005 kernel: general protection fault: 0000 [#2] SMP 2011-01-20T12:41:57.568335+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-01-20T12:41:57.568339+01:00 phy005 kernel: CPU 0 Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-20 11:48 EPT: Misconfiguration Ruben Kerkhof @ 2011-01-20 11:59 ` Ruben Kerkhof 2011-01-21 13:22 ` Marcelo Tosatti 1 sibling, 0 replies; 19+ messages in thread From: Ruben Kerkhof @ 2011-01-20 11:59 UTC (permalink / raw) To: kvm On Thu, Jan 20, 2011 at 12:48, Ruben Kerkhof <ruben@rubenkerkhof.com> wrote: > I'm suddenly getting lots of the following errors on a server running > 2.36.7, but I have no idea what it means: Sorry, that should be 2.34.7. Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-20 11:48 EPT: Misconfiguration Ruben Kerkhof 2011-01-20 11:59 ` Ruben Kerkhof @ 2011-01-21 13:22 ` Marcelo Tosatti 2011-01-25 14:44 ` Ruben Kerkhof 1 sibling, 1 reply; 19+ messages in thread From: Marcelo Tosatti @ 2011-01-21 13:22 UTC (permalink / raw) To: Ruben Kerkhof; +Cc: kvm On Thu, Jan 20, 2011 at 12:48:00PM +0100, Ruben Kerkhof wrote: > I'm suddenly getting lots of the following errors on a server running > 2.36.7, but I have no idea what it means: > > 2011-01-20T12:41:18.358603+01:00 phy005 kernel: EPT: Misconfiguration. > 2011-01-20T12:41:18.358621+01:00 phy005 kernel: EPT: GPA: 0x3dbff6b0 > 2011-01-20T12:41:18.358624+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x50743e007 level 4 > 2011-01-20T12:41:18.358627+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x523de2007 level 3 > 2011-01-20T12:41:18.358629+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x62336f007 level 2 > 2011-01-20T12:41:18.360109+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 > 2011-01-20T12:41:18.360137+01:00 phy005 kernel: > ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 > 2011-01-20T12:41:18.360151+01:00 phy005 kernel: ------------[ cut here > ]------------ A shadow pagetable entry in memory has bits 45-49 set, which is not allowed. Its probably bad memory if this errors were not present before with the same workload and host software. Would be useful to see what memtest86 says. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-21 13:22 ` Marcelo Tosatti @ 2011-01-25 14:44 ` Ruben Kerkhof 2011-01-25 17:39 ` Avi Kivity 0 siblings, 1 reply; 19+ messages in thread From: Ruben Kerkhof @ 2011-01-25 14:44 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: kvm Hi Marcello, On Fri, Jan 21, 2011 at 14:22, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Thu, Jan 20, 2011 at 12:48:00PM +0100, Ruben Kerkhof wrote: >> I'm suddenly getting lots of the following errors on a server running >> 2.36.7, but I have no idea what it means: >> >> 2011-01-20T12:41:18.358603+01:00 phy005 kernel: EPT: Misconfiguration. >> 2011-01-20T12:41:18.358621+01:00 phy005 kernel: EPT: GPA: 0x3dbff6b0 >> 2011-01-20T12:41:18.358624+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x50743e007 level 4 >> 2011-01-20T12:41:18.358627+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x523de2007 level 3 >> 2011-01-20T12:41:18.358629+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x62336f007 level 2 >> 2011-01-20T12:41:18.360109+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 >> 2011-01-20T12:41:18.360137+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 >> 2011-01-20T12:41:18.360151+01:00 phy005 kernel: ------------[ cut here >> ]------------ > > A shadow pagetable entry in memory has bits 45-49 set, which is not > allowed. Its probably bad memory if this errors were not present before > with the same workload and host software. Would be useful to see what > memtest86 says. I did 2 memtest86+ passes, but no errors were found. Just to be save, we replaced all memory. The machine has been running stable over the weekend, but now gives exactly the same error. Is there anything else which could cause this? Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-25 14:44 ` Ruben Kerkhof @ 2011-01-25 17:39 ` Avi Kivity 2011-01-25 18:29 ` Ruben Kerkhof 0 siblings, 1 reply; 19+ messages in thread From: Avi Kivity @ 2011-01-25 17:39 UTC (permalink / raw) To: Ruben Kerkhof; +Cc: Marcelo Tosatti, kvm On 01/25/2011 04:44 PM, Ruben Kerkhof wrote: > Hi Marcello, > > On Fri, Jan 21, 2011 at 14:22, Marcelo Tosatti<mtosatti@redhat.com> wrote: > > On Thu, Jan 20, 2011 at 12:48:00PM +0100, Ruben Kerkhof wrote: > >> I'm suddenly getting lots of the following errors on a server running > >> 2.36.7, but I have no idea what it means: > >> > >> 2011-01-20T12:41:18.358603+01:00 phy005 kernel: EPT: Misconfiguration. > >> 2011-01-20T12:41:18.358621+01:00 phy005 kernel: EPT: GPA: 0x3dbff6b0 > >> 2011-01-20T12:41:18.358624+01:00 phy005 kernel: > >> ept_misconfig_inspect_spte: spte 0x50743e007 level 4 > >> 2011-01-20T12:41:18.358627+01:00 phy005 kernel: > >> ept_misconfig_inspect_spte: spte 0x523de2007 level 3 > >> 2011-01-20T12:41:18.358629+01:00 phy005 kernel: > >> ept_misconfig_inspect_spte: spte 0x62336f007 level 2 > >> 2011-01-20T12:41:18.360109+01:00 phy005 kernel: > >> ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 > >> 2011-01-20T12:41:18.360137+01:00 phy005 kernel: > >> ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 > >> 2011-01-20T12:41:18.360151+01:00 phy005 kernel: ------------[ cut here > >> ]------------ > > > > A shadow pagetable entry in memory has bits 45-49 set, which is not > > allowed. Its probably bad memory if this errors were not present before > > with the same workload and host software. Would be useful to see what > > memtest86 says. > > I did 2 memtest86+ passes, but no errors were found. > > Just to be save, we replaced all memory. The machine has been running > stable over the weekend, but now gives exactly the same error. > > Is there anything else which could cause this? Try updating the BIOS. When you say "suddenly", this was with no changes to software and hardware? Is cooling adequate? How much memory is on that machine? Even outside the reserved bits the address looks way too large. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-25 17:39 ` Avi Kivity @ 2011-01-25 18:29 ` Ruben Kerkhof 2011-01-26 9:52 ` Avi Kivity 0 siblings, 1 reply; 19+ messages in thread From: Ruben Kerkhof @ 2011-01-25 18:29 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm Hi Avi, On Tue, Jan 25, 2011 at 18:39, Avi Kivity <avi@redhat.com> wrote: > On 01/25/2011 04:44 PM, Ruben Kerkhof wrote: >> >> Hi Marcello, >> >> On Fri, Jan 21, 2011 at 14:22, Marcelo Tosatti<mtosatti@redhat.com> >> wrote: >> > On Thu, Jan 20, 2011 at 12:48:00PM +0100, Ruben Kerkhof wrote: >> >> I'm suddenly getting lots of the following errors on a server running >> >> 2.36.7, but I have no idea what it means: >> >> >> >> 2011-01-20T12:41:18.358603+01:00 phy005 kernel: EPT: Misconfiguration. >> >> 2011-01-20T12:41:18.358621+01:00 phy005 kernel: EPT: GPA: 0x3dbff6b0 >> >> 2011-01-20T12:41:18.358624+01:00 phy005 kernel: >> >> ept_misconfig_inspect_spte: spte 0x50743e007 level 4 >> >> 2011-01-20T12:41:18.358627+01:00 phy005 kernel: >> >> ept_misconfig_inspect_spte: spte 0x523de2007 level 3 >> >> 2011-01-20T12:41:18.358629+01:00 phy005 kernel: >> >> ept_misconfig_inspect_spte: spte 0x62336f007 level 2 >> >> 2011-01-20T12:41:18.360109+01:00 phy005 kernel: >> >> ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 >> >> 2011-01-20T12:41:18.360137+01:00 phy005 kernel: >> >> ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 >> >> 2011-01-20T12:41:18.360151+01:00 phy005 kernel: ------------[ cut here >> >> ]------------ >> > >> > A shadow pagetable entry in memory has bits 45-49 set, which is not >> > allowed. Its probably bad memory if this errors were not present before >> > with the same workload and host software. Would be useful to see what >> > memtest86 says. >> >> I did 2 memtest86+ passes, but no errors were found. >> >> Just to be save, we replaced all memory. The machine has been running >> stable over the weekend, but now gives exactly the same error. >> >> Is there anything else which could cause this? > > Try updating the BIOS. That's the first thing we did. It's a Supermicro with an X8DTU-F board, updated to bios version 2.0b (which includes the latest microcode). The procs are Intel 5620's > When you say "suddenly", this was with no changes to software and hardware? The host software and hardware hasn't changed in the two months since the machine has been running. 2.6.34.7 kernel and qemu-kvm 0.13. We host customer vms on it though, so virtual machines come and go. Various operating systems, a mixture of Linux, FreeBSD and Windows 2008 R2. We have other machines with the same config without these problems though. > Is cooling adequate? Yes. > How much memory is on that machine? Even outside the reserved bits the > address looks way too large. 48GB. This time I have a few different messages though: 2011-01-25T11:58:50.001208+01:00 phy005 kernel: general protection fault: 0000 [#1] SMP 2011-01-25T11:58:50.001310+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-01-25T11:58:50.001316+01:00 phy005 kernel: CPU 12 2011-01-25T11:58:50.001323+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt i2c_core ioatdma joydev iTCO_vendor_support dca serio_raw 3w_9xxx [last unloaded: scsi_wait_scan] 2011-01-25T11:58:50.001327+01:00 phy005 kernel: 2011-01-25T11:58:50.001331+01:00 phy005 kernel: Pid: 1849, comm: qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-01-25T11:58:50.001336+01:00 phy005 kernel: RIP: 0010:[<ffffffff810d0216>] [<ffffffff810d0216>] __free_pages+0x9/0x26 2011-01-25T11:58:50.001339+01:00 phy005 kernel: RSP: 0018:ffff8802fbe45ab8 EFLAGS: 00010216 2011-01-25T11:58:50.001343+01:00 phy005 kernel: RAX: ffff88061ef8c000 RBX: ffff8803131ec100 RCX: 0000000000000000 2011-01-25T11:58:50.001348+01:00 phy005 kernel: RDX: 00000000000000ff RSI: 0000000000000000 RDI: 1603a07305001568 2011-01-25T11:58:50.001352+01:00 phy005 kernel: RBP: ffff8802fbe45ab8 R08: ffffea000a83b7f0 R09: 0000000000000004 2011-01-25T11:58:50.001356+01:00 phy005 kernel: R10: 0000000000000000 R11: ffff8802fbe45b38 R12: 0000000000000100 2011-01-25T11:58:50.001359+01:00 phy005 kernel: R13: 0000000000000001 R14: ffff8802e934c010 R15: ffff8802e934c010 2011-01-25T11:58:50.001363+01:00 phy005 kernel: FS: 00007f1f14844700(0000) GS:ffff880655480000(0000) knlGS:0000000000000000 2011-01-25T11:58:50.001366+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2011-01-25T11:58:50.001370+01:00 phy005 kernel: CR2: 00000000b72f6cb0 CR3: 0000000ba561c000 CR4: 00000000000026e0 2011-01-25T11:58:50.001374+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-01-25T11:58:50.001378+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-01-25T11:58:50.001382+01:00 phy005 kernel: Process qemu-kvm (pid: 1849, threadinfo ffff8802fbe44000, task ffff8802ea11aee0) 2011-01-25T11:58:50.001385+01:00 phy005 kernel: Stack: 2011-01-25T11:58:50.001389+01:00 phy005 kernel: ffff8802fbe45af8 ffffffff810ee455 0000000000000206 ffffc9001e2d4000 2011-01-25T11:58:50.001392+01:00 phy005 kernel: <0> ffff8802e934c010 ffff880b680a2050 0000000000000000 ffff880b680a2000 2011-01-25T11:58:50.001396+01:00 phy005 kernel: <0> ffff8802fbe45b08 ffffffff810ee504 ffff8802fbe45b28 ffffffffa0065d70 2011-01-25T11:58:50.001399+01:00 phy005 kernel: Call Trace: 2011-01-25T11:58:50.001402+01:00 phy005 kernel: [<ffffffff810ee455>] __vunmap+0x8e/0xbd 2011-01-25T11:58:50.001406+01:00 phy005 kernel: [<ffffffff810ee504>] vfree+0x2e/0x30 2011-01-25T11:58:50.001410+01:00 phy005 kernel: [<ffffffffa0065d70>] kvm_free_physmem_slot+0x2a/0xa4 [kvm] 2011-01-25T11:58:50.001414+01:00 phy005 kernel: [<ffffffffa00663fa>] kvm_free_physmem+0x32/0x4b [kvm] 2011-01-25T11:58:50.001417+01:00 phy005 kernel: [<ffffffffa006f90e>] kvm_arch_destroy_vm+0xf1/0x13d [kvm] 2011-01-25T11:58:50.001421+01:00 phy005 kernel: [<ffffffffa00664ce>] kvm_put_kvm+0xbb/0xe2 [kvm] 2011-01-25T11:58:50.001424+01:00 phy005 kernel: [<ffffffffa0066d04>] kvm_vcpu_release+0x18/0x1c [kvm] 2011-01-25T11:58:50.001427+01:00 phy005 kernel: [<ffffffff8110ef2b>] __fput+0x12a/0x1dc 2011-01-25T11:58:50.001438+01:00 phy005 kernel: [<ffffffff8110eff7>] fput+0x1a/0x1c 2011-01-25T11:58:50.001441+01:00 phy005 kernel: [<ffffffff8110c067>] filp_close+0x68/0x72 2011-01-25T11:58:50.001444+01:00 phy005 kernel: [<ffffffff8104f298>] put_files_struct+0x6a/0xcc 2011-01-25T11:58:50.001447+01:00 phy005 kernel: [<ffffffff8104f33b>] exit_files+0x41/0x46 2011-01-25T11:58:50.001450+01:00 phy005 kernel: [<ffffffff81050c36>] do_exit+0x295/0x752 2011-01-25T11:58:50.001453+01:00 phy005 kernel: [<ffffffff8104816f>] ? default_wake_function+0x12/0x14 2011-01-25T11:58:50.001459+01:00 phy005 kernel: [<ffffffff81051174>] do_group_exit+0x81/0xab 2011-01-25T11:58:50.001463+01:00 phy005 kernel: [<ffffffff8105e5cd>] get_signal_to_deliver+0x3a6/0x3c8 2011-01-25T11:58:50.001466+01:00 phy005 kernel: [<ffffffff81092b6a>] ? audit_buffer_free+0x75/0x7a 2011-01-25T11:58:50.001469+01:00 phy005 kernel: [<ffffffff81009038>] do_signal+0x72/0x6b8 2011-01-25T11:58:50.001472+01:00 phy005 kernel: [<ffffffff8110efce>] ? __fput+0x1cd/0x1dc 2011-01-25T11:58:50.001478+01:00 phy005 kernel: [<ffffffff810096a6>] do_notify_resume+0x28/0x86 2011-01-25T11:58:50.001482+01:00 phy005 kernel: [<ffffffff81009f3e>] int_signal+0x12/0x17 2011-01-25T11:58:50.001486+01:00 phy005 kernel: Code: ff ff 41 8b 46 08 41 29 06 4c 89 e7 57 9d 0f 1f 44 00 00 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 <f0> ff 4f 08 0f 94 c0 84 c0 74 10 85 f6 75 07 e8 63 fe ff ff eb 2011-01-25T11:58:50.001489+01:00 phy005 kernel: RIP [<ffffffff810d0216>] __free_pages+0x9/0x26 2011-01-25T11:58:50.001494+01:00 phy005 kernel: RSP <ffff8802fbe45ab8> 2011-01-25T11:58:50.001497+01:00 phy005 kernel: ---[ end trace 643b51f38991abec ]--- 2011-01-25T11:58:50.001500+01:00 phy005 kernel: Fixing recursive fault but reboot is needed! and a bit later: 2011-01-25T12:06:32.673937+01:00 phy005 kernel: qemu-kvm: Corrupted page table at address 7f37b37ff000 2011-01-25T12:06:32.673959+01:00 phy005 kernel: PGD c201d1067 PUD 94e538067 PMD 61e5bf067 PTE 1603a0730500e067 2011-01-25T12:06:32.673962+01:00 phy005 kernel: Bad pagetable: 0009 [#2] SMP 2011-01-25T12:06:32.673965+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-01-25T12:06:32.673967+01:00 phy005 kernel: CPU 2 2011-01-25T12:06:32.673972+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt i2c_core ioatdma joydev iTCO_vendor_support dca serio_raw 3w_9xxx [last unloaded: scsi_wait_scan] 2011-01-25T12:06:32.673978+01:00 phy005 kernel: 2011-01-25T12:06:32.673981+01:00 phy005 kernel: Pid: 2428, comm: qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-01-25T12:06:32.673985+01:00 phy005 kernel: RIP: 0010:[<ffffffff81213bd7>] [<ffffffff81213bd7>] copy_user_generic_string+0x17/0x40 2011-01-25T12:06:32.673987+01:00 phy005 kernel: RSP: 0018:ffff88061df85ba0 EFLAGS: 00010202 2011-01-25T12:06:32.673989+01:00 phy005 kernel: RAX: ffff88061df84000 RBX: ffff88061df85e98 RCX: 0000000000000005 2011-01-25T12:06:32.673992+01:00 phy005 kernel: RDX: 0000000000000720 RSI: 00007f37b37ff000 RDI: ffff8805db642453 2011-01-25T12:06:32.673999+01:00 phy005 kernel: RBP: ffff88061df85bc8 R08: 0000000000000b76 R09: 0000000000000b80 2011-01-25T12:06:32.674003+01:00 phy005 kernel: R10: 0000000000000650 R11: 0000000000000004 R12: ffff8805db642453 2011-01-25T12:06:32.674007+01:00 phy005 kernel: R13: 0000000000000725 R14: 0000000000000725 R15: 0000000000000725 2011-01-25T12:06:32.674011+01:00 phy005 kernel: FS: 00007f37e20e2700(0000) GS:ffff880002040000(0000) knlGS:0000000000000000 2011-01-25T12:06:32.674014+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2011-01-25T12:06:32.674018+01:00 phy005 kernel: CR2: 00007f37b37ff000 CR3: 0000000c23570000 CR4: 00000000000026e0 2011-01-25T12:06:32.674022+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-01-25T12:06:32.674036+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-01-25T12:06:32.674041+01:00 phy005 kernel: Process qemu-kvm (pid: 2428, threadinfo ffff88061df84000, task ffff88061c1aaee0) 2011-01-25T12:06:32.674044+01:00 phy005 kernel: Stack: 2011-01-25T12:06:32.674048+01:00 phy005 kernel: ffffffff8139f26e 0000000000000000 0000000000000725 00007f37b37ff000 2011-01-25T12:06:32.674052+01:00 phy005 kernel: <0> ffff8805db642453 ffff88061df85c08 ffffffff8139f475 0000000000000063 2011-01-25T12:06:32.674056+01:00 phy005 kernel: <0> 0000000000000b76 0000000000000000 ffff88061e089a00 0000000000000b76 2011-01-25T12:06:32.674058+01:00 phy005 kernel: Call Trace: 2011-01-25T12:06:32.674061+01:00 phy005 kernel: [<ffffffff8139f26e>] ? copy_from_user+0x2f/0x31 2011-01-25T12:06:32.674063+01:00 phy005 kernel: [<ffffffff8139f475>] memcpy_fromiovecend+0x57/0x82 2011-01-25T12:06:32.674075+01:00 phy005 kernel: [<ffffffff8139fb49>] skb_copy_datagram_from_iovec+0x5d/0x1ea 2011-01-25T12:06:32.674077+01:00 phy005 kernel: [<ffffffff8111c413>] ? pollwake+0x0/0x54 2011-01-25T12:06:32.674080+01:00 phy005 kernel: [<ffffffff8139f475>] ? memcpy_fromiovecend+0x57/0x82 2011-01-25T12:06:32.674082+01:00 phy005 kernel: [<ffffffffa00408f5>] tun_get_user+0x1bd/0x3e3 [tun] 2011-01-25T12:06:32.674084+01:00 phy005 kernel: [<ffffffffa0040b42>] ? tun_chr_aio_write+0x0/0x98 [tun] 2011-01-25T12:06:32.674087+01:00 phy005 kernel: [<ffffffffa0040bba>] tun_chr_aio_write+0x78/0x98 [tun] 2011-01-25T12:06:32.674089+01:00 phy005 kernel: [<ffffffff8110d995>] do_sync_readv_writev+0xc1/0x100 2011-01-25T12:06:32.674091+01:00 phy005 kernel: [<ffffffff811d78b4>] ? selinux_file_permission+0xa7/0xb3 2011-01-25T12:06:32.674094+01:00 phy005 kernel: [<ffffffff8110d6f9>] ? copy_from_user+0x2f/0x31 2011-01-25T12:06:32.674097+01:00 phy005 kernel: [<ffffffff811cdb0b>] ? security_file_permission+0x16/0x18 2011-01-25T12:06:32.674099+01:00 phy005 kernel: [<ffffffff8110e68c>] do_readv_writev+0xa7/0x127 2011-01-25T12:06:32.674101+01:00 phy005 kernel: [<ffffffff81065601>] ? sys_timer_settime+0x259/0x2ab 2011-01-25T12:06:32.674104+01:00 phy005 kernel: [<ffffffff8110e74f>] vfs_writev+0x43/0x4e 2011-01-25T12:06:32.674106+01:00 phy005 kernel: [<ffffffff8110e83f>] sys_writev+0x4a/0x93 2011-01-25T12:06:32.674109+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-01-25T12:06:32.674112+01:00 phy005 kernel: Code: 06 88 07 48 ff c6 48 ff c7 ff c9 75 f2 31 c0 c3 0f 1f 40 00 21 d2 74 30 83 fa 08 72 27 89 f9 83 e1 07 74 15 83 e9 08 f7 d9 29 ca <8a> 06 88 07 48 ff c6 48 ff c7 ff c9 75 f2 89 d1 c1 e9 03 83 e2 2011-01-25T12:06:32.674116+01:00 phy005 kernel: RIP [<ffffffff81213bd7>] copy_user_generic_string+0x17/0x40 2011-01-25T12:06:32.674118+01:00 phy005 kernel: RSP <ffff88061df85ba0> 2011-01-25T12:06:32.674120+01:00 phy005 kernel: ---[ end trace 643b51f38991abed ]--- 2011-01-25T12:38:49.416943+01:00 phy005 kernel: EPT: Misconfiguration. 2011-01-25T12:38:49.417518+01:00 phy005 kernel: EPT: GPA: 0x2abff038 2011-01-25T12:38:49.417526+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x5f49e9007 level 4 2011-01-25T12:38:49.417532+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x5db595007 level 3 2011-01-25T12:38:49.417553+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x5d5da7007 level 2 2011-01-25T12:38:49.417558+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x1603a07305006277 level 1 2011-01-25T12:38:49.419858+01:00 phy005 kernel: ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 2011-01-25T12:38:49.419881+01:00 phy005 kernel: ------------[ cut here ]------------ 2011-01-25T12:38:49.419884+01:00 phy005 kernel: WARNING: at arch/x86/kvm/vmx.c:3425 handle_ept_misconfig+0x152/0x1d8 [kvm_intel]() 2011-01-25T12:38:49.419886+01:00 phy005 kernel: Hardware name: X8DTU 2011-01-25T12:38:49.419890+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt i2c_core ioatdma joydev iTCO_vendor_support dca serio_raw 3w_9xxx [last unloaded: scsi_wait_scan] 2011-01-25T12:38:49.419893+01:00 phy005 kernel: Pid: 4475, comm: qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 2011-01-25T12:38:49.419896+01:00 phy005 kernel: Call Trace: 2011-01-25T12:38:49.419900+01:00 phy005 kernel: [<ffffffff8104d11f>] warn_slowpath_common+0x7c/0x94 2011-01-25T12:38:49.419907+01:00 phy005 kernel: [<ffffffff8104d14b>] warn_slowpath_null+0x14/0x16 2011-01-25T12:38:49.420860+01:00 phy005 kernel: [<ffffffffa00ba7fb>] handle_ept_misconfig+0x152/0x1d8 [kvm_intel] 2011-01-25T12:38:49.420887+01:00 phy005 kernel: [<ffffffffa00bb401>] vmx_handle_exit+0x204/0x23a [kvm_intel] 2011-01-25T12:38:49.420891+01:00 phy005 kernel: [<ffffffffa0075998>] kvm_arch_vcpu_ioctl_run+0x7cd/0xa74 [kvm] 2011-01-25T12:38:49.420893+01:00 phy005 kernel: [<ffffffffa00645ba>] kvm_vcpu_ioctl+0xfd/0x56e [kvm] 2011-01-25T12:38:49.422064+01:00 phy005 kernel: [<ffffffff8111aa2f>] vfs_ioctl+0x32/0xa6 2011-01-25T12:38:49.422090+01:00 phy005 kernel: [<ffffffff8111afa2>] do_vfs_ioctl+0x483/0x4c9 2011-01-25T12:38:49.422092+01:00 phy005 kernel: [<ffffffff8111b03e>] sys_ioctl+0x56/0x79 2011-01-25T12:38:49.422096+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-01-25T12:38:49.422099+01:00 phy005 kernel: ---[ end trace 643b51f38991abee ]--- 2011-01-25T13:16:57.541111+01:00 phy005 kernel: br0: port 39(vnet74) entering disabled state 2011-01-25T13:16:57.588110+01:00 phy005 kernel: device vnet74 left promiscuous mode 2011-01-25T13:16:57.588169+01:00 phy005 kernel: br0: port 39(vnet74) entering disabled state 2011-01-25T13:16:58.192440+01:00 phy005 kernel: BUG: Bad page map in process qemu-kvm pte:1603a0730500d067 pmd:61059f067 2011-01-25T13:16:58.192462+01:00 phy005 kernel: addr:00007f97fe1ff000 vm_flags:80100073 anon_vma:ffff88061dd04440 mapping:(null) index:7f97fe1ff 2011-01-25T13:16:58.193253+01:00 phy005 kernel: Pid: 4444, comm: qemu-kvm Tainted: G D W 2.6.34.7-66.tilaa.fc13.x86_64 #1 2011-01-25T13:16:58.193275+01:00 phy005 kernel: Call Trace: 2011-01-25T13:16:58.193280+01:00 phy005 kernel: [<ffffffff810e135a>] print_bad_pte+0x203/0x21c 2011-01-25T13:16:58.193284+01:00 phy005 kernel: [<ffffffff810e13be>] vm_normal_page+0x4b/0x64 2011-01-25T13:16:58.194123+01:00 phy005 kernel: [<ffffffff810e1c5a>] unmap_vmas+0x492/0x92c 2011-01-25T13:16:58.194132+01:00 phy005 kernel: [<ffffffff810e73ff>] exit_mmap+0xce/0x132 2011-01-25T13:16:58.194138+01:00 phy005 kernel: [<ffffffff8104ad7a>] mmput+0x5e/0xca 2011-01-25T13:16:58.194142+01:00 phy005 kernel: [<ffffffff8104f0d5>] exit_mm+0x114/0x121 2011-01-25T13:16:58.195242+01:00 phy005 kernel: [<ffffffff81050bf5>] do_exit+0x254/0x752 2011-01-25T13:16:58.195253+01:00 phy005 kernel: [<ffffffff8104816f>] ? default_wake_function+0x12/0x14 2011-01-25T13:16:58.195256+01:00 phy005 kernel: [<ffffffff81051174>] do_group_exit+0x81/0xab 2011-01-25T13:16:58.195260+01:00 phy005 kernel: [<ffffffff8105e5cd>] get_signal_to_deliver+0x3a6/0x3c8 2011-01-25T13:16:58.195264+01:00 phy005 kernel: [<ffffffff81092b6a>] ? audit_buffer_free+0x75/0x7a 2011-01-25T13:16:58.196201+01:00 phy005 kernel: [<ffffffff81009038>] do_signal+0x72/0x6b8 2011-01-25T13:16:58.196212+01:00 phy005 kernel: [<ffffffff8110efce>] ? __fput+0x1cd/0x1dc 2011-01-25T13:16:58.196216+01:00 phy005 kernel: [<ffffffff810096a6>] do_notify_resume+0x28/0x86 2011-01-25T13:16:58.196219+01:00 phy005 kernel: [<ffffffff81009f3e>] int_signal+0x12/0x17 2011-01-25T13:17:00.006943+01:00 phy005 kernel: br1: port 39(vnet75) entering disabled state 2011-01-25T13:17:00.511943+01:00 phy005 kernel: device vnet75 left promiscuous mode 2011-01-25T13:17:00.511991+01:00 phy005 kernel: br1: port 39(vnet75) entering disabled state 2011-01-25T13:17:18.748195+01:00 phy005 kernel: device vnet74 entered promiscuous mode 2011-01-25T13:17:18.752020+01:00 phy005 kernel: br0: port 39(vnet74) entering forwarding state 2011-01-25T13:17:18.754127+01:00 phy005 kernel: device vnet75 entered promiscuous mode 2011-01-25T13:17:18.756087+01:00 phy005 kernel: br1: port 39(vnet75) entering forwarding state 2011-01-25T13:17:24.416116+01:00 phy005 kernel: kvm: 16063: cpu0 unhandled wrmsr: 0x198 data 0 2011-01-25T13:17:24.416135+01:00 phy005 kernel: kvm: 16063: cpu1 unhandled wrmsr: 0x198 data 0 2011-01-25T13:17:29.051982+01:00 phy005 kernel: vnet74: no IPv6 routers present 2011-01-25T13:17:29.166986+01:00 phy005 kernel: vnet75: no IPv6 routers present 2011-01-25T15:01:38.735441+01:00 phy005 kernel: BUG: unable to handle kernel paging request at fffff6b192918010 2011-01-25T15:01:38.735756+01:00 phy005 kernel: IP: [<ffffffffa0079826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] 2011-01-25T15:01:38.735762+01:00 phy005 kernel: PGD 0 2011-01-25T15:01:38.735766+01:00 phy005 kernel: Oops: 0000 [#3] SMP 2011-01-25T15:01:38.735770+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-01-25T15:01:38.735773+01:00 phy005 kernel: CPU 10 2011-01-25T15:01:38.735780+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt i2c_core ioatdma joydev iTCO_vendor_support dca serio_raw 3w_9xxx [last unloaded: scsi_wait_scan] 2011-01-25T15:01:38.735783+01:00 phy005 kernel: 2011-01-25T15:01:38.735788+01:00 phy005 kernel: Pid: 2465, comm: qemu-kvm Tainted: G B D W 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-01-25T15:01:38.735792+01:00 phy005 kernel: RIP: 0010:[<ffffffffa0079826>] [<ffffffffa0079826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] 2011-01-25T15:01:38.735796+01:00 phy005 kernel: RSP: 0018:ffff880c243cdb58 EFLAGS: 00010206 2011-01-25T15:01:38.735800+01:00 phy005 kernel: RAX: 00000cb192918000 RBX: ffff88030010b8c0 RCX: 0000000000000000 2011-01-25T15:01:38.735804+01:00 phy005 kernel: RDX: ffffea0000000000 RSI: ffff880310adfff8 RDI: ffff8803000f68f8 2011-01-25T15:01:38.735807+01:00 phy005 kernel: RBP: ffff880c243cdb88 R08: ffff880310adf018 R09: 0000000000000004 2011-01-25T15:01:38.735811+01:00 phy005 kernel: R10: 0000000000000000 R11: ffffea000ac288c0 R12: ffff880c201fc000 2011-01-25T15:01:38.735819+01:00 phy005 kernel: R13: ffff880310adfff8 R14: 00000000000001ff R15: 0000000000000000 2011-01-25T15:01:38.735823+01:00 phy005 kernel: FS: 0000000000000000(0000) GS:ffff8800020c0000(0000) knlGS:0000000000000000 2011-01-25T15:01:38.735826+01:00 phy005 kernel: CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b 2011-01-25T15:01:38.735830+01:00 phy005 kernel: CR2: fffff6b192918010 CR3: 0000000001a42000 CR4: 00000000000026e0 2011-01-25T15:01:38.735833+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-01-25T15:01:38.735836+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-01-25T15:01:38.735840+01:00 phy005 kernel: Process qemu-kvm (pid: 2465, threadinfo ffff880c243cc000, task ffff880c235f1770) 2011-01-25T15:01:38.735843+01:00 phy005 kernel: Stack: 2011-01-25T15:01:38.735847+01:00 phy005 kernel: 0000000000000002 ffff880c201fc000 ffff88030010b810 ffff880c201fe328 2011-01-25T15:01:38.735851+01:00 phy005 kernel: <0> ffff880c22d01568 ffff880c235f1770 ffff880c243cdbb8 ffffffffa0079a42 2011-01-25T15:01:38.735855+01:00 phy005 kernel: <0> ffffea002a7039c0 ffff880c201fc000 ffff880c201fc000 0000000000000001 2011-01-25T15:01:38.735858+01:00 phy005 kernel: Call Trace: 2011-01-25T15:01:38.735862+01:00 phy005 kernel: [<ffffffffa0079a42>] kvm_mmu_zap_all+0x35/0x60 [kvm] 2011-01-25T15:01:38.735866+01:00 phy005 kernel: [<ffffffffa006ecde>] kvm_arch_flush_shadow+0x16/0x22 [kvm] 2011-01-25T15:01:38.735870+01:00 phy005 kernel: [<ffffffffa0064b0a>] kvm_mmu_notifier_release+0x31/0x44 [kvm] 2011-01-25T15:01:38.735875+01:00 phy005 kernel: [<ffffffff810fac37>] __mmu_notifier_release+0x4f/0x7b 2011-01-25T15:01:38.735879+01:00 phy005 kernel: [<ffffffff810e735d>] exit_mmap+0x2c/0x132 2011-01-25T15:01:38.735882+01:00 phy005 kernel: [<ffffffff8104ad7a>] mmput+0x5e/0xca 2011-01-25T15:01:38.735886+01:00 phy005 kernel: [<ffffffff8104f0d5>] exit_mm+0x114/0x121 2011-01-25T15:01:38.735890+01:00 phy005 kernel: [<ffffffff81050bf5>] do_exit+0x254/0x752 2011-01-25T15:01:38.735893+01:00 phy005 kernel: [<ffffffffa006709a>] ? vcpu_put+0x28/0x2d [kvm] 2011-01-25T15:01:38.735897+01:00 phy005 kernel: [<ffffffff81051174>] do_group_exit+0x81/0xab 2011-01-25T15:01:38.735902+01:00 phy005 kernel: [<ffffffff8105e5cd>] get_signal_to_deliver+0x3a6/0x3c8 2011-01-25T15:01:38.735906+01:00 phy005 kernel: [<ffffffff81009038>] do_signal+0x72/0x6b8 2011-01-25T15:01:38.735910+01:00 phy005 kernel: [<ffffffff8111aa2f>] ? vfs_ioctl+0x32/0xa6 2011-01-25T15:01:38.735914+01:00 phy005 kernel: [<ffffffff8111afa2>] ? do_vfs_ioctl+0x483/0x4c9 2011-01-25T15:01:38.735918+01:00 phy005 kernel: [<ffffffff810096a6>] do_notify_resume+0x28/0x86 2011-01-25T15:01:38.735922+01:00 phy005 kernel: [<ffffffff81009f3e>] int_signal+0x12/0x17 2011-01-25T15:01:38.735928+01:00 phy005 kernel: Code: 41 5e 44 89 f8 41 5f c9 c3 48 ba 00 f0 ff ff ff ff 0f 00 4c 89 ee 48 21 d0 48 ba 00 00 00 00 00 ea ff ff 48 c1 e8 0c 48 6b c0 38 <48> 8b 7c 10 10 e8 a3 f3 ff ff e9 06 fe ff ff 55 48 89 e5 41 57 2011-01-25T15:01:38.735938+01:00 phy005 kernel: RIP [<ffffffffa0079826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] 2011-01-25T15:01:38.735942+01:00 phy005 kernel: RSP <ffff880c243cdb58> 2011-01-25T15:01:38.735946+01:00 phy005 kernel: CR2: fffff6b192918010 2011-01-25T15:01:38.735950+01:00 phy005 kernel: ---[ end trace 643b51f38991abef ]--- 2011-01-25T15:01:38.735954+01:00 phy005 kernel: Fixing recursive fault but reboot is needed! and 2011-01-25T17:33:57.393780+01:00 phy005 kernel: BUG: unable to handle kernel paging request at ffffea7192918310 2011-01-25T17:33:57.393888+01:00 phy005 kernel: IP: [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-01-25T17:33:57.393895+01:00 phy005 kernel: PGD 118600067 PUD 0 2011-01-25T17:33:57.393897+01:00 phy005 kernel: Oops: 0000 [#4] SMP 2011-01-25T17:33:57.393900+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-01-25T17:33:57.393902+01:00 phy005 kernel: CPU 4 2011-01-25T17:33:57.393906+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt i2c_core ioatdma joydev iTCO_vendor_support dca serio_raw 3w_9xxx [last unloaded: scsi_wait_scan] 2011-01-25T17:33:57.393913+01:00 phy005 kernel: 2011-01-25T17:33:57.393915+01:00 phy005 kernel: Pid: 3630, comm: qemu-kvm Tainted: G B D W 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-01-25T17:33:57.393918+01:00 phy005 kernel: RIP: 0010:[<ffffffff81034880>] [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-01-25T17:33:57.393920+01:00 phy005 kernel: RSP: 0018:ffff880bac24bab8 EFLAGS: 00010082 2011-01-25T17:33:57.393923+01:00 phy005 kernel: RAX: ffffea7192918310 RBX: 00003ffffffff000 RCX: 0000000000000007 2011-01-25T17:33:57.393925+01:00 phy005 kernel: RDX: 00007fce4fc00000 RSI: 00007fce4fbff000 RDI: 1603a0730500e067 2011-01-25T17:33:57.393927+01:00 phy005 kernel: RBP: ffff880bac24bad8 R08: ffff880bac24bbc8 R09: ffff880bac24bb84 2011-01-25T17:33:57.393929+01:00 phy005 kernel: R10: ffff880315507ff8 R11: ffffea0000000000 R12: 0000000000000207 2011-01-25T17:33:57.393931+01:00 phy005 kernel: R13: ffffc00000000fff R14: 0000000000000007 R15: 0000000000000001 2011-01-25T17:33:57.393935+01:00 phy005 kernel: FS: 00007fce993a2700(0000) GS:ffff880655400000(0000) knlGS:0000000000000000 2011-01-25T17:33:57.393938+01:00 phy005 kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 2011-01-25T17:33:57.393941+01:00 phy005 kernel: CR2: ffffea7192918310 CR3: 0000000bac235000 CR4: 00000000000026e0 2011-01-25T17:33:57.393944+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-01-25T17:33:57.393948+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-01-25T17:33:57.393951+01:00 phy005 kernel: Process qemu-kvm (pid: 3630, threadinfo ffff880bac24a000, task ffff880bac380000) 2011-01-25T17:33:57.393955+01:00 phy005 kernel: Stack: 2011-01-25T17:33:57.393958+01:00 phy005 kernel: 00007fce4fc00000 00007fce4fc00000 00007fce4fc00000 ffff880bac3b43e8 2011-01-25T17:33:57.393962+01:00 phy005 kernel: <0> ffff880bac24bb38 ffffffff81034a15 00007fce4fbfffff 00007fce4fbfffff 2011-01-25T17:33:57.393966+01:00 phy005 kernel: <0> ffff880bac24bb84 ffff880bac24bbc8 ffff880bac1bb9c8 ffff880bac2357f8 2011-01-25T17:33:57.393969+01:00 phy005 kernel: Call Trace: 2011-01-25T17:33:57.393973+01:00 phy005 kernel: [<ffffffff81034a15>] gup_pud_range+0x156/0x192 2011-01-25T17:33:57.393977+01:00 phy005 kernel: [<ffffffff81034b15>] get_user_pages_fast+0xc4/0x172 2011-01-25T17:33:57.393981+01:00 phy005 kernel: [<ffffffff8102f16b>] ? device_change_notifier+0x5d/0x120 2011-01-25T17:33:57.393985+01:00 phy005 kernel: [<ffffffffa00656a7>] hva_to_pfn+0x41/0x123 [kvm] 2011-01-25T17:33:57.393989+01:00 phy005 kernel: [<ffffffffa00657bd>] ? gfn_to_hva+0x16/0x72 [kvm] 2011-01-25T17:33:57.393993+01:00 phy005 kernel: [<ffffffffa0065bfa>] gfn_to_pfn+0x6a/0x6e [kvm] 2011-01-25T17:33:57.393997+01:00 phy005 kernel: [<ffffffffa007e862>] tdp_page_fault+0x80/0x10c [kvm] 2011-01-25T17:33:57.393999+01:00 phy005 kernel: [<ffffffffa008450f>] ? apic_update_ppr+0x22/0x57 [kvm] 2011-01-25T17:33:57.394002+01:00 phy005 kernel: [<ffffffff812c0437>] ? device_find_child+0x12/0x81 2011-01-25T17:33:57.394004+01:00 phy005 kernel: [<ffffffffa007c5cd>] kvm_mmu_page_fault+0x1f/0x98 [kvm] 2011-01-25T17:33:57.394007+01:00 phy005 kernel: [<ffffffffa00ba97a>] handle_ept_violation+0xf9/0x102 [kvm_intel] 2011-01-25T17:33:57.394010+01:00 phy005 kernel: [<ffffffffa00bb401>] vmx_handle_exit+0x204/0x23a [kvm_intel] 2011-01-25T17:33:57.394012+01:00 phy005 kernel: [<ffffffffa0075998>] kvm_arch_vcpu_ioctl_run+0x7cd/0xa74 [kvm] 2011-01-25T17:33:57.394015+01:00 phy005 kernel: [<ffffffffa00645ba>] kvm_vcpu_ioctl+0xfd/0x56e [kvm] 2011-01-25T17:33:57.394017+01:00 phy005 kernel: [<ffffffff810206c4>] ? lapic_next_event+0x1d/0x21 2011-01-25T17:33:57.394020+01:00 phy005 kernel: [<ffffffff81071435>] ? clockevents_program_event+0x7a/0x83 2011-01-25T17:33:57.394023+01:00 phy005 kernel: [<ffffffff8107258d>] ? tick_dev_program_event+0x3c/0xfc 2011-01-25T17:33:57.394026+01:00 phy005 kernel: [<ffffffff8111aa2f>] vfs_ioctl+0x32/0xa6 2011-01-25T17:33:57.394030+01:00 phy005 kernel: [<ffffffff8111afa2>] do_vfs_ioctl+0x483/0x4c9 2011-01-25T17:33:57.394034+01:00 phy005 kernel: [<ffffffff8111b03e>] sys_ioctl+0x56/0x79 2011-01-25T17:33:57.394037+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-01-25T17:33:57.394040+01:00 phy005 kernel: Code: 21 d8 49 01 c2 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8 <66> 83 38 00 48 89 c7 79 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 2011-01-25T17:33:57.394043+01:00 phy005 kernel: RIP [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-01-25T17:33:57.394055+01:00 phy005 kernel: RSP <ffff880bac24bab8> 2011-01-25T17:33:57.394058+01:00 phy005 kernel: CR2: ffffea7192918310 2011-01-25T17:33:57.394060+01:00 phy005 kernel: ---[ end trace 643b51f38991abf0 ]--- Regards, Ruben Thanks, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-25 18:29 ` Ruben Kerkhof @ 2011-01-26 9:52 ` Avi Kivity 2011-01-26 15:00 ` Ruben Kerkhof 0 siblings, 1 reply; 19+ messages in thread From: Avi Kivity @ 2011-01-26 9:52 UTC (permalink / raw) To: Ruben Kerkhof; +Cc: Marcelo Tosatti, kvm On 01/25/2011 08:29 PM, Ruben Kerkhof wrote: > > When you say "suddenly", this was with no changes to software and hardware? > > The host software and hardware hasn't changed in the two months since > the machine has been running. 2.6.34.7 kernel and qemu-kvm 0.13. > > We host customer vms on it though, so virtual machines come and go. > Various operating systems, a mixture of Linux, FreeBSD and Windows > 2008 R2. We have other machines with the same config without these > problems though. Are those other machines running a similar workload? The traces look awfully like bad hardware, though that can also be explained by random memory corruption due to a bug. > This time I have a few different messages though: > > 2011-01-25T11:58:50.001208+01:00 phy005 kernel: general protection fault: 0000 [#1] SMP > > RSI: 0000000000000000 RDI: 1603a07305001568 > > 2011-01-25T11:58:50.001486+01:00 phy005 kernel: Code: ff ff 41 8b 46 > 08 41 29 06 4c 89 e7 57 9d 0f 1f 44 00 00 48 83 c4 18 5b 41 5c 41 5d > 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00<f0> ff 4f 08 0f 94 c0 84 > c0 74 10 85 f6 75 07 e8 63 fe ff ff eb lock decl 0x8(%rdi) %rdi is completely crap, looks like corruption again. Strangely, it is similar to the bad spte from the previous trace: 0x1603a0730500d277. The upper 48 bits are identical, the lower 16 bits are different.: > 2011-01-25T12:06:32.673937+01:00 phy005 kernel: qemu-kvm: Corrupted > page table at address 7f37b37ff000 > 2011-01-25T12:06:32.673959+01:00 phy005 kernel: PGD c201d1067 PUD > 94e538067 PMD 61e5bf067 PTE 1603a0730500e067 Here are those magic 48 bits again, in the PTE entry. > 2011-01-25T12:38:49.416943+01:00 phy005 kernel: EPT: Misconfiguration. > 2011-01-25T12:38:49.417518+01:00 phy005 kernel: EPT: GPA: 0x2abff038 > 2011-01-25T12:38:49.417526+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x5f49e9007 level 4 > 2011-01-25T12:38:49.417532+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x5db595007 level 3 > 2011-01-25T12:38:49.417553+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x5d5da7007 level 2 > 2011-01-25T12:38:49.417558+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x1603a07305006277 level 1 Again. > 2011-01-25T13:16:58.192440+01:00 phy005 kernel: BUG: Bad page map in > process qemu-kvm pte:1603a0730500d067 pmd:61059f067 Again. However, these all came from a single boot, yes? If so they can be the same corruption. Please collect more traces, with reboots in between. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-26 9:52 ` Avi Kivity @ 2011-01-26 15:00 ` Ruben Kerkhof 2011-02-10 15:23 ` Ruben Kerkhof 0 siblings, 1 reply; 19+ messages in thread From: Ruben Kerkhof @ 2011-01-26 15:00 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm On Wed, Jan 26, 2011 at 10:52, Avi Kivity <avi@redhat.com> wrote: > On 01/25/2011 08:29 PM, Ruben Kerkhof wrote: >> >> > When you say "suddenly", this was with no changes to software and >> > hardware? >> >> The host software and hardware hasn't changed in the two months since >> the machine has been running. 2.6.34.7 kernel and qemu-kvm 0.13. >> >> We host customer vms on it though, so virtual machines come and go. >> Various operating systems, a mixture of Linux, FreeBSD and Windows >> 2008 R2. We have other machines with the same config without these >> problems though. > > Are those other machines running a similar workload? Yes, similar, or they're more heavily loaded. On this machine, about half of the 48GB memory was used for virtual machines. > The traces look awfully like bad hardware, though that can also be explained > by random memory corruption due to a bug. Yeah, that's what I'm expecting. We already replaced the memory, next step is to move the disks over to another server to make sure it's not the board or cpu's. >> This time I have a few different messages though: >> >> 2011-01-25T11:58:50.001208+01:00 phy005 kernel: general protection fault: >> 0000 [#1] SMP >> >> RSI: 0000000000000000 RDI: 1603a07305001568 >> >> 2011-01-25T11:58:50.001486+01:00 phy005 kernel: Code: ff ff 41 8b 46 >> 08 41 29 06 4c 89 e7 57 9d 0f 1f 44 00 00 48 83 c4 18 5b 41 5c 41 5d >> 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00<f0> ff 4f 08 0f 94 c0 84 >> c0 74 10 85 f6 75 07 e8 63 fe ff ff eb > > lock decl 0x8(%rdi) > > %rdi is completely crap, looks like corruption again. Strangely, it is > similar to the bad spte from the previous trace: 0x1603a0730500d277. The > upper 48 bits are identical, the lower 16 bits are different.: >> >> 2011-01-25T12:06:32.673937+01:00 phy005 kernel: qemu-kvm: Corrupted >> page table at address 7f37b37ff000 >> 2011-01-25T12:06:32.673959+01:00 phy005 kernel: PGD c201d1067 PUD >> 94e538067 PMD 61e5bf067 PTE 1603a0730500e067 > > Here are those magic 48 bits again, in the PTE entry. >> >> 2011-01-25T12:38:49.416943+01:00 phy005 kernel: EPT: Misconfiguration. >> 2011-01-25T12:38:49.417518+01:00 phy005 kernel: EPT: GPA: 0x2abff038 >> 2011-01-25T12:38:49.417526+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x5f49e9007 level 4 >> 2011-01-25T12:38:49.417532+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x5db595007 level 3 >> 2011-01-25T12:38:49.417553+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x5d5da7007 level 2 >> 2011-01-25T12:38:49.417558+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x1603a07305006277 level 1 > > Again. > >> 2011-01-25T13:16:58.192440+01:00 phy005 kernel: BUG: Bad page map in >> process qemu-kvm pte:1603a0730500d067 pmd:61059f067 > > Again. > > However, these all came from a single boot, yes? Correct. > If so they can be the same > corruption. Please collect more traces, with reboots in between. Ok, thanks, will do. Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-01-26 15:00 ` Ruben Kerkhof @ 2011-02-10 15:23 ` Ruben Kerkhof 2011-02-13 2:07 ` Ruben Kerkhof 2011-02-13 12:58 ` Avi Kivity 0 siblings, 2 replies; 19+ messages in thread From: Ruben Kerkhof @ 2011-02-10 15:23 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm On Wed, Jan 26, 2011 at 16:00, Ruben Kerkhof <ruben@rubenkerkhof.com> wrote: > On Wed, Jan 26, 2011 at 10:52, Avi Kivity <avi@redhat.com> wrote: >> On 01/25/2011 08:29 PM, Ruben Kerkhof wrote: >>> >>> > When you say "suddenly", this was with no changes to software and >>> > hardware? >>> >>> The host software and hardware hasn't changed in the two months since >>> the machine has been running. 2.6.34.7 kernel and qemu-kvm 0.13. >>> >>> We host customer vms on it though, so virtual machines come and go. >>> Various operating systems, a mixture of Linux, FreeBSD and Windows >>> 2008 R2. We have other machines with the same config without these >>> problems though. >> >> Are those other machines running a similar workload? > > Yes, similar, or they're more heavily loaded. > > On this machine, about half of the 48GB memory was used for virtual machines. > >> The traces look awfully like bad hardware, though that can also be explained >> by random memory corruption due to a bug. > > Yeah, that's what I'm expecting. We already replaced the memory, next > step is to move the disks over to another server to make sure it's not > the board or cpu's. > >>> This time I have a few different messages though: >>> >>> 2011-01-25T11:58:50.001208+01:00 phy005 kernel: general protection fault: >>> 0000 [#1] SMP >>> >>> RSI: 0000000000000000 RDI: 1603a07305001568 >>> >>> 2011-01-25T11:58:50.001486+01:00 phy005 kernel: Code: ff ff 41 8b 46 >>> 08 41 29 06 4c 89 e7 57 9d 0f 1f 44 00 00 48 83 c4 18 5b 41 5c 41 5d >>> 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00<f0> ff 4f 08 0f 94 c0 84 >>> c0 74 10 85 f6 75 07 e8 63 fe ff ff eb >> >> lock decl 0x8(%rdi) >> >> %rdi is completely crap, looks like corruption again. Strangely, it is >> similar to the bad spte from the previous trace: 0x1603a0730500d277. The >> upper 48 bits are identical, the lower 16 bits are different.: >>> >>> 2011-01-25T12:06:32.673937+01:00 phy005 kernel: qemu-kvm: Corrupted >>> page table at address 7f37b37ff000 >>> 2011-01-25T12:06:32.673959+01:00 phy005 kernel: PGD c201d1067 PUD >>> 94e538067 PMD 61e5bf067 PTE 1603a0730500e067 >> >> Here are those magic 48 bits again, in the PTE entry. >>> >>> 2011-01-25T12:38:49.416943+01:00 phy005 kernel: EPT: Misconfiguration. >>> 2011-01-25T12:38:49.417518+01:00 phy005 kernel: EPT: GPA: 0x2abff038 >>> 2011-01-25T12:38:49.417526+01:00 phy005 kernel: >>> ept_misconfig_inspect_spte: spte 0x5f49e9007 level 4 >>> 2011-01-25T12:38:49.417532+01:00 phy005 kernel: >>> ept_misconfig_inspect_spte: spte 0x5db595007 level 3 >>> 2011-01-25T12:38:49.417553+01:00 phy005 kernel: >>> ept_misconfig_inspect_spte: spte 0x5d5da7007 level 2 >>> 2011-01-25T12:38:49.417558+01:00 phy005 kernel: >>> ept_misconfig_inspect_spte: spte 0x1603a07305006277 level 1 >> >> Again. >> >>> 2011-01-25T13:16:58.192440+01:00 phy005 kernel: BUG: Bad page map in >>> process qemu-kvm pte:1603a0730500d067 pmd:61059f067 >> >> Again. >> >> However, these all came from a single boot, yes? > > Correct. > >> If so they can be the same >> corruption. Please collect more traces, with reboots in between. This machine has been running for a week without problems, but then we started to get the following oopses again: 2011-02-06T19:45:35.221555+01:00 phy005 kernel: BUG: unable to handle kernel paging request at ffffea71929180e0 2011-02-06T19:45:35.222194+01:00 phy005 kernel: IP: [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-02-06T19:45:35.222199+01:00 phy005 kernel: PGD 118600067 PUD 0 2011-02-06T19:45:35.222203+01:00 phy005 kernel: Oops: 0000 [#1] SMP 2011-02-06T19:45:35.222221+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-02-06T19:45:35.222224+01:00 phy005 kernel: CPU 4 2011-02-06T19:45:35.222229+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-06T19:45:35.222231+01:00 phy005 kernel: 2011-02-06T19:45:35.222233+01:00 phy005 kernel: Pid: 3650, comm: qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-02-06T19:45:35.222236+01:00 phy005 kernel: RIP: 0010:[<ffffffff81034880>] [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-02-06T19:45:35.222239+01:00 phy005 kernel: RSP: 0018:ffff88060b9bda78 EFLAGS: 00010082 2011-02-06T19:45:35.222241+01:00 phy005 kernel: RAX: ffffea71929180e0 RBX: 00003ffffffff000 RCX: 0000000000000005 2011-02-06T19:45:35.222243+01:00 phy005 kernel: RDX: 00007fe54e400000 RSI: 00007fe54e3ff000 RDI: 1603a07305004067 2011-02-06T19:45:35.222245+01:00 phy005 kernel: RBP: ffff88060b9bda98 R08: ffff880b94384560 R09: ffff88060b9bdb44 2011-02-06T19:45:35.222248+01:00 phy005 kernel: R10: ffff880606b2fff8 R11: ffffea0000000000 R12: 0000000000000205 2011-02-06T19:45:35.222251+01:00 phy005 kernel: R13: ffffc00000000fff R14: 0000000000000005 R15: 0000000000000000 2011-02-06T19:45:35.222255+01:00 phy005 kernel: FS: 00007fe64cb0e700(0000) GS:ffff880655400000(0000) knlGS:0000000000000000 2011-02-06T19:45:35.222259+01:00 phy005 kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 2011-02-06T19:45:35.222263+01:00 phy005 kernel: CR2: ffffea71929180e0 CR3: 0000000bff06d000 CR4: 00000000000026e0 2011-02-06T19:45:35.222267+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-06T19:45:35.222271+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-06T19:45:35.222274+01:00 phy005 kernel: Process qemu-kvm (pid: 3650, threadinfo ffff88060b9bc000, task ffff880623ed2ee0) 2011-02-06T19:45:35.222278+01:00 phy005 kernel: Stack: 2011-02-06T19:45:35.222281+01:00 phy005 kernel: 00007fe54e400000 00007fe54e400000 00007fe54e400000 ffff88053a0d2388 2011-02-06T19:45:35.222285+01:00 phy005 kernel: <0> ffff88060b9bdaf8 ffffffff81034a15 00007fe54e3fffff 00007fe54e3fffff 2011-02-06T19:45:35.222289+01:00 phy005 kernel: <0> ffff88060b9bdb44 ffff880b94384560 ffff880bff06eca8 ffff880bff06d7f8 2011-02-06T19:45:35.222292+01:00 phy005 kernel: Call Trace: 2011-02-06T19:45:35.222296+01:00 phy005 kernel: [<ffffffff81034a15>] gup_pud_range+0x156/0x192 2011-02-06T19:45:35.222300+01:00 phy005 kernel: [<ffffffff81034b15>] get_user_pages_fast+0xc4/0x172 2011-02-06T19:45:35.222304+01:00 phy005 kernel: [<ffffffff81131fbc>] ? bio_add_page+0x36/0x38 2011-02-06T19:45:35.222308+01:00 phy005 kernel: [<ffffffff81134730>] dio_get_page+0x54/0x127 2011-02-06T19:45:35.222312+01:00 phy005 kernel: [<ffffffff81135317>] __blockdev_direct_IO+0x41d/0xa36 2011-02-06T19:45:35.222316+01:00 phy005 kernel: [<ffffffffa0080f69>] ? x86_emulate_insn+0x1ff8/0x2d61 [kvm] 2011-02-06T19:45:35.222320+01:00 phy005 kernel: [<ffffffff8113379b>] blkdev_direct_IO+0x4e/0x50 2011-02-06T19:45:35.222324+01:00 phy005 kernel: [<ffffffff81132c49>] ? blkdev_get_blocks+0x0/0x8d 2011-02-06T19:45:35.222328+01:00 phy005 kernel: [<ffffffff810cb516>] generic_file_direct_write+0xed/0x16d 2011-02-06T19:45:35.222331+01:00 phy005 kernel: [<ffffffff810cb72c>] __generic_file_aio_write+0x196/0x281 2011-02-06T19:45:35.222335+01:00 phy005 kernel: [<ffffffff811d5352>] ? file_has_perm+0xa4/0xc6 2011-02-06T19:45:35.222339+01:00 phy005 kernel: [<ffffffff81133043>] ? blkdev_aio_write+0x0/0x69 2011-02-06T19:45:35.222343+01:00 phy005 kernel: [<ffffffff8113306d>] blkdev_aio_write+0x2a/0x69 2011-02-06T19:45:35.222347+01:00 phy005 kernel: [<ffffffff81133043>] ? blkdev_aio_write+0x0/0x69 2011-02-06T19:45:35.222351+01:00 phy005 kernel: [<ffffffff8113d4eb>] aio_rw_vect_retry+0x85/0x18e 2011-02-06T19:45:35.222355+01:00 phy005 kernel: [<ffffffff8113e9b3>] aio_run_iocb+0x77/0x10f 2011-02-06T19:45:35.222359+01:00 phy005 kernel: [<ffffffff8113f508>] do_io_submit+0x558/0x7ce 2011-02-06T19:45:35.222363+01:00 phy005 kernel: [<ffffffff8113f78e>] sys_io_submit+0x10/0x12 2011-02-06T19:45:35.222366+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-02-06T19:45:35.222372+01:00 phy005 kernel: Code: 21 d8 49 01 c2 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8 <66> 83 38 00 48 89 c7 79 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 2011-02-06T19:45:35.222376+01:00 phy005 kernel: RIP [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-02-06T19:45:35.222379+01:00 phy005 kernel: RSP <ffff88060b9bda78> 2011-02-06T19:45:35.222382+01:00 phy005 kernel: CR2: ffffea71929180e0 2011-02-06T19:45:35.222386+01:00 phy005 kernel: ---[ end trace beed2b54d0bb8a00 ]--- and 2011-02-06T19:47:15.023129+01:00 phy005 kernel: qemu-kvm: Corrupted page table at address 7fbde15ff64c 2011-02-06T19:47:15.023207+01:00 phy005 kernel: PGD 5ff58a067 PUD 612668067 PMD 5937b7067 PTE 1603a07305008067 2011-02-06T19:47:15.023214+01:00 phy005 kernel: Bad pagetable: 000d [#2] SMP 2011-02-06T19:47:15.023219+01:00 phy005 kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:09.0/0000:05:00.0/host0/scsi_host/host0/stats 2011-02-06T19:47:15.023226+01:00 phy005 kernel: CPU 13 2011-02-06T19:47:15.023232+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-06T19:47:15.023236+01:00 phy005 kernel: 2011-02-06T19:47:15.023239+01:00 phy005 kernel: Pid: 3387, comm: qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-02-06T19:47:15.023244+01:00 phy005 kernel: RIP: 0033:[<00000000004abb73>] [<00000000004abb73>] 0x4abb73 2011-02-06T19:47:15.023247+01:00 phy005 kernel: RSP: 002b:00007fbdf3c00680 EFLAGS: 00010206 2011-02-06T19:47:15.023251+01:00 phy005 kernel: RAX: 00007fbde15ff000 RBX: 000000000000064c RCX: 0000000001abe968 2011-02-06T19:47:15.023254+01:00 phy005 kernel: RDX: 0000000001abe850 RSI: 0000000000000000 RDI: 000000003d600000 2011-02-06T19:47:15.023257+01:00 phy005 kernel: RBP: 0000000001f2ab00 R08: 0000000000000003 R09: 0000000002000000 2011-02-06T19:47:15.023260+01:00 phy005 kernel: R10: 000000000000c050 R11: 00007fbdec000818 R12: 0000000000000025 2011-02-06T19:47:15.023269+01:00 phy005 kernel: R13: 0000000000000003 R14: 000000003d600640 R15: 0000000000000000 2011-02-06T19:47:15.023273+01:00 phy005 kernel: FS: 00007fbdf3c01700(0000) GS:ffff8806554a0000(0000) knlGS:0000000000000000 2011-02-06T19:47:15.023276+01:00 phy005 kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 2011-02-06T19:47:15.023280+01:00 phy005 kernel: CR2: 00007fbde15ff64c CR3: 0000000606858000 CR4: 00000000000026e0 2011-02-06T19:47:15.023283+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-06T19:47:15.023286+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-06T19:47:15.023290+01:00 phy005 kernel: Process qemu-kvm (pid: 3387, threadinfo ffff88060689e000, task ffff8805ff5a9770) 2011-02-06T19:47:15.023294+01:00 phy005 kernel: 2011-02-06T19:47:15.023296+01:00 phy005 kernel: RIP [<00000000004abb73>] 0x4abb73 2011-02-06T19:47:15.023298+01:00 phy005 kernel: RSP <00007fbdf3c00680> 2011-02-06T19:47:15.023300+01:00 phy005 kernel: ---[ end trace beed2b54d0bb8a01 ]--- followed by 2011-02-06T21:20:32.882972+01:00 phy005 kernel: BUG: unable to handle kernel paging request at fffff6b192918010 2011-02-06T21:20:32.883252+01:00 phy005 kernel: IP: [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] 2011-02-06T21:20:32.883259+01:00 phy005 kernel: PGD 0 2011-02-06T21:20:32.883263+01:00 phy005 kernel: Oops: 0000 [#5] SMP 2011-02-06T21:20:32.883267+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-02-06T21:20:32.883271+01:00 phy005 kernel: CPU 8 2011-02-06T21:20:32.883278+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i 2c_i801 i2c_core iTCO_wdt serio_raw igb iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-06T21:20:32.883286+01:00 phy005 kernel: 2011-02-06T21:20:32.883290+01:00 phy005 kernel: Pid: 13247, comm: qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x 86_64 #1 X8DTU/X8DTU 2011-02-06T21:20:32.883295+01:00 phy005 kernel: RIP: 0010:[<ffffffffa0078826>] [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] 2011-02-06T21:20:32.883300+01:00 phy005 kernel: RSP: 0018:ffff880312bdfb58 EFLAGS: 00010206 2011-02-06T21:20:32.883303+01:00 phy005 kernel: RAX: 00000cb192918000 RBX: ffff8802d16ae210 RCX: 0000000000000000 2011-02-06T21:20:32.883307+01:00 phy005 kernel: RDX: ffffea0000000000 RSI: ffff88060bb07ff8 RDI: 0000000000000200 2011-02-06T21:20:32.883311+01:00 phy005 kernel: RBP: ffff880312bdfb88 R08: dead000000100100 R09: 0000000000000004 2011-02-06T21:20:32.883315+01:00 phy005 kernel: R10: 0000000000000000 R11: 0000000000000010 R12: ffff880853ae0000 2011-02-06T21:20:32.883319+01:00 phy005 kernel: R13: ffff88060bb07ff8 R14: 00000000000001ff R15: 0000000000000000 2011-02-06T21:20:32.883323+01:00 phy005 kernel: FS: 0000000000000000(0000) GS:ffff880002080000(0000) knlGS:0000000000000000 2011-02-06T21:20:32.883327+01:00 phy005 kernel: CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b 2011-02-06T21:20:32.883331+01:00 phy005 kernel: CR2: fffff6b192918010 CR3: 0000000001a42000 CR4: 00000000000026e0 2011-02-06T21:20:32.883335+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-06T21:20:32.883338+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-06T21:20:32.883343+01:00 phy005 kernel: Process qemu-kvm (pid: 13247, threadinfo ffff880312bde000, task ffff880268ad8000) 2011-02-06T21:20:32.883347+01:00 phy005 kernel: Stack: 2011-02-06T21:20:32.883351+01:00 phy005 kernel: 0000000000000002 ffff880853ae0000 ffff8802d16ae160 ffff880853ae2328 2011-02-06T21:20:32.883355+01:00 phy005 kernel: <0> ffff880c22d426e8 ffff880268ad8000 ffff880312bdfbb8 ffffffffa0078a42 2011-02-06T21:20:32.883358+01:00 phy005 kernel: <0> ffffea00134a16c8 ffff880853ae0000 ffff880853ae0000 0000000000000001 2011-02-06T21:20:32.883362+01:00 phy005 kernel: Call Trace: 2011-02-06T21:20:32.883366+01:00 phy005 kernel: [<ffffffffa0078a42>] kvm_mmu_zap_all+0x35/0x60 [kvm] 2011-02-06T21:20:32.883371+01:00 phy005 kernel: [<ffffffffa006dcde>] kvm_arch_flush_shadow+0x16/0x22 [kvm] 2011-02-06T21:20:32.883375+01:00 phy005 kernel: [<ffffffffa0063b0a>] kvm_mmu_notifier_release+0x31/0x44 [kvm] 2011-02-06T21:20:32.883379+01:00 phy005 kernel: [<ffffffff810fac37>] __mmu_notifier_release+0x4f/0x7b 2011-02-06T21:20:32.883383+01:00 phy005 kernel: [<ffffffff810e735d>] exit_mmap+0x2c/0x132 2011-02-06T21:20:32.883386+01:00 phy005 kernel: [<ffffffff8104ad7a>] mmput+0x5e/0xca 2011-02-06T21:20:32.883390+01:00 phy005 kernel: [<ffffffff8104f0d5>] exit_mm+0x114/0x121 2011-02-06T21:20:32.883394+01:00 phy005 kernel: [<ffffffff81050bf5>] do_exit+0x254/0x752 2011-02-06T21:20:32.883398+01:00 phy005 kernel: [<ffffffff81051174>] do_group_exit+0x81/0xab 2011-02-06T21:20:32.883403+01:00 phy005 kernel: [<ffffffff8105e5cd>] get_signal_to_deliver+0x3a6/0x3c8 2011-02-06T21:20:32.883406+01:00 phy005 kernel: [<ffffffff81009038>] do_signal+0x72/0x6b8 2011-02-06T21:20:32.883410+01:00 phy005 kernel: [<ffffffff8111aa2f>] ? vfs_ioctl+0x32/0xa6 2011-02-06T21:20:32.883413+01:00 phy005 kernel: [<ffffffff8111afa2>] ? do_vfs_ioctl+0x483/0x4c9 2011-02-06T21:20:32.883416+01:00 phy005 kernel: [<ffffffff810096a6>] do_notify_resume+0x28/0x86 2011-02-06T21:20:32.883420+01:00 phy005 kernel: [<ffffffff81009f3e>] int_signal+0x12/0x17 2011-02-06T21:20:32.883426+01:00 phy005 kernel: Code: 41 5e 44 89 f8 41 5f c9 c3 48 ba 00 f0 ff ff ff ff 0f 00 4c 89 ee 48 21 d0 48 ba 00 00 00 00 00 ea ff ff 48 c1 e8 0c 48 6b c0 38 <48> 8b 7c 10 10 e8 a3 f3 ff ff e9 06 fe ff ff 55 48 89 e5 41 57 2011-02-06T21:20:32.883431+01:00 phy005 kernel: RIP [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] 2011-02-06T21:20:32.883434+01:00 phy005 kernel: RSP <ffff880312bdfb58> 2011-02-06T21:20:32.883437+01:00 phy005 kernel: CR2: fffff6b192918010 2011-02-06T21:20:32.883441+01:00 phy005 kernel: ---[ end trace beed2b54d0bb8a04 ]--- 2011-02-06T21:20:32.883444+01:00 phy005 kernel: Fixing recursive fault but reboot is needed! after which we rebooted the machine and replaced the motherboard and cpus (we already replaced the memory before). But 2 days ago we got this oops: 2011-02-08T15:56:19.902104+01:00 phy005 kernel: BUG: unable to handle kernel paging request at ffffea71929181c0 2011-02-08T15:56:19.902686+01:00 phy005 kernel: IP: [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-02-08T15:56:19.902693+01:00 phy005 kernel: PGD 118600067 PUD 0 2011-02-08T15:56:19.902699+01:00 phy005 kernel: Oops: 0000 [#1] SMP 2011-02-08T15:56:19.902703+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m ap 2011-02-08T15:56:19.902708+01:00 phy005 kernel: CPU 8 2011-02-08T15:56:19.902715+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-08T15:56:19.902770+01:00 phy005 kernel: 2011-02-08T15:56:19.902775+01:00 phy005 kernel: Pid: 3346, comm: qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X 8DTU/X8DTU 2011-02-08T15:56:19.902781+01:00 phy005 kernel: RIP: 0010:[<ffffffff81034880>] [<ffffffff81034880>] gup_pte_range+0x94/ 0xd3 2011-02-08T15:56:19.902785+01:00 phy005 kernel: RSP: 0018:ffff880c21bc1a78 EFLAGS: 00010086 2011-02-08T15:56:19.902789+01:00 phy005 kernel: RAX: ffffea71929181c0 RBX: 00003ffffffff000 RCX: 0000000000000005 2011-02-08T15:56:19.902793+01:00 phy005 kernel: RDX: 00007fa2ca200000 RSI: 00007fa2ca1ff000 RDI: 1603a07305008067 2011-02-08T15:56:19.902797+01:00 phy005 kernel: RBP: ffff880c21bc1a98 R08: ffff88060fdfad60 R09: ffff880c21bc1b44 2011-02-08T15:56:19.902801+01:00 phy005 kernel: R10: ffff88061493fff8 R11: ffffea0000000000 R12: 0000000000000205 2011-02-08T15:56:19.902805+01:00 phy005 kernel: R13: ffffc00000000fff R14: 0000000000000005 R15: 0000000000000000 2011-02-08T15:56:19.902810+01:00 phy005 kernel: FS: 00007fa2d8724700(0000) GS:ffff880002080000(0000) knlGS:000000000000 0000 2011-02-08T15:56:19.902820+01:00 phy005 kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 2011-02-08T15:56:19.902825+01:00 phy005 kernel: CR2: ffffea71929181c0 CR3: 0000000c231f9000 CR4: 00000000000026e0 2011-02-08T15:56:19.902829+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-08T15:56:19.902833+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-08T15:56:19.902837+01:00 phy005 kernel: Process qemu-kvm (pid: 3346, threadinfo ffff880c21bc0000, task ffff880c2 264ddc0) 2011-02-08T15:56:19.902841+01:00 phy005 kernel: Stack: 2011-02-08T15:56:19.902844+01:00 phy005 kernel: 00007fa2ca200000 00007fa2ca201000 00007fa2ca201000 ffff880c22c3d280 2011-02-08T15:56:19.902848+01:00 phy005 kernel: <0> ffff880c21bc1af8 ffffffff81034a15 00007fa2ca200fff 00007fa2ca200fff 2011-02-08T15:56:19.902852+01:00 phy005 kernel: <0> ffff880c21bc1b44 ffff88060fdfad60 ffff880c2231a458 ffff880c231f97f8 2011-02-08T15:56:19.902855+01:00 phy005 kernel: Call Trace: 2011-02-08T15:56:19.902859+01:00 phy005 kernel: [<ffffffff81034a15>] gup_pud_range+0x156/0x192 2011-02-08T15:56:19.902863+01:00 phy005 kernel: [<ffffffff81034b15>] get_user_pages_fast+0xc4/0x172 2011-02-08T15:56:19.902867+01:00 phy005 kernel: [<ffffffff81131fbc>] ? bio_add_page+0x36/0x38 2011-02-08T15:56:19.902871+01:00 phy005 kernel: [<ffffffff81134730>] dio_get_page+0x54/0x127 2011-02-08T15:56:19.902875+01:00 phy005 kernel: [<ffffffff81135317>] __blockdev_direct_IO+0x41d/0xa36 2011-02-08T15:56:19.902880+01:00 phy005 kernel: [<ffffffffa008bf69>] ? x86_emulate_insn+0x1ff8/0x2d61 [kvm] 2011-02-08T15:56:19.902884+01:00 phy005 kernel: [<ffffffff8113379b>] blkdev_direct_IO+0x4e/0x50 2011-02-08T15:56:19.902888+01:00 phy005 kernel: [<ffffffff81132c49>] ? blkdev_get_blocks+0x0/0x8d 2011-02-08T15:56:19.902892+01:00 phy005 kernel: [<ffffffff810cb516>] generic_file_direct_write+0xed/0x16d 2011-02-08T15:56:19.902896+01:00 phy005 kernel: [<ffffffff810cb72c>] __generic_file_aio_write+0x196/0x281 2011-02-08T15:56:19.902899+01:00 phy005 kernel: [<ffffffff81133043>] ? blkdev_aio_write+0x0/0x69 2011-02-08T15:56:19.902909+01:00 phy005 kernel: [<ffffffff81133043>] ? blkdev_aio_write+0x0/0x69 2011-02-08T15:56:19.902914+01:00 phy005 kernel: [<ffffffff8113d4eb>] aio_rw_vect_retry+0x85/0x18e 2011-02-08T15:56:19.902919+01:00 phy005 kernel: [<ffffffff8113e9b3>] aio_run_iocb+0x77/0x10f 2011-02-08T15:56:19.902923+01:00 phy005 kernel: [<ffffffff8113f508>] do_io_submit+0x558/0x7ce 2011-02-08T15:56:19.902927+01:00 phy005 kernel: [<ffffffff8113f78e>] sys_io_submit+0x10/0x12 2011-02-08T15:56:19.902932+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-02-08T15:56:19.902938+01:00 phy005 kernel: Code: 21 d8 49 01 c2 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8 <66> 83 38 00 48 89 c7 79 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 2011-02-08T15:56:19.903077+01:00 phy005 kernel: RIP [<ffffffff81034880>] gup_pte_range+0x94/0xd3 2011-02-08T15:56:19.903081+01:00 phy005 kernel: RSP <ffff880c21bc1a78> 2011-02-08T15:56:19.903084+01:00 phy005 kernel: CR2: ffffea71929181c0 2011-02-08T15:56:19.903088+01:00 phy005 kernel: ---[ end trace 174c28940e9fd0a7 ]--- and yesterday this one: 2011-02-09T07:40:15.636528+01:00 phy005 kernel: BUG: unable to handle kernel NULL pointer dereference at (null) 2011-02-09T07:40:15.636635+01:00 phy005 kernel: IP: [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] 2011-02-09T07:40:15.636639+01:00 phy005 kernel: PGD 0 2011-02-09T07:40:15.636643+01:00 phy005 kernel: Oops: 0000 [#3] SMP 2011-02-09T07:40:15.636647+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/topology/thread_siblings 2011-02-09T07:40:15.636650+01:00 phy005 kernel: CPU 2 2011-02-09T07:40:15.636656+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-09T07:40:15.636663+01:00 phy005 kernel: 2011-02-09T07:40:15.636666+01:00 phy005 kernel: Pid: 2572, comm: qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU 2011-02-09T07:40:15.636670+01:00 phy005 kernel: RIP: 0010:[<ffffffffa0082db8>] [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] 2011-02-09T07:40:15.636673+01:00 phy005 kernel: RSP: 0018:ffff88061cbcbcd8 EFLAGS: 00010246 2011-02-09T07:40:15.636677+01:00 phy005 kernel: RAX: 0000000000000000 RBX: 1603a07305004fff RCX: ffff88061cbcbd08 2011-02-09T07:40:15.636680+01:00 phy005 kernel: RDX: 0000000000000023 RSI: 1603a07305004fff RDI: 0000000000000000 2011-02-09T07:40:15.636683+01:00 phy005 kernel: RBP: ffff88061cbcbce8 R08: 0000000000000023 R09: 0000000000000000 2011-02-09T07:40:15.636686+01:00 phy005 kernel: R10: 0000000000000000 R11: ffffffffa0082c7f R12: 0000000000000001 2011-02-09T07:40:15.636689+01:00 phy005 kernel: R13: 0000000000311763 R14: ffff8809b8b01ce0 R15: 0000000000000000 2011-02-09T07:40:15.636692+01:00 phy005 kernel: FS: 0000000000000000(0000) GS:ffff880002040000(0000) knlGS:0000000000000000 2011-02-09T07:40:15.636695+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2011-02-09T07:40:15.636699+01:00 phy005 kernel: CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000026e0 2011-02-09T07:40:15.636702+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-09T07:40:15.636705+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-09T07:40:15.636709+01:00 phy005 kernel: Process qemu-kvm (pid: 2572, threadinfo ffff88061cbca000, task ffff88061cf04650) 2011-02-09T07:40:15.636711+01:00 phy005 kernel: Stack: 2011-02-09T07:40:15.636715+01:00 phy005 kernel: ffff88036c471ff8 ffff880c23984000 ffff88061cbcbd18 ffffffffa0082ea9 2011-02-09T07:40:15.636718+01:00 phy005 kernel: <0> ffff8809b8b01ce0 ffff880c23984000 ffff88036c471ff8 00000000000001ff 2011-02-09T07:40:15.636721+01:00 phy005 kernel: <0> ffff88061cbcbd58 ffffffffa008363b 0000000000000200 ffff880c23984000 2011-02-09T07:40:15.636724+01:00 phy005 kernel: Call Trace: 2011-02-09T07:40:15.636728+01:00 phy005 kernel: [<ffffffffa0082ea9>] rmap_remove+0xa3/0x1a0 [kvm] 2011-02-09T07:40:15.636731+01:00 phy005 kernel: [<ffffffffa008363b>] kvm_mmu_zap_page+0x9f/0x299 [kvm] 2011-02-09T07:40:15.636734+01:00 phy005 kernel: [<ffffffffa0083a42>] kvm_mmu_zap_all+0x35/0x60 [kvm] 2011-02-09T07:40:15.636738+01:00 phy005 kernel: [<ffffffffa0078cde>] kvm_arch_flush_shadow+0x16/0x22 [kvm] 2011-02-09T07:40:15.636741+01:00 phy005 kernel: [<ffffffffa006eb0a>] kvm_mmu_notifier_release+0x31/0x44 [kvm] 2011-02-09T07:40:15.636744+01:00 phy005 kernel: [<ffffffff810fac37>] __mmu_notifier_release+0x4f/0x7b 2011-02-09T07:40:15.636748+01:00 phy005 kernel: [<ffffffff810e735d>] exit_mmap+0x2c/0x132 2011-02-09T07:40:15.636751+01:00 phy005 kernel: [<ffffffff8104ad7a>] mmput+0x5e/0xca 2011-02-09T07:40:15.636754+01:00 phy005 kernel: [<ffffffff8104f0d5>] exit_mm+0x114/0x121 2011-02-09T07:40:15.636757+01:00 phy005 kernel: [<ffffffff81050bf5>] do_exit+0x254/0x752 2011-02-09T07:40:15.636760+01:00 phy005 kernel: [<ffffffff8100a60e>] ? apic_timer_interrupt+0xe/0x20 2011-02-09T07:40:15.636764+01:00 phy005 kernel: [<ffffffff81051174>] do_group_exit+0x81/0xab 2011-02-09T07:40:15.636767+01:00 phy005 kernel: [<ffffffff810511b5>] sys_exit_group+0x17/0x1b 2011-02-09T07:40:15.636771+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-02-09T07:40:15.636777+01:00 phy005 kernel: Code: 88 ff ff ff b8 01 00 00 00 c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 41 89 d4 48 89 f3 e8 7b c7 fe ff 41 83 fc 01 48 89 c7 75 0d <48> 2b 18 48 c1 e3 03 48 03 58 18 eb 39 41 8d 4c 24 ff be 01 00 2011-02-09T07:40:15.636785+01:00 phy005 kernel: RIP [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] 2011-02-09T07:40:15.636788+01:00 phy005 kernel: RSP <ffff88061cbcbcd8> 2011-02-09T07:40:15.636791+01:00 phy005 kernel: CR2: 0000000000000000 2011-02-09T07:40:15.637743+01:00 phy005 kernel: ---[ end trace 174c28940e9fd0a9 ]--- 2011-02-09T07:40:15.637751+01:00 phy005 kernel: Fixing recursive fault but reboot is needed! So it doesn't seem to be a hardware problem since we replaced all that. Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-10 15:23 ` Ruben Kerkhof @ 2011-02-13 2:07 ` Ruben Kerkhof 2011-02-13 13:03 ` Avi Kivity 2011-02-13 12:58 ` Avi Kivity 1 sibling, 1 reply; 19+ messages in thread From: Ruben Kerkhof @ 2011-02-13 2:07 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm On Thu, Feb 10, 2011 at 16:23, Ruben Kerkhof <ruben@rubenkerkhof.com> wrote: > On Wed, Jan 26, 2011 at 16:00, Ruben Kerkhof <ruben@rubenkerkhof.com> wrote: >> On Wed, Jan 26, 2011 at 10:52, Avi Kivity <avi@redhat.com> wrote: >>> On 01/25/2011 08:29 PM, Ruben Kerkhof wrote: >>>> >>>> > When you say "suddenly", this was with no changes to software and >>>> > hardware? >>>> >>>> The host software and hardware hasn't changed in the two months since >>>> the machine has been running. 2.6.34.7 kernel and qemu-kvm 0.13. >>>> >>>> We host customer vms on it though, so virtual machines come and go. >>>> Various operating systems, a mixture of Linux, FreeBSD and Windows >>>> 2008 R2. We have other machines with the same config without these >>>> problems though. >>> >>> Are those other machines running a similar workload? >> >> Yes, similar, or they're more heavily loaded. >> >> On this machine, about half of the 48GB memory was used for virtual machines. >> >>> The traces look awfully like bad hardware, though that can also be explained >>> by random memory corruption due to a bug. >> >> Yeah, that's what I'm expecting. We already replaced the memory, next >> step is to move the disks over to another server to make sure it's not >> the board or cpu's. >> >>>> This time I have a few different messages though: >>>> >>>> 2011-01-25T11:58:50.001208+01:00 phy005 kernel: general protection fault: >>>> 0000 [#1] SMP >>>> >>>> RSI: 0000000000000000 RDI: 1603a07305001568 >>>> >>>> 2011-01-25T11:58:50.001486+01:00 phy005 kernel: Code: ff ff 41 8b 46 >>>> 08 41 29 06 4c 89 e7 57 9d 0f 1f 44 00 00 48 83 c4 18 5b 41 5c 41 5d >>>> 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00<f0> ff 4f 08 0f 94 c0 84 >>>> c0 74 10 85 f6 75 07 e8 63 fe ff ff eb >>> >>> lock decl 0x8(%rdi) >>> >>> %rdi is completely crap, looks like corruption again. Strangely, it is >>> similar to the bad spte from the previous trace: 0x1603a0730500d277. The >>> upper 48 bits are identical, the lower 16 bits are different.: >>>> >>>> 2011-01-25T12:06:32.673937+01:00 phy005 kernel: qemu-kvm: Corrupted >>>> page table at address 7f37b37ff000 >>>> 2011-01-25T12:06:32.673959+01:00 phy005 kernel: PGD c201d1067 PUD >>>> 94e538067 PMD 61e5bf067 PTE 1603a0730500e067 >>> >>> Here are those magic 48 bits again, in the PTE entry. >>>> >>>> 2011-01-25T12:38:49.416943+01:00 phy005 kernel: EPT: Misconfiguration. >>>> 2011-01-25T12:38:49.417518+01:00 phy005 kernel: EPT: GPA: 0x2abff038 >>>> 2011-01-25T12:38:49.417526+01:00 phy005 kernel: >>>> ept_misconfig_inspect_spte: spte 0x5f49e9007 level 4 >>>> 2011-01-25T12:38:49.417532+01:00 phy005 kernel: >>>> ept_misconfig_inspect_spte: spte 0x5db595007 level 3 >>>> 2011-01-25T12:38:49.417553+01:00 phy005 kernel: >>>> ept_misconfig_inspect_spte: spte 0x5d5da7007 level 2 >>>> 2011-01-25T12:38:49.417558+01:00 phy005 kernel: >>>> ept_misconfig_inspect_spte: spte 0x1603a07305006277 level 1 >>> >>> Again. >>> >>>> 2011-01-25T13:16:58.192440+01:00 phy005 kernel: BUG: Bad page map in >>>> process qemu-kvm pte:1603a0730500d067 pmd:61059f067 >>> >>> Again. >>> >>> However, these all came from a single boot, yes? >> >> Correct. >> >>> If so they can be the same >>> corruption. Please collect more traces, with reboots in between. > > This machine has been running for a week without problems, but then we > started to get the following oopses again: > > 2011-02-06T19:45:35.221555+01:00 phy005 kernel: BUG: unable to handle > kernel paging request at ffffea71929180e0 > 2011-02-06T19:45:35.222194+01:00 phy005 kernel: IP: > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-06T19:45:35.222199+01:00 phy005 kernel: PGD 118600067 PUD 0 > 2011-02-06T19:45:35.222203+01:00 phy005 kernel: Oops: 0000 [#1] SMP > 2011-02-06T19:45:35.222221+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/topology/thread_siblings > 2011-02-06T19:45:35.222224+01:00 phy005 kernel: CPU 4 > 2011-02-06T19:45:35.222229+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding > xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter > ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb > iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: > scsi_wait_scan] > 2011-02-06T19:45:35.222231+01:00 phy005 kernel: > 2011-02-06T19:45:35.222233+01:00 phy005 kernel: Pid: 3650, comm: > qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU > 2011-02-06T19:45:35.222236+01:00 phy005 kernel: RIP: > 0010:[<ffffffff81034880>] [<ffffffff81034880>] > gup_pte_range+0x94/0xd3 > 2011-02-06T19:45:35.222239+01:00 phy005 kernel: RSP: > 0018:ffff88060b9bda78 EFLAGS: 00010082 > 2011-02-06T19:45:35.222241+01:00 phy005 kernel: RAX: ffffea71929180e0 > RBX: 00003ffffffff000 RCX: 0000000000000005 > 2011-02-06T19:45:35.222243+01:00 phy005 kernel: RDX: 00007fe54e400000 > RSI: 00007fe54e3ff000 RDI: 1603a07305004067 > 2011-02-06T19:45:35.222245+01:00 phy005 kernel: RBP: ffff88060b9bda98 > R08: ffff880b94384560 R09: ffff88060b9bdb44 > 2011-02-06T19:45:35.222248+01:00 phy005 kernel: R10: ffff880606b2fff8 > R11: ffffea0000000000 R12: 0000000000000205 > 2011-02-06T19:45:35.222251+01:00 phy005 kernel: R13: ffffc00000000fff > R14: 0000000000000005 R15: 0000000000000000 > 2011-02-06T19:45:35.222255+01:00 phy005 kernel: FS: > 00007fe64cb0e700(0000) GS:ffff880655400000(0000) > knlGS:0000000000000000 > 2011-02-06T19:45:35.222259+01:00 phy005 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > 2011-02-06T19:45:35.222263+01:00 phy005 kernel: CR2: ffffea71929180e0 > CR3: 0000000bff06d000 CR4: 00000000000026e0 > 2011-02-06T19:45:35.222267+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-06T19:45:35.222271+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-06T19:45:35.222274+01:00 phy005 kernel: Process qemu-kvm (pid: > 3650, threadinfo ffff88060b9bc000, task ffff880623ed2ee0) > 2011-02-06T19:45:35.222278+01:00 phy005 kernel: Stack: > 2011-02-06T19:45:35.222281+01:00 phy005 kernel: 00007fe54e400000 > 00007fe54e400000 00007fe54e400000 ffff88053a0d2388 > 2011-02-06T19:45:35.222285+01:00 phy005 kernel: <0> ffff88060b9bdaf8 > ffffffff81034a15 00007fe54e3fffff 00007fe54e3fffff > 2011-02-06T19:45:35.222289+01:00 phy005 kernel: <0> ffff88060b9bdb44 > ffff880b94384560 ffff880bff06eca8 ffff880bff06d7f8 > 2011-02-06T19:45:35.222292+01:00 phy005 kernel: Call Trace: > 2011-02-06T19:45:35.222296+01:00 phy005 kernel: [<ffffffff81034a15>] > gup_pud_range+0x156/0x192 > 2011-02-06T19:45:35.222300+01:00 phy005 kernel: [<ffffffff81034b15>] > get_user_pages_fast+0xc4/0x172 > 2011-02-06T19:45:35.222304+01:00 phy005 kernel: [<ffffffff81131fbc>] ? > bio_add_page+0x36/0x38 > 2011-02-06T19:45:35.222308+01:00 phy005 kernel: [<ffffffff81134730>] > dio_get_page+0x54/0x127 > 2011-02-06T19:45:35.222312+01:00 phy005 kernel: [<ffffffff81135317>] > __blockdev_direct_IO+0x41d/0xa36 > 2011-02-06T19:45:35.222316+01:00 phy005 kernel: [<ffffffffa0080f69>] ? > x86_emulate_insn+0x1ff8/0x2d61 [kvm] > 2011-02-06T19:45:35.222320+01:00 phy005 kernel: [<ffffffff8113379b>] > blkdev_direct_IO+0x4e/0x50 > 2011-02-06T19:45:35.222324+01:00 phy005 kernel: [<ffffffff81132c49>] ? > blkdev_get_blocks+0x0/0x8d > 2011-02-06T19:45:35.222328+01:00 phy005 kernel: [<ffffffff810cb516>] > generic_file_direct_write+0xed/0x16d > 2011-02-06T19:45:35.222331+01:00 phy005 kernel: [<ffffffff810cb72c>] > __generic_file_aio_write+0x196/0x281 > 2011-02-06T19:45:35.222335+01:00 phy005 kernel: [<ffffffff811d5352>] ? > file_has_perm+0xa4/0xc6 > 2011-02-06T19:45:35.222339+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-06T19:45:35.222343+01:00 phy005 kernel: [<ffffffff8113306d>] > blkdev_aio_write+0x2a/0x69 > 2011-02-06T19:45:35.222347+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-06T19:45:35.222351+01:00 phy005 kernel: [<ffffffff8113d4eb>] > aio_rw_vect_retry+0x85/0x18e > 2011-02-06T19:45:35.222355+01:00 phy005 kernel: [<ffffffff8113e9b3>] > aio_run_iocb+0x77/0x10f > 2011-02-06T19:45:35.222359+01:00 phy005 kernel: [<ffffffff8113f508>] > do_io_submit+0x558/0x7ce > 2011-02-06T19:45:35.222363+01:00 phy005 kernel: [<ffffffff8113f78e>] > sys_io_submit+0x10/0x12 > 2011-02-06T19:45:35.222366+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-06T19:45:35.222372+01:00 phy005 kernel: Code: 21 d8 49 01 c2 > 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 > 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8 <66> 83 38 00 48 89 c7 79 > 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 > 2011-02-06T19:45:35.222376+01:00 phy005 kernel: RIP > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-06T19:45:35.222379+01:00 phy005 kernel: RSP <ffff88060b9bda78> > 2011-02-06T19:45:35.222382+01:00 phy005 kernel: CR2: ffffea71929180e0 > 2011-02-06T19:45:35.222386+01:00 phy005 kernel: ---[ end trace > beed2b54d0bb8a00 ]--- > > and > > 2011-02-06T19:47:15.023129+01:00 phy005 kernel: qemu-kvm: Corrupted > page table at address 7fbde15ff64c > 2011-02-06T19:47:15.023207+01:00 phy005 kernel: PGD 5ff58a067 PUD > 612668067 PMD 5937b7067 PTE 1603a07305008067 > 2011-02-06T19:47:15.023214+01:00 phy005 kernel: Bad pagetable: 000d [#2] SMP > 2011-02-06T19:47:15.023219+01:00 phy005 kernel: last sysfs file: > /sys/devices/pci0000:00/0000:00:09.0/0000:05:00.0/host0/scsi_host/host0/stats > 2011-02-06T19:47:15.023226+01:00 phy005 kernel: CPU 13 > 2011-02-06T19:47:15.023232+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding > xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter > ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb > iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: > scsi_wait_scan] > 2011-02-06T19:47:15.023236+01:00 phy005 kernel: > 2011-02-06T19:47:15.023239+01:00 phy005 kernel: Pid: 3387, comm: > qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 > X8DTU/X8DTU > 2011-02-06T19:47:15.023244+01:00 phy005 kernel: RIP: > 0033:[<00000000004abb73>] [<00000000004abb73>] 0x4abb73 > 2011-02-06T19:47:15.023247+01:00 phy005 kernel: RSP: > 002b:00007fbdf3c00680 EFLAGS: 00010206 > 2011-02-06T19:47:15.023251+01:00 phy005 kernel: RAX: 00007fbde15ff000 > RBX: 000000000000064c RCX: 0000000001abe968 > 2011-02-06T19:47:15.023254+01:00 phy005 kernel: RDX: 0000000001abe850 > RSI: 0000000000000000 RDI: 000000003d600000 > 2011-02-06T19:47:15.023257+01:00 phy005 kernel: RBP: 0000000001f2ab00 > R08: 0000000000000003 R09: 0000000002000000 > 2011-02-06T19:47:15.023260+01:00 phy005 kernel: R10: 000000000000c050 > R11: 00007fbdec000818 R12: 0000000000000025 > 2011-02-06T19:47:15.023269+01:00 phy005 kernel: R13: 0000000000000003 > R14: 000000003d600640 R15: 0000000000000000 > 2011-02-06T19:47:15.023273+01:00 phy005 kernel: FS: > 00007fbdf3c01700(0000) GS:ffff8806554a0000(0000) > knlGS:0000000000000000 > 2011-02-06T19:47:15.023276+01:00 phy005 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > 2011-02-06T19:47:15.023280+01:00 phy005 kernel: CR2: 00007fbde15ff64c > CR3: 0000000606858000 CR4: 00000000000026e0 > 2011-02-06T19:47:15.023283+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-06T19:47:15.023286+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-06T19:47:15.023290+01:00 phy005 kernel: Process qemu-kvm (pid: > 3387, threadinfo ffff88060689e000, task ffff8805ff5a9770) > 2011-02-06T19:47:15.023294+01:00 phy005 kernel: > 2011-02-06T19:47:15.023296+01:00 phy005 kernel: RIP > [<00000000004abb73>] 0x4abb73 > 2011-02-06T19:47:15.023298+01:00 phy005 kernel: RSP <00007fbdf3c00680> > 2011-02-06T19:47:15.023300+01:00 phy005 kernel: ---[ end trace > beed2b54d0bb8a01 ]--- > > followed by > > 2011-02-06T21:20:32.882972+01:00 phy005 kernel: BUG: unable to handle > kernel paging request at fffff6b192918010 > 2011-02-06T21:20:32.883252+01:00 phy005 kernel: IP: > [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] > 2011-02-06T21:20:32.883259+01:00 phy005 kernel: PGD 0 > 2011-02-06T21:20:32.883263+01:00 phy005 kernel: Oops: 0000 [#5] SMP > 2011-02-06T21:20:32.883267+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/topology/thread_siblings > 2011-02-06T21:20:32.883271+01:00 phy005 kernel: CPU 8 > 2011-02-06T21:20:32.883278+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q > garp stp llc bonding xt_comment xt_recent ip6t_REJECT > nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i > 2c_i801 i2c_core iTCO_wdt serio_raw igb iTCO_vendor_support joydev > ioatdma dca 3w_9xxx [last unloaded: scsi_wait_scan] > 2011-02-06T21:20:32.883286+01:00 phy005 kernel: > 2011-02-06T21:20:32.883290+01:00 phy005 kernel: Pid: 13247, comm: > qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x > 86_64 #1 X8DTU/X8DTU > 2011-02-06T21:20:32.883295+01:00 phy005 kernel: RIP: > 0010:[<ffffffffa0078826>] [<ffffffffa0078826>] > kvm_mmu_zap_page+0x28a/0x299 [kvm] > 2011-02-06T21:20:32.883300+01:00 phy005 kernel: RSP: > 0018:ffff880312bdfb58 EFLAGS: 00010206 > 2011-02-06T21:20:32.883303+01:00 phy005 kernel: RAX: 00000cb192918000 > RBX: ffff8802d16ae210 RCX: 0000000000000000 > 2011-02-06T21:20:32.883307+01:00 phy005 kernel: RDX: ffffea0000000000 > RSI: ffff88060bb07ff8 RDI: 0000000000000200 > 2011-02-06T21:20:32.883311+01:00 phy005 kernel: RBP: ffff880312bdfb88 > R08: dead000000100100 R09: 0000000000000004 > 2011-02-06T21:20:32.883315+01:00 phy005 kernel: R10: 0000000000000000 > R11: 0000000000000010 R12: ffff880853ae0000 > 2011-02-06T21:20:32.883319+01:00 phy005 kernel: R13: ffff88060bb07ff8 > R14: 00000000000001ff R15: 0000000000000000 > 2011-02-06T21:20:32.883323+01:00 phy005 kernel: FS: > 0000000000000000(0000) GS:ffff880002080000(0000) > knlGS:0000000000000000 > 2011-02-06T21:20:32.883327+01:00 phy005 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 000000008005003b > 2011-02-06T21:20:32.883331+01:00 phy005 kernel: CR2: fffff6b192918010 > CR3: 0000000001a42000 CR4: 00000000000026e0 > 2011-02-06T21:20:32.883335+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-06T21:20:32.883338+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-06T21:20:32.883343+01:00 phy005 kernel: Process qemu-kvm (pid: > 13247, threadinfo ffff880312bde000, task ffff880268ad8000) > 2011-02-06T21:20:32.883347+01:00 phy005 kernel: Stack: > 2011-02-06T21:20:32.883351+01:00 phy005 kernel: 0000000000000002 > ffff880853ae0000 ffff8802d16ae160 ffff880853ae2328 > 2011-02-06T21:20:32.883355+01:00 phy005 kernel: <0> ffff880c22d426e8 > ffff880268ad8000 ffff880312bdfbb8 ffffffffa0078a42 > 2011-02-06T21:20:32.883358+01:00 phy005 kernel: <0> ffffea00134a16c8 > ffff880853ae0000 ffff880853ae0000 0000000000000001 > 2011-02-06T21:20:32.883362+01:00 phy005 kernel: Call Trace: > 2011-02-06T21:20:32.883366+01:00 phy005 kernel: [<ffffffffa0078a42>] > kvm_mmu_zap_all+0x35/0x60 [kvm] > 2011-02-06T21:20:32.883371+01:00 phy005 kernel: [<ffffffffa006dcde>] > kvm_arch_flush_shadow+0x16/0x22 [kvm] > 2011-02-06T21:20:32.883375+01:00 phy005 kernel: [<ffffffffa0063b0a>] > kvm_mmu_notifier_release+0x31/0x44 [kvm] > 2011-02-06T21:20:32.883379+01:00 phy005 kernel: [<ffffffff810fac37>] > __mmu_notifier_release+0x4f/0x7b > 2011-02-06T21:20:32.883383+01:00 phy005 kernel: [<ffffffff810e735d>] > exit_mmap+0x2c/0x132 > 2011-02-06T21:20:32.883386+01:00 phy005 kernel: [<ffffffff8104ad7a>] > mmput+0x5e/0xca > 2011-02-06T21:20:32.883390+01:00 phy005 kernel: [<ffffffff8104f0d5>] > exit_mm+0x114/0x121 > 2011-02-06T21:20:32.883394+01:00 phy005 kernel: [<ffffffff81050bf5>] > do_exit+0x254/0x752 > 2011-02-06T21:20:32.883398+01:00 phy005 kernel: [<ffffffff81051174>] > do_group_exit+0x81/0xab > 2011-02-06T21:20:32.883403+01:00 phy005 kernel: [<ffffffff8105e5cd>] > get_signal_to_deliver+0x3a6/0x3c8 > 2011-02-06T21:20:32.883406+01:00 phy005 kernel: [<ffffffff81009038>] > do_signal+0x72/0x6b8 > 2011-02-06T21:20:32.883410+01:00 phy005 kernel: [<ffffffff8111aa2f>] ? > vfs_ioctl+0x32/0xa6 > 2011-02-06T21:20:32.883413+01:00 phy005 kernel: [<ffffffff8111afa2>] ? > do_vfs_ioctl+0x483/0x4c9 > 2011-02-06T21:20:32.883416+01:00 phy005 kernel: [<ffffffff810096a6>] > do_notify_resume+0x28/0x86 > 2011-02-06T21:20:32.883420+01:00 phy005 kernel: [<ffffffff81009f3e>] > int_signal+0x12/0x17 > 2011-02-06T21:20:32.883426+01:00 phy005 kernel: Code: 41 5e 44 89 f8 > 41 5f c9 c3 48 ba 00 f0 ff ff ff ff 0f 00 4c 89 ee 48 21 d0 48 ba 00 > 00 00 00 00 ea ff ff 48 c1 e8 0c 48 6b c0 38 <48> 8b 7c 10 10 e8 a3 f3 > ff ff e9 06 fe ff ff 55 48 89 e5 41 57 > 2011-02-06T21:20:32.883431+01:00 phy005 kernel: RIP > [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] > 2011-02-06T21:20:32.883434+01:00 phy005 kernel: RSP <ffff880312bdfb58> > 2011-02-06T21:20:32.883437+01:00 phy005 kernel: CR2: fffff6b192918010 > 2011-02-06T21:20:32.883441+01:00 phy005 kernel: ---[ end trace > beed2b54d0bb8a04 ]--- > 2011-02-06T21:20:32.883444+01:00 phy005 kernel: Fixing recursive fault > but reboot is needed! > > after which we rebooted the machine and replaced the motherboard and > cpus (we already replaced the memory before). > > But 2 days ago we got this oops: > > 2011-02-08T15:56:19.902104+01:00 phy005 kernel: BUG: unable to handle > kernel paging request at ffffea71929181c0 > 2011-02-08T15:56:19.902686+01:00 phy005 kernel: IP: > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-08T15:56:19.902693+01:00 phy005 kernel: PGD 118600067 PUD 0 > 2011-02-08T15:56:19.902699+01:00 phy005 kernel: Oops: 0000 [#1] SMP > 2011-02-08T15:56:19.902703+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m > ap > 2011-02-08T15:56:19.902708+01:00 phy005 kernel: CPU 8 > 2011-02-08T15:56:19.902715+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q > garp stp llc bonding xt_comment xt_recent ip6t_REJECT > nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i > gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca > serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] > 2011-02-08T15:56:19.902770+01:00 phy005 kernel: > 2011-02-08T15:56:19.902775+01:00 phy005 kernel: Pid: 3346, comm: > qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X > 8DTU/X8DTU > 2011-02-08T15:56:19.902781+01:00 phy005 kernel: RIP: > 0010:[<ffffffff81034880>] [<ffffffff81034880>] gup_pte_range+0x94/ > 0xd3 > 2011-02-08T15:56:19.902785+01:00 phy005 kernel: RSP: > 0018:ffff880c21bc1a78 EFLAGS: 00010086 > 2011-02-08T15:56:19.902789+01:00 phy005 kernel: RAX: ffffea71929181c0 > RBX: 00003ffffffff000 RCX: 0000000000000005 > 2011-02-08T15:56:19.902793+01:00 phy005 kernel: RDX: 00007fa2ca200000 > RSI: 00007fa2ca1ff000 RDI: 1603a07305008067 > 2011-02-08T15:56:19.902797+01:00 phy005 kernel: RBP: ffff880c21bc1a98 > R08: ffff88060fdfad60 R09: ffff880c21bc1b44 > 2011-02-08T15:56:19.902801+01:00 phy005 kernel: R10: ffff88061493fff8 > R11: ffffea0000000000 R12: 0000000000000205 > 2011-02-08T15:56:19.902805+01:00 phy005 kernel: R13: ffffc00000000fff > R14: 0000000000000005 R15: 0000000000000000 > 2011-02-08T15:56:19.902810+01:00 phy005 kernel: FS: > 00007fa2d8724700(0000) GS:ffff880002080000(0000) knlGS:000000000000 > 0000 > 2011-02-08T15:56:19.902820+01:00 phy005 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > 2011-02-08T15:56:19.902825+01:00 phy005 kernel: CR2: ffffea71929181c0 > CR3: 0000000c231f9000 CR4: 00000000000026e0 > 2011-02-08T15:56:19.902829+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-08T15:56:19.902833+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-08T15:56:19.902837+01:00 phy005 kernel: Process qemu-kvm (pid: > 3346, threadinfo ffff880c21bc0000, task ffff880c2 > 264ddc0) > 2011-02-08T15:56:19.902841+01:00 phy005 kernel: Stack: > 2011-02-08T15:56:19.902844+01:00 phy005 kernel: 00007fa2ca200000 > 00007fa2ca201000 00007fa2ca201000 ffff880c22c3d280 > 2011-02-08T15:56:19.902848+01:00 phy005 kernel: <0> ffff880c21bc1af8 > ffffffff81034a15 00007fa2ca200fff 00007fa2ca200fff > 2011-02-08T15:56:19.902852+01:00 phy005 kernel: <0> ffff880c21bc1b44 > ffff88060fdfad60 ffff880c2231a458 ffff880c231f97f8 > 2011-02-08T15:56:19.902855+01:00 phy005 kernel: Call Trace: > 2011-02-08T15:56:19.902859+01:00 phy005 kernel: [<ffffffff81034a15>] > gup_pud_range+0x156/0x192 > 2011-02-08T15:56:19.902863+01:00 phy005 kernel: [<ffffffff81034b15>] > get_user_pages_fast+0xc4/0x172 > 2011-02-08T15:56:19.902867+01:00 phy005 kernel: [<ffffffff81131fbc>] ? > bio_add_page+0x36/0x38 > 2011-02-08T15:56:19.902871+01:00 phy005 kernel: [<ffffffff81134730>] > dio_get_page+0x54/0x127 > 2011-02-08T15:56:19.902875+01:00 phy005 kernel: [<ffffffff81135317>] > __blockdev_direct_IO+0x41d/0xa36 > 2011-02-08T15:56:19.902880+01:00 phy005 kernel: [<ffffffffa008bf69>] ? > x86_emulate_insn+0x1ff8/0x2d61 [kvm] > 2011-02-08T15:56:19.902884+01:00 phy005 kernel: [<ffffffff8113379b>] > blkdev_direct_IO+0x4e/0x50 > 2011-02-08T15:56:19.902888+01:00 phy005 kernel: [<ffffffff81132c49>] ? > blkdev_get_blocks+0x0/0x8d > 2011-02-08T15:56:19.902892+01:00 phy005 kernel: [<ffffffff810cb516>] > generic_file_direct_write+0xed/0x16d > 2011-02-08T15:56:19.902896+01:00 phy005 kernel: [<ffffffff810cb72c>] > __generic_file_aio_write+0x196/0x281 > 2011-02-08T15:56:19.902899+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-08T15:56:19.902909+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-08T15:56:19.902914+01:00 phy005 kernel: [<ffffffff8113d4eb>] > aio_rw_vect_retry+0x85/0x18e > 2011-02-08T15:56:19.902919+01:00 phy005 kernel: [<ffffffff8113e9b3>] > aio_run_iocb+0x77/0x10f > 2011-02-08T15:56:19.902923+01:00 phy005 kernel: [<ffffffff8113f508>] > do_io_submit+0x558/0x7ce > 2011-02-08T15:56:19.902927+01:00 phy005 kernel: [<ffffffff8113f78e>] > sys_io_submit+0x10/0x12 > 2011-02-08T15:56:19.902932+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-08T15:56:19.902938+01:00 phy005 kernel: Code: 21 d8 49 01 c2 > 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 > 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8 <66> 83 38 00 48 89 c7 79 > 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 > 2011-02-08T15:56:19.903077+01:00 phy005 kernel: RIP > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-08T15:56:19.903081+01:00 phy005 kernel: RSP <ffff880c21bc1a78> > 2011-02-08T15:56:19.903084+01:00 phy005 kernel: CR2: ffffea71929181c0 > 2011-02-08T15:56:19.903088+01:00 phy005 kernel: ---[ end trace > 174c28940e9fd0a7 ]--- > > and yesterday this one: > > 2011-02-09T07:40:15.636528+01:00 phy005 kernel: BUG: unable to handle > kernel NULL pointer dereference at (null) > 2011-02-09T07:40:15.636635+01:00 phy005 kernel: IP: > [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] > 2011-02-09T07:40:15.636639+01:00 phy005 kernel: PGD 0 > 2011-02-09T07:40:15.636643+01:00 phy005 kernel: Oops: 0000 [#3] SMP > 2011-02-09T07:40:15.636647+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/topology/thread_siblings > 2011-02-09T07:40:15.636650+01:00 phy005 kernel: CPU 2 > 2011-02-09T07:40:15.636656+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding > xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter > ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt ioatdma i2c_core > iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: > scsi_wait_scan] > 2011-02-09T07:40:15.636663+01:00 phy005 kernel: > 2011-02-09T07:40:15.636666+01:00 phy005 kernel: Pid: 2572, comm: > qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 > X8DTU/X8DTU > 2011-02-09T07:40:15.636670+01:00 phy005 kernel: RIP: > 0010:[<ffffffffa0082db8>] [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e > [kvm] > 2011-02-09T07:40:15.636673+01:00 phy005 kernel: RSP: > 0018:ffff88061cbcbcd8 EFLAGS: 00010246 > 2011-02-09T07:40:15.636677+01:00 phy005 kernel: RAX: 0000000000000000 > RBX: 1603a07305004fff RCX: ffff88061cbcbd08 > 2011-02-09T07:40:15.636680+01:00 phy005 kernel: RDX: 0000000000000023 > RSI: 1603a07305004fff RDI: 0000000000000000 > 2011-02-09T07:40:15.636683+01:00 phy005 kernel: RBP: ffff88061cbcbce8 > R08: 0000000000000023 R09: 0000000000000000 > 2011-02-09T07:40:15.636686+01:00 phy005 kernel: R10: 0000000000000000 > R11: ffffffffa0082c7f R12: 0000000000000001 > 2011-02-09T07:40:15.636689+01:00 phy005 kernel: R13: 0000000000311763 > R14: ffff8809b8b01ce0 R15: 0000000000000000 > 2011-02-09T07:40:15.636692+01:00 phy005 kernel: FS: > 0000000000000000(0000) GS:ffff880002040000(0000) > knlGS:0000000000000000 > 2011-02-09T07:40:15.636695+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: > 0000 CR0: 000000008005003b > 2011-02-09T07:40:15.636699+01:00 phy005 kernel: CR2: 0000000000000000 > CR3: 0000000001a42000 CR4: 00000000000026e0 > 2011-02-09T07:40:15.636702+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-09T07:40:15.636705+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-09T07:40:15.636709+01:00 phy005 kernel: Process qemu-kvm (pid: > 2572, threadinfo ffff88061cbca000, task ffff88061cf04650) > 2011-02-09T07:40:15.636711+01:00 phy005 kernel: Stack: > 2011-02-09T07:40:15.636715+01:00 phy005 kernel: ffff88036c471ff8 > ffff880c23984000 ffff88061cbcbd18 ffffffffa0082ea9 > 2011-02-09T07:40:15.636718+01:00 phy005 kernel: <0> ffff8809b8b01ce0 > ffff880c23984000 ffff88036c471ff8 00000000000001ff > 2011-02-09T07:40:15.636721+01:00 phy005 kernel: <0> ffff88061cbcbd58 > ffffffffa008363b 0000000000000200 ffff880c23984000 > 2011-02-09T07:40:15.636724+01:00 phy005 kernel: Call Trace: > 2011-02-09T07:40:15.636728+01:00 phy005 kernel: [<ffffffffa0082ea9>] > rmap_remove+0xa3/0x1a0 [kvm] > 2011-02-09T07:40:15.636731+01:00 phy005 kernel: [<ffffffffa008363b>] > kvm_mmu_zap_page+0x9f/0x299 [kvm] > 2011-02-09T07:40:15.636734+01:00 phy005 kernel: [<ffffffffa0083a42>] > kvm_mmu_zap_all+0x35/0x60 [kvm] > 2011-02-09T07:40:15.636738+01:00 phy005 kernel: [<ffffffffa0078cde>] > kvm_arch_flush_shadow+0x16/0x22 [kvm] > 2011-02-09T07:40:15.636741+01:00 phy005 kernel: [<ffffffffa006eb0a>] > kvm_mmu_notifier_release+0x31/0x44 [kvm] > 2011-02-09T07:40:15.636744+01:00 phy005 kernel: [<ffffffff810fac37>] > __mmu_notifier_release+0x4f/0x7b > 2011-02-09T07:40:15.636748+01:00 phy005 kernel: [<ffffffff810e735d>] > exit_mmap+0x2c/0x132 > 2011-02-09T07:40:15.636751+01:00 phy005 kernel: [<ffffffff8104ad7a>] > mmput+0x5e/0xca > 2011-02-09T07:40:15.636754+01:00 phy005 kernel: [<ffffffff8104f0d5>] > exit_mm+0x114/0x121 > 2011-02-09T07:40:15.636757+01:00 phy005 kernel: [<ffffffff81050bf5>] > do_exit+0x254/0x752 > 2011-02-09T07:40:15.636760+01:00 phy005 kernel: [<ffffffff8100a60e>] ? > apic_timer_interrupt+0xe/0x20 > 2011-02-09T07:40:15.636764+01:00 phy005 kernel: [<ffffffff81051174>] > do_group_exit+0x81/0xab > 2011-02-09T07:40:15.636767+01:00 phy005 kernel: [<ffffffff810511b5>] > sys_exit_group+0x17/0x1b > 2011-02-09T07:40:15.636771+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-09T07:40:15.636777+01:00 phy005 kernel: Code: 88 ff ff ff b8 > 01 00 00 00 c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 41 89 d4 48 89 > f3 e8 7b c7 fe ff 41 83 fc 01 48 89 c7 75 0d <48> 2b 18 48 c1 e3 03 48 > 03 58 18 eb 39 41 8d 4c 24 ff be 01 00 > 2011-02-09T07:40:15.636785+01:00 phy005 kernel: RIP > [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] > 2011-02-09T07:40:15.636788+01:00 phy005 kernel: RSP <ffff88061cbcbcd8> > 2011-02-09T07:40:15.636791+01:00 phy005 kernel: CR2: 0000000000000000 > 2011-02-09T07:40:15.637743+01:00 phy005 kernel: ---[ end trace > 174c28940e9fd0a9 ]--- > 2011-02-09T07:40:15.637751+01:00 phy005 kernel: Fixing recursive fault > but reboot is needed! > > So it doesn't seem to be a hardware problem since we replaced all that. > > Kind regards, > > Ruben And tonight we had another one of those errors we had a few weeks ago: 2011-02-13T02:56:28.694496+01:00 phy005 kernel: EPT: Misconfiguration. 2011-02-13T02:56:28.694908+01:00 phy005 kernel: EPT: GPA: 0x2edff000 2011-02-13T02:56:28.694914+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x25602d007 level 4 2011-02-13T02:56:28.694916+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x3df3e2007 level 3 2011-02-13T02:56:28.694919+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x5e90c7007 level 2 2011-02-13T02:56:28.694925+01:00 phy005 kernel: ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 2011-02-13T02:56:28.694928+01:00 phy005 kernel: ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 2011-02-13T02:56:28.694930+01:00 phy005 kernel: ------------[ cut here ]------------ 2011-02-13T02:56:28.694933+01:00 phy005 kernel: WARNING: at arch/x86/kvm/vmx.c:3425 handle_ept_misconfig+0x152/0x1d8 [kvm_intel]() 2011-02-13T02:56:28.694936+01:00 phy005 kernel: Hardware name: X8DTU 2011-02-13T02:56:28.694941+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt igb ioatdma dca iTCO_vendor_support joydev serio_raw microcode 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-13T02:56:28.695004+01:00 phy005 kernel: Pid: 4756, comm: qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 2011-02-13T02:56:28.695008+01:00 phy005 kernel: Call Trace: 2011-02-13T02:56:28.695013+01:00 phy005 kernel: [<ffffffff8104d11f>] warn_slowpath_common+0x7c/0x94 2011-02-13T02:56:28.695020+01:00 phy005 kernel: [<ffffffff8104d14b>] warn_slowpath_null+0x14/0x16 2011-02-13T02:56:28.695024+01:00 phy005 kernel: [<ffffffffa00c97fb>] handle_ept_misconfig+0x152/0x1d8 [kvm_intel] 2011-02-13T02:56:28.695028+01:00 phy005 kernel: [<ffffffffa00ca401>] vmx_handle_exit+0x204/0x23a [kvm_intel] 2011-02-13T02:56:28.695033+01:00 phy005 kernel: [<ffffffffa0084998>] kvm_arch_vcpu_ioctl_run+0x7cd/0xa74 [kvm] 2011-02-13T02:56:28.695037+01:00 phy005 kernel: [<ffffffffa00735ba>] kvm_vcpu_ioctl+0xfd/0x56e [kvm] 2011-02-13T02:56:28.695042+01:00 phy005 kernel: [<ffffffff810feaab>] ? virt_to_head_page+0xe/0x2f 2011-02-13T02:56:28.695046+01:00 phy005 kernel: [<ffffffff810cc6ca>] ? mempool_kfree+0xe/0x10 2011-02-13T02:56:28.695051+01:00 phy005 kernel: [<ffffffff810cc857>] ? mempool_free+0x76/0x7b 2011-02-13T02:56:28.695055+01:00 phy005 kernel: [<ffffffff8111aa2f>] vfs_ioctl+0x32/0xa6 2011-02-13T02:56:28.695060+01:00 phy005 kernel: [<ffffffff8111afa2>] do_vfs_ioctl+0x483/0x4c9 2011-02-13T02:56:28.695065+01:00 phy005 kernel: [<ffffffff8111b03e>] sys_ioctl+0x56/0x79 2011-02-13T02:56:28.695070+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-02-13T02:56:28.695074+01:00 phy005 kernel: ---[ end trace d95032626ea304ca ]--- Any help would be much appreciated. It seems very strange that I'm the first one who runs into this. I've found two bugreports which report the same, the first one at https://partner-bugzilla.redhat.com/show_bug.cgi?format=multiple&id=613691, but that's a duplicate of https://partner-bugzilla.redhat.com/show_bug.cgi?id=606131 which I'm not authorized to see... Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-13 2:07 ` Ruben Kerkhof @ 2011-02-13 13:03 ` Avi Kivity 2011-02-13 14:40 ` Ruben Kerkhof 2011-02-15 17:16 ` Marcelo Tosatti 0 siblings, 2 replies; 19+ messages in thread From: Avi Kivity @ 2011-02-13 13:03 UTC (permalink / raw) To: Ruben Kerkhof; +Cc: Marcelo Tosatti, kvm On 02/13/2011 04:07 AM, Ruben Kerkhof wrote: > And tonight we had another one of those errors we had a few weeks ago: > > 2011-02-13T02:56:28.694496+01:00 phy005 kernel: EPT: Misconfiguration. > 2011-02-13T02:56:28.694908+01:00 phy005 kernel: EPT: GPA: 0x2edff000 This GPA indexes into the 511th entry of the spte. Marcelo, does this remind you of https://bugzilla.kernel.org/show_bug.cgi?id=27052 by any chance? > 2011-02-13T02:56:28.694914+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x25602d007 level 4 > 2011-02-13T02:56:28.694916+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x3df3e2007 level 3 > 2011-02-13T02:56:28.694919+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x5e90c7007 level 2 > 2011-02-13T02:56:28.694925+01:00 phy005 kernel: > ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 Magic 1603a073........ pte. > 2011-02-13T02:56:28.694928+01:00 phy005 kernel: > ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 > 2011-02-13T02:56:28.694930+01:00 phy005 kernel: ------------[ cut here > ]------------ > 2011-02-13T02:56:28.694933+01:00 phy005 kernel: WARNING: at > arch/x86/kvm/vmx.c:3425 handle_ept_misconfig+0x152/0x1d8 [kvm_intel]() > 2011-02-13T02:56:28.694936+01:00 phy005 kernel: Hardware name: X8DTU > 2011-02-13T02:56:28.694941+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding > xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter > ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt igb ioatdma > dca iTCO_vendor_support joydev serio_raw microcode 3w_9xxx [last > unloaded: scsi_wait_scan] > 2011-02-13T02:56:28.695004+01:00 phy005 kernel: Pid: 4756, comm: > qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 > 2011-02-13T02:56:28.695008+01:00 phy005 kernel: Call Trace: > 2011-02-13T02:56:28.695013+01:00 phy005 kernel: [<ffffffff8104d11f>] > warn_slowpath_common+0x7c/0x94 > 2011-02-13T02:56:28.695020+01:00 phy005 kernel: [<ffffffff8104d14b>] > warn_slowpath_null+0x14/0x16 > 2011-02-13T02:56:28.695024+01:00 phy005 kernel: [<ffffffffa00c97fb>] > handle_ept_misconfig+0x152/0x1d8 [kvm_intel] > 2011-02-13T02:56:28.695028+01:00 phy005 kernel: [<ffffffffa00ca401>] > vmx_handle_exit+0x204/0x23a [kvm_intel] > 2011-02-13T02:56:28.695033+01:00 phy005 kernel: [<ffffffffa0084998>] > kvm_arch_vcpu_ioctl_run+0x7cd/0xa74 [kvm] > 2011-02-13T02:56:28.695037+01:00 phy005 kernel: [<ffffffffa00735ba>] > kvm_vcpu_ioctl+0xfd/0x56e [kvm] > 2011-02-13T02:56:28.695042+01:00 phy005 kernel: [<ffffffff810feaab>] ? > virt_to_head_page+0xe/0x2f > 2011-02-13T02:56:28.695046+01:00 phy005 kernel: [<ffffffff810cc6ca>] ? > mempool_kfree+0xe/0x10 > 2011-02-13T02:56:28.695051+01:00 phy005 kernel: [<ffffffff810cc857>] ? > mempool_free+0x76/0x7b > 2011-02-13T02:56:28.695055+01:00 phy005 kernel: [<ffffffff8111aa2f>] > vfs_ioctl+0x32/0xa6 > 2011-02-13T02:56:28.695060+01:00 phy005 kernel: [<ffffffff8111afa2>] > do_vfs_ioctl+0x483/0x4c9 > 2011-02-13T02:56:28.695065+01:00 phy005 kernel: [<ffffffff8111b03e>] > sys_ioctl+0x56/0x79 > 2011-02-13T02:56:28.695070+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-13T02:56:28.695074+01:00 phy005 kernel: ---[ end trace > d95032626ea304ca ]--- > > Any help would be much appreciated. It seems very strange that I'm the > first one who runs into this. > I've found two bugreports which report the same, the first one at > https://partner-bugzilla.redhat.com/show_bug.cgi?format=multiple&id=613691, > but that's a duplicate of > https://partner-bugzilla.redhat.com/show_bug.cgi?id=606131 which I'm > not authorized to see... These don't appear to be related. Are you running ksm, btw? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-13 13:03 ` Avi Kivity @ 2011-02-13 14:40 ` Ruben Kerkhof 2011-02-15 17:16 ` Marcelo Tosatti 1 sibling, 0 replies; 19+ messages in thread From: Ruben Kerkhof @ 2011-02-13 14:40 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm On Sun, Feb 13, 2011 at 14:03, Avi Kivity <avi@redhat.com> wrote: > On 02/13/2011 04:07 AM, Ruben Kerkhof wrote: >> >> And tonight we had another one of those errors we had a few weeks ago: >> >> 2011-02-13T02:56:28.694496+01:00 phy005 kernel: EPT: Misconfiguration. >> 2011-02-13T02:56:28.694908+01:00 phy005 kernel: EPT: GPA: 0x2edff000 > > This GPA indexes into the 511th entry of the spte. Marcelo, does this > remind you of https://bugzilla.kernel.org/show_bug.cgi?id=27052 by any > chance? > >> 2011-02-13T02:56:28.694914+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x25602d007 level 4 >> 2011-02-13T02:56:28.694916+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x3df3e2007 level 3 >> 2011-02-13T02:56:28.694919+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x5e90c7007 level 2 >> 2011-02-13T02:56:28.694925+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1 > > Magic 1603a073........ pte. > >> 2011-02-13T02:56:28.694928+01:00 phy005 kernel: >> ept_misconfig_inspect_spte: rsvd_bits = 0x3a00000000000 >> 2011-02-13T02:56:28.694930+01:00 phy005 kernel: ------------[ cut here >> ]------------ >> 2011-02-13T02:56:28.694933+01:00 phy005 kernel: WARNING: at >> arch/x86/kvm/vmx.c:3425 handle_ept_misconfig+0x152/0x1d8 [kvm_intel]() >> 2011-02-13T02:56:28.694936+01:00 phy005 kernel: Hardware name: X8DTU >> 2011-02-13T02:56:28.694941+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding >> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter >> ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt igb ioatdma >> dca iTCO_vendor_support joydev serio_raw microcode 3w_9xxx [last >> unloaded: scsi_wait_scan] >> 2011-02-13T02:56:28.695004+01:00 phy005 kernel: Pid: 4756, comm: >> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 >> 2011-02-13T02:56:28.695008+01:00 phy005 kernel: Call Trace: >> 2011-02-13T02:56:28.695013+01:00 phy005 kernel: [<ffffffff8104d11f>] >> warn_slowpath_common+0x7c/0x94 >> 2011-02-13T02:56:28.695020+01:00 phy005 kernel: [<ffffffff8104d14b>] >> warn_slowpath_null+0x14/0x16 >> 2011-02-13T02:56:28.695024+01:00 phy005 kernel: [<ffffffffa00c97fb>] >> handle_ept_misconfig+0x152/0x1d8 [kvm_intel] >> 2011-02-13T02:56:28.695028+01:00 phy005 kernel: [<ffffffffa00ca401>] >> vmx_handle_exit+0x204/0x23a [kvm_intel] >> 2011-02-13T02:56:28.695033+01:00 phy005 kernel: [<ffffffffa0084998>] >> kvm_arch_vcpu_ioctl_run+0x7cd/0xa74 [kvm] >> 2011-02-13T02:56:28.695037+01:00 phy005 kernel: [<ffffffffa00735ba>] >> kvm_vcpu_ioctl+0xfd/0x56e [kvm] >> 2011-02-13T02:56:28.695042+01:00 phy005 kernel: [<ffffffff810feaab>] ? >> virt_to_head_page+0xe/0x2f >> 2011-02-13T02:56:28.695046+01:00 phy005 kernel: [<ffffffff810cc6ca>] ? >> mempool_kfree+0xe/0x10 >> 2011-02-13T02:56:28.695051+01:00 phy005 kernel: [<ffffffff810cc857>] ? >> mempool_free+0x76/0x7b >> 2011-02-13T02:56:28.695055+01:00 phy005 kernel: [<ffffffff8111aa2f>] >> vfs_ioctl+0x32/0xa6 >> 2011-02-13T02:56:28.695060+01:00 phy005 kernel: [<ffffffff8111afa2>] >> do_vfs_ioctl+0x483/0x4c9 >> 2011-02-13T02:56:28.695065+01:00 phy005 kernel: [<ffffffff8111b03e>] >> sys_ioctl+0x56/0x79 >> 2011-02-13T02:56:28.695070+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-13T02:56:28.695074+01:00 phy005 kernel: ---[ end trace >> d95032626ea304ca ]--- >> >> Any help would be much appreciated. It seems very strange that I'm the >> first one who runs into this. >> I've found two bugreports which report the same, the first one at >> >> https://partner-bugzilla.redhat.com/show_bug.cgi?format=multiple&id=613691, >> but that's a duplicate of >> https://partner-bugzilla.redhat.com/show_bug.cgi?id=606131 which I'm >> not authorized to see... > > These don't appear to be related. Are you running ksm, btw? No. Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-13 13:03 ` Avi Kivity 2011-02-13 14:40 ` Ruben Kerkhof @ 2011-02-15 17:16 ` Marcelo Tosatti 2011-02-15 19:04 ` Ruben Kerkhof 1 sibling, 1 reply; 19+ messages in thread From: Marcelo Tosatti @ 2011-02-15 17:16 UTC (permalink / raw) To: Avi Kivity; +Cc: Ruben Kerkhof, kvm On Sun, Feb 13, 2011 at 03:03:40PM +0200, Avi Kivity wrote: > On 02/13/2011 04:07 AM, Ruben Kerkhof wrote: > >And tonight we had another one of those errors we had a few weeks ago: > > > >2011-02-13T02:56:28.694496+01:00 phy005 kernel: EPT: Misconfiguration. > >2011-02-13T02:56:28.694908+01:00 phy005 kernel: EPT: GPA: 0x2edff000 > > This GPA indexes into the 511th entry of the spte. Marcelo, does > this remind you of https://bugzilla.kernel.org/show_bug.cgi?id=27052 > by any chance? This and the others reported. So yes, it looks something is corrupting memory. Ruben, you can try to boot with slub_debug=ZFPU kernel option. Is there any reason for not upgrading to FC14? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-15 17:16 ` Marcelo Tosatti @ 2011-02-15 19:04 ` Ruben Kerkhof 2011-02-24 21:15 ` Ruben Kerkhof 0 siblings, 1 reply; 19+ messages in thread From: Ruben Kerkhof @ 2011-02-15 19:04 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Avi Kivity, kvm Hi Marcelo, On Tue, Feb 15, 2011 at 18:16, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Sun, Feb 13, 2011 at 03:03:40PM +0200, Avi Kivity wrote: >> On 02/13/2011 04:07 AM, Ruben Kerkhof wrote: >> >And tonight we had another one of those errors we had a few weeks ago: >> > >> >2011-02-13T02:56:28.694496+01:00 phy005 kernel: EPT: Misconfiguration. >> >2011-02-13T02:56:28.694908+01:00 phy005 kernel: EPT: GPA: 0x2edff000 >> >> This GPA indexes into the 511th entry of the spte. Marcelo, does >> this remind you of https://bugzilla.kernel.org/show_bug.cgi?id=27052 >> by any chance? > > This and the others reported. So yes, it looks something is corrupting > memory. Ruben, you can try to boot with slub_debug=ZFPU kernel option. Sure, but not for a while, I'm first moving all my customers of this machine. We've had to reboot it like 5 or 6 times in the last couple of weeks. As soon as that's done I'm going to test the hell out of it. Now that we moved a few of the vm's we don't see any oopses, so it could either be that it only triggers under load, or there's a specific guest which is triggering it. > Is there any reason for not upgrading to FC14? I haven't had a reason to upgrade yet, all our other machines are running fine, using the same kernel. Plus I'm still finding lots of issues unrelated to kvm on F14, broken ssh in combination with openldap, ipmi bugs, selinux policy etc. Next to that it takes a lot of time to test all our images etc. I'll probably skip the F14 kernel and go straight to 2.638, since that should bring significant improvements like THP, async pagefaults etc. Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-15 19:04 ` Ruben Kerkhof @ 2011-02-24 21:15 ` Ruben Kerkhof 2011-02-27 10:46 ` Avi Kivity 0 siblings, 1 reply; 19+ messages in thread From: Ruben Kerkhof @ 2011-02-24 21:15 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Avi Kivity, kvm [-- Attachment #1: Type: text/plain, Size: 2145 bytes --] Hi Marcelo, On Tue, Feb 15, 2011 at 20:04, Ruben Kerkhof <ruben@rubenkerkhof.com> wrote: > Hi Marcelo, > > On Tue, Feb 15, 2011 at 18:16, Marcelo Tosatti <mtosatti@redhat.com> wrote: >> On Sun, Feb 13, 2011 at 03:03:40PM +0200, Avi Kivity wrote: >>> On 02/13/2011 04:07 AM, Ruben Kerkhof wrote: >>> >And tonight we had another one of those errors we had a few weeks ago: >>> > >>> >2011-02-13T02:56:28.694496+01:00 phy005 kernel: EPT: Misconfiguration. >>> >2011-02-13T02:56:28.694908+01:00 phy005 kernel: EPT: GPA: 0x2edff000 >>> >>> This GPA indexes into the 511th entry of the spte. Marcelo, does >>> this remind you of https://bugzilla.kernel.org/show_bug.cgi?id=27052 >>> by any chance? >> >> This and the others reported. So yes, it looks something is corrupting >> memory. Ruben, you can try to boot with slub_debug=ZFPU kernel option. Ok, there are now only 6 vms left on this host, and I've booted it with the slub_debug=ZFPU option. After a few hours, I got the following result: 2011-02-24T21:41:30.818496+01:00 phy005 kernel: ============================================================================= 2011-02-24T21:41:30.818517+01:00 phy005 kernel: BUG kmalloc-2048 (Not tainted): Object padding overwritten 2011-02-24T21:41:30.818523+01:00 phy005 kernel: ----------------------------------------------------------------------------- 2011-02-24T21:41:30.818526+01:00 phy005 kernel: 2011-02-24T21:41:30.818530+01:00 phy005 kernel: INFO: 0xffff8806230752ca-0xffff8806230752cf. First byte 0x0 instead of 0x5a 2011-02-24T21:41:30.818534+01:00 phy005 kernel: INFO: Allocated in __netdev_alloc_skb+0x34/0x51 age=2231 cpu=8 pid=0 2011-02-24T21:41:30.818537+01:00 phy005 kernel: INFO: Freed in skb_release_data+0xc9/0xce age=2368 cpu=8 pid=2159 2011-02-24T21:41:30.818541+01:00 phy005 kernel: INFO: Slab 0xffffea00157a9880 objects=15 used=13 fp=0xffff8806230752d0 flags=0x40000000004083 2011-02-24T21:41:30.818545+01:00 phy005 kernel: INFO: Object 0xffff880623074a88 @offset=19080 fp=0xffff8806230752d0 The rest of the output is attached since it's quite large. Kind regards, Ruben [-- Attachment #2: messages.gz --] [-- Type: application/x-gzip, Size: 46355 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-24 21:15 ` Ruben Kerkhof @ 2011-02-27 10:46 ` Avi Kivity 2011-03-05 18:57 ` Ruben Kerkhof 0 siblings, 1 reply; 19+ messages in thread From: Avi Kivity @ 2011-02-27 10:46 UTC (permalink / raw) To: Ruben Kerkhof; +Cc: Marcelo Tosatti, kvm, netdev Copying netdev: looks like memory corruption in the networking stack. Archive link: http://www.spinics.net/lists/kvm/msg50651.html (for the attachment). On 02/24/2011 11:15 PM, Ruben Kerkhof wrote: > > > > On Tue, Feb 15, 2011 at 18:16, Marcelo Tosatti<mtosatti@redhat.com> wrote: > > >> This and the others reported. So yes, it looks something is corrupting > >> memory. Ruben, you can try to boot with slub_debug=ZFPU kernel option. > > Ok, there are now only 6 vms left on this host, and I've booted it > with the slub_debug=ZFPU option. > After a few hours, I got the following result: > > 2011-02-24T21:41:30.818496+01:00 phy005 kernel: > ============================================================================= > 2011-02-24T21:41:30.818517+01:00 phy005 kernel: BUG kmalloc-2048 (Not > tainted): Object padding overwritten > 2011-02-24T21:41:30.818523+01:00 phy005 kernel: > ----------------------------------------------------------------------------- > 2011-02-24T21:41:30.818526+01:00 phy005 kernel: > 2011-02-24T21:41:30.818530+01:00 phy005 kernel: INFO: > 0xffff8806230752ca-0xffff8806230752cf. First byte 0x0 instead of 0x5a > 2011-02-24T21:41:30.818534+01:00 phy005 kernel: INFO: Allocated in > __netdev_alloc_skb+0x34/0x51 age=2231 cpu=8 pid=0 > 2011-02-24T21:41:30.818537+01:00 phy005 kernel: INFO: Freed in > skb_release_data+0xc9/0xce age=2368 cpu=8 pid=2159 > 2011-02-24T21:41:30.818541+01:00 phy005 kernel: INFO: Slab > 0xffffea00157a9880 objects=15 used=13 fp=0xffff8806230752d0 > flags=0x40000000004083 > 2011-02-24T21:41:30.818545+01:00 phy005 kernel: INFO: Object > 0xffff880623074a88 @offset=19080 fp=0xffff8806230752d0 > > The rest of the output is attached since it's quite large. > > Kind regards, > > Ruben -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-27 10:46 ` Avi Kivity @ 2011-03-05 18:57 ` Ruben Kerkhof 0 siblings, 0 replies; 19+ messages in thread From: Ruben Kerkhof @ 2011-03-05 18:57 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, netdev On Sun, Feb 27, 2011 at 11:46, Avi Kivity <avi@redhat.com> wrote: > > Copying netdev: looks like memory corruption in the networking stack. > > Archive link: http://www.spinics.net/lists/kvm/msg50651.html (for the > attachment). There's now only a single guest running on this host (Ubuntu Maverick). I've also upgraded the host kernel to 2.6.38-rc6, and this just happened (after a day or so): 2011-03-05T19:41:58.328866+01:00 phy005 kernel: [85271.656862] BUG kmalloc-2048 (Not tainted): Object padding overwritten 2011-03-05T19:41:58.328870+01:00 phy005 kernel: [85271.656864] ----------------------------------------------------------------------------- 2011-03-05T19:41:58.328875+01:00 phy005 kernel: [85271.656866] 2011-03-05T19:41:58.328885+01:00 phy005 kernel: [85271.656870] INFO: 0xffff880c0d52a960-0xffff880c0d52a967. First byte 0x0 instead of 0x5a 2011-03-05T19:41:58.328890+01:00 phy005 kernel: [85271.656880] INFO: Allocated in __netdev_alloc_skb+0x1f/0x3b age=16039 cpu=5 pid=0 2011-03-05T19:41:58.328894+01:00 phy005 kernel: [85271.656886] INFO: Freed in skb_release_data+0xa5/0xaa age=0 cpu=5 pid=1766 2011-03-05T19:41:58.328898+01:00 phy005 kernel: [85271.656890] INFO: Slab 0xffffea002a2ea0c0 objects=15 used=13 fp=0xffff880c0d52a120 flags=0xc00000000040c1 2011-03-05T19:41:58.328902+01:00 phy005 kernel: [85271.656894] INFO: Object 0xffff880c0d52a120 @offset=8480 fp=0xffff880c0d52d2d0 2011-03-05T19:41:58.328905+01:00 phy005 kernel: [85271.656895] 2011-03-05T19:41:58.328909+01:00 phy005 kernel: [85271.656897] Bytes b4 0xffff880c0d52a110: 14 89 12 05 01 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ 2011-03-05T19:41:58.328913+01:00 phy005 kernel: [85271.656909] Object 0xffff880c0d52a120: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk We have a quite complex network stack, two interfaces (igb) attached to bond0, with on top two bridges and on that two vlans. The guest is running a vpn and an IPv6 tunnel. Let me know if more info is needed. Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-10 15:23 ` Ruben Kerkhof 2011-02-13 2:07 ` Ruben Kerkhof @ 2011-02-13 12:58 ` Avi Kivity 2011-02-13 14:36 ` Ruben Kerkhof 1 sibling, 1 reply; 19+ messages in thread From: Avi Kivity @ 2011-02-13 12:58 UTC (permalink / raw) To: Ruben Kerkhof; +Cc: Marcelo Tosatti, kvm, Andrea Arcangeli On 02/10/2011 05:23 PM, Ruben Kerkhof wrote: > > This machine has been running for a week without problems, but then we > started to get the following oopses again: > > 2011-02-06T19:45:35.221555+01:00 phy005 kernel: BUG: unable to handle > kernel paging request at ffffea71929180e0 > 2011-02-06T19:45:35.222194+01:00 phy005 kernel: IP: > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-06T19:45:35.222199+01:00 phy005 kernel: PGD 118600067 PUD 0 > 2011-02-06T19:45:35.222203+01:00 phy005 kernel: Oops: 0000 [#1] SMP > 2011-02-06T19:45:35.222221+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/topology/thread_siblings > 2011-02-06T19:45:35.222224+01:00 phy005 kernel: CPU 4 > 2011-02-06T19:45:35.222229+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding > xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter > ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb > iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: > scsi_wait_scan] > 2011-02-06T19:45:35.222231+01:00 phy005 kernel: > 2011-02-06T19:45:35.222233+01:00 phy005 kernel: Pid: 3650, comm: > qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU > 2011-02-06T19:45:35.222236+01:00 phy005 kernel: RIP: > 0010:[<ffffffff81034880>] [<ffffffff81034880>] > gup_pte_range+0x94/0xd3 > 2011-02-06T19:45:35.222239+01:00 phy005 kernel: RSP: > 0018:ffff88060b9bda78 EFLAGS: 00010082 > 2011-02-06T19:45:35.222241+01:00 phy005 kernel: RAX: ffffea71929180e0 > RBX: 00003ffffffff000 RCX: 0000000000000005 > 2011-02-06T19:45:35.222243+01:00 phy005 kernel: RDX: 00007fe54e400000 > RSI: 00007fe54e3ff000 RDI: 1603a07305004067 > 2011-02-06T19:45:35.222245+01:00 phy005 kernel: RBP: ffff88060b9bda98 > R08: ffff880b94384560 R09: ffff88060b9bdb44 > 2011-02-06T19:45:35.222248+01:00 phy005 kernel: R10: ffff880606b2fff8 > R11: ffffea0000000000 R12: 0000000000000205 > 2011-02-06T19:45:35.222251+01:00 phy005 kernel: R13: ffffc00000000fff > R14: 0000000000000005 R15: 0000000000000000 > 2011-02-06T19:45:35.222255+01:00 phy005 kernel: FS: > 00007fe64cb0e700(0000) GS:ffff880655400000(0000) > knlGS:0000000000000000 > 2011-02-06T19:45:35.222259+01:00 phy005 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > 2011-02-06T19:45:35.222263+01:00 phy005 kernel: CR2: ffffea71929180e0 > CR3: 0000000bff06d000 CR4: 00000000000026e0 > 2011-02-06T19:45:35.222267+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-06T19:45:35.222271+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-06T19:45:35.222274+01:00 phy005 kernel: Process qemu-kvm (pid: > 3650, threadinfo ffff88060b9bc000, task ffff880623ed2ee0) > 2011-02-06T19:45:35.222278+01:00 phy005 kernel: Stack: > 2011-02-06T19:45:35.222281+01:00 phy005 kernel: 00007fe54e400000 > 00007fe54e400000 00007fe54e400000 ffff88053a0d2388 > 2011-02-06T19:45:35.222285+01:00 phy005 kernel:<0> ffff88060b9bdaf8 > ffffffff81034a15 00007fe54e3fffff 00007fe54e3fffff > 2011-02-06T19:45:35.222289+01:00 phy005 kernel:<0> ffff88060b9bdb44 > ffff880b94384560 ffff880bff06eca8 ffff880bff06d7f8 > 2011-02-06T19:45:35.222292+01:00 phy005 kernel: Call Trace: > 2011-02-06T19:45:35.222296+01:00 phy005 kernel: [<ffffffff81034a15>] > gup_pud_range+0x156/0x192 > 2011-02-06T19:45:35.222300+01:00 phy005 kernel: [<ffffffff81034b15>] > get_user_pages_fast+0xc4/0x172 > 2011-02-06T19:45:35.222304+01:00 phy005 kernel: [<ffffffff81131fbc>] ? > bio_add_page+0x36/0x38 > 2011-02-06T19:45:35.222308+01:00 phy005 kernel: [<ffffffff81134730>] > dio_get_page+0x54/0x127 > 2011-02-06T19:45:35.222312+01:00 phy005 kernel: [<ffffffff81135317>] > __blockdev_direct_IO+0x41d/0xa36 > 2011-02-06T19:45:35.222316+01:00 phy005 kernel: [<ffffffffa0080f69>] ? > x86_emulate_insn+0x1ff8/0x2d61 [kvm] > 2011-02-06T19:45:35.222320+01:00 phy005 kernel: [<ffffffff8113379b>] > blkdev_direct_IO+0x4e/0x50 > 2011-02-06T19:45:35.222324+01:00 phy005 kernel: [<ffffffff81132c49>] ? > blkdev_get_blocks+0x0/0x8d > 2011-02-06T19:45:35.222328+01:00 phy005 kernel: [<ffffffff810cb516>] > generic_file_direct_write+0xed/0x16d > 2011-02-06T19:45:35.222331+01:00 phy005 kernel: [<ffffffff810cb72c>] > __generic_file_aio_write+0x196/0x281 > 2011-02-06T19:45:35.222335+01:00 phy005 kernel: [<ffffffff811d5352>] ? > file_has_perm+0xa4/0xc6 > 2011-02-06T19:45:35.222339+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-06T19:45:35.222343+01:00 phy005 kernel: [<ffffffff8113306d>] > blkdev_aio_write+0x2a/0x69 > 2011-02-06T19:45:35.222347+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-06T19:45:35.222351+01:00 phy005 kernel: [<ffffffff8113d4eb>] > aio_rw_vect_retry+0x85/0x18e > 2011-02-06T19:45:35.222355+01:00 phy005 kernel: [<ffffffff8113e9b3>] > aio_run_iocb+0x77/0x10f > 2011-02-06T19:45:35.222359+01:00 phy005 kernel: [<ffffffff8113f508>] > do_io_submit+0x558/0x7ce > 2011-02-06T19:45:35.222363+01:00 phy005 kernel: [<ffffffff8113f78e>] > sys_io_submit+0x10/0x12 > 2011-02-06T19:45:35.222366+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-06T19:45:35.222372+01:00 phy005 kernel: Code: 21 d8 49 01 c2 > 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 > 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> 83 38 00 48 89 c7 79 > 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 > 2011-02-06T19:45:35.222376+01:00 phy005 kernel: RIP > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-06T19:45:35.222379+01:00 phy005 kernel: RSP<ffff88060b9bda78> > 2011-02-06T19:45:35.222382+01:00 phy005 kernel: CR2: ffffea71929180e0 > 2011-02-06T19:45:35.222386+01:00 phy005 kernel: ---[ end trace > beed2b54d0bb8a00 ]--- > Hm, outside any kvm code. > and > > 2011-02-06T19:47:15.023129+01:00 phy005 kernel: qemu-kvm: Corrupted > page table at address 7fbde15ff64c > 2011-02-06T19:47:15.023207+01:00 phy005 kernel: PGD 5ff58a067 PUD > 612668067 PMD 5937b7067 PE 1603a07305008067 Again outside kvm, and again the magic pte 1603axxxxx. > followed by > > 2011-02-06T21:20:32.882972+01:00 phy005 kernel: BUG: unable to handle > kernel paging request at fffff6b192918010 > 2011-02-06T21:20:32.883252+01:00 phy005 kernel: IP: > [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] Well, after something goes bad, nothing good can come out of it. > after which we rebooted the machine and replaced the motherboard and > cpus (we already replaced the memory before). > > But 2 days ago we got this oops: > > 2011-02-08T15:56:19.902104+01:00 phy005 kernel: BUG: unable to handle > kernel paging request at ffffea71929181c0 > 2011-02-08T15:56:19.902686+01:00 phy005 kernel: IP: > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-08T15:56:19.902693+01:00 phy005 kernel: PGD 118600067 PUD 0 > 2011-02-08T15:56:19.902699+01:00 phy005 kernel: Oops: 0000 [#1] SMP > 2011-02-08T15:56:19.902703+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m > ap > 2011-02-08T15:56:19.902708+01:00 phy005 kernel: CPU 8 > 2011-02-08T15:56:19.902715+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q > garp stp llc bonding xt_comment xt_recent ip6t_REJECT > nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i > gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca > serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] > 2011-02-08T15:56:19.902770+01:00 phy005 kernel: > 2011-02-08T15:56:19.902775+01:00 phy005 kernel: Pid: 3346, comm: > qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X > 8DTU/X8DTU > 2011-02-08T15:56:19.902781+01:00 phy005 kernel: RIP: > 0010:[<ffffffff81034880>] [<ffffffff81034880>] gup_pte_range+0x94/ > 0xd3 > 2011-02-08T15:56:19.902785+01:00 phy005 kernel: RSP: > 0018:ffff880c21bc1a78 EFLAGS: 00010086 > 2011-02-08T15:56:19.902789+01:00 phy005 kernel: RAX: ffffea71929181c0 > RBX: 00003ffffffff000 RCX: 0000000000000005 > 2011-02-08T15:56:19.902793+01:00 phy005 kernel: RDX: 00007fa2ca200000 > RSI: 00007fa2ca1ff000 RDI: 1603a07305008067 > 2011-02-08T15:56:19.902797+01:00 phy005 kernel: RBP: ffff880c21bc1a98 > R08: ffff88060fdfad60 R09: ffff880c21bc1b44 > 2011-02-08T15:56:19.902801+01:00 phy005 kernel: R10: ffff88061493fff8 > R11: ffffea0000000000 R12: 0000000000000205 > 2011-02-08T15:56:19.902805+01:00 phy005 kernel: R13: ffffc00000000fff > R14: 0000000000000005 R15: 0000000000000000 > 2011-02-08T15:56:19.902810+01:00 phy005 kernel: FS: > 00007fa2d8724700(0000) GS:ffff880002080000(0000) knlGS:000000000000 > 0000 > 2011-02-08T15:56:19.902820+01:00 phy005 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > 2011-02-08T15:56:19.902825+01:00 phy005 kernel: CR2: ffffea71929181c0 > CR3: 0000000c231f9000 CR4: 00000000000026e0 > 2011-02-08T15:56:19.902829+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-08T15:56:19.902833+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-08T15:56:19.902837+01:00 phy005 kernel: Process qemu-kvm (pid: > 3346, threadinfo ffff880c21bc0000, task ffff880c2 > 264ddc0) > 2011-02-08T15:56:19.902841+01:00 phy005 kernel: Stack: > 2011-02-08T15:56:19.902844+01:00 phy005 kernel: 00007fa2ca200000 > 00007fa2ca201000 00007fa2ca201000 ffff880c22c3d280 > 2011-02-08T15:56:19.902848+01:00 phy005 kernel:<0> ffff880c21bc1af8 > ffffffff81034a15 00007fa2ca200fff 00007fa2ca200fff > 2011-02-08T15:56:19.902852+01:00 phy005 kernel:<0> ffff880c21bc1b44 > ffff88060fdfad60 ffff880c2231a458 ffff880c231f97f8 > 2011-02-08T15:56:19.902855+01:00 phy005 kernel: Call Trace: > 2011-02-08T15:56:19.902859+01:00 phy005 kernel: [<ffffffff81034a15>] > gup_pud_range+0x156/0x192 > 2011-02-08T15:56:19.902863+01:00 phy005 kernel: [<ffffffff81034b15>] > get_user_pages_fast+0xc4/0x172 > 2011-02-08T15:56:19.902867+01:00 phy005 kernel: [<ffffffff81131fbc>] ? > bio_add_page+0x36/0x38 > 2011-02-08T15:56:19.902871+01:00 phy005 kernel: [<ffffffff81134730>] > dio_get_page+0x54/0x127 > 2011-02-08T15:56:19.902875+01:00 phy005 kernel: [<ffffffff81135317>] > __blockdev_direct_IO+0x41d/0xa36 > 2011-02-08T15:56:19.902880+01:00 phy005 kernel: [<ffffffffa008bf69>] ? > x86_emulate_insn+0x1ff8/0x2d61 [kvm] > 2011-02-08T15:56:19.902884+01:00 phy005 kernel: [<ffffffff8113379b>] > blkdev_direct_IO+0x4e/0x50 > 2011-02-08T15:56:19.902888+01:00 phy005 kernel: [<ffffffff81132c49>] ? > blkdev_get_blocks+0x0/0x8d > 2011-02-08T15:56:19.902892+01:00 phy005 kernel: [<ffffffff810cb516>] > generic_file_direct_write+0xed/0x16d > 2011-02-08T15:56:19.902896+01:00 phy005 kernel: [<ffffffff810cb72c>] > __generic_file_aio_write+0x196/0x281 > 2011-02-08T15:56:19.902899+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-08T15:56:19.902909+01:00 phy005 kernel: [<ffffffff81133043>] ? > blkdev_aio_write+0x0/0x69 > 2011-02-08T15:56:19.902914+01:00 phy005 kernel: [<ffffffff8113d4eb>] > aio_rw_vect_retry+0x85/0x18e > 2011-02-08T15:56:19.902919+01:00 phy005 kernel: [<ffffffff8113e9b3>] > aio_run_iocb+0x77/0x10f > 2011-02-08T15:56:19.902923+01:00 phy005 kernel: [<ffffffff8113f508>] > do_io_submit+0x558/0x7ce > 2011-02-08T15:56:19.902927+01:00 phy005 kernel: [<ffffffff8113f78e>] > sys_io_submit+0x10/0x12 > 2011-02-08T15:56:19.902932+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-08T15:56:19.902938+01:00 phy005 kernel: Code: 21 d8 49 01 c2 > 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 > 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> 83 38 00 48 89 c7 79 > 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 > 2011-02-08T15:56:19.903077+01:00 phy005 kernel: RIP > [<ffffffff81034880>] gup_pte_range+0x94/0xd3 > 2011-02-08T15:56:19.903081+01:00 phy005 kernel: RSP<ffff880c21bc1a78> > 2011-02-08T15:56:19.903084+01:00 phy005 kernel: CR2: ffffea71929181c0 > 2011-02-08T15:56:19.903088+01:00 phy005 kernel: ---[ end trace > 174c28940e9fd0a7 ]--- > Again outside kvm. > and yesterday this one: > > 2011-02-09T07:40:15.636528+01:00 phy005 kernel: BUG: unable to handle > kernel NULL pointer dereference at (null) > 2011-02-09T07:40:15.636635+01:00 phy005 kernel: IP: > [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] > 2011-02-09T07:40:15.636639+01:00 phy005 kernel: PGD 0 > 2011-02-09T07:40:15.636643+01:00 phy005 kernel: Oops: 0000 [#3] SMP > 2011-02-09T07:40:15.636647+01:00 phy005 kernel: last sysfs file: > /sys/devices/system/cpu/cpu15/topology/thread_siblings > 2011-02-09T07:40:15.636650+01:00 phy005 kernel: CPU 2 > 2011-02-09T07:40:15.636656+01:00 phy005 kernel: Modules linked in: tun > ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding > xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter > ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt ioatdma i2c_core > iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: > scsi_wait_scan] > 2011-02-09T07:40:15.636663+01:00 phy005 kernel: > 2011-02-09T07:40:15.636666+01:00 phy005 kernel: Pid: 2572, comm: > qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 > X8DTU/X8DTU > 2011-02-09T07:40:15.636670+01:00 phy005 kernel: RIP: > 0010:[<ffffffffa0082db8>] [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e > [kvm] > 2011-02-09T07:40:15.636673+01:00 phy005 kernel: RSP: > 0018:ffff88061cbcbcd8 EFLAGS: 00010246 > 2011-02-09T07:40:15.636677+01:00 phy005 kernel: RAX: 0000000000000000 > RBX: 1603a07305004fff RCX: ffff88061cbcbd08 > 2011-02-09T07:40:15.636680+01:00 phy005 kernel: RDX: 0000000000000023 > RSI: 1603a07305004fff RDI: 0000000000000000 > 2011-02-09T07:40:15.636683+01:00 phy005 kernel: RBP: ffff88061cbcbce8 > R08: 0000000000000023 R09: 0000000000000000 > 2011-02-09T07:40:15.636686+01:00 phy005 kernel: R10: 0000000000000000 > R11: ffffffffa0082c7f R12: 0000000000000001 > 2011-02-09T07:40:15.636689+01:00 phy005 kernel: R13: 0000000000311763 > R14: ffff8809b8b01ce0 R15: 0000000000000000 > 2011-02-09T07:40:15.636692+01:00 phy005 kernel: FS: > 0000000000000000(0000) GS:ffff880002040000(0000) > knlGS:0000000000000000 > 2011-02-09T07:40:15.636695+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: > 0000 CR0: 000000008005003b > 2011-02-09T07:40:15.636699+01:00 phy005 kernel: CR2: 0000000000000000 > CR3: 0000000001a42000 CR4: 00000000000026e0 > 2011-02-09T07:40:15.636702+01:00 phy005 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-02-09T07:40:15.636705+01:00 phy005 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-02-09T07:40:15.636709+01:00 phy005 kernel: Process qemu-kvm (pid: > 2572, threadinfo ffff88061cbca000, task ffff88061cf04650) > 2011-02-09T07:40:15.636711+01:00 phy005 kernel: Stack: > 2011-02-09T07:40:15.636715+01:00 phy005 kernel: ffff88036c471ff8 > ffff880c23984000 ffff88061cbcbd18 ffffffffa0082ea9 > 2011-02-09T07:40:15.636718+01:00 phy005 kernel:<0> ffff8809b8b01ce0 > ffff880c23984000 ffff88036c471ff8 00000000000001ff > 2011-02-09T07:40:15.636721+01:00 phy005 kernel:<0> ffff88061cbcbd58 > ffffffffa008363b 0000000000000200 ffff880c23984000 > 2011-02-09T07:40:15.636724+01:00 phy005 kernel: Call Trace: > 2011-02-09T07:40:15.636728+01:00 phy005 kernel: [<ffffffffa0082ea9>] > rmap_remove+0xa3/0x1a0 [kvm] > 2011-02-09T07:40:15.636731+01:00 phy005 kernel: [<ffffffffa008363b>] > kvm_mmu_zap_page+0x9f/0x299 [kvm] > 2011-02-09T07:40:15.636734+01:00 phy005 kernel: [<ffffffffa0083a42>] > kvm_mmu_zap_all+0x35/0x60 [kvm] > 2011-02-09T07:40:15.636738+01:00 phy005 kernel: [<ffffffffa0078cde>] > kvm_arch_flush_shadow+0x16/0x22 [kvm] > 2011-02-09T07:40:15.636741+01:00 phy005 kernel: [<ffffffffa006eb0a>] > kvm_mmu_notifier_release+0x31/0x44 [kvm] > 2011-02-09T07:40:15.636744+01:00 phy005 kernel: [<ffffffff810fac37>] > __mmu_notifier_release+0x4f/0x7b > 2011-02-09T07:40:15.636748+01:00 phy005 kernel: [<ffffffff810e735d>] > exit_mmap+0x2c/0x132 > 2011-02-09T07:40:15.636751+01:00 phy005 kernel: [<ffffffff8104ad7a>] > mmput+0x5e/0xca > 2011-02-09T07:40:15.636754+01:00 phy005 kernel: [<ffffffff8104f0d5>] > exit_mm+0x114/0x121 > 2011-02-09T07:40:15.636757+01:00 phy005 kernel: [<ffffffff81050bf5>] > do_exit+0x254/0x752 > 2011-02-09T07:40:15.636760+01:00 phy005 kernel: [<ffffffff8100a60e>] ? > apic_timer_interrupt+0xe/0x20 > 2011-02-09T07:40:15.636764+01:00 phy005 kernel: [<ffffffff81051174>] > do_group_exit+0x81/0xab > 2011-02-09T07:40:15.636767+01:00 phy005 kernel: [<ffffffff810511b5>] > sys_exit_group+0x17/0x1b > 2011-02-09T07:40:15.636771+01:00 phy005 kernel: [<ffffffff81009c72>] > system_call_fastpath+0x16/0x1b > 2011-02-09T07:40:15.636777+01:00 phy005 kernel: Code: 88 ff ff ff b8 > 01 00 00 00 c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 41 89 d4 48 89 > f3 e8 7b c7 fe ff 41 83 fc 01 48 89 c7 75 0d<48> 2b 18 48 c1 e3 03 48 > 03 58 18 eb 39 41 8d 4c 24 ff be 01 00 > 2011-02-09T07:40:15.636785+01:00 phy005 kernel: RIP > [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] > 2011-02-09T07:40:15.636788+01:00 phy005 kernel: RSP<ffff88061cbcbcd8> > 2011-02-09T07:40:15.636791+01:00 phy005 kernel: CR2: 0000000000000000 > 2011-02-09T07:40:15.637743+01:00 phy005 kernel: ---[ end trace > 174c28940e9fd0a9 ]--- > 2011-02-09T07:40:15.637751+01:00 phy005 kernel: Fixing recursive fault > but reboot is needed! > In kvm. Was there a reboot between the two? > So it doesn't seem to be a hardware problem since we replaced all that. I agree. And your other machines are stable? When you say "identical software", are those exactly the same binaries? copying Andrea for possible insight into the non-kvm oopses. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: EPT: Misconfiguration 2011-02-13 12:58 ` Avi Kivity @ 2011-02-13 14:36 ` Ruben Kerkhof 0 siblings, 0 replies; 19+ messages in thread From: Ruben Kerkhof @ 2011-02-13 14:36 UTC (permalink / raw) To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, Andrea Arcangeli Hi Avi, On Sun, Feb 13, 2011 at 13:58, Avi Kivity <avi@redhat.com> wrote: > On 02/10/2011 05:23 PM, Ruben Kerkhof wrote: >> >> This machine has been running for a week without problems, but then we >> started to get the following oopses again: >> >> 2011-02-06T19:45:35.221555+01:00 phy005 kernel: BUG: unable to handle >> kernel paging request at ffffea71929180e0 >> 2011-02-06T19:45:35.222194+01:00 phy005 kernel: IP: >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-06T19:45:35.222199+01:00 phy005 kernel: PGD 118600067 PUD 0 >> 2011-02-06T19:45:35.222203+01:00 phy005 kernel: Oops: 0000 [#1] SMP >> 2011-02-06T19:45:35.222221+01:00 phy005 kernel: last sysfs file: >> /sys/devices/system/cpu/cpu15/topology/thread_siblings >> 2011-02-06T19:45:35.222224+01:00 phy005 kernel: CPU 4 >> 2011-02-06T19:45:35.222229+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding >> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter >> ip6_tables ipv6 kvm_intel kvm i2c_i801 i2c_core iTCO_wdt serio_raw igb >> iTCO_vendor_support joydev ioatdma dca 3w_9xxx [last unloaded: >> scsi_wait_scan] >> 2011-02-06T19:45:35.222231+01:00 phy005 kernel: >> 2011-02-06T19:45:35.222233+01:00 phy005 kernel: Pid: 3650, comm: >> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X8DTU/X8DTU >> 2011-02-06T19:45:35.222236+01:00 phy005 kernel: RIP: >> 0010:[<ffffffff81034880>] [<ffffffff81034880>] >> gup_pte_range+0x94/0xd3 >> 2011-02-06T19:45:35.222239+01:00 phy005 kernel: RSP: >> 0018:ffff88060b9bda78 EFLAGS: 00010082 >> 2011-02-06T19:45:35.222241+01:00 phy005 kernel: RAX: ffffea71929180e0 >> RBX: 00003ffffffff000 RCX: 0000000000000005 >> 2011-02-06T19:45:35.222243+01:00 phy005 kernel: RDX: 00007fe54e400000 >> RSI: 00007fe54e3ff000 RDI: 1603a07305004067 >> 2011-02-06T19:45:35.222245+01:00 phy005 kernel: RBP: ffff88060b9bda98 >> R08: ffff880b94384560 R09: ffff88060b9bdb44 >> 2011-02-06T19:45:35.222248+01:00 phy005 kernel: R10: ffff880606b2fff8 >> R11: ffffea0000000000 R12: 0000000000000205 >> 2011-02-06T19:45:35.222251+01:00 phy005 kernel: R13: ffffc00000000fff >> R14: 0000000000000005 R15: 0000000000000000 >> 2011-02-06T19:45:35.222255+01:00 phy005 kernel: FS: >> 00007fe64cb0e700(0000) GS:ffff880655400000(0000) >> knlGS:0000000000000000 >> 2011-02-06T19:45:35.222259+01:00 phy005 kernel: CS: 0010 DS: 002b ES: >> 002b CR0: 0000000080050033 >> 2011-02-06T19:45:35.222263+01:00 phy005 kernel: CR2: ffffea71929180e0 >> CR3: 0000000bff06d000 CR4: 00000000000026e0 >> 2011-02-06T19:45:35.222267+01:00 phy005 kernel: DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> 2011-02-06T19:45:35.222271+01:00 phy005 kernel: DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> 2011-02-06T19:45:35.222274+01:00 phy005 kernel: Process qemu-kvm (pid: >> 3650, threadinfo ffff88060b9bc000, task ffff880623ed2ee0) >> 2011-02-06T19:45:35.222278+01:00 phy005 kernel: Stack: >> 2011-02-06T19:45:35.222281+01:00 phy005 kernel: 00007fe54e400000 >> 00007fe54e400000 00007fe54e400000 ffff88053a0d2388 >> 2011-02-06T19:45:35.222285+01:00 phy005 kernel:<0> ffff88060b9bdaf8 >> ffffffff81034a15 00007fe54e3fffff 00007fe54e3fffff >> 2011-02-06T19:45:35.222289+01:00 phy005 kernel:<0> ffff88060b9bdb44 >> ffff880b94384560 ffff880bff06eca8 ffff880bff06d7f8 >> 2011-02-06T19:45:35.222292+01:00 phy005 kernel: Call Trace: >> 2011-02-06T19:45:35.222296+01:00 phy005 kernel: [<ffffffff81034a15>] >> gup_pud_range+0x156/0x192 >> 2011-02-06T19:45:35.222300+01:00 phy005 kernel: [<ffffffff81034b15>] >> get_user_pages_fast+0xc4/0x172 >> 2011-02-06T19:45:35.222304+01:00 phy005 kernel: [<ffffffff81131fbc>] ? >> bio_add_page+0x36/0x38 >> 2011-02-06T19:45:35.222308+01:00 phy005 kernel: [<ffffffff81134730>] >> dio_get_page+0x54/0x127 >> 2011-02-06T19:45:35.222312+01:00 phy005 kernel: [<ffffffff81135317>] >> __blockdev_direct_IO+0x41d/0xa36 >> 2011-02-06T19:45:35.222316+01:00 phy005 kernel: [<ffffffffa0080f69>] ? >> x86_emulate_insn+0x1ff8/0x2d61 [kvm] >> 2011-02-06T19:45:35.222320+01:00 phy005 kernel: [<ffffffff8113379b>] >> blkdev_direct_IO+0x4e/0x50 >> 2011-02-06T19:45:35.222324+01:00 phy005 kernel: [<ffffffff81132c49>] ? >> blkdev_get_blocks+0x0/0x8d >> 2011-02-06T19:45:35.222328+01:00 phy005 kernel: [<ffffffff810cb516>] >> generic_file_direct_write+0xed/0x16d >> 2011-02-06T19:45:35.222331+01:00 phy005 kernel: [<ffffffff810cb72c>] >> __generic_file_aio_write+0x196/0x281 >> 2011-02-06T19:45:35.222335+01:00 phy005 kernel: [<ffffffff811d5352>] ? >> file_has_perm+0xa4/0xc6 >> 2011-02-06T19:45:35.222339+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-06T19:45:35.222343+01:00 phy005 kernel: [<ffffffff8113306d>] >> blkdev_aio_write+0x2a/0x69 >> 2011-02-06T19:45:35.222347+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-06T19:45:35.222351+01:00 phy005 kernel: [<ffffffff8113d4eb>] >> aio_rw_vect_retry+0x85/0x18e >> 2011-02-06T19:45:35.222355+01:00 phy005 kernel: [<ffffffff8113e9b3>] >> aio_run_iocb+0x77/0x10f >> 2011-02-06T19:45:35.222359+01:00 phy005 kernel: [<ffffffff8113f508>] >> do_io_submit+0x558/0x7ce >> 2011-02-06T19:45:35.222363+01:00 phy005 kernel: [<ffffffff8113f78e>] >> sys_io_submit+0x10/0x12 >> 2011-02-06T19:45:35.222366+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-06T19:45:35.222372+01:00 phy005 kernel: Code: 21 d8 49 01 c2 >> 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 >> 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> 83 38 00 48 89 c7 79 >> 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 >> 2011-02-06T19:45:35.222376+01:00 phy005 kernel: RIP >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-06T19:45:35.222379+01:00 phy005 kernel: RSP<ffff88060b9bda78> >> 2011-02-06T19:45:35.222382+01:00 phy005 kernel: CR2: ffffea71929180e0 >> 2011-02-06T19:45:35.222386+01:00 phy005 kernel: ---[ end trace >> beed2b54d0bb8a00 ]--- >> > > Hm, outside any kvm code. > >> and >> >> 2011-02-06T19:47:15.023129+01:00 phy005 kernel: qemu-kvm: Corrupted >> page table at address 7fbde15ff64c >> 2011-02-06T19:47:15.023207+01:00 phy005 kernel: PGD 5ff58a067 PUD >> 612668067 PMD 5937b7067 PE 1603a07305008067 > > Again outside kvm, and again the magic pte 1603axxxxx. > > >> followed by >> >> 2011-02-06T21:20:32.882972+01:00 phy005 kernel: BUG: unable to handle >> kernel paging request at fffff6b192918010 >> 2011-02-06T21:20:32.883252+01:00 phy005 kernel: IP: >> [<ffffffffa0078826>] kvm_mmu_zap_page+0x28a/0x299 [kvm] > > Well, after something goes bad, nothing good can come out of it. > >> after which we rebooted the machine and replaced the motherboard and >> cpus (we already replaced the memory before). >> >> But 2 days ago we got this oops: >> >> 2011-02-08T15:56:19.902104+01:00 phy005 kernel: BUG: unable to handle >> kernel paging request at ffffea71929181c0 >> 2011-02-08T15:56:19.902686+01:00 phy005 kernel: IP: >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-08T15:56:19.902693+01:00 phy005 kernel: PGD 118600067 PUD 0 >> 2011-02-08T15:56:19.902699+01:00 phy005 kernel: Oops: 0000 [#1] SMP >> 2011-02-08T15:56:19.902703+01:00 phy005 kernel: last sysfs file: >> /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m >> ap >> 2011-02-08T15:56:19.902708+01:00 phy005 kernel: CPU 8 >> 2011-02-08T15:56:19.902715+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q >> garp stp llc bonding xt_comment xt_recent ip6t_REJECT >> nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i >> gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca >> serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] >> 2011-02-08T15:56:19.902770+01:00 phy005 kernel: >> 2011-02-08T15:56:19.902775+01:00 phy005 kernel: Pid: 3346, comm: >> qemu-kvm Not tainted 2.6.34.7-66.tilaa.fc13.x86_64 #1 X >> 8DTU/X8DTU >> 2011-02-08T15:56:19.902781+01:00 phy005 kernel: RIP: >> 0010:[<ffffffff81034880>] [<ffffffff81034880>] gup_pte_range+0x94/ >> 0xd3 >> 2011-02-08T15:56:19.902785+01:00 phy005 kernel: RSP: >> 0018:ffff880c21bc1a78 EFLAGS: 00010086 >> 2011-02-08T15:56:19.902789+01:00 phy005 kernel: RAX: ffffea71929181c0 >> RBX: 00003ffffffff000 RCX: 0000000000000005 >> 2011-02-08T15:56:19.902793+01:00 phy005 kernel: RDX: 00007fa2ca200000 >> RSI: 00007fa2ca1ff000 RDI: 1603a07305008067 >> 2011-02-08T15:56:19.902797+01:00 phy005 kernel: RBP: ffff880c21bc1a98 >> R08: ffff88060fdfad60 R09: ffff880c21bc1b44 >> 2011-02-08T15:56:19.902801+01:00 phy005 kernel: R10: ffff88061493fff8 >> R11: ffffea0000000000 R12: 0000000000000205 >> 2011-02-08T15:56:19.902805+01:00 phy005 kernel: R13: ffffc00000000fff >> R14: 0000000000000005 R15: 0000000000000000 >> 2011-02-08T15:56:19.902810+01:00 phy005 kernel: FS: >> 00007fa2d8724700(0000) GS:ffff880002080000(0000) knlGS:000000000000 >> 0000 >> 2011-02-08T15:56:19.902820+01:00 phy005 kernel: CS: 0010 DS: 002b ES: >> 002b CR0: 0000000080050033 >> 2011-02-08T15:56:19.902825+01:00 phy005 kernel: CR2: ffffea71929181c0 >> CR3: 0000000c231f9000 CR4: 00000000000026e0 >> 2011-02-08T15:56:19.902829+01:00 phy005 kernel: DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> 2011-02-08T15:56:19.902833+01:00 phy005 kernel: DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> 2011-02-08T15:56:19.902837+01:00 phy005 kernel: Process qemu-kvm (pid: >> 3346, threadinfo ffff880c21bc0000, task ffff880c2 >> 264ddc0) >> 2011-02-08T15:56:19.902841+01:00 phy005 kernel: Stack: >> 2011-02-08T15:56:19.902844+01:00 phy005 kernel: 00007fa2ca200000 >> 00007fa2ca201000 00007fa2ca201000 ffff880c22c3d280 >> 2011-02-08T15:56:19.902848+01:00 phy005 kernel:<0> ffff880c21bc1af8 >> ffffffff81034a15 00007fa2ca200fff 00007fa2ca200fff >> 2011-02-08T15:56:19.902852+01:00 phy005 kernel:<0> ffff880c21bc1b44 >> ffff88060fdfad60 ffff880c2231a458 ffff880c231f97f8 >> 2011-02-08T15:56:19.902855+01:00 phy005 kernel: Call Trace: >> 2011-02-08T15:56:19.902859+01:00 phy005 kernel: [<ffffffff81034a15>] >> gup_pud_range+0x156/0x192 >> 2011-02-08T15:56:19.902863+01:00 phy005 kernel: [<ffffffff81034b15>] >> get_user_pages_fast+0xc4/0x172 >> 2011-02-08T15:56:19.902867+01:00 phy005 kernel: [<ffffffff81131fbc>] ? >> bio_add_page+0x36/0x38 >> 2011-02-08T15:56:19.902871+01:00 phy005 kernel: [<ffffffff81134730>] >> dio_get_page+0x54/0x127 >> 2011-02-08T15:56:19.902875+01:00 phy005 kernel: [<ffffffff81135317>] >> __blockdev_direct_IO+0x41d/0xa36 >> 2011-02-08T15:56:19.902880+01:00 phy005 kernel: [<ffffffffa008bf69>] ? >> x86_emulate_insn+0x1ff8/0x2d61 [kvm] >> 2011-02-08T15:56:19.902884+01:00 phy005 kernel: [<ffffffff8113379b>] >> blkdev_direct_IO+0x4e/0x50 >> 2011-02-08T15:56:19.902888+01:00 phy005 kernel: [<ffffffff81132c49>] ? >> blkdev_get_blocks+0x0/0x8d >> 2011-02-08T15:56:19.902892+01:00 phy005 kernel: [<ffffffff810cb516>] >> generic_file_direct_write+0xed/0x16d >> 2011-02-08T15:56:19.902896+01:00 phy005 kernel: [<ffffffff810cb72c>] >> __generic_file_aio_write+0x196/0x281 >> 2011-02-08T15:56:19.902899+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-08T15:56:19.902909+01:00 phy005 kernel: [<ffffffff81133043>] ? >> blkdev_aio_write+0x0/0x69 >> 2011-02-08T15:56:19.902914+01:00 phy005 kernel: [<ffffffff8113d4eb>] >> aio_rw_vect_retry+0x85/0x18e >> 2011-02-08T15:56:19.902919+01:00 phy005 kernel: [<ffffffff8113e9b3>] >> aio_run_iocb+0x77/0x10f >> 2011-02-08T15:56:19.902923+01:00 phy005 kernel: [<ffffffff8113f508>] >> do_io_submit+0x558/0x7ce >> 2011-02-08T15:56:19.902927+01:00 phy005 kernel: [<ffffffff8113f78e>] >> sys_io_submit+0x10/0x12 >> 2011-02-08T15:56:19.902932+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-08T15:56:19.902938+01:00 phy005 kernel: Code: 21 d8 49 01 c2 >> 49 8b 3a 49 89 fe 4d 21 ee 4d 21 e6 49 39 ce 75 49 48 89 f8 0f 1f 40 >> 00 48 21 d8 48 c1 e8 0c 48 6b c0 38 4c 01 d8<66> 83 38 00 48 89 c7 79 >> 04 48 8b 78 10 f0 ff 47 08 49 63 39 48 >> 2011-02-08T15:56:19.903077+01:00 phy005 kernel: RIP >> [<ffffffff81034880>] gup_pte_range+0x94/0xd3 >> 2011-02-08T15:56:19.903081+01:00 phy005 kernel: RSP<ffff880c21bc1a78> >> 2011-02-08T15:56:19.903084+01:00 phy005 kernel: CR2: ffffea71929181c0 >> 2011-02-08T15:56:19.903088+01:00 phy005 kernel: ---[ end trace >> 174c28940e9fd0a7 ]--- >> > > Again outside kvm. > >> and yesterday this one: >> >> 2011-02-09T07:40:15.636528+01:00 phy005 kernel: BUG: unable to handle >> kernel NULL pointer dereference at (null) >> 2011-02-09T07:40:15.636635+01:00 phy005 kernel: IP: >> [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] >> 2011-02-09T07:40:15.636639+01:00 phy005 kernel: PGD 0 >> 2011-02-09T07:40:15.636643+01:00 phy005 kernel: Oops: 0000 [#3] SMP >> 2011-02-09T07:40:15.636647+01:00 phy005 kernel: last sysfs file: >> /sys/devices/system/cpu/cpu15/topology/thread_siblings >> 2011-02-09T07:40:15.636650+01:00 phy005 kernel: CPU 2 >> 2011-02-09T07:40:15.636656+01:00 phy005 kernel: Modules linked in: tun >> ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding >> xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter >> ip6_tables ipv6 kvm_intel kvm igb i2c_i801 iTCO_wdt ioatdma i2c_core >> iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: >> scsi_wait_scan] >> 2011-02-09T07:40:15.636663+01:00 phy005 kernel: >> 2011-02-09T07:40:15.636666+01:00 phy005 kernel: Pid: 2572, comm: >> qemu-kvm Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_64 #1 >> X8DTU/X8DTU >> 2011-02-09T07:40:15.636670+01:00 phy005 kernel: RIP: >> 0010:[<ffffffffa0082db8>] [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e >> [kvm] >> 2011-02-09T07:40:15.636673+01:00 phy005 kernel: RSP: >> 0018:ffff88061cbcbcd8 EFLAGS: 00010246 >> 2011-02-09T07:40:15.636677+01:00 phy005 kernel: RAX: 0000000000000000 >> RBX: 1603a07305004fff RCX: ffff88061cbcbd08 >> 2011-02-09T07:40:15.636680+01:00 phy005 kernel: RDX: 0000000000000023 >> RSI: 1603a07305004fff RDI: 0000000000000000 >> 2011-02-09T07:40:15.636683+01:00 phy005 kernel: RBP: ffff88061cbcbce8 >> R08: 0000000000000023 R09: 0000000000000000 >> 2011-02-09T07:40:15.636686+01:00 phy005 kernel: R10: 0000000000000000 >> R11: ffffffffa0082c7f R12: 0000000000000001 >> 2011-02-09T07:40:15.636689+01:00 phy005 kernel: R13: 0000000000311763 >> R14: ffff8809b8b01ce0 R15: 0000000000000000 >> 2011-02-09T07:40:15.636692+01:00 phy005 kernel: FS: >> 0000000000000000(0000) GS:ffff880002040000(0000) >> knlGS:0000000000000000 >> 2011-02-09T07:40:15.636695+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: >> 0000 CR0: 000000008005003b >> 2011-02-09T07:40:15.636699+01:00 phy005 kernel: CR2: 0000000000000000 >> CR3: 0000000001a42000 CR4: 00000000000026e0 >> 2011-02-09T07:40:15.636702+01:00 phy005 kernel: DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> 2011-02-09T07:40:15.636705+01:00 phy005 kernel: DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> 2011-02-09T07:40:15.636709+01:00 phy005 kernel: Process qemu-kvm (pid: >> 2572, threadinfo ffff88061cbca000, task ffff88061cf04650) >> 2011-02-09T07:40:15.636711+01:00 phy005 kernel: Stack: >> 2011-02-09T07:40:15.636715+01:00 phy005 kernel: ffff88036c471ff8 >> ffff880c23984000 ffff88061cbcbd18 ffffffffa0082ea9 >> 2011-02-09T07:40:15.636718+01:00 phy005 kernel:<0> ffff8809b8b01ce0 >> ffff880c23984000 ffff88036c471ff8 00000000000001ff >> 2011-02-09T07:40:15.636721+01:00 phy005 kernel:<0> ffff88061cbcbd58 >> ffffffffa008363b 0000000000000200 ffff880c23984000 >> 2011-02-09T07:40:15.636724+01:00 phy005 kernel: Call Trace: >> 2011-02-09T07:40:15.636728+01:00 phy005 kernel: [<ffffffffa0082ea9>] >> rmap_remove+0xa3/0x1a0 [kvm] >> 2011-02-09T07:40:15.636731+01:00 phy005 kernel: [<ffffffffa008363b>] >> kvm_mmu_zap_page+0x9f/0x299 [kvm] >> 2011-02-09T07:40:15.636734+01:00 phy005 kernel: [<ffffffffa0083a42>] >> kvm_mmu_zap_all+0x35/0x60 [kvm] >> 2011-02-09T07:40:15.636738+01:00 phy005 kernel: [<ffffffffa0078cde>] >> kvm_arch_flush_shadow+0x16/0x22 [kvm] >> 2011-02-09T07:40:15.636741+01:00 phy005 kernel: [<ffffffffa006eb0a>] >> kvm_mmu_notifier_release+0x31/0x44 [kvm] >> 2011-02-09T07:40:15.636744+01:00 phy005 kernel: [<ffffffff810fac37>] >> __mmu_notifier_release+0x4f/0x7b >> 2011-02-09T07:40:15.636748+01:00 phy005 kernel: [<ffffffff810e735d>] >> exit_mmap+0x2c/0x132 >> 2011-02-09T07:40:15.636751+01:00 phy005 kernel: [<ffffffff8104ad7a>] >> mmput+0x5e/0xca >> 2011-02-09T07:40:15.636754+01:00 phy005 kernel: [<ffffffff8104f0d5>] >> exit_mm+0x114/0x121 >> 2011-02-09T07:40:15.636757+01:00 phy005 kernel: [<ffffffff81050bf5>] >> do_exit+0x254/0x752 >> 2011-02-09T07:40:15.636760+01:00 phy005 kernel: [<ffffffff8100a60e>] ? >> apic_timer_interrupt+0xe/0x20 >> 2011-02-09T07:40:15.636764+01:00 phy005 kernel: [<ffffffff81051174>] >> do_group_exit+0x81/0xab >> 2011-02-09T07:40:15.636767+01:00 phy005 kernel: [<ffffffff810511b5>] >> sys_exit_group+0x17/0x1b >> 2011-02-09T07:40:15.636771+01:00 phy005 kernel: [<ffffffff81009c72>] >> system_call_fastpath+0x16/0x1b >> 2011-02-09T07:40:15.636777+01:00 phy005 kernel: Code: 88 ff ff ff b8 >> 01 00 00 00 c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 41 89 d4 48 89 >> f3 e8 7b c7 fe ff 41 83 fc 01 48 89 c7 75 0d<48> 2b 18 48 c1 e3 03 48 >> 03 58 18 eb 39 41 8d 4c 24 ff be 01 00 >> 2011-02-09T07:40:15.636785+01:00 phy005 kernel: RIP >> [<ffffffffa0082db8>] gfn_to_rmap+0x20/0x6e [kvm] >> 2011-02-09T07:40:15.636788+01:00 phy005 kernel: RSP<ffff88061cbcbcd8> >> 2011-02-09T07:40:15.636791+01:00 phy005 kernel: CR2: 0000000000000000 >> 2011-02-09T07:40:15.637743+01:00 phy005 kernel: ---[ end trace >> 174c28940e9fd0a9 ]--- >> 2011-02-09T07:40:15.637751+01:00 phy005 kernel: Fixing recursive fault >> but reboot is needed! >> > > In kvm. Was there a reboot between the two? No, there wasn't. I've just looked back at the logs and there was another oops in between: 2011-02-09T04:28:01.890999+01:00 phy005 kernel: general protection fault: 0000 [#2] SMP 2011-02-09T04:28:01.891122+01:00 phy005 kernel: last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_m ap 2011-02-09T04:28:01.891127+01:00 phy005 kernel: CPU 12 2011-02-09T04:28:01.891137+01:00 phy005 kernel: Modules linked in: tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i gb i2c_i801 iTCO_wdt ioatdma i2c_core iTCO_vendor_support dca serio_raw joydev 3w_9xxx [last unloaded: scsi_wait_scan] 2011-02-09T04:28:01.891144+01:00 phy005 kernel: 2011-02-09T04:28:01.891148+01:00 phy005 kernel: Pid: 19782, comm: find Tainted: G D 2.6.34.7-66.tilaa.fc13.x86_6 4 #1 X8DTU/X8DTU 2011-02-09T04:28:01.891154+01:00 phy005 kernel: RIP: 0010:[<ffffffff81158aa4>] [<ffffffff81158aa4>] proc_fd_instantiate +0x88/0x127 2011-02-09T04:28:01.891157+01:00 phy005 kernel: RSP: 0018:ffff880245677da8 EFLAGS: 00010206 2011-02-09T04:28:01.891161+01:00 phy005 kernel: RAX: 1603a07305000000 RBX: ffff8808076ada40 RCX: ffff88058bbbddc0 2011-02-09T04:28:01.891164+01:00 phy005 kernel: RDX: 000000000000022a RSI: ffff8808076ada40 RDI: ffff88062293ee80 2011-02-09T04:28:01.891168+01:00 phy005 kernel: RBP: ffff880245677dc8 R08: ffff8808076a91d0 R09: ffffffff81158a1c 2011-02-09T04:28:01.891172+01:00 phy005 kernel: R10: 0000000000000002 R11: ffff880245677d08 R12: ffff88062293ee00 2011-02-09T04:28:01.891176+01:00 phy005 kernel: R13: ffff8805b3897bf8 R14: ffff8808076a9430 R15: ffff8807ddd76c00 2011-02-09T04:28:01.891180+01:00 phy005 kernel: FS: 00007f09aa8e07a0(0000) GS:ffff880655480000(0000) knlGS:000000000000 0000 2011-02-09T04:28:01.891184+01:00 phy005 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2011-02-09T04:28:01.891188+01:00 phy005 kernel: CR2: 0000000000e43080 CR3: 00000007d6d6c000 CR4: 00000000000026e0 2011-02-09T04:28:01.891192+01:00 phy005 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-02-09T04:28:01.891196+01:00 phy005 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-02-09T04:28:01.891199+01:00 phy005 kernel: Process find (pid: 19782, threadinfo ffff880245676000, task ffff88058bbb 8000) 2011-02-09T04:28:01.891202+01:00 phy005 kernel: Stack: 2011-02-09T04:28:01.891206+01:00 phy005 kernel: ffff880245677e78 0000000000000003 ffff8802bfe0af00 ffff8808076ada40 2011-02-09T04:28:01.891209+01:00 phy005 kernel: <0> ffff880245677e38 ffffffff811564b8 ffff880245677e38 ffffffff81158a1c 2011-02-09T04:28:01.891213+01:00 phy005 kernel: <0> ffffffff8111b530 ffff880245677f38 0000000300119d45 ffff880245677e78 2011-02-09T04:28:01.891216+01:00 phy005 kernel: Call Trace: 2011-02-09T04:28:01.891220+01:00 phy005 kernel: [<ffffffff811564b8>] proc_fill_cache+0xa7/0x13f 2011-02-09T04:28:01.891224+01:00 phy005 kernel: [<ffffffff81158a1c>] ? proc_fd_instantiate+0x0/0x127 2011-02-09T04:28:01.891227+01:00 phy005 kernel: [<ffffffff8111b530>] ? filldir+0x0/0xd0 2011-02-09T04:28:01.891231+01:00 phy005 kernel: [<ffffffff8111b530>] ? filldir+0x0/0xd0 2011-02-09T04:28:01.891235+01:00 phy005 kernel: [<ffffffff811586c8>] proc_readfd_common+0x159/0x1a3 2011-02-09T04:28:01.891239+01:00 phy005 kernel: [<ffffffff81158a1c>] ? proc_fd_instantiate+0x0/0x127 2011-02-09T04:28:01.891242+01:00 phy005 kernel: [<ffffffff8111b530>] ? filldir+0x0/0xd0 2011-02-09T04:28:01.891246+01:00 phy005 kernel: [<ffffffff8115873e>] proc_readfd+0x15/0x17 2011-02-09T04:28:01.891250+01:00 phy005 kernel: [<ffffffff8111b731>] vfs_readdir+0x77/0xb4 2011-02-09T04:28:01.891254+01:00 phy005 kernel: [<ffffffff8111b8b7>] sys_getdents+0x81/0xd1 2011-02-09T04:28:01.891258+01:00 phy005 kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b 2011-02-09T04:28:01.891263+01:00 phy005 kernel: Code: e8 08 3e 2f 00 49 8b 44 24 08 44 3b 28 0f 83 9c 00 00 00 45 89 ed 49 c1 e5 03 4c 03 68 08 49 8b 45 00 48 85 c0 0f 84 84 00 00 00 <f6> 40 3c 01 74 0a 66 41 81 8e aa 00 00 00 40 01 f6 40 3c 02 74 2011-02-09T04:28:01.891275+01:00 phy005 kernel: RIP [<ffffffff81158aa4>] proc_fd_instantiate+0x88/0x127 2011-02-09T04:28:01.891279+01:00 phy005 kernel: RSP <ffff880245677da8> 2011-02-09T04:28:01.891283+01:00 phy005 kernel: ---[ end trace 174c28940e9fd0a8 ]--- > >> So it doesn't seem to be a hardware problem since we replaced all that. > > I agree. And your other machines are stable? Yes, the other ones have been running for ages without problems. We've been using 2.6.34.7 for about three months now. > When you say "identical software", are those exactly the same binaries? Yes, the same (kickstarted) install, the same rpms. > copying Andrea for possible insight into the non-kvm oopses. > > -- > error compiling committee.c: too many arguments to function Kind regards, Ruben ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-03-05 18:58 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-20 11:48 EPT: Misconfiguration Ruben Kerkhof 2011-01-20 11:59 ` Ruben Kerkhof 2011-01-21 13:22 ` Marcelo Tosatti 2011-01-25 14:44 ` Ruben Kerkhof 2011-01-25 17:39 ` Avi Kivity 2011-01-25 18:29 ` Ruben Kerkhof 2011-01-26 9:52 ` Avi Kivity 2011-01-26 15:00 ` Ruben Kerkhof 2011-02-10 15:23 ` Ruben Kerkhof 2011-02-13 2:07 ` Ruben Kerkhof 2011-02-13 13:03 ` Avi Kivity 2011-02-13 14:40 ` Ruben Kerkhof 2011-02-15 17:16 ` Marcelo Tosatti 2011-02-15 19:04 ` Ruben Kerkhof 2011-02-24 21:15 ` Ruben Kerkhof 2011-02-27 10:46 ` Avi Kivity 2011-03-05 18:57 ` Ruben Kerkhof 2011-02-13 12:58 ` Avi Kivity 2011-02-13 14:36 ` Ruben Kerkhof
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).