* 2.6.38.1 general protection fault @ 2011-03-25 9:32 Tomasz Chmielewski 2011-03-26 9:15 ` Avi Kivity 0 siblings, 1 reply; 13+ messages in thread From: Tomasz Chmielewski @ 2011-03-25 9:32 UTC (permalink / raw) To: kvm@vger.kernel.org I got this on a 2.6.38.1 system which (I think) had some problem accessing guest image on a btrfs filesystem. general protection fault: 0000 [#1] SMP last sysfs file: /sys/kernel/uevent_seqnum CPU 0 Modules linked in: ipt_MASQUERADE vhost_net kvm_intel kvm iptable_filter xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables bridge stp btrfs zlib_deflate crc32c libcrc32c coretemp f71882fg snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc tpm_tis tpm tpm_bios pcspkr i7core_edac edac_core r8169 mii raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 ahci libahci sata_nv sata_sil sata_via 3w_9xxx 3w_xxxx [last unloaded: scsi_wait_scan] Pid: 10199, comm: kvm Not tainted 2.6.38.1 #1 MSI MS-7522/MSI X58 Pro-E (MS-7522) RIP: 0010:[<ffffffffa02cae20>] [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm] RSP: 0018:ffff880508ee9bf0 EFLAGS: 00010202 RAX: 00008805d6b087f8 RBX: ffff8805b7b10000 RCX: 0000000000000050 RDX: 0000000000000000 RSI: 00008805d6b087f8 RDI: ffff8805b7b10000 RBP: ffff880508ee9c10 R08: ffff8801061d4000 R09: ffffc9001f19aff0 R10: 0000000000000030 R11: 0000000000000000 R12: 0000000000000000 R13: ffffc9001f19aff8 R14: 0000000000000060 R15: ffff8801061d4000 FS: 00007f7ca25d6730(0000) GS:ffff8800bf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000462b10 CR3: 00000003ac47f000 CR4: 00000000000026e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kvm (pid: 10199, threadinfo ffff880508ee8000, task ffff88001b5a5b00) Stack: ffffffffffffffcf 00000000000220ff 0000000000000001 ffff8801061d4050 ffff880508ee9c80 ffffffffa02c8a54 0000000000000030 ffffffffa02cae00 0000000000000000 00007f7c80a2b000 ffff8805b7b10000 0000000000000001 Call Trace: [<ffffffffa02c8a54>] kvm_handle_hva+0xb4/0x170 [kvm] [<ffffffffa02cae00>] ? kvm_unmap_rmapp+0x0/0x70 [kvm] [<ffffffffa02c8b27>] kvm_unmap_hva+0x17/0x20 [kvm] [<ffffffffa02b1e72>] kvm_mmu_notifier_invalidate_range_start+0x62/0xb0 [kvm] [<ffffffff8113ea11>] __mmu_notifier_invalidate_range_start+0x51/0x70 [<ffffffff8111e2c1>] copy_page_range+0x3b1/0x460 [<ffffffff812c5628>] ? rb_insert_color+0x98/0x140 [<ffffffff81060cdc>] dup_mm+0x2fc/0x500 [<ffffffff810617fe>] copy_process+0x8be/0x11b0 [<ffffffff81062165>] do_fork+0x75/0x350 [<ffffffff81177bcd>] ? mntput+0x1d/0x40 [<ffffffff8115b095>] ? fput+0x1e5/0x270 [<ffffffff815aa7f5>] ? _raw_spin_lock_irq+0x15/0x20 [<ffffffff81075141>] ? sigprocmask+0x91/0x110 [<ffffffff81014ab8>] sys_clone+0x28/0x30 [<ffffffff8100c3e3>] stub_clone+0x13/0x20 [<ffffffff8100c0c2>] ? system_call_fastpath+0x16/0x1b Code: 49 89 01 eb 91 66 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 45 31 e4 48 89 fb 49 89 f5 eb 1d 0f 1f 00 <f6> 06 01 74 38 48 8b 15 a4 66 02 00 48 89 df 41 bc 01 00 00 00 RIP [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm] RSP <ffff880508ee9bf0> ---[ end trace 85201a339b7635fc ]--- -- Tomasz Chmielewski http://wpkg.org ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-25 9:32 2.6.38.1 general protection fault Tomasz Chmielewski @ 2011-03-26 9:15 ` Avi Kivity 2011-03-26 10:42 ` Tomasz Chmielewski 0 siblings, 1 reply; 13+ messages in thread From: Avi Kivity @ 2011-03-26 9:15 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: kvm@vger.kernel.org, Andrea Arcangeli On 03/25/2011 11:32 AM, Tomasz Chmielewski wrote: > I got this on a 2.6.38.1 system which (I think) had some problem accessing guest image on a btrfs filesystem. > > > general protection fault: 0000 [#1] SMP > last sysfs file: /sys/kernel/uevent_seqnum > CPU 0 > Modules linked in: ipt_MASQUERADE vhost_net kvm_intel kvm iptable_filter xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables bridge stp btrfs zlib_deflate crc32c libcrc32c coretemp f71882fg snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc tpm_tis tpm tpm_bios pcspkr i7core_edac edac_core r8169 mii raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 ahci libahci sata_nv sata_sil sata_via 3w_9xxx 3w_xxxx [last unloaded: scsi_wait_scan] > > Pid: 10199, comm: kvm Not tainted 2.6.38.1 #1 MSI MS-7522/MSI X58 Pro-E (MS-7522) > RIP: 0010:[<ffffffffa02cae20>] [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm] > RSP: 0018:ffff880508ee9bf0 EFLAGS: 00010202 > RAX: 00008805d6b087f8 RBX: ffff8805b7b10000 RCX: 0000000000000050 > RDX: 0000000000000000 RSI: 00008805d6b087f8 RDI: ffff8805b7b10000 > RBP: ffff880508ee9c10 R08: ffff8801061d4000 R09: ffffc9001f19aff0 > R10: 0000000000000030 R11: 0000000000000000 R12: 0000000000000000 > R13: ffffc9001f19aff8 R14: 0000000000000060 R15: ffff8801061d4000 > FS: 00007f7ca25d6730(0000) GS:ffff8800bf400000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000462b10 CR3: 00000003ac47f000 CR4: 00000000000026e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kvm (pid: 10199, threadinfo ffff880508ee8000, task ffff88001b5a5b00) > Stack: > ffffffffffffffcf 00000000000220ff 0000000000000001 ffff8801061d4050 > ffff880508ee9c80 ffffffffa02c8a54 0000000000000030 ffffffffa02cae00 > 0000000000000000 00007f7c80a2b000 ffff8805b7b10000 0000000000000001 > Call Trace: > [<ffffffffa02c8a54>] kvm_handle_hva+0xb4/0x170 [kvm] > [<ffffffffa02cae00>] ? kvm_unmap_rmapp+0x0/0x70 [kvm] > [<ffffffffa02c8b27>] kvm_unmap_hva+0x17/0x20 [kvm] > [<ffffffffa02b1e72>] kvm_mmu_notifier_invalidate_range_start+0x62/0xb0 [kvm] > [<ffffffff8113ea11>] __mmu_notifier_invalidate_range_start+0x51/0x70 > [<ffffffff8111e2c1>] copy_page_range+0x3b1/0x460 > [<ffffffff812c5628>] ? rb_insert_color+0x98/0x140 > [<ffffffff81060cdc>] dup_mm+0x2fc/0x500 > [<ffffffff810617fe>] copy_process+0x8be/0x11b0 > [<ffffffff81062165>] do_fork+0x75/0x350 > [<ffffffff81177bcd>] ? mntput+0x1d/0x40 > [<ffffffff8115b095>] ? fput+0x1e5/0x270 > [<ffffffff815aa7f5>] ? _raw_spin_lock_irq+0x15/0x20 > [<ffffffff81075141>] ? sigprocmask+0x91/0x110 > [<ffffffff81014ab8>] sys_clone+0x28/0x30 > [<ffffffff8100c3e3>] stub_clone+0x13/0x20 > [<ffffffff8100c0c2>] ? system_call_fastpath+0x16/0x1b > Code: 49 89 01 eb 91 66 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 45 31 e4 48 89 fb 49 89 f5 eb 1d 0f 1f 00<f6> 06 01 74 38 48 8b 15 a4 66 02 00 48 89 df 41 bc 01 00 00 00 > RIP [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm] > RSP<ffff880508ee9bf0> > ---[ end trace 85201a339b7635fc ]--- > > > 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 41 55 push %r13 6: 41 54 push %r12 8: 53 push %rbx 9: 48 83 ec 08 sub $0x8,%rsp d: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 12: 45 31 e4 xor %r12d,%r12d 15: 48 89 fb mov %rdi,%rbx 18: 49 89 f5 mov %rsi,%r13 1b: eb 1d jmp 0x3a 1d: 0f 1f 00 nopl (%rax) 20: f6 06 01 testb $0x1,(%rsi) Looks like the top 16 bits of %rsi are flipped. Also wierd to see a fork(). What's your qemu command line? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-26 9:15 ` Avi Kivity @ 2011-03-26 10:42 ` Tomasz Chmielewski 2011-03-27 9:42 ` Avi Kivity 0 siblings, 1 reply; 13+ messages in thread From: Tomasz Chmielewski @ 2011-03-26 10:42 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm@vger.kernel.org, Andrea Arcangeli On 26.03.2011 10:15, Avi Kivity wrote: > On 03/25/2011 11:32 AM, Tomasz Chmielewski wrote: >> I got this on a 2.6.38.1 system which (I think) had some problem >> accessing guest image on a btrfs filesystem. >> >> >> general protection fault: 0000 [#1] SMP (...) > 0: 55 push %rbp > 1: 48 89 e5 mov %rsp,%rbp > 4: 41 55 push %r13 > 6: 41 54 push %r12 > 8: 53 push %rbx > 9: 48 83 ec 08 sub $0x8,%rsp > d: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > 12: 45 31 e4 xor %r12d,%r12d > 15: 48 89 fb mov %rdi,%rbx > 18: 49 89 f5 mov %rsi,%r13 > 1b: eb 1d jmp 0x3a > 1d: 0f 1f 00 nopl (%rax) > 20: f6 06 01 testb $0x1,(%rsi) > > > Looks like the top 16 bits of %rsi are flipped. > > Also wierd to see a fork(). What's your qemu command line? /usr/bin/kvm -monitor unix:/var/run/qemu-server/113.mon,server,nowait -vnc unix:/var/run/qemu-server/113.vnc,password -pidfile /var/run/qemu-server/113.pid -daemonize -usbdevice tablet -name swcache -smp sockets=1,cores=1 -nodefaults -boot menu=on -vga cirrus -tdf -k de -drive file=/var/lib/vz/template/iso/systemrescuecd-x86-2.0.0.iso,if=ide,index=2,media=cdrom -drive file=/var/lib/vz/images/113/vm-113-disk-1.raw,if=scsi,index=0,cache=none,boot=on -m 1024 -netdev type=tap,id=vlan0d0,ifname=tap113i0d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=DE:42:48:50:D8:69,netdev=vlan0d0 -netdev type=tap,id=vlan100d0,ifname=tap113i100d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=72:D2:6E:8E:07:4D,netdev=vlan100d0 -- Tomasz Chmielewski http://wpkg.org ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-26 10:42 ` Tomasz Chmielewski @ 2011-03-27 9:42 ` Avi Kivity 2011-03-28 6:24 ` Tomasz Chmielewski 0 siblings, 1 reply; 13+ messages in thread From: Avi Kivity @ 2011-03-27 9:42 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: kvm@vger.kernel.org, Andrea Arcangeli On 03/26/2011 12:42 PM, Tomasz Chmielewski wrote: > On 26.03.2011 10:15, Avi Kivity wrote: > > On 03/25/2011 11:32 AM, Tomasz Chmielewski wrote: > >> I got this on a 2.6.38.1 system which (I think) had some problem > >> accessing guest image on a btrfs filesystem. > >> > >> > >> general protection fault: 0000 [#1] SMP > > (...) > > > 0: 55 push %rbp > > 1: 48 89 e5 mov %rsp,%rbp > > 4: 41 55 push %r13 > > 6: 41 54 push %r12 > > 8: 53 push %rbx > > 9: 48 83 ec 08 sub $0x8,%rsp > > d: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > > 12: 45 31 e4 xor %r12d,%r12d > > 15: 48 89 fb mov %rdi,%rbx > > 18: 49 89 f5 mov %rsi,%r13 > > 1b: eb 1d jmp 0x3a > > 1d: 0f 1f 00 nopl (%rax) > > 20: f6 06 01 testb $0x1,(%rsi) > > > > > > Looks like the top 16 bits of %rsi are flipped. > > > > Also wierd to see a fork(). What's your qemu command line? > > /usr/bin/kvm -monitor unix:/var/run/qemu-server/113.mon,server,nowait -vnc unix:/var/run/qemu-server/113.vnc,password -pidfile /var/run/qemu-server/113.pid -daemonize -usbdevice tablet -name swcache -smp sockets=1,cores=1 -nodefaults -boot menu=on -vga cirrus -tdf -k de -drive file=/var/lib/vz/template/iso/systemrescuecd-x86-2.0.0.iso,if=ide,index=2,media=cdrom -drive file=/var/lib/vz/images/113/vm-113-disk-1.raw,if=scsi,index=0,cache=none,boot=on -m 1024 -netdev type=tap,id=vlan0d0,ifname=tap113i0d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=DE:42:48:50:D8:69,netdev=vlan0d0 -netdev type=tap,id=vlan100d0,ifname=tap113i100d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=72:D2:6E:8E:07:4D,netdev=vlan100d0 > > Okay, the fork came from the ,script=. The issue with %rsi looks like a use-after-free, however kvm_mmu_notifier_invalidate_range_start appears to be properly srcu protected. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-27 9:42 ` Avi Kivity @ 2011-03-28 6:24 ` Tomasz Chmielewski 2011-03-28 9:19 ` Avi Kivity 0 siblings, 1 reply; 13+ messages in thread From: Tomasz Chmielewski @ 2011-03-28 6:24 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm@vger.kernel.org, Andrea Arcangeli On 27.03.2011 11:42, Avi Kivity wrote: (...) > Okay, the fork came from the ,script=. > > The issue with %rsi looks like a use-after-free, however > kvm_mmu_notifier_invalidate_range_start appears to be properly srcu > protected. FYI, I saw this one as well: http://www.virtall.com/files/temp/kvm.txt If you need to look at the config, it's available here: http://www.virtall.com/files/temp/config-2.6.38.1 -- Tomasz Chmielewski http://wpkg.org ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 6:24 ` Tomasz Chmielewski @ 2011-03-28 9:19 ` Avi Kivity 2011-03-28 17:54 ` Andrea Arcangeli 2011-03-29 13:34 ` Marcelo Tosatti 0 siblings, 2 replies; 13+ messages in thread From: Avi Kivity @ 2011-03-28 9:19 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: kvm@vger.kernel.org, Andrea Arcangeli, Marcelo Tosatti On 03/28/2011 08:24 AM, Tomasz Chmielewski wrote: > On 27.03.2011 11:42, Avi Kivity wrote: > > (...) > >> Okay, the fork came from the ,script=. >> >> The issue with %rsi looks like a use-after-free, however >> kvm_mmu_notifier_invalidate_range_start appears to be properly srcu >> protected. > > FYI, I saw this one as well: > > http://www.virtall.com/files/temp/kvm.txt Similar pattern - top 16 bits of %rsi are flipped. Marcelo, what was the option to enable padding for allocations and overrun detection? Also use-after-free? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 9:19 ` Avi Kivity @ 2011-03-28 17:54 ` Andrea Arcangeli 2011-03-28 18:02 ` Avi Kivity 2011-03-29 13:34 ` Marcelo Tosatti 1 sibling, 1 reply; 13+ messages in thread From: Andrea Arcangeli @ 2011-03-28 17:54 UTC (permalink / raw) To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm@vger.kernel.org, Marcelo Tosatti Hello everyone, On Mon, Mar 28, 2011 at 11:19:51AM +0200, Avi Kivity wrote: > On 03/28/2011 08:24 AM, Tomasz Chmielewski wrote: > > On 27.03.2011 11:42, Avi Kivity wrote: > > > > (...) > > > >> Okay, the fork came from the ,script=. > >> > >> The issue with %rsi looks like a use-after-free, however > >> kvm_mmu_notifier_invalidate_range_start appears to be properly srcu > >> protected. > > > > FYI, I saw this one as well: > > > > http://www.virtall.com/files/temp/kvm.txt > > Similar pattern - top 16 bits of %rsi are flipped. > > Marcelo, what was the option to enable padding for allocations and > overrun detection? Also use-after-free? BTW, is it genuine that a protection fault is generated instead of a page fault while dereferencing address 0x00008805d6b087f8? I would normally except a page fault from a memory dereference that doesn't alter processor state/segments. The other GFP happened in pmdp_clear_flush_notify inside collapse_huge_page. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 17:54 ` Andrea Arcangeli @ 2011-03-28 18:02 ` Avi Kivity 2011-03-28 20:04 ` Andrea Arcangeli 0 siblings, 1 reply; 13+ messages in thread From: Avi Kivity @ 2011-03-28 18:02 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Tomasz Chmielewski, kvm@vger.kernel.org, Marcelo Tosatti On 03/28/2011 07:54 PM, Andrea Arcangeli wrote: > BTW, is it genuine that a protection fault is generated instead of a page > fault while dereferencing address 0x00008805d6b087f8? I would normally > except a page fault from a memory dereference that doesn't alter > processor state/segments. Yes. Bits 48-63 of the address must be equal to bit 47, or a #GP is generated (non-canonical address). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 18:02 ` Avi Kivity @ 2011-03-28 20:04 ` Andrea Arcangeli 2011-03-28 20:14 ` Tomasz Chmielewski 0 siblings, 1 reply; 13+ messages in thread From: Andrea Arcangeli @ 2011-03-28 20:04 UTC (permalink / raw) To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm@vger.kernel.org, Marcelo Tosatti On Mon, Mar 28, 2011 at 08:02:47PM +0200, Avi Kivity wrote: > On 03/28/2011 07:54 PM, Andrea Arcangeli wrote: > > BTW, is it genuine that a protection fault is generated instead of a page > > fault while dereferencing address 0x00008805d6b087f8? I would normally > > except a page fault from a memory dereference that doesn't alter > > processor state/segments. > > Yes. Bits 48-63 of the address must be equal to bit 47, or a #GP is > generated (non-canonical address). Ok, when you said 16 bit reversed I didn't match it to bit 48 and max 128TB of user address space. I thought it was good idea to check because in the past I've seen GFP that were hardware issues triggering on normal memory dereference but this is probably not the case. Tomasz, how easily can you reproduce? Could you upload to the site the output of objdump -dr arch/x86/kvm/mmu.o too? (my assembly is vastly different than the one shown so far, I may find more info in the oops if I get the assembly of the caller too and of the iteration of the loop that runs in that function before the GFP) khugepaged is present in your second trace (and khugepaged is mangling over some memslot range with guest gfn mapped or kvm_unmap_rmapp wouldn't be called in the first place, hope the memslot are all ok) but probably you didn't get the right alignment so likely the THP are mapped as 4k pages in the guest, which must work fine too. I wonder if that might be related to that (my qemu-kvm I keep it patched with the patch below which isn't yet polished enough to be digestible for qemu, wrong alignments, x86 4M alignment not handled yet, and not sure if the DONTFORK fix to prevent OOM with hotplug/migrate is acceptable in that position). Can you try to "echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs" and then run "cat /proc/`pgrep qemu`/smaps >/dev/null" once per minute (or find the right pid by hand if you've more than one qemu process running). This debug trick will only work for 2.6.38.1, as 2.6.39 has a native THP handling in the smaps file, but in 2.6.38.1 it should flush all sptes mapped on THP just like fork (this might help to reproduce). I'm also surprised this happened during fork that initialize the tap interface, shouldn't that fork run before any sptes is established? (we're running the spte invalidate with mmu notifier in the parent before wrprotecting the ptes during fork) I also wonder if it's a memslot race of some kind, I don't see anything wrong in the rmapp handling at the moment. This isn't a patch to try, I'm only showing it here for reference as I guess I suspect it might hide the bug. I'm now going to reverse it and see if I can reproduce, in case having large sptes (instead of 4k sptes) always mapped on host THP changes something. Thanks! diff --git a/exec.c b/exec.c index bb0c1be..f60e5fe 100644 --- a/exec.c +++ b/exec.c @@ -2856,6 +2856,18 @@ static ram_addr_t last_ram_offset(void) return last; } +#if defined(__linux__) && defined(__x86_64__) +/* + * Align on the max transparent hugepage size so that + * "(gfn ^ pfn) & (HPAGE_SIZE-1) == 0" to allow KVM to + * take advantage of hugepages with NPT/EPT or to + * ensure the first 2M of the guest physical ram will + * be mapped by the same hugetlb for QEMU (it is worth + * it even without NPT/EPT). + */ +#define PREFERRED_RAM_ALIGN (2*1024*1024) +#endif + ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, ram_addr_t size, void *host) { @@ -2902,9 +2914,15 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); #else - new_block->host = qemu_vmalloc(size); +#ifdef PREFERRED_RAM_ALIGN + if (size >= PREFERRED_RAM_ALIGN) + new_block->host = qemu_memalign(PREFERRED_RAM_ALIGN, size); + else +#endif + new_block->host = qemu_vmalloc(size); #endif qemu_madvise(new_block->host, size, QEMU_MADV_MERGEABLE); + qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK); } } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 20:04 ` Andrea Arcangeli @ 2011-03-28 20:14 ` Tomasz Chmielewski 2011-04-20 9:28 ` Thomas Treutner 0 siblings, 1 reply; 13+ messages in thread From: Tomasz Chmielewski @ 2011-03-28 20:14 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Avi Kivity, kvm@vger.kernel.org, Marcelo Tosatti On 28.03.2011 22:04, Andrea Arcangeli wrote: > Tomasz, how easily can you reproduce? Well, this server runs 10 VMs or so, and it happens after 1-2 days of uptime. I reverted now to a 2.6.35.x, as it had enough downtime with 2.6.38 already ;) so I'd rather not experiment anymore for some time with a kernel known to cause problems. > Could you upload to the site the > output of objdump -dr arch/x86/kvm/mmu.o too? http://virtall.com/files/temp/mmu-objdump.txt -- Tomasz Chmielewski http://wpkg.org ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 20:14 ` Tomasz Chmielewski @ 2011-04-20 9:28 ` Thomas Treutner 2011-04-20 10:54 ` Tomasz Chmielewski 0 siblings, 1 reply; 13+ messages in thread From: Thomas Treutner @ 2011-04-20 9:28 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: kvm@vger.kernel.org On 03/28/2011 10:14 PM, Tomasz Chmielewski wrote: > On 28.03.2011 22:04, Andrea Arcangeli wrote: > >> Tomasz, how easily can you reproduce? > > Well, this server runs 10 VMs or so, and it happens after 1-2 days of > uptime. > > I reverted now to a 2.6.35.x, as it had enough downtime with 2.6.38 > already ;) so I'd rather not experiment anymore for some time with a > kernel known to cause problems. Tomasz, to which exact kernel version (host+guests) did you switch and is it now stable? thanks, -t ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-04-20 9:28 ` Thomas Treutner @ 2011-04-20 10:54 ` Tomasz Chmielewski 0 siblings, 0 replies; 13+ messages in thread From: Tomasz Chmielewski @ 2011-04-20 10:54 UTC (permalink / raw) To: Thomas Treutner; +Cc: kvm@vger.kernel.org On 20.04.2011 11:28, Thomas Treutner wrote: > On 03/28/2011 10:14 PM, Tomasz Chmielewski wrote: >> On 28.03.2011 22:04, Andrea Arcangeli wrote: >> >>> Tomasz, how easily can you reproduce? >> >> Well, this server runs 10 VMs or so, and it happens after 1-2 days of >> uptime. >> >> I reverted now to a 2.6.35.x, as it had enough downtime with 2.6.38 >> already ;) so I'd rather not experiment anymore for some time with a >> kernel known to cause problems. > > Tomasz, to which exact kernel version (host+guests) did you switch and > is it now stable? I've switched the host to the latest 2.6.35.x and it's stable. Guest kernel doesn't seem to make a difference here, but majority of them are running 2.6.38.x kernel (had some weird issues with "events/0", taking 100% CPU on guests when I used 2.6.35, which made the guests crawling slow). -- Tomasz Chmielewski http://wpkg.org ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.6.38.1 general protection fault 2011-03-28 9:19 ` Avi Kivity 2011-03-28 17:54 ` Andrea Arcangeli @ 2011-03-29 13:34 ` Marcelo Tosatti 1 sibling, 0 replies; 13+ messages in thread From: Marcelo Tosatti @ 2011-03-29 13:34 UTC (permalink / raw) To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm@vger.kernel.org, Andrea Arcangeli On Mon, Mar 28, 2011 at 11:19:51AM +0200, Avi Kivity wrote: > On 03/28/2011 08:24 AM, Tomasz Chmielewski wrote: > >On 27.03.2011 11:42, Avi Kivity wrote: > > > >(...) > > > >>Okay, the fork came from the ,script=. > >> > >>The issue with %rsi looks like a use-after-free, however > >>kvm_mmu_notifier_invalidate_range_start appears to be properly srcu > >>protected. > > > >FYI, I saw this one as well: > > > >http://www.virtall.com/files/temp/kvm.txt > > Similar pattern - top 16 bits of %rsi are flipped. > > Marcelo, what was the option to enable padding for allocations and > overrun detection? Also use-after-free? slub_debug=ZFPU boot kernel parameter. Documentation/vm/slub.txt: Possible debug options are F Sanity checks on (enables SLAB_DEBUG_FREE. Sorry SLAB legacy issues) Z Red zoning P Poisoning (object and padding) U User tracking (free and alloc) ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-04-20 10:54 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-25 9:32 2.6.38.1 general protection fault Tomasz Chmielewski 2011-03-26 9:15 ` Avi Kivity 2011-03-26 10:42 ` Tomasz Chmielewski 2011-03-27 9:42 ` Avi Kivity 2011-03-28 6:24 ` Tomasz Chmielewski 2011-03-28 9:19 ` Avi Kivity 2011-03-28 17:54 ` Andrea Arcangeli 2011-03-28 18:02 ` Avi Kivity 2011-03-28 20:04 ` Andrea Arcangeli 2011-03-28 20:14 ` Tomasz Chmielewski 2011-04-20 9:28 ` Thomas Treutner 2011-04-20 10:54 ` Tomasz Chmielewski 2011-03-29 13:34 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).