From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755531Ab2LSPnX (ORCPT ); Wed, 19 Dec 2012 10:43:23 -0500 Received: from mx1.redhat.com ([209.132.183.28]:8890 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751481Ab2LSPnN (ORCPT ); Wed, 19 Dec 2012 10:43:13 -0500 Message-ID: <1355931777.3224.562.camel@bling.home> Subject: Re: [PATCH 0/7] KVM: Alleviate mmu_lock hold time when we start dirty logging From: Alex Williamson To: Takuya Yoshikawa Cc: Takuya Yoshikawa , mtosatti@redhat.com, gleb@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Date: Wed, 19 Dec 2012 08:42:57 -0700 In-Reply-To: <20121219213037.b234f9d4f187df2132e65576@gmail.com> References: <20121218162558.65a8bfd3.yoshikawa_takuya_b1@lab.ntt.co.jp> <20121219213037.b234f9d4f187df2132e65576@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2012-12-19 at 21:30 +0900, Takuya Yoshikawa wrote: > Ccing Alex, > > I tested kvm.git next branch before Alex's patch set was applied, and > did not see the bug. Although I'm not 100% sure but it is possible > that something got broken by a patch in the following series: > > [01/10] KVM: Restrict non-existing slot state transitions > [02/10] KVM: Check userspace_addr when modifying a memory slot > [03/10] KVM: Fix iommu map/unmap to handle memory slot moves > [04/10] KVM: Minor memory slot optimization > [05/10] KVM: Rename KVM_MEMORY_SLOTS -> KVM_USER_MEM_SLOTS > [06/10] KVM: Make KVM_PRIVATE_MEM_SLOTS optional > [07/10] KVM: struct kvm_memory_slot.user_alloc -> bool > [08/10] KVM: struct kvm_memory_slot.flags -> u32 > [09/10] KVM: struct kvm_memory_slot.id -> short > [10/10] KVM: Increase user memory slots on x86 to 125 > > If I can get time, I will check which one caused the problem tomorrow. Please let me know if you can identify one of these as the culprit. They're all very simple, but there's always a chance I've missed a hard coding of slot numbers somewhere. Thanks, Alex > On Tue, 18 Dec 2012 16:25:58 +0900 > Takuya Yoshikawa wrote: > > > IMPORTANT NOTE (not about this patch set): > > > > I have hit the following bug many times with the current next branch, > > even WITHOUT my patches. Although I do not know a way to reproduce this > > yet, it seems that something was broken around slot->dirty_bitmap. I am > > now investigating the new code in __kvm_set_memory_region(). > > > > The bug: > > [ 575.238063] BUG: unable to handle kernel paging request at 00000002efe83a77 > > [ 575.238185] IP: [] mark_page_dirty_in_slot+0x19/0x20 [kvm] > > [ 575.238308] PGD 0 > > [ 575.238343] Oops: 0002 [#1] SMP > > > > The call trace: > > [ 575.241207] Call Trace: > > [ 575.241257] [] kvm_write_guest_cached+0x91/0xb0 [kvm] > > [ 575.241370] [] kvm_arch_vcpu_ioctl_run+0x1109/0x12c0 [kvm] > > [ 575.241488] [] ? kvm_arch_vcpu_ioctl_run+0xa5/0x12c0 [kvm] > > [ 575.241595] [] ? mutex_lock_killable_nested+0x274/0x340 > > [ 575.241706] [] ? kvm_set_ioapic_irq+0x20/0x20 [kvm] > > [ 575.241813] [] kvm_vcpu_ioctl+0x559/0x670 [kvm] > > [ 575.241913] [] ? kvm_vm_ioctl+0x1b8/0x570 [kvm] > > [ 575.242007] [] ? native_sched_clock+0x13/0x80 > > [ 575.242125] [] ? sched_clock+0x9/0x10 > > [ 575.242208] [] ? sched_clock_cpu+0xbd/0x110 > > [ 575.242298] [] ? fget_light+0x3c/0x140 > > [ 575.242381] [] do_vfs_ioctl+0x98/0x570 > > [ 575.242463] [] ? fget_light+0xa1/0x140 > > [ 575.246393] [] ? fget_light+0x3c/0x140 > > [ 575.250363] [] sys_ioctl+0x91/0xb0 > > [ 575.254327] [] system_call_fastpath+0x16/0x1b