From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: 32-bit FreeBSD under 64-bit KVM Date: Thu, 08 Mar 2007 17:12:24 +0200 Message-ID: <45F027D8.9070001@qumranet.com> References: <45E38511.9000006@aurel32.net> <45E3D902.7000102@qumranet.com> <20070301214843.GA28822@volta.aurel32.net> <45E92B4D.8030501@qumranet.com> <45EECD2D.20209@aurel32.net> <45EECD5A.5090801@qumranet.com> <20070307174737.GA8340@farad.aurel32.net> <45EEFEB3.1090809@qumranet.com> <45EF5039.4010907@aurel32.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------000100060308010403070801" Cc: kvm-devel To: Aurelien Jarno Return-path: In-Reply-To: <45EF5039.4010907-rXXEIb44qovR7s880joybQ@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org This is a multi-part message in MIME format. --------------000100060308010403070801 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Aurelien Jarno wrote: >> Okay, an mmu bug. Been a while since we've seen one. >> >> Please post a URL for the .iso so I can take a look, along with exact >> instructions for reproducing the bug. >> > > I have put a qcow image on http://aurel32.free.fr . You have to bunzip2 > the image and then run: > > qemu-system-x86_64 -hda gnu_kfreebsd.qcow > > Then wait for the system to boot and look at the boot process. The > kernel should boot fine, but when INIT is started, you will get a > "SEGMENTATION VIOLATION" from the kernel. > > At least is what I observe here on two different computers, both with > AMD CPU (Turion 64 X2 and Athlon 64 X2), for kvm versions 14 (maybe also > before) through 16. Note also that the problem is not always > reproducible if the system load is high (for example running cpuburn on > both cores). > The attached patch should fix it. If you're using the external module, you'll need to apply with 'patch -p3' in the kernel/ subdirectory. -- error compiling committee.c: too many arguments to function --------------000100060308010403070801 Content-Type: text/x-patch; name="kvm-fix-nonpae-pde-writes.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kvm-fix-nonpae-pde-writes.patch" commit 6ee9853b015f8807f497ffad39b142ddc1403aa9 Author: Avi Kivity Date: Thu Mar 8 17:13:32 2007 +0200 KVM: MMU: Fix guest writes to nonpae pde KVM shadow page tables are always in pae mode, regardless of the guest setting. This means that a guest pde (mapping 4MB of memory) is mapped to two shadow pdes (mapping 2MB each). When the guest writes to a pte or pde, we intercept the write and emulate it. We also remove any shadowed mappings corresponding to the write. Since the mmu did not account for the doubling in the number of pdes, it removed the wrong entry, resulting in a mismatch between shadow page tables and guest page tables, followed shortly by guest memory corruption. This patch fixes the problem by detecting the special case of writing to a non-pae pde and adjusting the address and number of shadow pdes zapped accordingly. Signed-off-by: Avi Kivity diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c index a7b3e2a..f5d45b0 100644 --- a/drivers/kvm/mmu.c +++ b/drivers/kvm/mmu.c @@ -1093,22 +1093,40 @@ out: return r; } +static void mmu_pre_write_zap_pte(struct kvm_vcpu *vcpu, + struct kvm_mmu_page *page, + u64 *spte) +{ + u64 pte; + struct kvm_mmu_page *child; + + pte = *spte; + if (is_present_pte(pte)) { + if (page->role.level == PT_PAGE_TABLE_LEVEL) + rmap_remove(vcpu, spte); + else { + child = page_header(pte & PT64_BASE_ADDR_MASK); + mmu_page_remove_parent_pte(vcpu, child, spte); + } + } + *spte = 0; +} + void kvm_mmu_pre_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes) { gfn_t gfn = gpa >> PAGE_SHIFT; struct kvm_mmu_page *page; - struct kvm_mmu_page *child; struct hlist_node *node, *n; struct hlist_head *bucket; unsigned index; u64 *spte; - u64 pte; unsigned offset = offset_in_page(gpa); unsigned pte_size; unsigned page_offset; unsigned misaligned; int level; int flooded = 0; + int npte; pgprintk("%s: gpa %llx bytes %d\n", __FUNCTION__, gpa, bytes); if (gfn == vcpu->last_pt_write_gfn) { @@ -1144,22 +1162,26 @@ void kvm_mmu_pre_write(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes) } page_offset = offset; level = page->role.level; + npte = 1; if (page->role.glevels == PT32_ROOT_LEVEL) { page_offset <<= 1; /* 32->64 */ + /* + * A 32-bit pde maps 4MB while the shadow pdes map + * only 2MB. So we need to double the offset again + * and zap two pdes instead of one. + */ + if (level == PT32_ROOT_LEVEL) { + page_offset <<= 1; + npte = 2; + } page_offset &= ~PAGE_MASK; } spte = __va(page->page_hpa); spte += page_offset / sizeof(*spte); - pte = *spte; - if (is_present_pte(pte)) { - if (level == PT_PAGE_TABLE_LEVEL) - rmap_remove(vcpu, spte); - else { - child = page_header(pte & PT64_BASE_ADDR_MASK); - mmu_page_remove_parent_pte(vcpu, child, spte); - } + while (npte--) { + mmu_pre_write_zap_pte(vcpu, page, spte); + ++spte; } - *spte = 0; } } --------------000100060308010403070801 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV --------------000100060308010403070801 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ kvm-devel mailing list kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org https://lists.sourceforge.net/lists/listinfo/kvm-devel --------------000100060308010403070801--