From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: cr3 OOS optimisation breaks 32-bit GNU/kFreeBSD guest Date: Sun, 22 Feb 2009 22:47:13 -0300 Message-ID: <20090223014713.GA11438@amt.cnet> References: <20090223003305.GW12976@hall.aurel32.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Aurelien Jarno Return-path: Received: from mx2.redhat.com ([66.187.237.31]:38205 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752169AbZBWBr6 (ORCPT ); Sun, 22 Feb 2009 20:47:58 -0500 Content-Disposition: inline In-Reply-To: <20090223003305.GW12976@hall.aurel32.net> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Feb 23, 2009 at 01:33:05AM +0100, Aurelien Jarno wrote: > Hi, > > Since kvm-81, I have noticed that GNU/kFreeBSD 32-bit guest are crashing > under high load (during a compilation for example) with the following > error message: > > | Fatal trap 12: page fault while in kernel mode > | fault virtual address = 0x4 > | fault code = supervisor read, page not present > | instruction pointer = 0x20:0xc0a4fc00 > | stack pointer = 0x28:0xe66d7a70 > | frame pointer = 0x28:0xe66d7a80 > | code segment = base 0x0, limit 0xfffff, type 0x1b > | = DPL 0, pres 1, def32 1, gran 1 > | processor eflags = interrupt enabled, resume, IOPL = 0 > | current process = 24037 (bash) > | trap number = 12 > | panic: page fault > | Uptime: 4m7s > | Cannot dump. No dump device defined. > | Automatic reboot in 15 seconds - press a key on the console to abort > > I haven't tried yet with a plain FreeBSD guest, but I also expect it to > crash given the kernel (version 7.1) is almost the same. A closer > investigation has shown that the following commit is causing the > problem: > > | commit 6364a3918cb5c28376849e7fca3e09bd66b859f3 > | Author: Marcelo Tosatti > | Date: Mon Dec 1 22:32:04 2008 -0200 > | > | KVM: MMU: skip global pgtables on sync due to cr3 switch > | > | Skip syncing global pages on cr3 switch (but not on cr4/cr0). This is > | important for Linux 32-bit guests with PAE, where the kmap page is > | marked as global. > | > | Signed-off-by: Marcelo Tosatti > | Signed-off-by: Avi Kivity > > As expected, loading the KVM module with oos_shadow=0 workaround the > problem. Please note that the guest is running in 32-bit mode, does not > use PAE, and uses global pages. My host has an Intel Q9450 CPU, and the > problem appears with both a 2.6.26 and a 2.6.28 64-bit kernel. > > Does anybody see any problem in this patch? How can I further > debug the problem? Aurelien, Maybe there is a bug in the syncing code (eg: not all global pages are sync'ed when the OS requests a global sync), or FreeBSD is "relying" on invlpg/cr3 write to sync global pages (remember TLB entries can be invalidated internally by CPU). If you want to debug it, would suggest looping over all MMU pages in mmu_sync_global, after the kvm_sync_page loop, and WARN_ON(sp->unsync && sp->global); If that fails, check if the unsync and global flags mean what they are supposed to. Sorry for the trouble and thanks for the detailed report, will take a close look at it this week.