From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: Out of sync shadow core breaks Hurd
Date: Thu, 20 Nov 2008 10:48:21 +0100
Message-ID: <20081120094821.GA990@dmt.cnet>
References: <20081112190037.GA4009@volta.aurel32.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: kvm@vger.kernel.org
To: Aurelien Jarno <aurelien@aurel32.net>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx2.redhat.com ([66.187.237.31]:53398 "EHLO mx2.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750867AbYKTMuf (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 20 Nov 2008 07:50:35 -0500
Content-Disposition: inline
In-Reply-To: <20081112190037.GA4009@volta.aurel32.net>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Hi Aurelien,

On Wed, Nov 12, 2008 at 08:00:37PM +0100, Aurelien Jarno wrote:
> Hi,
>=20
> Starting with kvm-76 (and including kvm-79), Hurd does not boot anymo=
re
> under KVM. The ext2fs translator issues a strange error message:
>=20
> |=A0Hurd server bootstrap: ext2fs.static[device:hd0s3] execext2fs.sta=
tic: /build/bui
> |=A0ldd/hurd-20080607/build-tree/hurd/ext2fs/dir.c:494: dirscanblock:=
 Assertion `dp-
> |=A0>dn->dirents[idx] =3D=3D -1 || dp->dn->dirents[idx] =3D=3D nentri=
es' failed.           -
> |=A0>dn->dirents[idx] =3D=3D -1 || dp->dn->dirents[idx] =3D=3D nentri=
es' failed.
>=20
> Bisecting the problem, I have found that it comes from this patch:
>=20
> |=A0641fb03992b20aa640781a245f6b7136f0b845e4 is first bad commit
> | commit 641fb03992b20aa640781a245f6b7136f0b845e4
> |=A0Author: Marcelo Tosatti <mtosatti@redhat.com>
> |=A0Date:   Tue Sep 23 13:18:39 2008 -0300
> |=A0
> |=A0    KVM: MMU: out of sync shadow core v2
> |=A0
> |=A0    Allow guest pagetables to go out of sync.
> |=A0
> |=A0    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> |     Signed-off-by: Avi Kivity <avi@redhat.com>
>=20
> The problem can be workarounded loading the kvm module with=20
> oos_shadow=3D0.
>=20
> The easiest way to reproduce the problem is to download a ready to us=
e
> Hurd image=A0[1]. The error message from the ext2fs translator is not
> exactly the same, but it still fails.

It seems Hurd does not always explicitly flush the TLB via cr0/cr3/cr4
writes or invlpg after updating pagetables. Debugging shows that OOS is
properly syncing the sptes wrt the guest pagetables, and that all pages
are synced before guest re-entry on TLB flush exits.

The Intel TLB doc says (5.1 "Invalidation Instructions"):

(Other instructions and operations may invalidate entries in the TLBs
and the paging structure caches, but the instructions identified above
are recommended.)

As a test, syncing on every exit makes it happy:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7a2aeba..47e2550 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3052,6 +3052,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu=
, struct kvm_run *kvm_run)
=20
 	kvm_lapic_sync_from_vapic(vcpu);
=20
+	kvm_mmu_sync_roots(vcpu);
+
 	r =3D kvm_x86_ops->handle_exit(kvm_run, vcpu);
 out:
 	return r;

It would be necessary to confirm this by hacking Hurd to flush on every
pagetable update. Perhaps something like

RCS file: /sources/hurd/gnumach/i386/intel/pmap.c,v
retrieving revision 1.4.2.22
diff -u -r1.4.2.22 pmap.c
--- pmap.c  11 Nov 2008 02:24:18 -0000  1.4.2.22
+++ pmap.c  20 Nov 2008 12:47:01 -0000
@@ -82,7 +82,7 @@
 #include <i386/proc_reg.h>
 #include <i386/locore.h>
=20
-#define    WRITE_PTE(pte_p, pte_entry)     *(pte_p) =3D (pte_entry);
+#define    WRITE_PTE(pte_p, pte_entry)     *(pte_p) =3D (pte_entry);
flush_tlb();
=20
 /*
  * Private data structures.