From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH] Fix SMP shadow instantiation race Date: Mon, 10 Dec 2007 23:27:45 +0200 Message-ID: <475DAF51.8060804@qumranet.com> References: <20071210161907.GA13917@dmt> <475D726A.2040901@qumranet.com> <20071210191208.GA15500@dmt> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Marcelo Tosatti Return-path: In-Reply-To: <20071210191208.GA15500@dmt> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Marcelo Tosatti wrote: > On Mon, Dec 10, 2007 at 07:07:54PM +0200, Avi Kivity wrote: > >> Marcelo Tosatti wrote: >> >>> There is a race where VCPU0 is shadowing a pagetable entry while VCPU1 >>> is updating it, which results in a stale shadow copy. >>> >>> Fix that by comparing the contents of the cached guest pte with the >>> current guest pte after write-protecting the guest pagetable. >>> >>> Attached program kvm_shadow_race.c demonstrates the problem. >>> >>> >>> >> Where is it? >> > > Attached. > > Can you explain what it does? I get the same results on both host and guest (successful completion). >>> diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h >>> index 72d4816..4fece01 100644 >>> --- a/drivers/kvm/paging_tmpl.h >>> +++ b/drivers/kvm/paging_tmpl.h >>> @@ -66,6 +66,7 @@ struct guest_walker { >>> int level; >>> gfn_t table_gfn[PT_MAX_FULL_LEVELS]; >>> pt_element_t pte; >>> + gpa_t pte_gpa; >>> >>> >> I think this needs to be an array like table_gfn[]. The guest may play >> with the pde (and upper entries) as well as the pte. >> > > I was working under the assumption that the only significant bits of > upper entries (WRITEABLE and PRESENT) that can be changed by the guest > must be reflected first in the lower level pte's. > > Isnt that a fair assumption to make? > > The other bits (including the physical addresses) may change too. There is no requirement that the changes to pde write/present bits be reflected on pte write/present bits. Consider a unix kernel implementing fork() by write-protecting the pud tables. It can write protect the entire user address space by clearing the write bit on the first 256 pgd entries. (I don't think Linux does that; maybe that's a worthwhile optimization) -- Any sufficiently difficult bug is indistinguishable from a feature. ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php