From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0 Date: Thu, 11 Jun 2009 08:18:15 -0700 Message-ID: <4A312037.10300@goop.org> References: <20090522080655.GA24960@edu.joroinen.fi> <20090604202656.GR24960@edu.joroinen.fi> <1244197217.27370.146.camel@zakaz.uk.xensource.com> <20090605112347.GY24960@edu.joroinen.fi> <1244201864.27370.172.camel@zakaz.uk.xensource.com> <20090605133850.GA24960@edu.joroinen.fi> <1244209979.27370.188.camel@zakaz.uk.xensource.com> <20090605154130.GB24960@edu.joroinen.fi> <1244217948.27370.213.camel@zakaz.uk.xensource.com> <1244218353.27370.216.camel@zakaz.uk.xensource.com> <20090605181925.GC24960@edu.joroinen.fi> <1244475935.27370.309.camel@zakaz.uk.xensource.com> <1244476858.27370.325.camel@zakaz.uk.xensource.com> <4A2E9BC3.4060507@goop.org> <1244710938.27370.502.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1244710938.27370.502.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Campbell Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org On 06/11/09 02:02, Ian Campbell wrote: > On Tue, 2009-06-09 at 13:28 -0400, Jeremy Fitzhardinge wrote: > >> Ian Campbell wrote: >> >>> I wonder how this interacts with the logic in >>> arch/x86/xen/mmu.c:xen_pin_page() which holds the lock while waiting for >>> the (deferred) pin multicall to occur? Hmm, no this is about the >>> PagePinned flag on the struct page which is out of date WRT the actual >>> pinned status as Xen sees it -- we update the PagePinned flag early in >>> xen_pin_page() long before Xen the pin hypercall so this window is the >>> other way round to what would be needed to trigger this bug. >>> >>> >> Yes, it looks like you could get a bad mapping here. An obvious fix >> would be to defer clearing the pinned flag in the page struct until >> after the hypercall has issued. That would make the racy >> kmap_atomic_pte map RO, which would be fine unless it actually tries to >> modify it (but I can't imagine it would do that unlocked). >> > > But would it redo the mapping after taking the lock? It doesn't look > like it does (why would it). So we could end up writing to an unpinned > pte via a R/O mapping. > Hm, yep. One thing I noticed is that set_pte() is used very rarely, so it would be no cost to always use a hypercall in that case. But xen_set_pte_at() ends up calling xen_set_pte() as well, and I think that's more common. Certainly we need to make sure that we're actually taking advantage of late-pin by direct writing unpinned ptes. I've been thinking of rearranging the set_pte(_at) pvops a little bit anyway; its not obvious we're really getting much benefit from using the update_va_mapping hypercall, and if we're not using it, then the set_pte_at pvop is taking a lot of unused parameters. If we switch to just using mmu_update, then we can just pass the address and pte value. But we could also pass the struct page * (which makes a bit of conceptual sense), so we could easy directly test whether the pte is pinned, and either use a direct write or hypercall accordingly. > As an experiment I tried the simple approach of flushing the multicalls > explicitly in xen_unpin_page and then clearing the Pinned bit and it all > goes a bit wrong. eip is "ptep->pte_low = 0" so I think the unpinned but > R/O theory holds... > Yes, I think the theory is sound. But I'm curious why Pasi seems to be able to hit the race easily, but we have not... J