From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0 Date: Wed, 22 Jul 2009 11:16:25 -0700 Message-ID: <4A675779.1030403@goop.org> References: <20090522080655.GA24960@edu.joroinen.fi> <20090604202656.GR24960@edu.joroinen.fi> <1244197217.27370.146.camel@zakaz.uk.xensource.com> <20090605112347.GY24960@edu.joroinen.fi> <1244201864.27370.172.camel@zakaz.uk.xensource.com> <20090605133850.GA24960@edu.joroinen.fi> <1244209979.27370.188.camel@zakaz.uk.xensource.com> <20090605154130.GB24960@edu.joroinen.fi> <1244217948.27370.213.camel@zakaz.uk.xensource.com> <1244218353.27370.216.camel@zakaz.uk.xensource.com> <20090605181925.GC24960@edu.joroinen.fi> <1244475935.27370.309.camel@zakaz.uk.xensource.com> <1244476858.27370.325.camel@zakaz.uk.xensource.com> <4A2E9BC3.4060507@goop.org> <1244710938.27370.502.camel@zakaz.uk.xensource.com> <1244711914.27370.511.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1244711914.27370.511.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Campbell Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org On 06/11/09 02:18, Ian Campbell wrote: > Pasi, to validate the theory that you are seeing races between unpinning > and kmap_atomic_pte can you give this biguglystick approach to solving > it a go. > I gave up trying to solve this in any kind of clever way and just decided to go with a slightly cleaned-up version of this patch. Unfortunately, I don't think it actually solves the problem because it doesn't prevent unpin from happening while the page is still kmapped; it just ends up using the spinlock as a barrier to move the race around to some timing which is presumably mostly avoids the problem. In principle the fix is to take the lock in xen_kmap_atomic and release it in xen_kunmap_atomic. Unfortunately this is pretty ugly and complex because kmaps are 1) inherently per-cpu, and 2) there can be 2 levels of kmapping at once. This means that we'd need 2 per-cpu locks to allow us to hold these locks for the mapping duration without introducing deadlocks. However unpin (and pin, in principle) need to take *all* these locks to exclude kmap on all cpus. This is total overkill, since we only really care about excluding kmap and unpin from a given pagetable, which suggests that the locks should be part of the mm rather than global. Unfortunately k(un)map_atomic_pte don't get the mm of the pagetable they're trying to pin, and I don't think we can assume its the current mm. Another (pretty hideous) approach might be to make unpin check the state of the KMAP_PTE[01] slots in each CPU's kmap fixmaps and see if a mapping exists for a page that its currently unpinning. This also has the problem of being inherently racy; if we unpin the page, there's going to be a little window after the unpin and before the kmap pte update (even if they're back-to-back batched hypercalls). I guess we could have a global rw spinlock; kmap/unmap takes it for reading, and so can be concurrent with all other kmaps, but pin/unpin takes it for writing to exclude them. That would work, but runs the risk of pin/unpin from being livelocked out (I don't think rwspins will block new readers if there's a pending writer). Ugly, but its the only thing I can think of which actually solves the problem. Oh, crap, we don't have a kunmap_atomic_pte hook. Thoughts? Am I overthinking this and missing something obvious? (PS: avoid CONFIG_HIGHPTE) J > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 1729178..beeb8e8 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -1145,9 +1145,12 @@ static int xen_unpin_page(struct mm_struct *mm, struct page *page, > return 0; /* never need to flush on unpin */ > } > > +static DEFINE_SPINLOCK(hack_lock); /* Hack to sync unpin against kmap_atomic_pte */ > + > /* Release a pagetables pages back as normal RW */ > static void __xen_pgd_unpin(struct mm_struct *mm, pgd_t *pgd) > { > + spin_lock(&hack_lock); > xen_mc_batch(); > > xen_do_pin(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(pgd))); > @@ -1173,6 +1176,7 @@ static void __xen_pgd_unpin(struct mm_struct *mm, pgd_t *pgd) > __xen_pgd_walk(mm, pgd, xen_unpin_page, USER_LIMIT); > > xen_mc_issue(0); > + spin_unlock(&hack_lock); > } > > static void xen_pgd_unpin(struct mm_struct *mm) > @@ -1521,6 +1525,9 @@ static void xen_pgd_free(struct mm_struct *mm, pgd_t *pgd) > static void *xen_kmap_atomic_pte(struct page *page, enum km_type type) > { > pgprot_t prot = PAGE_KERNEL; > + void *ret; > + > + spin_lock(&hack_lock); > > if (PagePinned(page)) > prot = PAGE_KERNEL_RO; > @@ -1530,7 +1537,11 @@ static void *xen_kmap_atomic_pte(struct page *page, enum km_type type) > page_to_pfn(page), type, > (unsigned long)pgprot_val(prot) & _PAGE_RW ? "WRITE" : "READ"); > > - return kmap_atomic_prot(page, type, prot); > + ret = kmap_atomic_prot(page, type, prot); > + > + spin_unlock(&hack_lock); > + > + return ret; > } > #endif > > > >