From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [RFC PATCH] xen: free_domheap_pages: delay page scrub to tasklet Date: Mon, 19 May 2014 10:59:08 +0100 Message-ID: <5379D5EC.3040604@citrix.com> References: <1400468276-8683-1-git-send-email-bob.liu@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WmKLi-0001OM-CR for xen-devel@lists.xenproject.org; Mon, 19 May 2014 09:59:14 +0000 In-Reply-To: <1400468276-8683-1-git-send-email-bob.liu@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Bob Liu Cc: keir@xen.org, ian.campbell@citrix.com, jbeulich@suse.com, xen-devel@lists.xenproject.org, boris.ostrovsky@oracle.com List-Id: xen-devel@lists.xenproject.org On 19/05/14 03:57, Bob Liu wrote: > Because of page scrub, it's very slow to destroy a domain with large > memory. > It took around 10 minutes when destroy a guest of nearly 1 TB of memory. > > [root@ca-test111 ~]# time xm des 5 > real 10m51.582s > user 0m0.115s > sys 0m0.039s > [root@ca-test111 ~]# > > Use perf we can see what happened, thanks for Boris's help and provide this > useful tool for xen. > [root@x4-4 bob]# perf report > 22.32% xl [xen.syms] [k] page_get_owner_and_reference > 20.82% xl [xen.syms] [k] relinquish_memory > 20.63% xl [xen.syms] [k] put_page > 17.10% xl [xen.syms] [k] scrub_one_page > 4.74% xl [xen.syms] [k] unmap_domain_page > 2.24% xl [xen.syms] [k] get_page > 1.49% xl [xen.syms] [k] free_heap_pages > 1.06% xl [xen.syms] [k] _spin_lock > 0.78% xl [xen.syms] [k] __put_page_type > 0.75% xl [xen.syms] [k] map_domain_page > 0.57% xl [xen.syms] [k] free_page_type > 0.52% xl [xen.syms] [k] is_iomem_page > 0.42% xl [xen.syms] [k] free_domheap_pages > 0.31% xl [xen.syms] [k] put_page_from_l1e > 0.27% xl [xen.syms] [k] check_lock > 0.27% xl [xen.syms] [k] __mfn_valid > > This patch try to delay scrub_one_page() to a tasklet which will be scheduled on > all online physical cpus, so that it's much faster to return from 'xl/xm > destroy xxx'. > > Tested on a guest with 30G memory. > Before this patch: > [root@x4-4 bob]# time xl des PV-30G > > real 0m16.014s > user 0m0.010s > sys 0m13.976s > [root@x4-4 bob]# > > After: > [root@x4-4 bob]# time xl des PV-30G > > real 0m3.581s > user 0m0.003s > sys 0m1.554s > [root@x4-4 bob]# > > The destroy time reduced from 16s to 3s. > > Signed-off-by: Bob Liu > --- > xen/common/page_alloc.c | 39 ++++++++++++++++++++++++++++++++++++++- > 1 file changed, 38 insertions(+), 1 deletion(-) > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 601319c..2ca59a1 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -79,6 +79,10 @@ PAGE_LIST_HEAD(page_offlined_list); > /* Broken page list, protected by heap_lock. */ > PAGE_LIST_HEAD(page_broken_list); > > +PAGE_LIST_HEAD(page_scrub_list); > +static DEFINE_SPINLOCK(scrub_list_spinlock); > +static struct tasklet scrub_page_tasklet; > + > /************************* > * BOOT-TIME ALLOCATOR > */ > @@ -1417,6 +1421,25 @@ void free_xenheap_pages(void *v, unsigned int order) > #endif > > > +static void scrub_free_pages(unsigned long unuse) > +{ > + struct page_info *pg; > + > + for ( ; ; ) > + { A tasklet function is expected to return. I don't see how this works at all... > + while ( page_list_empty(&page_scrub_list) ) > + cpu_relax(); > + > + spin_lock(&scrub_list_spinlock); > + pg = page_list_remove_head(&page_scrub_list); > + spin_unlock(&scrub_list_spinlock); > + if (pg) > + { > + scrub_one_page(pg); > + free_heap_pages(pg, 0); > + } > + } > +} > > /************************* > * DOMAIN-HEAP SUB-ALLOCATOR > @@ -1425,6 +1448,7 @@ void free_xenheap_pages(void *v, unsigned int order) > void init_domheap_pages(paddr_t ps, paddr_t pe) > { > unsigned long smfn, emfn; > + unsigned int cpu; > > ASSERT(!in_irq()); > > @@ -1435,6 +1459,9 @@ void init_domheap_pages(paddr_t ps, paddr_t pe) > return; > > init_heap_pages(mfn_to_page(smfn), emfn - smfn); > + tasklet_init(&scrub_page_tasklet, scrub_free_pages, 0); > + for_each_online_cpu(cpu) > + tasklet_schedule_on_cpu(&scrub_page_tasklet, cpu); So now you have an infinite loop doing nothing, running on all cpus in tasklet context ? > } > > > @@ -1564,8 +1591,17 @@ void free_domheap_pages(struct page_info *pg, unsigned int order) > * domain has died we assume responsibility for erasure. > */ > if ( unlikely(d->is_dying) ) > + { > + /* > + * Add page to page_scrub_list to speed up domain destroy, those > + * pages will be zeroed later by scrub_page_tasklet. > + */ Spaces/tabs ~Andrew > + spin_lock(&scrub_list_spinlock); > for ( i = 0; i < (1 << order); i++ ) > - scrub_one_page(&pg[i]); > + page_list_add_tail(&pg[i], &page_scrub_list); > + spin_unlock(&scrub_list_spinlock); > + goto out; > + } > > free_heap_pages(pg, order); > } > @@ -1583,6 +1619,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order) > drop_dom_ref = 0; > } > > +out: > if ( drop_dom_ref ) > put_domain(d); > }