Re: [PATCH 2/2] xen: spread page scrubbing across all idle CPU

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bob Liu <bob.liu@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Bob Liu <lliubbo@gmail.com>,
	keir@xen.org, ian.campbell@citrix.com, andrew.cooper3@citrix.com,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH 2/2] xen: spread page scrubbing across all idle CPU
Date: Wed, 11 Jun 2014 10:51:36 +0800	[thread overview]
Message-ID: <5397C438.301@oracle.com> (raw)
In-Reply-To: <53972E79020000780001989C@mail.emea.novell.com>


On 06/10/2014 10:12 PM, Jan Beulich wrote:
>>>> On 10.06.14 at 14:18, <lliubbo@gmail.com> wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -593,6 +593,10 @@ int domain_kill(struct domain *d)
>>              BUG_ON(rc != -EAGAIN);
>>              break;
>>          }
>> +
>> +        tasklet_init(&global_scrub_tasklet, scrub_free_pages, 1);
>> +        tasklet_schedule_on_cpu(&global_scrub_tasklet, smp_processor_id());
> 
> Apart from this being a single global that you can't validly re-use
> this way (as already pointed out by Andrew), it also remains
> unclear why you want this scheduled on the current CPU rather
> than just anywhere.
> 

The reason is the same as you noticed later, if also schedule this
tasklet on other CPUs then their respective vCPU won't get run at all
until the scrubbing is done.

So I use idle_loop() instead of tasklet on other CPUs.

>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -79,6 +79,14 @@ PAGE_LIST_HEAD(page_offlined_list);
>>  /* Broken page list, protected by heap_lock. */
>>  PAGE_LIST_HEAD(page_broken_list);
>>  
>> +/* Global scrub page list, protected by scrub_lock */
>> +PAGE_LIST_HEAD(global_scrub_list);
>> +DEFINE_SPINLOCK(scrub_lock);
>> +struct tasklet global_scrub_tasklet;
>> +
>> +DEFINE_PER_CPU(struct page_list_head, scrub_list);
>> +DEFINE_PER_CPU(struct page_list_head, scrubbed_list);
> 
> All static I suppose?
> 
>> @@ -1680,6 +1693,62 @@ void scrub_one_page(struct page_info *pg)
>>      unmap_domain_page(p);
>>  }
>>  
>> +#define SCRUB_BATCH 1024
>> +void scrub_free_pages(unsigned long is_tasklet)
>> +{
>> +    struct page_info *pg;
>> +    unsigned int i, cpu, empty = 0;
>> +
>> +    cpu = smp_processor_id();
>> +    do
>> +    {
>> +        if ( page_list_empty(&this_cpu(scrub_list)) )
>> +        {
>> +            /* Delist a batch of pages from global scrub list */
>> +            spin_lock(&scrub_lock);
>> +            for( i = 0; i < SCRUB_BATCH; i++ )
>> +            {
>> +                pg = page_list_remove_head(&global_scrub_list);
>> +                if ( !pg )
>> +                {
>> +                    empty = 1;
>> +                    break;
>> +                }
>> +                page_list_add_tail(pg, &this_cpu(scrub_list));
>> +             }
>> +             spin_unlock(&scrub_lock);
>> +        }
>> +
>> +        /* Scrub percpu list */
>> +        while ( !page_list_empty(&this_cpu(scrub_list)) )
>> +        {
>> +            pg = page_list_remove_head(&this_cpu(scrub_list));
>> +            ASSERT(pg);
>> +            scrub_one_page(pg);
>> +            page_list_add_tail(pg, &this_cpu(scrubbed_list));
>> +        }
> 
> So you scrub up to 1k pages at a time here - how long does that take?
> Did you take into consideration how much higher a wakeup latency this
> may cause when the invocation comes from the idle vCPU?
> 

An extra softirq_pending(cpu) check can be added in this while loop, I
think it's no more a problem if scrubbing only 1 page every time.

>> +        /* Free percpu scrubbed list */
>> +        if ( !page_list_empty(&this_cpu(scrubbed_list)) )
>> +        {
>> +            spin_lock(&heap_lock);
>> +            while ( !page_list_empty(&this_cpu(scrubbed_list)) )
>> +            {
>> +                pg = page_list_remove_head(&this_cpu(scrubbed_list));
>> +                __free_heap_pages(pg, 0);
>> +            }
>> +            spin_unlock(&heap_lock);
>> +        }
>> +
>> +        /* Global list is empty */
>> +        if (empty)
>> +            return;
>> +    } while ( !softirq_pending(cpu) );
>> +
>> +    if( is_tasklet )
>> +        tasklet_schedule_on_cpu(&global_scrub_tasklet, cpu);
> 
> So you re-schedule this tasklet immediately - while this may be
> acceptable inside the hypervisor, did you consider the effect this
> will have on the guest (normally Dom0)? Its respective vCPU won't
> get run _at all_ until you're done scrubbing.
> 

Yes, that's a problem. I don't have any better idea right now.

What I'm trying is doing the scrubbing on current CPU as well as on all
idle vcpus in parallel.
I also considered your suggestion about doing the scrubbing in the
background as well as on the allocation path. But I think it's more
unacceptable for users to get blocked randomly for a uncertain time when
allocating a large mount of memory.
That's why I still chose the sync way that once 'xl destroy' return all
memory are scrubbed.

> In the end, this being an improvement by about 50% according to
> the numbers you gave, I'm not convinced the downsides of this are
> worth the gain.
> 
> Jan
> 

-- 
Regards,
-Bob

next prev parent reply	other threads:[~2014-06-11  2:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-10 12:18 [PATCH 1/2] xen: introduce a no lock version function of free_heap_pages Bob Liu
2014-06-10 12:18 ` [PATCH 2/2] xen: spread page scrubbing across all idle CPU Bob Liu
2014-06-10 12:42   ` Andrew Cooper
2014-06-10 14:12   ` Jan Beulich
2014-06-11  2:51     ` Bob Liu [this message]
2014-06-11  8:13       ` Jan Beulich
2014-06-11 10:13         ` George Dunlap
2014-06-11 10:22           ` Jan Beulich
2014-06-11 10:36           ` Bob Liu
2014-06-11 10:45             ` George Dunlap
2014-06-11 11:17               ` Bob Liu
2014-06-11 10:48             ` Andrew Cooper
2014-06-10 12:32 ` [PATCH 1/2] xen: introduce a no lock version function of free_heap_pages Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5397C438.301@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=lliubbo@gmail.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.