All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bob Liu <bob.liu@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Bob Liu <lliubbo@gmail.com>,
	keir@xen.org, ian.campbell@citrix.com, andrew.cooper3@citrix.com,
	xen-devel@lists.xenproject.org
Subject: Re: [RFC PATCH v2] xen: free_domheap_pages: delay page scrub to idle loop
Date: Tue, 20 May 2014 16:47:48 +0800	[thread overview]
Message-ID: <537B16B4.7060707@oracle.com> (raw)
In-Reply-To: <537B2C500200007800013F18@mail.emea.novell.com>


On 05/20/2014 04:20 PM, Jan Beulich wrote:
>>>> On 20.05.14 at 04:15, <lliubbo@gmail.com> wrote:
>> So I use a percpu scrub page list in this patch, the tradeoff is we may not 
>> use all idle cpus. It depends on free_domheap_pages() runs on which cpu.
> 
> So this means the time until all memory can be used again a completely
> arbitrary amount of time can pass, as it depends on the freeing CPU
> to be idle long enough (many minutes according to your observations)
> - even if all the rest of the system was idle.
> 
> I see the problem with the lock contention, but I think an at least
> slightly more sophisticated solution is going to be needed.
> 
>> @@ -633,6 +635,9 @@ static struct page_info *alloc_heap_pages(
>>                      goto found;
>>          } while ( zone-- > zone_lo ); /* careful: unsigned zone may wrap */
>>  
>> +        if ( scrub_free_pages() )
>> +            continue;
> 
> This will recover memory only from the current CPU's list - the larger
> the system, the less likely that this will turn up anything. Furthermore
> you're creating an almost unbounded loop here - for order > 0 the
> ability of scrub_pages() to make memory available doesn't mean that
> on the next iteration the loop wouldn't come back here.
> 
>> @@ -1417,6 +1422,23 @@ void free_xenheap_pages(void *v, unsigned int order)
>>  #endif
>>  
>>  
>> +unsigned long scrub_free_pages(void)
> 
> The return value and the local variable below could easily be
> unsigned int.
> 
>> +{
>> +    struct page_info *pg;
>> +    unsigned long nr_scrubed = 0;
>> +
>> +    /* Scrub around 400M memory every time */
>> +    while ( nr_scrubed < 100000 )
> 
> Without explanation such a hard coded number wouldn't be acceptable
> in any case. How long does it take to scrub 400Mb on a _slow_ system?
> I hope you realize that the amount of work you do here affects the
> wakeup time of a vCPU supposed to run on the given CPU.
> 
>> @@ -1564,8 +1586,15 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>>           * domain has died we assume responsibility for erasure.
>>           */
>>          if ( unlikely(d->is_dying) )
>> +        {
>> +            /*
>> +             * Add page to page_scrub_list to speed up domain destroy, those
>> +             * pages will be zeroed later by scrub_page_tasklet.
>> +             */
>>              for ( i = 0; i < (1 << order); i++ )
>> -                scrub_one_page(&pg[i]);
>> +                page_list_add_tail( &pg[i], &this_cpu(page_scrub_list) );
>> +            goto out;
>> +        }
> 
> If done this way, I see no reason why you couldn't add the page in one
> chunk to the list (i.e. even if order > 0), by making use of PFN_ORDER()
> to communicate the order to the scrubbing routine.
> 
> But having sent a v2 patch without the conceptional questions being

Sorry, I also realised this version is immature.

> sorted out I consider kind of odd anyway. I.e. before sending another
> version I think you need to
> - explain that the latency gain here outweighs the performance effects
>   on other guests,
> - explain why alternative approaches (like the suggested flagging of the
>   pages as needing scrubbing during freeing, and doing the scrubbing in

Could you expand a bit more on how to use the idle cpus to do the
scrubbing in background without impact other guests?
I don't have a good solution in mind yet.

>   the background as well as on the allocation path) are worse, or at least

Do you mean we can delay the page scrubbing to alloc_heap_pages()?

free_domheap_pages()
{
    if ( unlikely(d->is_dying) )
    {
        //set page flag to need scrubbing, and then free
        free_heap_pages(pg);
    }
}

alloc_heap_pages()
{
    for ( ; ; )
    {
        if ( (pg = page_list_remove_head(&heap(node, zone, j))) )
            goto found;
    }

found:
    if (page tagged with need scrubbing)
        scrub_one_page(pg);
}

Thanks,
-Bob

  reply	other threads:[~2014-05-20  8:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-20  2:15 [RFC PATCH v2] xen: free_domheap_pages: delay page scrub to idle loop Bob Liu
2014-05-20  8:20 ` Jan Beulich
2014-05-20  8:47   ` Bob Liu [this message]
2014-05-20  9:46     ` Jan Beulich
2014-05-20 13:56 ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=537B16B4.7060707@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=lliubbo@gmail.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.