Re: [PATCH v2 3/3] xen: use idle vcpus to scrub pages

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Bob Liu <bob.liu@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Bob Liu <lliubbo@gmail.com>,
	keir@xen.org, ian.campbell@citrix.com,
	George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 3/3] xen: use idle vcpus to scrub pages
Date: Mon, 07 Jul 2014 20:20:48 +0800	[thread overview]
Message-ID: <53BA90A0.8050907@oracle.com> (raw)
In-Reply-To: <53B3A664.7070401@oracle.com>


On 07/02/2014 02:27 PM, Bob Liu wrote:
> 
> On 07/01/2014 08:59 PM, Jan Beulich wrote:
>>>>> On 01.07.14 at 14:25, <bob.liu@oracle.com> wrote:
>>> On 07/01/2014 05:12 PM, Jan Beulich wrote:
>>>>>>> On 30.06.14 at 15:39, <lliubbo@gmail.com> wrote:
>>>>> @@ -948,6 +954,7 @@ static void free_heap_pages(
>>>>>      {
>>>>>          if ( !tainted )
>>>>>          {
>>>>> +            node_need_scrub[node] = 1;
>>>>>              for ( i = 0; i < (1 << order); i++ )
>>>>>                  pg[i].count_info |= PGC_need_scrub;
>>>>>          }
>>>>
>>>> Iirc it was more than this single place where you set
>>>> PGC_need_scrub, and hence where you'd now need to set the
>>>> other flag too.
>>>>
>>>
>>> I'm afraid this is the only place where PGC_need_scrub was set.
>>
>> Ah, indeed - I misremembered others, they are all tests for the flag.
>>
>>> I'm sorry for all of the coding style problems.
>>>
>>> By the way is there any script which can be used to check the code
>>> before submitting? Something like ./scripts/checkpatch.pl under linux.
>>
>> No, there isn't. But avoiding (or spotting) hard tabs should be easy
>> enough, and other things you ought to simply inspect your patch for
>> - after all that's no different from what reviewers do.
>>
>>>>> +    }
>>>>> +
>>>>> +    /* free percpu free list */
>>>>> +    if ( !page_list_empty(local_free_list) )
>>>>> +    {
>>>>> +        spin_lock(&heap_lock);
>>>>> +        page_list_for_each_safe( pg, tmp, local_free_list )
>>>>> +        {
>>>>> +            order = PFN_ORDER(pg);
>>>>> +            page_list_del(pg, local_free_list);
>>>>> +            for ( i = 0; i < (1 << order); i++ )
>>>>> +	    {
>>>>> +                pg[i].count_info |= PGC_state_free;
>>>>> +                pg[i].count_info &= ~PGC_need_scrub;
>>>>
>>>> This needs to happen earlier - the scrub flag should be cleared right
>>>> after scrubbing, and the free flag should imo be set when the page
>>>> gets freed. That's for two reasons:
>>>> 1) Hypervisor allocations don't need scrubbed pages, i.e. they can
>>>> allocate memory regardless of the scrub flag's state.
>>>
>>> AFAIR, the reason I set those flags here is to avoid a panic happen.
>>
>> That's pretty vague a statement.
>>
>>>> 2) You still detain the memory on the local lists from allocation. On a
>>>> many-node system, the 16Mb per node can certainly sum up (which
>>>> is not to say that I don't view the 16Mb on a single node as already
>>>> problematic).
>>>
>>> Right, but we can adjust SCRUB_BATCH_ORDER.
>>> Anyway I'll take a retry as you suggested.
>>
>> You should really drop the idea of removing pages temporarily.
>> All you need to do is make sure a page being allocated and getting
>> simultaneously scrubbed by another CPU won't get passed to the
>> caller until the scrubbing finished. In particular it's no problem if
>> the allocating CPU occasionally ends up scrubbing a page already
>> being scrubbed elsewhere.
>>
> 
> Yes, I also like to drop percpu lists which can make things simper. But
> I'm afraid which also means I can't use any spinlock(&heap_lock) any
> more because of potential heavy lock contentions. I'm not sure whether
> things can work fine without heap_lock.
> 

In my attempt to get rid of heap_lock, there was a panic happen when
iterating the heap free list. My implementation is like this:
scrub_free_pages()
{
    for ( zone = 0; zone < NR_ZONES; zone++ )
    {
        for ( order = MAX_ORDER; order >= 0; order-- )
        {
            page_list_for_each_safe( pg, tmp, &heap(node, zone, order) )
            {
                if ( !test_bit(_PGC_need_scrub, &(pg->count_info)) )
                    continue;
                for ( i = 0; i < (1 << order); i++ )
                {
                    if ( test_bit(_PGC_need_scrub, &(pg->count_info)) )
                    {
                        scrub_one_page(&pg[i]);
                        clear_bit(_PGC_need_scrub, &pg[i].count_info);
                    }
                }
            }
        }
    }
}

The panic was in page_list_next().

I didn't find a good way to iterate the free list without holding
heap_lock, but if holding the lock it might be heavy lock contention
then I have to remove pages temporarily from heap free list to a percpu
list.

-- 
Regards,
-Bob

next prev parent reply	other threads:[~2014-07-07 12:21 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-30 13:39 [PATCH v2 1/3] xen: delay page scrubbing to allocation path Bob Liu
2014-06-30 13:39 ` [PATCH v2 2/3] xen: introduce function merge_free_trunks Bob Liu
2014-06-30 15:58   ` Jan Beulich
2014-07-01  8:14     ` Bob Liu
2014-07-01  8:27       ` Jan Beulich
2014-06-30 13:39 ` [PATCH v2 3/3] xen: use idle vcpus to scrub pages Bob Liu
2014-07-01  9:12   ` Jan Beulich
2014-07-01 12:25     ` Bob Liu
2014-07-01 12:59       ` Jan Beulich
2014-07-02  6:27         ` Bob Liu
2014-07-07 12:20           ` Bob Liu [this message]
2014-07-15  9:16         ` Bob Liu
2014-07-23  0:38           ` Konrad Rzeszutek Wilk
2014-07-23  1:30             ` Bob Liu
2014-07-23  7:28           ` Jan Beulich
2014-07-24  2:08             ` Bob Liu
2014-07-24  6:24               ` Jan Beulich
2014-07-25  0:42                 ` Bob Liu
2014-07-25  6:51                   ` Jan Beulich
2014-07-25  7:28                     ` Bob Liu
2014-07-25  7:36                       ` Jan Beulich
2014-07-25  8:18                         ` Bob Liu
2014-07-25  8:28                           ` Jan Beulich
2014-06-30 15:56 ` [PATCH v2 1/3] xen: delay page scrubbing to allocation path Jan Beulich
2014-07-01  8:12   ` Bob Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53BA90A0.8050907@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=lliubbo@gmail.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).