Re: [PATCH v2 3/3] xen: use idle vcpus to scrub pages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bob Liu <bob.liu@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Bob Liu <lliubbo@gmail.com>,
	keir@xen.org, ian.campbell@citrix.com,
	George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 3/3] xen: use idle vcpus to scrub pages
Date: Mon, 07 Jul 2014 20:20:48 +0800	[thread overview]
Message-ID: <53BA90A0.8050907@oracle.com> (raw)
In-Reply-To: <53B3A664.7070401@oracle.com>


On 07/02/2014 02:27 PM, Bob Liu wrote:
> 
> On 07/01/2014 08:59 PM, Jan Beulich wrote:
>>>>> On 01.07.14 at 14:25, <bob.liu@oracle.com> wrote:
>>> On 07/01/2014 05:12 PM, Jan Beulich wrote:
>>>>>>> On 30.06.14 at 15:39, <lliubbo@gmail.com> wrote:
>>>>> @@ -948,6 +954,7 @@ static void free_heap_pages(
>>>>>      {
>>>>>          if ( !tainted )
>>>>>          {
>>>>> +            node_need_scrub[node] = 1;
>>>>>              for ( i = 0; i < (1 << order); i++ )
>>>>>                  pg[i].count_info |= PGC_need_scrub;
>>>>>          }
>>>>
>>>> Iirc it was more than this single place where you set
>>>> PGC_need_scrub, and hence where you'd now need to set the
>>>> other flag too.
>>>>
>>>
>>> I'm afraid this is the only place where PGC_need_scrub was set.
>>
>> Ah, indeed - I misremembered others, they are all tests for the flag.
>>
>>> I'm sorry for all of the coding style problems.
>>>
>>> By the way is there any script which can be used to check the code
>>> before submitting? Something like ./scripts/checkpatch.pl under linux.
>>
>> No, there isn't. But avoiding (or spotting) hard tabs should be easy
>> enough, and other things you ought to simply inspect your patch for
>> - after all that's no different from what reviewers do.
>>
>>>>> +    }
>>>>> +
>>>>> +    /* free percpu free list */
>>>>> +    if ( !page_list_empty(local_free_list) )
>>>>> +    {
>>>>> +        spin_lock(&heap_lock);
>>>>> +        page_list_for_each_safe( pg, tmp, local_free_list )
>>>>> +        {
>>>>> +            order = PFN_ORDER(pg);
>>>>> +            page_list_del(pg, local_free_list);
>>>>> +            for ( i = 0; i < (1 << order); i++ )
>>>>> +	    {
>>>>> +                pg[i].count_info |= PGC_state_free;
>>>>> +                pg[i].count_info &= ~PGC_need_scrub;
>>>>
>>>> This needs to happen earlier - the scrub flag should be cleared right
>>>> after scrubbing, and the free flag should imo be set when the page
>>>> gets freed. That's for two reasons:
>>>> 1) Hypervisor allocations don't need scrubbed pages, i.e. they can
>>>> allocate memory regardless of the scrub flag's state.
>>>
>>> AFAIR, the reason I set those flags here is to avoid a panic happen.
>>
>> That's pretty vague a statement.
>>
>>>> 2) You still detain the memory on the local lists from allocation. On a
>>>> many-node system, the 16Mb per node can certainly sum up (which
>>>> is not to say that I don't view the 16Mb on a single node as already
>>>> problematic).
>>>
>>> Right, but we can adjust SCRUB_BATCH_ORDER.
>>> Anyway I'll take a retry as you suggested.
>>
>> You should really drop the idea of removing pages temporarily.
>> All you need to do is make sure a page being allocated and getting
>> simultaneously scrubbed by another CPU won't get passed to the
>> caller until the scrubbing finished. In particular it's no problem if
>> the allocating CPU occasionally ends up scrubbing a page already
>> being scrubbed elsewhere.
>>
> 
> Yes, I also like to drop percpu lists which can make things simper. But
> I'm afraid which also means I can't use any spinlock(&heap_lock) any
> more because of potential heavy lock contentions. I'm not sure whether
> things can work fine without heap_lock.
> 

In my attempt to get rid of heap_lock, there was a panic happen when
iterating the heap free list. My implementation is like this:
scrub_free_pages()
{
    for ( zone = 0; zone < NR_ZONES; zone++ )
    {
        for ( order = MAX_ORDER; order >= 0; order-- )
        {
            page_list_for_each_safe( pg, tmp, &heap(node, zone, order) )
            {
                if ( !test_bit(_PGC_need_scrub, &(pg->count_info)) )
                    continue;
                for ( i = 0; i < (1 << order); i++ )
                {
                    if ( test_bit(_PGC_need_scrub, &(pg->count_info)) )
                    {
                        scrub_one_page(&pg[i]);
                        clear_bit(_PGC_need_scrub, &pg[i].count_info);
                    }
                }
            }
        }
    }
}

The panic was in page_list_next().

I didn't find a good way to iterate the free list without holding
heap_lock, but if holding the lock it might be heavy lock contention
then I have to remove pages temporarily from heap free list to a percpu
list.

-- 
Regards,
-Bob

next prev parent reply	other threads:[~2014-07-07 12:21 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-30 13:39 [PATCH v2 1/3] xen: delay page scrubbing to allocation path Bob Liu
2014-06-30 13:39 ` [PATCH v2 2/3] xen: introduce function merge_free_trunks Bob Liu
2014-06-30 15:58   ` Jan Beulich
2014-07-01  8:14     ` Bob Liu
2014-07-01  8:27       ` Jan Beulich
2014-06-30 13:39 ` [PATCH v2 3/3] xen: use idle vcpus to scrub pages Bob Liu
2014-07-01  9:12   ` Jan Beulich
2014-07-01 12:25     ` Bob Liu
2014-07-01 12:59       ` Jan Beulich
2014-07-02  6:27         ` Bob Liu
2014-07-07 12:20           ` Bob Liu [this message]
2014-07-15  9:16         ` Bob Liu
2014-07-23  0:38           ` Konrad Rzeszutek Wilk
2014-07-23  1:30             ` Bob Liu
2014-07-23  7:28           ` Jan Beulich
2014-07-24  2:08             ` Bob Liu
2014-07-24  6:24               ` Jan Beulich
2014-07-25  0:42                 ` Bob Liu
2014-07-25  6:51                   ` Jan Beulich
2014-07-25  7:28                     ` Bob Liu
2014-07-25  7:36                       ` Jan Beulich
2014-07-25  8:18                         ` Bob Liu
2014-07-25  8:28                           ` Jan Beulich
2014-06-30 15:56 ` [PATCH v2 1/3] xen: delay page scrubbing to allocation path Jan Beulich
2014-07-01  8:12   ` Bob Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53BA90A0.8050907@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=lliubbo@gmail.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.