Re: [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Jan Beulich <jbeulich@suse.com>, xen-devel@lists.xen.org
Cc: sstabellini@kernel.org, wei.liu2@citrix.com,
	George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
	ian.jackson@eu.citrix.com, tim@xen.org
Subject: Re: [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist
Date: Sat, 22 Jul 2017 22:00:02 -0400	[thread overview]
Message-ID: <f00890f5-a092-985e-0914-92a24b739161@oracle.com> (raw)
In-Reply-To: <595290B202000078001014DA@prv-mh.provo.novell.com>



On 06/27/2017 01:06 PM, Jan Beulich wrote:
>>>> Boris Ostrovsky <boris.ostrovsky@oracle.com> 06/22/17 8:55 PM >>>
>> I kept node_need_scrub[] as a global array and not a "per-node". I think splitting
>> it should be part of making heap_lock a per-node lock, together with increasing
>> scrub concurrency by having more than one CPU scrub a node.
> 
> Agreed - I hadn't meant my earlier comment to necessarily affect this series.
> 
>> @@ -798,11 +814,26 @@ static struct page_info *alloc_heap_pages(
>>       return NULL;
>>   
>>    found:
>> +
>> +    if ( pg->u.free.first_dirty != INVALID_DIRTY_IDX )
>> +        first_dirty_pg = pg + pg->u.free.first_dirty;
>> +
>>       /* We may have to halve the chunk a number of times. */
>>       while ( j != order )
>>       {
>> -        PFN_ORDER(pg) = --j;
>> -        page_list_add_tail(pg, &heap(node, zone, j));
>> +        unsigned int first_dirty;
>> +
>> +        if ( first_dirty_pg && ((pg + (1 << j)) > first_dirty_pg) )
> 
> Despite the various examples of doing it this way, please at least use 1u.
> 
>> +        {
>> +            if ( pg < first_dirty_pg )
>> +                first_dirty = (first_dirty_pg - pg) / sizeof(*pg);
> 
> Pointer subtraction already includes the involved division. 


Yes, this was a mistake.

> Otoh I wonder
> if you couldn't get away without pointer comparison/subtraction here
> altogether.


Without comparison I can only assume that first_dirty is zero (i.e. the 
whole buddy is potentially dirty). Is there something else I could do?


> 
>> @@ -849,13 +880,22 @@ static int reserve_offlined_page(struct page_info *head)
>>   {
>>       unsigned int node = phys_to_nid(page_to_maddr(head));
>>       int zone = page_to_zone(head), i, head_order = PFN_ORDER(head), count = 0;
>> -    struct page_info *cur_head;
>> +    struct page_info *cur_head, *first_dirty_pg = NULL;
>>       int cur_order;
>>   
>>       ASSERT(spin_is_locked(&heap_lock));
>>   
>>       cur_head = head;
>>   
>> +    /*
>> +     * We may break the buddy so let's mark the head as clean. Then, when
>> +     * merging chunks back into the heap, we will see whether the chunk has
>> +     * unscrubbed pages and set its first_dirty properly.
>> +     */
>> +    if (head->u.free.first_dirty != INVALID_DIRTY_IDX)
> 
> Coding style.
> 
>> @@ -892,8 +934,25 @@ static int reserve_offlined_page(struct page_info *head)
>>               {
>>               merge:
>>                   /* We don't consider merging outside the head_order. */
>> -                page_list_add_tail(cur_head, &heap(node, zone, cur_order));
>> -                PFN_ORDER(cur_head) = cur_order;
>> +
>> +                /* See if any of the pages indeed need scrubbing. */
>> +                if ( first_dirty_pg && (cur_head + (1 << cur_order) > first_dirty_pg) )
>> +                {
>> +                    if ( cur_head < first_dirty_pg )
>> +                        i = (first_dirty_pg - cur_head) / sizeof(*cur_head);

I assume the same comment as above applies here.

>> +                    else
>> +                        i = 0;
>> +
>> +                    for ( ; i < (1 << cur_order); i++ )
>> +                        if ( test_bit(_PGC_need_scrub,
>> +                                      &cur_head[i].count_info) )
>> +                        {
>> +                            first_dirty = i;
>> +                            break;
>> +                        }
> 
> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are
> there cases where ->u.free.first_dirty of a page may be wrong?


When we merge in free_heap_pages we don't clear first_dirty of the 
successor buddy (at some point I did have this done but you questioned 
whether it was needed and I dropped it).


> 
>> @@ -977,35 +1090,53 @@ static void free_heap_pages(
>>   
>>           if ( (page_to_mfn(pg) & mask) )
>>           {
>> +            struct page_info *predecessor = pg - mask;
>> +
>>               /* Merge with predecessor block? */
>> -            if ( !mfn_valid(_mfn(page_to_mfn(pg-mask))) ||
>> -                 !page_state_is(pg-mask, free) ||
>> -                 (PFN_ORDER(pg-mask) != order) ||
>> -                 (phys_to_nid(page_to_maddr(pg-mask)) != node) )
>> +            if ( !mfn_valid(_mfn(page_to_mfn(predecessor))) ||
>> +                 !page_state_is(predecessor, free) ||
>> +                 (PFN_ORDER(predecessor) != order) ||
>> +                 (phys_to_nid(page_to_maddr(predecessor)) != node) )
>>                   break;
>> -            pg -= mask;
>> -            page_list_del(pg, &heap(node, zone, order));
>> +
>> +            page_list_del(predecessor, &heap(node, zone, order));
>> +
>> +            if ( predecessor->u.free.first_dirty != INVALID_DIRTY_IDX )
>> +                need_scrub = true;
> 
> I'm afraid I continue to be confused by this: Why does need_scrub depend on
> the state of pages not being the subject of the current free operation? I
> realize that at this point in the series the path can't be taken yet, but
> won't later patches need to rip it out or change it anyway, in which case it
> would be better to introduce the then correct check (if any) only there?


Right, at this point we indeed will never have the 'if' evaluate to true 
since heap is always clean. And when we switch to scrubbing from idle 
need_scrub is never looked at.

I suspect this check will not be needed in the intermediate patches neither.

> 
>> --- a/xen/include/asm-x86/mm.h
>> +++ b/xen/include/asm-x86/mm.h
>> @@ -88,7 +88,15 @@ struct page_info
>>           /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
>>           struct {
>>               /* Do TLBs need flushing for safety before next page use? */
>> -            bool_t need_tlbflush;
>> +            unsigned long need_tlbflush:1;
>> +
>> +            /*
>> +             * Index of the first *possibly* unscrubbed page in the buddy.
>> +             * One more than maximum possible order (MAX_ORDER+1) to
> 
> Why +1 here and hence ...

Don't we have MAX_ORDER+1 orders?

> 
>> +             * accommodate INVALID_DIRTY_IDX.
>> +             */
>> +#define INVALID_DIRTY_IDX (-1UL & (((1UL<<MAX_ORDER) + 2) - 1))
>> +            unsigned long first_dirty:MAX_ORDER + 2;
> 
> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly
> parenthesized (apart from lacking blanks around the shift operator)? I'd
> expect you want a value with MAX_ORDER+1 set bits, i.e.
> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too.

Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2017-07-23  2:00 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-22 18:57 [PATCH v5 0/8] Memory scrubbing from idle loop Boris Ostrovsky
2017-06-22 18:57 ` [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist Boris Ostrovsky
2017-06-27 17:06   ` Jan Beulich
2017-07-23  2:00     ` Boris Ostrovsky [this message]
2017-07-31 14:45       ` Jan Beulich
2017-07-31 16:03         ` Boris Ostrovsky
2017-08-02  9:24           ` Jan Beulich
2017-08-02 15:31             ` Boris Ostrovsky
2017-06-22 18:57 ` [PATCH v5 2/8] mm: Extract allocation loop from alloc_heap_pages() Boris Ostrovsky
2017-06-27 17:59   ` Jan Beulich
2017-06-22 18:57 ` [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed Boris Ostrovsky
2017-06-27 18:00   ` Jan Beulich
2017-07-23  2:07     ` Boris Ostrovsky
2017-07-31 15:16       ` Jan Beulich
2017-07-31 16:07         ` Boris Ostrovsky
2017-06-22 18:57 ` [PATCH v5 4/8] mm: Scrub memory from idle loop Boris Ostrovsky
2017-06-23  8:36   ` Dario Faggioli
2017-06-27 18:01   ` Jan Beulich
2017-07-23  2:14     ` Boris Ostrovsky
2017-07-31 15:20       ` Jan Beulich
2017-07-31 16:15         ` Boris Ostrovsky
2017-08-02  9:27           ` Jan Beulich
2017-06-22 18:57 ` [PATCH v5 5/8] spinlock: Introduce spin_lock_cb() Boris Ostrovsky
2017-06-22 18:57 ` [PATCH v5 6/8] mm: Keep heap accessible to others while scrubbing Boris Ostrovsky
2017-06-27 19:28   ` Jan Beulich
2017-06-27 19:31     ` Jan Beulich
2017-07-23  2:28     ` Boris Ostrovsky
2017-08-02  8:34       ` Jan Beulich
2017-06-22 18:57 ` [PATCH v5 7/8] mm: Print number of unscrubbed pages in 'H' debug handler Boris Ostrovsky
2017-06-22 18:57 ` [PATCH v5 8/8] mm: Make sure pages are scrubbed Boris Ostrovsky
2017-06-27 19:29   ` Jan Beulich
2017-06-23  9:36 ` [PATCH v5 0/8] Memory scrubbing from idle loop Jan Beulich
2017-06-23 13:11   ` Boris Ostrovsky
2017-06-23 13:22     ` Jan Beulich
2017-06-23 13:29       ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f00890f5-a092-985e-0914-92a24b739161@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).