Re: [PATCHES v8 1/8] mm: Place unscrubbed pages at the end of pagelist

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Julien Grall <julien.grall@arm.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>, xen-devel@lists.xen.org
Cc: sstabellini@kernel.org, wei.liu2@citrix.com,
	George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
	ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com
Subject: Re: [PATCHES v8 1/8] mm: Place unscrubbed pages at the end of pagelist
Date: Thu, 17 Aug 2017 11:30:02 +0100	[thread overview]
Message-ID: <1fb6bc7c-b303-8bac-46df-3d88c3459c11@arm.com> (raw)
In-Reply-To: <1502908394-9760-2-git-send-email-boris.ostrovsky@oracle.com>

Hi Boris,

On 16/08/17 19:33, Boris Ostrovsky wrote:
> .. so that it's easy to find pages that need to be scrubbed (those pages are
> now marked with _PGC_need_scrub bit).
>
> We keep track of the first unscrubbed page in a page buddy using first_dirty
> field. For now it can have two values, 0 (whole buddy needs scrubbing) or
> INVALID_DIRTY_IDX (the buddy does not need to be scrubbed). Subsequent patches
> will allow scrubbing to be interrupted, resulting in first_dirty taking any
> value.
>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

For the ARM bits:

Acked-by: Julien Grall <julien.grall@arm.com>

Cheers,

> ---
> Changes in v8:
> * Changed x86's definition of page_info.u.free from using bitfields to natural
>   datatypes
> * Swapped order of bitfields in page_info.u.free for ARM
> * Added BUILD_BUG_ON to check page_info.u.free.first_dirty size on x86, moved
>   previously defined BUILD_BUG_ON from init_heap_pages() to init_boot_pages()
>   (to avoid introducing extra '#ifdef x86' and to keep both together)
>
>  xen/common/page_alloc.c  | 159 ++++++++++++++++++++++++++++++++++++++++-------
>  xen/include/asm-arm/mm.h |  17 ++++-
>  xen/include/asm-x86/mm.h |  15 +++++
>  3 files changed, 167 insertions(+), 24 deletions(-)
>
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 444ecf3..a39fd81 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -261,7 +261,11 @@ void __init init_boot_pages(paddr_t ps, paddr_t pe)
>  #ifdef CONFIG_X86
>      const unsigned long *badpage = NULL;
>      unsigned int i, array_size;
> +
> +    BUILD_BUG_ON(8 * sizeof(((struct page_info *)0)->u.free.first_dirty) <
> +                 MAX_ORDER + 1);
>  #endif
> +    BUILD_BUG_ON(sizeof(((struct page_info *)0)->u) != sizeof(unsigned long));
>
>      ps = round_pgup(ps);
>      pe = round_pgdown(pe);
> @@ -375,6 +379,8 @@ typedef struct page_list_head heap_by_zone_and_order_t[NR_ZONES][MAX_ORDER+1];
>  static heap_by_zone_and_order_t *_heap[MAX_NUMNODES];
>  #define heap(node, zone, order) ((*_heap[node])[zone][order])
>
> +static unsigned long node_need_scrub[MAX_NUMNODES];
> +
>  static unsigned long *avail[MAX_NUMNODES];
>  static long total_avail_pages;
>
> @@ -670,13 +676,30 @@ static void check_low_mem_virq(void)
>      }
>  }
>
> +/* Pages that need a scrub are added to tail, otherwise to head. */
> +static void page_list_add_scrub(struct page_info *pg, unsigned int node,
> +                                unsigned int zone, unsigned int order,
> +                                unsigned int first_dirty)
> +{
> +    PFN_ORDER(pg) = order;
> +    pg->u.free.first_dirty = first_dirty;
> +
> +    if ( first_dirty != INVALID_DIRTY_IDX )
> +    {
> +        ASSERT(first_dirty < (1U << order));
> +        page_list_add_tail(pg, &heap(node, zone, order));
> +    }
> +    else
> +        page_list_add(pg, &heap(node, zone, order));
> +}
> +
>  /* Allocate 2^@order contiguous pages. */
>  static struct page_info *alloc_heap_pages(
>      unsigned int zone_lo, unsigned int zone_hi,
>      unsigned int order, unsigned int memflags,
>      struct domain *d)
>  {
> -    unsigned int i, j, zone = 0, nodemask_retry = 0;
> +    unsigned int i, j, zone = 0, nodemask_retry = 0, first_dirty;
>      nodeid_t first_node, node = MEMF_get_node(memflags), req_node = node;
>      unsigned long request = 1UL << order;
>      struct page_info *pg;
> @@ -790,12 +813,26 @@ static struct page_info *alloc_heap_pages(
>      return NULL;
>
>   found:
> +
> +    first_dirty = pg->u.free.first_dirty;
> +
>      /* We may have to halve the chunk a number of times. */
>      while ( j != order )
>      {
> -        PFN_ORDER(pg) = --j;
> -        page_list_add_tail(pg, &heap(node, zone, j));
> -        pg += 1 << j;
> +        j--;
> +        page_list_add_scrub(pg, node, zone, j,
> +                            (1U << j) > first_dirty ?
> +                            first_dirty : INVALID_DIRTY_IDX);
> +        pg += 1U << j;
> +
> +        if ( first_dirty != INVALID_DIRTY_IDX )
> +        {
> +            /* Adjust first_dirty */
> +            if ( first_dirty >= 1U << j )
> +                first_dirty -= 1U << j;
> +            else
> +                first_dirty = 0; /* We've moved past original first_dirty */
> +        }
>      }
>
>      ASSERT(avail[node][zone] >= request);
> @@ -842,12 +879,20 @@ static int reserve_offlined_page(struct page_info *head)
>      unsigned int node = phys_to_nid(page_to_maddr(head));
>      int zone = page_to_zone(head), i, head_order = PFN_ORDER(head), count = 0;
>      struct page_info *cur_head;
> -    int cur_order;
> +    unsigned int cur_order, first_dirty;
>
>      ASSERT(spin_is_locked(&heap_lock));
>
>      cur_head = head;
>
> +    /*
> +     * We may break the buddy so let's mark the head as clean. Then, when
> +     * merging chunks back into the heap, we will see whether the chunk has
> +     * unscrubbed pages and set its first_dirty properly.
> +     */
> +    first_dirty = head->u.free.first_dirty;
> +    head->u.free.first_dirty = INVALID_DIRTY_IDX;
> +
>      page_list_del(head, &heap(node, zone, head_order));
>
>      while ( cur_head < (head + (1 << head_order)) )
> @@ -858,6 +903,8 @@ static int reserve_offlined_page(struct page_info *head)
>          if ( page_state_is(cur_head, offlined) )
>          {
>              cur_head++;
> +            if ( first_dirty != INVALID_DIRTY_IDX && first_dirty )
> +                first_dirty--;
>              continue;
>          }
>
> @@ -884,9 +931,20 @@ static int reserve_offlined_page(struct page_info *head)
>              {
>              merge:
>                  /* We don't consider merging outside the head_order. */
> -                page_list_add_tail(cur_head, &heap(node, zone, cur_order));
> -                PFN_ORDER(cur_head) = cur_order;
> +                page_list_add_scrub(cur_head, node, zone, cur_order,
> +                                    (1U << cur_order) > first_dirty ?
> +                                    first_dirty : INVALID_DIRTY_IDX);
>                  cur_head += (1 << cur_order);
> +
> +                /* Adjust first_dirty if needed. */
> +                if ( first_dirty != INVALID_DIRTY_IDX )
> +                {
> +                    if ( first_dirty >=  1U << cur_order )
> +                        first_dirty -= 1U << cur_order;
> +                    else
> +                        first_dirty = 0;
> +                }
> +
>                  break;
>              }
>          }
> @@ -911,9 +969,53 @@ static int reserve_offlined_page(struct page_info *head)
>      return count;
>  }
>
> +static void scrub_free_pages(unsigned int node)
> +{
> +    struct page_info *pg;
> +    unsigned int zone;
> +
> +    ASSERT(spin_is_locked(&heap_lock));
> +
> +    if ( !node_need_scrub[node] )
> +        return;
> +
> +    for ( zone = 0; zone < NR_ZONES; zone++ )
> +    {
> +        unsigned int order = MAX_ORDER;
> +
> +        do {
> +            while ( !page_list_empty(&heap(node, zone, order)) )
> +            {
> +                unsigned int i;
> +
> +                /* Unscrubbed pages are always at the end of the list. */
> +                pg = page_list_last(&heap(node, zone, order));
> +                if ( pg->u.free.first_dirty == INVALID_DIRTY_IDX )
> +                    break;
> +
> +                for ( i = pg->u.free.first_dirty; i < (1U << order); i++)
> +                {
> +                    if ( test_bit(_PGC_need_scrub, &pg[i].count_info) )
> +                    {
> +                        scrub_one_page(&pg[i]);
> +                        pg[i].count_info &= ~PGC_need_scrub;
> +                        node_need_scrub[node]--;
> +                    }
> +                }
> +
> +                page_list_del(pg, &heap(node, zone, order));
> +                page_list_add_scrub(pg, node, zone, order, INVALID_DIRTY_IDX);
> +
> +                if ( node_need_scrub[node] == 0 )
> +                    return;
> +            }
> +        } while ( order-- != 0 );
> +    }
> +}
> +
>  /* Free 2^@order set of pages. */
>  static void free_heap_pages(
> -    struct page_info *pg, unsigned int order)
> +    struct page_info *pg, unsigned int order, bool need_scrub)
>  {
>      unsigned long mask, mfn = page_to_mfn(pg);
>      unsigned int i, node = phys_to_nid(page_to_maddr(pg)), tainted = 0;
> @@ -953,10 +1055,20 @@ static void free_heap_pages(
>          /* This page is not a guest frame any more. */
>          page_set_owner(&pg[i], NULL); /* set_gpfn_from_mfn snoops pg owner */
>          set_gpfn_from_mfn(mfn + i, INVALID_M2P_ENTRY);
> +
> +        if ( need_scrub )
> +            pg[i].count_info |= PGC_need_scrub;
>      }
>
>      avail[node][zone] += 1 << order;
>      total_avail_pages += 1 << order;
> +    if ( need_scrub )
> +    {
> +        node_need_scrub[node] += 1 << order;
> +        pg->u.free.first_dirty = 0;
> +    }
> +    else
> +        pg->u.free.first_dirty = INVALID_DIRTY_IDX;
>
>      if ( tmem_enabled() )
>          midsize_alloc_zone_pages = max(
> @@ -980,6 +1092,12 @@ static void free_heap_pages(
>
>              page_list_del(predecessor, &heap(node, zone, order));
>
> +            /* Keep predecessor's first_dirty if it is already set. */
> +            if ( predecessor->u.free.first_dirty == INVALID_DIRTY_IDX &&
> +                 pg->u.free.first_dirty != INVALID_DIRTY_IDX )
> +                predecessor->u.free.first_dirty = (1U << order) +
> +                                                  pg->u.free.first_dirty;
> +
>              pg = predecessor;
>          }
>          else
> @@ -999,12 +1117,14 @@ static void free_heap_pages(
>          order++;
>      }
>
> -    PFN_ORDER(pg) = order;
> -    page_list_add_tail(pg, &heap(node, zone, order));
> +    page_list_add_scrub(pg, node, zone, order, pg->u.free.first_dirty);
>
>      if ( tainted )
>          reserve_offlined_page(pg);
>
> +    if ( need_scrub )
> +        scrub_free_pages(node);
> +
>      spin_unlock(&heap_lock);
>  }
>
> @@ -1225,7 +1345,7 @@ unsigned int online_page(unsigned long mfn, uint32_t *status)
>      spin_unlock(&heap_lock);
>
>      if ( (y & PGC_state) == PGC_state_offlined )
> -        free_heap_pages(pg, 0);
> +        free_heap_pages(pg, 0, false);
>
>      return ret;
>  }
> @@ -1294,7 +1414,7 @@ static void init_heap_pages(
>              nr_pages -= n;
>          }
>
> -        free_heap_pages(pg+i, 0);
> +        free_heap_pages(pg + i, 0, false);
>      }
>  }
>
> @@ -1621,7 +1741,7 @@ void free_xenheap_pages(void *v, unsigned int order)
>
>      memguard_guard_range(v, 1 << (order + PAGE_SHIFT));
>
> -    free_heap_pages(virt_to_page(v), order);
> +    free_heap_pages(virt_to_page(v), order, false);
>  }
>
>  #else
> @@ -1675,12 +1795,9 @@ void free_xenheap_pages(void *v, unsigned int order)
>      pg = virt_to_page(v);
>
>      for ( i = 0; i < (1u << order); i++ )
> -    {
> -        scrub_one_page(&pg[i]);
>          pg[i].count_info &= ~PGC_xen_heap;
> -    }
>
> -    free_heap_pages(pg, order);
> +    free_heap_pages(pg, order, true);
>  }
>
>  #endif
> @@ -1789,7 +1906,7 @@ struct page_info *alloc_domheap_pages(
>      if ( d && !(memflags & MEMF_no_owner) &&
>           assign_pages(d, pg, order, memflags) )
>      {
> -        free_heap_pages(pg, order);
> +        free_heap_pages(pg, order, false);
>          return NULL;
>      }
>
> @@ -1857,11 +1974,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>              scrub = 1;
>          }
>
> -        if ( unlikely(scrub) )
> -            for ( i = 0; i < (1 << order); i++ )
> -                scrub_one_page(&pg[i]);
> -
> -        free_heap_pages(pg, order);
> +        free_heap_pages(pg, order, scrub);
>      }
>
>      if ( drop_dom_ref )
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index ef84b72..3b3d38f 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -43,8 +43,16 @@ struct page_info
>          } inuse;
>          /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
>          struct {
> +            /*
> +             * Index of the first *possibly* unscrubbed page in the buddy.
> +             * One more bit than maximum possible order to accommodate
> +             * INVALID_DIRTY_IDX.
> +             */
> +#define INVALID_DIRTY_IDX ((1UL << (MAX_ORDER + 1)) - 1)
> +            unsigned long first_dirty:MAX_ORDER + 1;
> +
>              /* Do TLBs need flushing for safety before next page use? */
> -            bool_t need_tlbflush;
> +            bool need_tlbflush:1;
>          } free;
>
>      } u;
> @@ -107,6 +115,13 @@ struct page_info
>  #define PGC_count_width   PG_shift(9)
>  #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
>
> +/*
> + * Page needs to be scrubbed. Since this bit can only be set on a page that is
> + * free (i.e. in PGC_state_free) we can reuse PGC_allocated bit.
> + */
> +#define _PGC_need_scrub   _PGC_allocated
> +#define PGC_need_scrub    PGC_allocated
> +
>  extern mfn_t xenheap_mfn_start, xenheap_mfn_end;
>  extern vaddr_t xenheap_virt_end;
>  #ifdef CONFIG_ARM_64
> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
> index 2bf3f33..86b1723 100644
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -87,6 +87,14 @@ struct page_info
>
>          /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
>          struct {
> +            /*
> +             * Index of the first *possibly* unscrubbed page in the buddy.
> +             * One more bit than maximum possible order to accommodate
> +             * INVALID_DIRTY_IDX.
> +             */
> +#define INVALID_DIRTY_IDX ((1UL << (MAX_ORDER + 1)) - 1)
> +            unsigned int first_dirty;
> +
>              /* Do TLBs need flushing for safety before next page use? */
>              bool_t need_tlbflush;
>          } free;
> @@ -233,6 +241,13 @@ struct page_info
>  #define PGC_count_width   PG_shift(9)
>  #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
>
> +/*
> + * Page needs to be scrubbed. Since this bit can only be set on a page that is
> + * free (i.e. in PGC_state_free) we can reuse PGC_allocated bit.
> + */
> +#define _PGC_need_scrub   _PGC_allocated
> +#define PGC_need_scrub    PGC_allocated
> +
>  #define is_xen_heap_page(page) ((page)->count_info & PGC_xen_heap)
>  #define is_xen_heap_mfn(mfn) \
>      (__mfn_valid(mfn) && is_xen_heap_page(__mfn_to_page(mfn)))
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2017-08-17 10:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-16 18:33 [PATCHES v8 0/8] Memory scrubbing from idle loop Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 1/8] mm: Place unscrubbed pages at the end of pagelist Boris Ostrovsky
2017-08-17 10:30   ` Julien Grall [this message]
2017-08-21 13:49     ` Jan Beulich
2017-08-21 17:00       ` Julien Grall
2017-08-18  9:11   ` Jan Beulich
2017-08-18 13:11     ` Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 2/8] mm: Extract allocation loop from alloc_heap_pages() Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 3/8] mm: Scrub pages in alloc_heap_pages() if needed Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 4/8] mm: Scrub memory from idle loop Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 5/8] spinlock: Introduce spin_lock_cb() Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 6/8] mm: Keep heap accessible to others while scrubbing Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 7/8] mm: Print number of unscrubbed pages in 'H' debug handler Boris Ostrovsky
2017-08-16 18:33 ` [PATCHES v8 8/8] mm: Make sure pages are scrubbed Boris Ostrovsky
2017-08-17 10:27   ` Julien Grall
2017-08-17 13:44     ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1fb6bc7c-b303-8bac-46df-3d88c3459c11@arm.com \
    --to=julien.grall@arm.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).