[PATCH] permute with 2MB chunk

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] permute with 2MB chunk
@ 2008-03-18 18:03 Jean Guyader
  2008-03-19  9:42 ` Cui, Dexuan
  0 siblings, 1 reply; 7+ messages in thread
From: Jean Guyader @ 2008-03-18 18:03 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 196 bytes --]


The memory permutation cause a slow down in case of a save/restore (bug 
1143). It works better when the mixing is done with 2MB chunks.

Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>

[-- Attachment #2: fix-permute.patch --]
[-- Type: text/x-diff, Size: 1911 bytes --]

diff -r 59b8768d0d0d tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c	Wed Mar 05 11:18:25 2008 +0000
+++ b/tools/libxc/xc_domain_save.c	Tue Mar 18 17:45:03 2008 +0000
@@ -125,34 +125,22 @@ static inline int count_bits ( int nr, v
     return count;
 }
 
-static inline int permute( int i, int nr, int order_nr  )
+static inline int permute(unsigned long i, unsigned long order_nr)
 {
     /* Need a simple permutation function so that we scan pages in a
        pseudo random order, enabling us to get a better estimate of
        the domain's page dirtying rate as we go (there are often
        contiguous ranges of pfns that have similar behaviour, and we
        want to mix them up. */
+  
+  unsigned char keep = 9; /* chunk of 2 MB */
+  unsigned char shift_low = (order_nr - keep) / 2 + ((order_nr - keep) / 2) % 2;
+  unsigned char shift_high = order_nr - keep - shift_low;
 
-    /* e.g. nr->oder 15->4 16->4 17->5 */
-    /* 512MB domain, 128k pages, order 17 */
+  unsigned long high = (i >> (keep + shift_low));
+  unsigned long low = (i >> keep) & ((1 << shift_low) - 1);
 
-    /*
-      QPONMLKJIHGFEDCBA
-             QPONMLKJIH
-      GFEDCBA
-     */
-
-    /*
-      QPONMLKJIHGFEDCBA
-                  EDCBA
-             QPONM
-      LKJIHGF
-      */
-
-    do { i = ((i>>(order_nr-10)) | ( i<<10 ) ) & ((1<<order_nr)-1); }
-    while ( i >= nr ); /* this won't ever loop if nr is a power of 2 */
-
-    return i;
+  return (i & ((1 << keep) - 1)) | (low << (shift_high + keep)) | (high << keep);
 }
 
 static uint64_t tv_to_us(struct timeval *new)
@@ -1126,7 +1114,7 @@ int xc_domain_save(int xc_handle, int io
                    (batch < MAX_BATCH_SIZE) && (N < p2m_size);
                    N++ )
             {
-                int n = permute(N, p2m_size, order_nr);
+                int n = permute(N, order_nr);
 
                 if ( debug )
                 {

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] permute with 2MB chunk
  2008-03-18 18:03 [PATCH] permute with 2MB chunk Jean Guyader
@ 2008-03-19  9:42 ` Cui, Dexuan
  2008-03-19 10:00   ` Tian, Kevin
  2008-03-19 10:08   ` Keir Fraser
  0 siblings, 2 replies; 7+ messages in thread
From: Cui, Dexuan @ 2008-03-19  9:42 UTC (permalink / raw)
  To: Jean Guyader, xen-devel

Hi Jean,
The patch does fix the bug. Great!

I made a test to change xc_hvm_build() to invoke xc_domain_memory_populate_physmap() in the same pfn order of that in the old permute(); then I created an HVM guest, and I met with almost the same slowness in it!
Looks the old poor version of permute() can incur high rate of cache miss, hence the slowness after S/R is caused?

However, I still have questions:
For the bug, I remember the slowness 
1) Only happens to HVM guest (PV-guest has not this issue);   -- any difference between HVM and PV here??
2) Only happens to S/R and local non-live migration, but doesn't happen to local live migration. -- any difference between live and non-live here??
And when we suffer from the slowness, "local live migrating" the HVM guest can make the performance back to normal!

Can you reproduce these in your side? If so, can you help to explain them? 

Many thanks!

-- Dexuan

-----Original Message-----
From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Jean Guyader
Sent: 2008年3月19日 2:03
To: xen-devel@lists.xensource.com
Subject: [Xen-devel] [PATCH] permute with 2MB chunk

The memory permutation cause a slow down in case of a save/restore (bug 
1143). It works better when the mixing is done with 2MB chunks.

Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] permute with 2MB chunk
  2008-03-19  9:42 ` Cui, Dexuan
@ 2008-03-19 10:00   ` Tian, Kevin
  2008-03-19 10:08   ` Keir Fraser
  1 sibling, 0 replies; 7+ messages in thread
From: Tian, Kevin @ 2008-03-19 10:00 UTC (permalink / raw)
  To: Cui, Dexuan, Jean Guyader, xen-devel

 
Also a bit curious whether original intent of permute() still keeps 
true on a 2M granularity from this good patch:

    /* Need a simple permutation function so that we scan pages in a
       pseudo random order, enabling us to get a better estimate of
       the domain's page dirtying rate as we go (there are often
       contiguous ranges of pfns that have similar behaviour, and we
       want to mix them up. */

And if not, maybe permute() can be removed instead, or with some
counterpart in restore side? :-)

Thanks,
Kevin
>From: Cui, Dexuan
>Sent: 2008年3月19日 17:42
>
>Hi Jean,
>The patch does fix the bug. Great!
>
>I made a test to change xc_hvm_build() to invoke 
>xc_domain_memory_populate_physmap() in the same pfn order of 
>that in the old permute(); then I created an HVM guest, and I 
>met with almost the same slowness in it!
>Looks the old poor version of permute() can incur high rate of 
>cache miss, hence the slowness after S/R is caused?
>
>However, I still have questions:
>For the bug, I remember the slowness 
>1) Only happens to HVM guest (PV-guest has not this issue);   
>-- any difference between HVM and PV here??
>2) Only happens to S/R and local non-live migration, but 
>doesn't happen to local live migration. -- any difference 
>between live and non-live here??
>And when we suffer from the slowness, "local live migrating" 
>the HVM guest can make the performance back to normal!
>
>Can you reproduce these in your side? If so, can you help to 
>explain them? 
>
>Many thanks!
>
>-- Dexuan
>
>
>-----Original Message-----
>From: xen-devel-bounces@lists.xensource.com 
>[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
>Jean Guyader
>Sent: 2008年3月19日 2:03
>To: xen-devel@lists.xensource.com
>Subject: [Xen-devel] [PATCH] permute with 2MB chunk
>
>
>The memory permutation cause a slow down in case of a 
>save/restore (bug 
>1143). It works better when the mixing is done with 2MB chunks.
>
>Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] permute with 2MB chunk
  2008-03-19  9:42 ` Cui, Dexuan
  2008-03-19 10:00   ` Tian, Kevin
@ 2008-03-19 10:08   ` Keir Fraser
  2008-03-20  9:05     ` Ian Pratt
  1 sibling, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2008-03-19 10:08 UTC (permalink / raw)
  To: Cui, Dexuan, Jean Guyader, xen-devel

On 19/3/08 09:42, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:

> However, I still have questions:
> For the bug, I remember the slowness
> 1) Only happens to HVM guest (PV-guest has not this issue);   -- any
> difference between HVM and PV here??
> 2) Only happens to S/R and local non-live migration, but doesn't happen to
> local live migration. -- any difference between live and non-live here??
> And when we suffer from the slowness, "local live migrating" the HVM guest can
> make the performance back to normal!
> 
> Can you reproduce these in your side? If so, can you help to explain them?

We also tested building an HVM guest with the permuted ordering of pages,
versus reverse ordering, versus normal ordering. Only the permuted ordering
showed the problem. We assume that the permute() function has an unfortunate
interaction with the memory allocator in certain HVM guest OSes, causing
poor cache utilisation.

The fact that live migration made the bug go away can perhaps be explained
by the fact that multiple rounds of page transmission add an extra layer of
randomisation to page allocations at the receiver?

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] permute with 2MB chunk
  2008-03-19 10:08   ` Keir Fraser
@ 2008-03-20  9:05     ` Ian Pratt
  2008-03-20  9:13       ` Keir Fraser
  2008-03-25 12:34       ` Jean Guyader
  0 siblings, 2 replies; 7+ messages in thread
From: Ian Pratt @ 2008-03-20  9:05 UTC (permalink / raw)
  To: Keir Fraser, Cui, Dexuan, Jean Guyader (Intern), xen-devel; +Cc: Ian Pratt

> We also tested building an HVM guest with the permuted ordering of
> pages, versus reverse ordering, versus normal ordering. Only the
permuted
> ordering showed the problem. We assume that the permute() function has
an
> unfortunate interaction with the memory allocator in certain HVM guest
OSes,
> causing poor cache utilisation.

It's still very odd that the permutation fn only seems to effect Linux
running as a HVM guest and not as a PV guest. I still think there's
something we're not quite understanding.

Jean: have you definitely verified that building a domain with the
permute function does not affect Linux PV guests?

Thanks,
Ian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] permute with 2MB chunk
  2008-03-20  9:05     ` Ian Pratt
@ 2008-03-20  9:13       ` Keir Fraser
  2008-03-25 12:34       ` Jean Guyader
  1 sibling, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2008-03-20  9:13 UTC (permalink / raw)
  To: Ian Pratt, Cui, Dexuan, Jean Guyader (Intern), xen-devel

On 20/3/08 09:05, "Ian Pratt" <Ian.Pratt@eu.citrix.com> wrote:

>> We also tested building an HVM guest with the permuted ordering of
>> pages, versus reverse ordering, versus normal ordering. Only the
> permuted
>> ordering showed the problem. We assume that the permute() function has
> an
>> unfortunate interaction with the memory allocator in certain HVM guest
> OSes,
>> causing poor cache utilisation.
> 
> It's still very odd that the permutation fn only seems to effect Linux
> running as a HVM guest and not as a PV guest. I still think there's
> something we're not quite understanding.
> 
> Jean: have you definitely verified that building a domain with the
> permute function does not affect Linux PV guests?

Dexuan has also claimed in private email that the 2MB permute function
speeds up kernel builds in save-restored HVM guests from 62s to 58s. We
don't know the confidence intervals for those figures though. Still, there's
presumably something rather fragile underlying all this...

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] permute with 2MB chunk
  2008-03-20  9:05     ` Ian Pratt
  2008-03-20  9:13       ` Keir Fraser
@ 2008-03-25 12:34       ` Jean Guyader
  1 sibling, 0 replies; 7+ messages in thread
From: Jean Guyader @ 2008-03-25 12:34 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Keir Fraser, Cui, Dexuan

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

Ian Pratt wrote:
>> We also tested building an HVM guest with the permuted ordering of
>> pages, versus reverse ordering, versus normal ordering. Only the
> permuted
>> ordering showed the problem. We assume that the permute() function has
> an
>> unfortunate interaction with the memory allocator in certain HVM guest
> OSes,
>> causing poor cache utilisation.
> 
> It's still very odd that the permutation fn only seems to effect Linux
> running as a HVM guest and not as a PV guest. I still think there's
> something we're not quite understanding.
> 
> Jean: have you definitely verified that building a domain with the
> permute function does not affect Linux PV guests?
> 

Here a new version of the permute patch, it has to be applied instead of 
the previous one. Now it works with PV guests, sorry for the delay.

Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>

-- 
Jean Guyader

[-- Attachment #2: fix-permute2.patch --]
[-- Type: text/x-diff, Size: 2620 bytes --]

diff -r 76c9cf11ce23 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c	Fri Mar 21 09:45:34 2008 +0000
+++ b/tools/libxc/xc_domain_save.c	Tue Mar 25 12:31:42 2008 +0000
@@ -123,6 +123,32 @@ static inline int count_bits ( int nr, v
     for ( i = 0; i < (nr / (sizeof(unsigned long)*8)); i++, p++ )
         count += hweight32(*p);
     return count;
+}
+
+static inline int permute(unsigned long i, unsigned long nr, unsigned long order_nr)
+{
+    /* Need a simple permutation function so that we scan pages in a
+       pseudo random order, enabling us to get a better estimate of
+       the domain's page dirtying rate as we go (there are often
+       contiguous ranges of pfns that have similar behaviour, and we
+       want to mix them up. */
+  
+  unsigned char keep = 9; /* chunk of 2 MB */
+  unsigned char shift_high = (order_nr - keep) / 2;
+  unsigned char shift_low = order_nr - keep - (order_nr - keep) / 2;
+  
+  /* Check if the permutation gives an out of range number. */
+  do
+  {
+    unsigned long high = (i >> (keep + shift_low));
+    unsigned long low = (i >> keep) & ((1 << shift_low) - 1);
+    i = (i & ((1 << keep) - 1)) |
+      (low << (shift_high + keep)) | (high << keep);
+  }
+  while (i >= nr);
+
+
+  return (i);
 }
 
 static uint64_t tv_to_us(struct timeval *new)
@@ -735,6 +761,7 @@ static xen_pfn_t *map_and_save_p2m_table
         p2m_frame_list[i/FPP] = mfn_to_pfn(p2m_frame_list[i/FPP]);
     }
 
+    memset(&ctxt, 0, sizeof (ctxt));
     if ( xc_vcpu_getcontext(xc_handle, dom, 0, &ctxt.c) )
     {
         ERROR("Could not get vcpu context");
@@ -828,6 +855,8 @@ int xc_domain_save(int xc_handle, int io
 
     /* base of the region in which domain memory is mapped */
     unsigned char *region_base = NULL;
+
+    int order_nr = 0;
 
     /* bitmap of pages:
        - that should be sent this iteration (unless later marked as skip);
@@ -937,6 +966,11 @@ int xc_domain_save(int xc_handle, int io
 
     /* pretend we sent all the pages last iteration */
     sent_last_iter = p2m_size;
+
+    /* calculate the power of 2 order of p2m_size, e.g.
+       15->4 16->4 17->5 */
+    for ( i = p2m_size-1, order_nr = 0; i ; i >>= 1, order_nr++ )
+      continue;
 
     /* Setup to_send / to_fix and to_skip bitmaps */
     to_send = malloc(BITMAP_SIZE);
@@ -1088,7 +1122,7 @@ int xc_domain_save(int xc_handle, int io
                    (batch < MAX_BATCH_SIZE) && (N < p2m_size);
                    N++ )
             {
-                int n = N;
+                int n = permute(N, p2m_size, order_nr);
 
                 if ( debug )
                 {

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-03-25 12:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-18 18:03 [PATCH] permute with 2MB chunk Jean Guyader
2008-03-19  9:42 ` Cui, Dexuan
2008-03-19 10:00   ` Tian, Kevin
2008-03-19 10:08   ` Keir Fraser
2008-03-20  9:05     ` Ian Pratt
2008-03-20  9:13       ` Keir Fraser
2008-03-25 12:34       ` Jean Guyader

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.