[PATCH] scrub pages on guest termination

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] scrub pages on guest termination
@ 2008-05-23 15:00 Ben Guthro
  2008-05-23 16:04 ` Keir Fraser
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Guthro @ 2008-05-23 15:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Robert Phillips


[-- Attachment #1.1: Type: text/plain, Size: 1107 bytes --]

This patch solves the following problem.  When a large VS terminates, the node locks
up. The node locks up because the page_scrub_kick routine sends a softirq to
all processors instructing them to run the page scrub code.  There they interfere
with each other as they serialize behind the page_scrub_lock.

The patch does two things:

(1) In page_scrub_kick, only a single cpu is interrupted.  Some cpu other than
the calling cpu is chosen (if available) because we assume the calling cpu
has other higher priority work to do.

(2) In page_scrub_softirq, if more than one cpu is online, the first cpu
to start scrubbing designates itself as the primary_scrubber.  As such
it is dedicated to scrubbing pages until the list is empty.  Other cpus
might call page_scrub_softirq but they spend only 1 msec scrubbing before
returning to check for other higher priority work.  But, with multiple cpus
online, the node can afford to have one cpu dedicated to scrubbing when
that work needs to be done.

Signed-off-by: Robert Phillips <rphillips@virtualiron.com>
Signed-off-by: Ben Guthro <bguthro@virtualiron.com>


[-- Attachment #1.2: Type: text/html, Size: 1466 bytes --]

[-- Attachment #2: xen-page-scrub.patch --]
[-- Type: text/plain, Size: 2859 bytes --]

diff -r 29dc52031954 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -984,16 +984,23 @@
     void             *p;
     int               i;
     s_time_t          start = NOW();
+    static int        primary_scrubber = -1;
 
-    /* Aim to do 1ms of work every 10ms. */
+    /* Unless SMP, aim to do 1ms of work every 10ms. */
     do {
         spin_lock(&page_scrub_lock);
 
         if ( unlikely((ent = page_scrub_list.next) == &page_scrub_list) )
         {
+            if (primary_scrubber == smp_processor_id())
+                primary_scrubber = -1;
             spin_unlock(&page_scrub_lock);
             return;
         }
+        
+        /* If SMP, dedicate a cpu to scrubbing til the job is done */
+        if (primary_scrubber == -1 && num_online_cpus() > 1)
+            primary_scrubber = smp_processor_id();
         
         /* Peel up to 16 pages from the list. */
         for ( i = 0; i < 16; i++ )
@@ -1020,7 +1027,7 @@
             unmap_domain_page(p);
             free_heap_pages(pfn_dom_zone_type(page_to_mfn(pg)), pg, 0);
         }
-    } while ( (NOW() - start) < MILLISECS(1) );
+    } while ( primary_scrubber == smp_processor_id() || (NOW() - start) < MILLISECS(1) );
 
     set_timer(&this_cpu(page_scrub_timer), NOW() + MILLISECS(10));
 }
diff -r 29dc52031954 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -90,10 +90,21 @@
         if ( !list_empty(&page_scrub_list) )    \
             raise_softirq(PAGE_SCRUB_SOFTIRQ);  \
     } while ( 0 )
-#define page_scrub_kick()                                               \
-    do {                                                                \
-        if ( !list_empty(&page_scrub_list) )                            \
-            cpumask_raise_softirq(cpu_online_map, PAGE_SCRUB_SOFTIRQ);  \
+
+#define page_scrub_kick()                                       \
+    do {                                                        \
+        if ( !list_empty(&page_scrub_list) ) {                  \
+            int cpu;                                            \
+            /* Try to use some other cpu. */                    \
+            for_each_online_cpu(cpu) {                          \
+                if (cpu != smp_processor_id()) {                \
+                    cpu_raise_softirq(cpu, PAGE_SCRUB_SOFTIRQ); \
+                    break;                                      \
+                }                                               \
+            }                                                   \
+            if (cpu >= NR_CPUS)                                 \
+                raise_softirq(PAGE_SCRUB_SOFTIRQ);              \
+        }                                                       \
     } while ( 0 )
 unsigned long avail_scrub_pages(void);
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scrub pages on guest termination
  2008-05-23 15:00 [PATCH] scrub pages on guest termination Ben Guthro
@ 2008-05-23 16:04 ` Keir Fraser
  2008-05-23 17:01   ` Ben Guthro
  0 siblings, 1 reply; 4+ messages in thread
From: Keir Fraser @ 2008-05-23 16:04 UTC (permalink / raw)
  To: Ben Guthro, xen-devel; +Cc: Robert Phillips

[-- Attachment #1.1: Type: text/plain, Size: 937 bytes --]

The aim of the loop was to scrub enough pages in a batch that lock
contention is kept tolerably low. Even if 16 pages is not sufficient for
that, I¹m surprised a Œnode¹ (you mean a whole system, presumably?) would
appear to lock up. Maybe pages would be scrubbed slower than we¹d like, but
still CPUs should be able to get the spinlock often enough to evaluate
whether they have spent 1ms in the loop and hence get out of there.

What sort of system were you seeing the lockup on? Does it have very many
physical CPUs?

 -- Keir

On 23/5/08 16:00, "Ben Guthro" <bguthro@virtualiron.com> wrote:

> This patch solves the following problem.  When a large VS terminates, the node
> locks
> up. The node locks up because the page_scrub_kick routine sends a softirq to
> all processors instructing them to run the page scrub code.  There they
> interfere
> with each other as they serialize behind the page_scrub_lock.

[-- Attachment #1.2: Type: text/html, Size: 1387 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scrub pages on guest termination
  2008-05-23 16:04 ` Keir Fraser
@ 2008-05-23 17:01   ` Ben Guthro
  2008-05-23 17:19     ` Keir Fraser
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Guthro @ 2008-05-23 17:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Robert Phillips


[-- Attachment #1.1: Type: text/plain, Size: 1494 bytes --]

Yes, sorry -  should have removed our terminology from the description.
Node=physical machine
VS=HVM guest w/ pv-on-hvm drivers
Looking back at the original bug report - it seems to indicate it was 
migrating from a system with 2 processors to one with 8

Specifcally - from
Dell Precision WorkStation 380
Processor:    Intel(R) Pentium(R) D CPU 2.80GHz
# of CPUs:    2
Speed:    2.8GHz

to

Supermicro X7DB8
Processor:    Genuine Intel(R) CPU @ 2.13GHz
# of CPUs:    8
Speed:    2.133 GHz




Keir Fraser wrote:
> The aim of the loop was to scrub enough pages in a batch that lock 
> contention is kept tolerably low. Even if 16 pages is not sufficient 
> for that, I'm surprised a 'node' (you mean a whole system, 
> presumably?) would appear to lock up. Maybe pages would be scrubbed 
> slower than we'd like, but still CPUs should be able to get the 
> spinlock often enough to evaluate whether they have spent 1ms in the 
> loop and hence get out of there.
>
> What sort of system were you seeing the lockup on? Does it have very 
> many physical CPUs?
>
>  -- Keir
>
> On 23/5/08 16:00, "Ben Guthro" <bguthro@virtualiron.com> wrote:
>
>     This patch solves the following problem.  When a large VS
>     terminates, the node locks
>     up. The node locks up because the page_scrub_kick routine sends a
>     softirq to
>     all processors instructing them to run the page scrub code.  There
>     they interfere
>     with each other as they serialize behind the page_scrub_lock.
>
>


[-- Attachment #1.2: Type: text/html, Size: 2418 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scrub pages on guest termination
  2008-05-23 17:01   ` Ben Guthro
@ 2008-05-23 17:19     ` Keir Fraser
  0 siblings, 0 replies; 4+ messages in thread
From: Keir Fraser @ 2008-05-23 17:19 UTC (permalink / raw)
  To: Ben Guthro; +Cc: xen-devel, Robert Phillips

[-- Attachment #1.1: Type: text/plain, Size: 1522 bytes --]

On 23/5/08 18:01, "Ben Guthro" <bguthro@virtualiron.com> wrote:

> Yes, sorry -  should have removed our terminology from the description.
> Node=physical machine
> VS=HVM guest w/ pv-on-hvm drivers
> Looking back at the original bug report - it seems to indicate it was
> migrating from a system with 2 processors to one with 8

It¹s very surprising that lock contention would cause such a severe lack of
progress on an 8-CPU system. If the lock is that hotly contended then even
the usage of it in free_domheap_pages() has to be questionable.

I¹m inclined to say that if we want to address this then we should do it in
one or more of the following ways:
 1. Count CPUs into the scrub function with an atomic_t and beyond a limit
all other CPUs bail straight out after re-setting their timer.
 2. Increase scrub batch size to reduce proportion of time that each loop
iteration holds the lock.
 3. Turn the spin_lock() into a spin_trylock() so that the timeout check can
be guaranteed to execute frequently.
 4. Eliminate the global lock by building a lock-free linked list, or by
maintaining per-CPU hashed work queues with work stealing, or... etc.

The patch as-is at least suffers from the issue that the Œprimary scrubber¹
should be regularly checking for softirq work. But I¹m not sure such a
sizeable change to the scheduling policy for scrubbing (such as it is!) is
necessary or desirable.

Option 4 is on the morally highest ground but is of course the most work.
:-)

 -- Keir

[-- Attachment #1.2: Type: text/html, Size: 2060 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-05-23 17:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-23 15:00 [PATCH] scrub pages on guest termination Ben Guthro
2008-05-23 16:04 ` Keir Fraser
2008-05-23 17:01   ` Ben Guthro
2008-05-23 17:19     ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.