* [PATCH v3 0/5] Reduce cross CPU IPI interference
@ 2011-11-13 10:17 Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 10:17 UTC (permalink / raw)
To: linux-kernel
Cc: Gilad Ben-Yossef, Peter Zijlstra, Frederic Weisbecker,
Russell King, linux-mm, Christoph Lameter, Pekka Enberg,
Matt Mackall, Sasha Levin, Rik van Riel, Andi Kleen
We have lots of infrastructure in place to partition a multi-core system such
that we have a group of CPUs that are dedicated to specific task: cgroups,
scheduler and interrupt affinity and cpuisol boot parameter. Still, kernel
code will some time interrupt all CPUs in the system via IPIs for various
needs. These IPIs are useful and cannot be avoided altogether, but in certain
cases it is possible to interrupt only specific CPUs that have useful work to
do and not the entire system.
This patch set, inspired by discussions with Peter Zijlstra and Frederic
Weisbecker when testing the nohz task patch set, is a first stab at trying to
explore doing this by locating the places where such global IPI calls are
being made and turning a global IPI into an IPI for a specific group of CPUs.
The purpose of the patch set is to get feedback if this is the right way to
go for dealing with this issue and indeed, if the issue is even worth dealing
with at all. Based on the feedback from this patch set I plan to offer further
patches that address similar issue in other code paths.
The patch creates an on_each_cpu_mask infrastructure API (derived from
existing arch specific versions in Tile and Arm) and uses it to turn two global
IPI invocation to per CPU group invocations.
This 3rd version incorporates changes due to reviewers feedback.
The major changes from the previous version of the patch are:
- Reverted to the much simpler way of handling cpumask allocation in slub.c
flush_all() that was used in the first iteration of the patch at the
suggestion of Andi K, Christoph L. and Pekka E. after testing with fault
injection of memory failure show that this is safe even for CPUMASK_OFFSTACK=y
case.
- Rewrote the patch that handles per cpu page caches flush to only try and
calculate which cpu to IPI when a drain is requested instead of tracking
the cpus as allocations and deallocation progress, in similar fashion to
what was done in the other patch for the slub cache at the suggestion of
Christoph L. and Rik V. The code is now much smaller and touches
only none fast path code.
The patch was compiled for arm and boot tested on x86 in UP, SMP, with and without
CONFIG_CPUMASK_OFFSTACK and was further tested by running hackbench on x86 in
SMP mode in a 4 CPUs VM with no obvious regressions.
I also artificially exercised SLUB flush_all via the debug interface and observed
the difference in IPI count across processors with and without the patch - from
an IPI on all processors but one without the patch to a subset (and often no IPI
at all) with the patch.
I further used fault injection framework to force cpumask alloction failures for
CPUMASK_OFFSTACK=y cases and triggering the code using slub sys debug interface
and running ./hackbench 1000 for page_alloc, with no critical failures.
I believe it's as good as this patch set is going to get :-)
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: linux-mm@kvack.org
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Pekka Enberg <penberg@kernel.org>
CC: Matt Mackall <mpm@selenic.com>
CC: Sasha Levin <levinsasha928@gmail.com>
CC: Rik van Riel <riel@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>
Gilad Ben-Yossef (5):
smp: Introduce a generic on_each_cpu_mask function
arm: Move arm over to generic on_each_cpu_mask
tile: Move tile to use generic on_each_cpu_mask
slub: Only IPI CPUs that have per cpu obj to flush
mm: Only IPI CPUs to drain local pages if they exist
arch/arm/kernel/smp_tlb.c | 20 +++++---------------
arch/tile/include/asm/smp.h | 7 -------
arch/tile/kernel/smp.c | 19 -------------------
include/linux/smp.h | 16 ++++++++++++++++
kernel/smp.c | 20 ++++++++++++++++++++
mm/page_alloc.c | 18 +++++++++++++++++-
mm/slub.c | 15 ++++++++++++++-
7 files changed, 72 insertions(+), 43 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function
2011-11-13 10:17 [PATCH v3 0/5] Reduce cross CPU IPI interference Gilad Ben-Yossef
@ 2011-11-13 10:17 ` Gilad Ben-Yossef
2011-11-15 15:51 ` Christoph Lameter
2011-11-13 10:17 ` [PATCH v3 2/5] arm: Move arm over to generic on_each_cpu_mask Gilad Ben-Yossef
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 10:17 UTC (permalink / raw)
To: linux-kernel
Cc: Gilad Ben-Yossef, Peter Zijlstra, Frederic Weisbecker,
Russell King, linux-mm, Christoph Lameter, Pekka Enberg,
Matt Mackall, Rik van Riel, Andi Kleen
on_each_cpu_mask calls a function on processors specified my cpumask,
which may include the local processor.
All the limitation specified in smp_call_function_many apply.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: linux-mm@kvack.org
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Pekka Enberg <penberg@kernel.org>
CC: Matt Mackall <mpm@selenic.com>
CC: Rik van Riel <riel@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>
---
include/linux/smp.h | 16 ++++++++++++++++
kernel/smp.c | 20 ++++++++++++++++++++
2 files changed, 36 insertions(+), 0 deletions(-)
diff --git a/include/linux/smp.h b/include/linux/smp.h
index 8cc38d3..60628d7 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -102,6 +102,13 @@ static inline void call_function_init(void) { }
int on_each_cpu(smp_call_func_t func, void *info, int wait);
/*
+ * Call a function on processors specified by mask, which might include
+ * the local one.
+ */
+void on_each_cpu_mask(const struct cpumask *mask, void (*func)(void *),
+ void *info, bool wait);
+
+/*
* Mark the boot cpu "online" so that it can call console drivers in
* printk() and can access its per-cpu storage.
*/
@@ -132,6 +139,15 @@ static inline int up_smp_call_function(smp_call_func_t func, void *info)
local_irq_enable(); \
0; \
})
+#define on_each_cpu_mask(mask, func, info, wait) \
+ do { \
+ if (cpumask_test_cpu(0, (mask))) { \
+ local_irq_disable(); \
+ (func)(info); \
+ local_irq_enable(); \
+ } \
+ } while (0)
+
static inline void smp_send_reschedule(int cpu) { }
#define num_booting_cpus() 1
#define smp_prepare_boot_cpu() do {} while (0)
diff --git a/kernel/smp.c b/kernel/smp.c
index db197d6..7c0cbd7 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -701,3 +701,23 @@ int on_each_cpu(void (*func) (void *info), void *info, int wait)
return ret;
}
EXPORT_SYMBOL(on_each_cpu);
+
+/*
+ * Call a function on processors specified by cpumask, which may include
+ * the local processor. All the limitation specified in smp_call_function_many
+ * apply.
+ */
+void on_each_cpu_mask(const struct cpumask *mask, void (*func)(void *),
+ void *info, bool wait)
+{
+ int cpu = get_cpu();
+
+ smp_call_function_many(mask, func, info, wait);
+ if (cpumask_test_cpu(cpu, mask)) {
+ local_irq_disable();
+ func(info);
+ local_irq_enable();
+ }
+ put_cpu();
+}
+EXPORT_SYMBOL(on_each_cpu_mask);
--
1.7.0.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 2/5] arm: Move arm over to generic on_each_cpu_mask
2011-11-13 10:17 [PATCH v3 0/5] Reduce cross CPU IPI interference Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
@ 2011-11-13 10:17 ` Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 3/5] tile: Move tile to use " Gilad Ben-Yossef
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 10:17 UTC (permalink / raw)
To: linux-kernel
Cc: Gilad Ben-Yossef, Peter Zijlstra, Frederic Weisbecker,
Russell King, linux-mm, Christoph Lameter, Pekka Enberg,
Matt Mackall, Rik van Riel, Andi Kleen
Note the generic version has the mask as first parameter
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: linux-mm@kvack.org
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Pekka Enberg <penberg@kernel.org>
CC: Matt Mackall <mpm@selenic.com>
CC: Rik van Riel <riel@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>
---
arch/arm/kernel/smp_tlb.c | 20 +++++---------------
1 files changed, 5 insertions(+), 15 deletions(-)
diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
index 7dcb352..02c5d2c 100644
--- a/arch/arm/kernel/smp_tlb.c
+++ b/arch/arm/kernel/smp_tlb.c
@@ -13,18 +13,6 @@
#include <asm/smp_plat.h>
#include <asm/tlbflush.h>
-static void on_each_cpu_mask(void (*func)(void *), void *info, int wait,
- const struct cpumask *mask)
-{
- preempt_disable();
-
- smp_call_function_many(mask, func, info, wait);
- if (cpumask_test_cpu(smp_processor_id(), mask))
- func(info);
-
- preempt_enable();
-}
-
/**********************************************************************/
/*
@@ -87,7 +75,7 @@ void flush_tlb_all(void)
void flush_tlb_mm(struct mm_struct *mm)
{
if (tlb_ops_need_broadcast())
- on_each_cpu_mask(ipi_flush_tlb_mm, mm, 1, mm_cpumask(mm));
+ on_each_cpu_mask(mm_cpumask(mm), ipi_flush_tlb_mm, mm, 1);
else
local_flush_tlb_mm(mm);
}
@@ -98,7 +86,8 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr)
struct tlb_args ta;
ta.ta_vma = vma;
ta.ta_start = uaddr;
- on_each_cpu_mask(ipi_flush_tlb_page, &ta, 1, mm_cpumask(vma->vm_mm));
+ on_each_cpu_mask(mm_cpumask(vma->vm_mm), ipi_flush_tlb_page,
+ &ta, 1);
} else
local_flush_tlb_page(vma, uaddr);
}
@@ -121,7 +110,8 @@ void flush_tlb_range(struct vm_area_struct *vma,
ta.ta_vma = vma;
ta.ta_start = start;
ta.ta_end = end;
- on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(vma->vm_mm));
+ on_each_cpu_mask(mm_cpumask(vma->vm_mm), ipi_flush_tlb_range,
+ &ta, 1);
} else
local_flush_tlb_range(vma, start, end);
}
--
1.7.0.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 3/5] tile: Move tile to use generic on_each_cpu_mask
2011-11-13 10:17 [PATCH v3 0/5] Reduce cross CPU IPI interference Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 2/5] arm: Move arm over to generic on_each_cpu_mask Gilad Ben-Yossef
@ 2011-11-13 10:17 ` Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 5/5] mm: Only IPI CPUs to drain local pages if they exist Gilad Ben-Yossef
4 siblings, 0 replies; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 10:17 UTC (permalink / raw)
To: linux-kernel
Cc: Gilad Ben-Yossef, Peter Zijlstra, Frederic Weisbecker,
Russell King, linux-mm, Christoph Lameter, Pekka Enberg,
Matt Mackall, Rik van Riel, Andi Kleen
The API is the same as the tile private one, so just remove
the private version of the functions
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: linux-mm@kvack.org
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Pekka Enberg <penberg@kernel.org>
CC: Matt Mackall <mpm@selenic.com>
CC: Rik van Riel <riel@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>
---
arch/tile/include/asm/smp.h | 7 -------
arch/tile/kernel/smp.c | 19 -------------------
2 files changed, 0 insertions(+), 26 deletions(-)
diff --git a/arch/tile/include/asm/smp.h b/arch/tile/include/asm/smp.h
index 532124a..1aa759a 100644
--- a/arch/tile/include/asm/smp.h
+++ b/arch/tile/include/asm/smp.h
@@ -43,10 +43,6 @@ void evaluate_message(int tag);
/* Boot a secondary cpu */
void online_secondary(void);
-/* Call a function on a specified set of CPUs (may include this one). */
-extern void on_each_cpu_mask(const struct cpumask *mask,
- void (*func)(void *), void *info, bool wait);
-
/* Topology of the supervisor tile grid, and coordinates of boot processor */
extern HV_Topology smp_topology;
@@ -91,9 +87,6 @@ void print_disabled_cpus(void);
#else /* !CONFIG_SMP */
-#define on_each_cpu_mask(mask, func, info, wait) \
- do { if (cpumask_test_cpu(0, (mask))) func(info); } while (0)
-
#define smp_master_cpu 0
#define smp_height 1
#define smp_width 1
diff --git a/arch/tile/kernel/smp.c b/arch/tile/kernel/smp.c
index c52224d..a44e103 100644
--- a/arch/tile/kernel/smp.c
+++ b/arch/tile/kernel/smp.c
@@ -87,25 +87,6 @@ void send_IPI_allbutself(int tag)
send_IPI_many(&mask, tag);
}
-
-/*
- * Provide smp_call_function_mask, but also run function locally
- * if specified in the mask.
- */
-void on_each_cpu_mask(const struct cpumask *mask, void (*func)(void *),
- void *info, bool wait)
-{
- int cpu = get_cpu();
- smp_call_function_many(mask, func, info, wait);
- if (cpumask_test_cpu(cpu, mask)) {
- local_irq_disable();
- func(info);
- local_irq_enable();
- }
- put_cpu();
-}
-
-
/*
* Functions related to starting/stopping cpus.
*/
--
1.7.0.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush
2011-11-13 10:17 [PATCH v3 0/5] Reduce cross CPU IPI interference Gilad Ben-Yossef
` (2 preceding siblings ...)
2011-11-13 10:17 ` [PATCH v3 3/5] tile: Move tile to use " Gilad Ben-Yossef
@ 2011-11-13 10:17 ` Gilad Ben-Yossef
2011-11-13 12:20 ` Hillf Danton
2011-11-15 15:54 ` Christoph Lameter
2011-11-13 10:17 ` [PATCH v3 5/5] mm: Only IPI CPUs to drain local pages if they exist Gilad Ben-Yossef
4 siblings, 2 replies; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 10:17 UTC (permalink / raw)
To: linux-kernel
Cc: Gilad Ben-Yossef, Peter Zijlstra, Frederic Weisbecker,
Russell King, linux-mm, Christoph Lameter, Pekka Enberg,
Matt Mackall, Sasha Levin, Rik van Riel, Andi Kleen
flush_all() is called for each kmem_cahce_destroy(). So every cache
being destroyed dynamically ended up sending an IPI to each CPU in the
system, regardless if the cache has ever been used there.
For example, if you close the Infinband ipath driver char device file,
the close file ops calls kmem_cache_destroy(). So running some
infiniband config tool on one a single CPU dedicated to system tasks
might interrupt the rest of the 127 CPUs I dedicated to some CPU
intensive task.
I suspect there is a good chance that every line in the output of "git
grep kmem_cache_destroy linux/ | grep '\->'" has a similar scenario.
This patch attempts to rectify this issue by sending an IPI to flush
the per cpu objects back to the free lists only to CPUs that seems to
have such objects.
The check which CPU to IPI is racy but we don't care since asking a
CPU without per cpu objects to flush does no damage and as far as I
can tell the flush_all by itself is racy against allocs on remote
CPUs anyway, so if you meant the flush_all to be determinstic, you
had to arrange for locking regardless.
Without this patch the following artificial test case:
$ cd /sys/kernel/slab
$ for DIR in *; do cat $DIR/alloc_calls > /dev/null; done
produces 166 IPIs on an cpuset isolated CPU. With it it produces none.
The code path of memory allocation failure for CPUMASK_OFFSTACK=y
config was tested using fault injection framework.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: linux-mm@kvack.org
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Pekka Enberg <penberg@kernel.org>
CC: Matt Mackall <mpm@selenic.com>
CC: Sasha Levin <levinsasha928@gmail.com>
CC: Rik van Riel <riel@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>
---
mm/slub.c | 15 ++++++++++++++-
1 files changed, 14 insertions(+), 1 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 7d2a996..caf4b3a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2006,7 +2006,20 @@ static void flush_cpu_slab(void *d)
static void flush_all(struct kmem_cache *s)
{
- on_each_cpu(flush_cpu_slab, s, 1);
+ cpumask_var_t cpus;
+ struct kmem_cache_cpu *c;
+ int cpu;
+
+ if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
+ for_each_online_cpu(cpu) {
+ c = per_cpu_ptr(s->cpu_slab, cpu);
+ if (c && c->page)
+ cpumask_set_cpu(cpu, cpus);
+ }
+ on_each_cpu_mask(cpus, flush_cpu_slab, s, 1);
+ free_cpumask_var(cpus);
+ } else
+ on_each_cpu(flush_cpu_slab, s, 1);
}
/*
--
1.7.0.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 5/5] mm: Only IPI CPUs to drain local pages if they exist
2011-11-13 10:17 [PATCH v3 0/5] Reduce cross CPU IPI interference Gilad Ben-Yossef
` (3 preceding siblings ...)
2011-11-13 10:17 ` [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
@ 2011-11-13 10:17 ` Gilad Ben-Yossef
2011-11-15 16:00 ` Christoph Lameter
4 siblings, 1 reply; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 10:17 UTC (permalink / raw)
To: linux-kernel
Cc: Gilad Ben-Yossef, Peter Zijlstra, Frederic Weisbecker,
Russell King, linux-mm, Christoph Lameter, Pekka Enberg,
Matt Mackall, Sasha Levin, Rik van Riel, Andi Kleen
Calculate a cpumask of CPUs with per-cpu pages in any zone
and only send an IPI requesting CPUs to drain these pages
to the buddy allocator if they actually have pages when
asked to flush.
The code path of memory allocation failure for CPUMASK_OFFSTACK=y
config was tested using fault injection framework.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: linux-mm@kvack.org
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Pekka Enberg <penberg@kernel.org>
CC: Matt Mackall <mpm@selenic.com>
CC: Sasha Levin <levinsasha928@gmail.com>
CC: Rik van Riel <riel@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>
---
mm/page_alloc.c | 18 +++++++++++++++++-
1 files changed, 17 insertions(+), 1 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9dd443d..44dc6c5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1119,7 +1119,23 @@ void drain_local_pages(void *arg)
*/
void drain_all_pages(void)
{
- on_each_cpu(drain_local_pages, NULL, 1);
+ int cpu;
+ struct zone *zone;
+ cpumask_var_t cpus;
+ struct per_cpu_pageset *pageset;
+
+ if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
+ for_each_populated_zone(zone) {
+ for_each_online_cpu(cpu) {
+ pageset = per_cpu_ptr(zone->pageset, cpu);
+ if (pageset->pcp.count)
+ cpumask_set_cpu(cpu, cpus);
+ }
+ }
+ on_each_cpu_mask(cpus, drain_local_pages, NULL, 1);
+ free_cpumask_var(cpus);
+ } else
+ on_each_cpu(drain_local_pages, NULL, 1);
}
#ifdef CONFIG_HIBERNATION
--
1.7.0.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush
2011-11-13 10:17 ` [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
@ 2011-11-13 12:20 ` Hillf Danton
2011-11-13 14:57 ` Gilad Ben-Yossef
2011-11-15 15:54 ` Christoph Lameter
1 sibling, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2011-11-13 12:20 UTC (permalink / raw)
To: Gilad Ben-Yossef
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Christoph Lameter, Pekka Enberg
On Sun, Nov 13, 2011 at 6:17 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
> flush_all() is called for each kmem_cahce_destroy(). So every cache
> being destroyed dynamically ended up sending an IPI to each CPU in the
> system, regardless if the cache has ever been used there.
>
> For example, if you close the Infinband ipath driver char device file,
> the close file ops calls kmem_cache_destroy(). So running some
> infiniband config tool on one a single CPU dedicated to system tasks
> might interrupt the rest of the 127 CPUs I dedicated to some CPU
> intensive task.
>
> I suspect there is a good chance that every line in the output of "git
> grep kmem_cache_destroy linux/ | grep '\->'" has a similar scenario.
>
> This patch attempts to rectify this issue by sending an IPI to flush
> the per cpu objects back to the free lists only to CPUs that seems to
> have such objects.
>
> The check which CPU to IPI is racy but we don't care since asking a
> CPU without per cpu objects to flush does no damage and as far as I
> can tell the flush_all by itself is racy against allocs on remote
> CPUs anyway, so if you meant the flush_all to be determinstic, you
> had to arrange for locking regardless.
>
> Without this patch the following artificial test case:
>
> $ cd /sys/kernel/slab
> $ for DIR in *; do cat $DIR/alloc_calls > /dev/null; done
>
> produces 166 IPIs on an cpuset isolated CPU. With it it produces none.
>
> The code path of memory allocation failure for CPUMASK_OFFSTACK=y
> config was tested using fault injection framework.
>
> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
> Acked-by: Chris Metcalf <cmetcalf@tilera.com>
> CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> CC: Frederic Weisbecker <fweisbec@gmail.com>
> CC: Russell King <linux@arm.linux.org.uk>
> CC: linux-mm@kvack.org
> CC: Christoph Lameter <cl@linux-foundation.org>
> CC: Pekka Enberg <penberg@kernel.org>
> CC: Matt Mackall <mpm@selenic.com>
> CC: Sasha Levin <levinsasha928@gmail.com>
> CC: Rik van Riel <riel@redhat.com>
> CC: Andi Kleen <andi@firstfloor.org>
> ---
> mm/slub.c | 15 ++++++++++++++-
> 1 files changed, 14 insertions(+), 1 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 7d2a996..caf4b3a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2006,7 +2006,20 @@ static void flush_cpu_slab(void *d)
>
> static void flush_all(struct kmem_cache *s)
> {
> - on_each_cpu(flush_cpu_slab, s, 1);
> + cpumask_var_t cpus;
> + struct kmem_cache_cpu *c;
> + int cpu;
> +
> + if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
Perhaps, the technique of local_cpu_mask defined in kernel/sched_rt.c
could be used to replace the above atomic allocation.
Best regards
Hillf
> + for_each_online_cpu(cpu) {
> + c = per_cpu_ptr(s->cpu_slab, cpu);
> + if (c && c->page)
> + cpumask_set_cpu(cpu, cpus);
> + }
> + on_each_cpu_mask(cpus, flush_cpu_slab, s, 1);
> + free_cpumask_var(cpus);
> + } else
> + on_each_cpu(flush_cpu_slab, s, 1);
> }
>
> /*
> --
> 1.7.0.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush
2011-11-13 12:20 ` Hillf Danton
@ 2011-11-13 14:57 ` Gilad Ben-Yossef
2011-11-14 13:19 ` Hillf Danton
0 siblings, 1 reply; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-13 14:57 UTC (permalink / raw)
To: Hillf Danton
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Christoph Lameter, Pekka Enberg
On Sun, Nov 13, 2011 at 2:20 PM, Hillf Danton <dhillf@gmail.com> wrote:
>
> On Sun, Nov 13, 2011 at 6:17 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
>
...
>
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 7d2a996..caf4b3a 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2006,7 +2006,20 @@ static void flush_cpu_slab(void *d)
> >
> > static void flush_all(struct kmem_cache *s)
> > {
> > - on_each_cpu(flush_cpu_slab, s, 1);
> > + cpumask_var_t cpus;
> > + struct kmem_cache_cpu *c;
> > + int cpu;
> > +
> > + if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
>
> Perhaps, the technique of local_cpu_mask defined in kernel/sched_rt.c
> could be used to replace the above atomic allocation.
>
Thank you for taking the time to review my patch :-)
That is indeed the direction I went with inthe previous iteration of
this patch, with the small change that because of observing that the
allocation will only actually occurs for CPUMASK_OFFSTACK=y which by
definition are systems with lots and lots of CPUs and, it is actually
better to allocate the cpumask per kmem_cache rather then per CPU,
since on system where it matters we are bound to have more CPUs (e.g.
4096) then kmem_caches (~160). See
https://lkml.org/lkml/2011/10/23/151.
I then went a head and further optimized the code to only incur the
memory overhead of allocating those cpumasks for CPUMASK_OFFSTACK=y
systems. See https://lkml.org/lkml/2011/10/23/152.
As you can see from the discussion that evolved, there seems to be an
agreement that the code complexity overhead involved is simply not
worth it for what is, unlike sched_rt, a rather esoteric case and one
where allocation failure is easily dealt with.
Thanks!
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com
"Unfortunately, cache misses are an equal opportunity pain provider."
-- Mike Galbraith, LKML
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush
2011-11-13 14:57 ` Gilad Ben-Yossef
@ 2011-11-14 13:19 ` Hillf Danton
2011-11-14 13:57 ` Gilad Ben-Yossef
0 siblings, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2011-11-14 13:19 UTC (permalink / raw)
To: Gilad Ben-Yossef
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Christoph Lameter, Pekka Enberg
On Sun, Nov 13, 2011 at 10:57 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
> On Sun, Nov 13, 2011 at 2:20 PM, Hillf Danton <dhillf@gmail.com> wrote:
>>
>> On Sun, Nov 13, 2011 at 6:17 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
>>
> ...
>>
>> > diff --git a/mm/slub.c b/mm/slub.c
>> > index 7d2a996..caf4b3a 100644
>> > --- a/mm/slub.c
>> > +++ b/mm/slub.c
>> > @@ -2006,7 +2006,20 @@ static void flush_cpu_slab(void *d)
>> >
>> > static void flush_all(struct kmem_cache *s)
>> > {
>> > - on_each_cpu(flush_cpu_slab, s, 1);
>> > + cpumask_var_t cpus;
>> > + struct kmem_cache_cpu *c;
>> > + int cpu;
>> > +
>> > + if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
>>
>> Perhaps, the technique of local_cpu_mask defined in kernel/sched_rt.c
>> could be used to replace the above atomic allocation.
>>
>
> Thank you for taking the time to review my patch :-)
>
> That is indeed the direction I went with inthe previous iteration of
> this patch, with the small change that because of observing that the
> allocation will only actually occurs for CPUMASK_OFFSTACK=y which by
> definition are systems with lots and lots of CPUs and, it is actually
> better to allocate the cpumask per kmem_cache rather then per CPU,
> since on system where it matters we are bound to have more CPUs (e.g.
> 4096) then kmem_caches (~160). See
> https://lkml.org/lkml/2011/10/23/151.
>
> I then went a head and further optimized the code to only incur the
> memory overhead of allocating those cpumasks for CPUMASK_OFFSTACK=y
> systems. See https://lkml.org/lkml/2011/10/23/152.
>
> As you can see from the discussion that evolved, there seems to be an
> agreement that the code complexity overhead involved is simply not
> worth it for what is, unlike sched_rt, a rather esoteric case and one
> where allocation failure is easily dealt with.
>
Even with the introduced overhead of allocation, IPIs could not go down
as much as we wish, right?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush
2011-11-14 13:19 ` Hillf Danton
@ 2011-11-14 13:57 ` Gilad Ben-Yossef
0 siblings, 0 replies; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-14 13:57 UTC (permalink / raw)
To: Hillf Danton
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Christoph Lameter, Pekka Enberg
On Mon, Nov 14, 2011 at 3:19 PM, Hillf Danton <dhillf@gmail.com> wrote:
> On Sun, Nov 13, 2011 at 10:57 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
...
>>> Perhaps, the technique of local_cpu_mask defined in kernel/sched_rt.c
>>> could be used to replace the above atomic allocation.
>>>
>>
>> Thank you for taking the time to review my patch :-)
>>
>> That is indeed the direction I went with inthe previous iteration of
>> this patch, with the small change that because of observing that the
>> allocation will only actually occurs for CPUMASK_OFFSTACK=y which by
>> definition are systems with lots and lots of CPUs and, it is actually
>> better to allocate the cpumask per kmem_cache rather then per CPU,
>> since on system where it matters we are bound to have more CPUs (e.g.
>> 4096) then kmem_caches (~160). See
>> https://lkml.org/lkml/2011/10/23/151.
>>
>> I then went a head and further optimized the code to only incur the
>> memory overhead of allocating those cpumasks for CPUMASK_OFFSTACK=y
>> systems. See https://lkml.org/lkml/2011/10/23/152.
>>
>> As you can see from the discussion that evolved, there seems to be an
>> agreement that the code complexity overhead involved is simply not
>> worth it for what is, unlike sched_rt, a rather esoteric case and one
>> where allocation failure is easily dealt with.
>>
> Even with the introduced overhead of allocation, IPIs could not go down
> as much as we wish, right?
>
My apologies, but I don't think I follow you through -
If processor A needs processor B to do something, an IPI is the right
thing to do. Let's call them useful IPIs.
What I am trying to tackle is the places where processor B doesn't
really have anything to
do and processor A is simply blindly sending IPIs to the whole system.
I call them useless IPIs.
I don't see a reason why *useless* IPIs can go to zero, or very close
to that. Useful IPIs are fine :-)
Thanks,
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com
"Unfortunately, cache misses are an equal opportunity pain provider."
-- Mike Galbraith, LKML
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function
2011-11-13 10:17 ` [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
@ 2011-11-15 15:51 ` Christoph Lameter
2011-11-22 10:06 ` Gilad Ben-Yossef
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2011-11-15 15:51 UTC (permalink / raw)
To: Gilad Ben-Yossef
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Pekka Enberg, Matt Mackall, Rik van Riel, Andi Kleen
On Sun, 13 Nov 2011, Gilad Ben-Yossef wrote:
> on_each_cpu_mask calls a function on processors specified my cpumask,
> which may include the local processor.
Reviewed-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush
2011-11-13 10:17 ` [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
2011-11-13 12:20 ` Hillf Danton
@ 2011-11-15 15:54 ` Christoph Lameter
1 sibling, 0 replies; 14+ messages in thread
From: Christoph Lameter @ 2011-11-15 15:54 UTC (permalink / raw)
To: Gilad Ben-Yossef
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Pekka Enberg, Matt Mackall, Sasha Levin, Rik van Riel,
Andi Kleen
On Sun, 13 Nov 2011, Gilad Ben-Yossef wrote:
> @@ -2006,7 +2006,20 @@ static void flush_cpu_slab(void *d)
> + if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
> + for_each_online_cpu(cpu) {
> + c = per_cpu_ptr(s->cpu_slab, cpu);
> + if (c && c->page)
c will never be null. No need to check.
Otherwise
Acked-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 5/5] mm: Only IPI CPUs to drain local pages if they exist
2011-11-13 10:17 ` [PATCH v3 5/5] mm: Only IPI CPUs to drain local pages if they exist Gilad Ben-Yossef
@ 2011-11-15 16:00 ` Christoph Lameter
0 siblings, 0 replies; 14+ messages in thread
From: Christoph Lameter @ 2011-11-15 16:00 UTC (permalink / raw)
To: Gilad Ben-Yossef
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Pekka Enberg, Matt Mackall, Sasha Levin, Rik van Riel,
Andi Kleen
On Sun, 13 Nov 2011, Gilad Ben-Yossef wrote:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9dd443d..44dc6c5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1119,7 +1119,23 @@ void drain_local_pages(void *arg)
> */
> void drain_all_pages(void)
> {
> - on_each_cpu(drain_local_pages, NULL, 1);
> + int cpu;
> + struct zone *zone;
> + cpumask_var_t cpus;
> + struct per_cpu_pageset *pageset;
We usually name such pointers "pcp" in the page allocator.
> +
> + if (likely(zalloc_cpumask_var(&cpus, GFP_ATOMIC))) {
> + for_each_populated_zone(zone) {
> + for_each_online_cpu(cpu) {
> + pageset = per_cpu_ptr(zone->pageset, cpu);
> + if (pageset->pcp.count)
> + cpumask_set_cpu(cpu, cpus);
> + }
The pagesets are allocated on bootup from the per cpu areas. You may get a
better access pattern by using for_each_online_cpu as the outer loop
because their is a likelyhood of linear increasing accesses as you loop
through the zones for a particular cpu.
Acked-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function
2011-11-15 15:51 ` Christoph Lameter
@ 2011-11-22 10:06 ` Gilad Ben-Yossef
0 siblings, 0 replies; 14+ messages in thread
From: Gilad Ben-Yossef @ 2011-11-22 10:06 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, Peter Zijlstra, Frederic Weisbecker, Russell King,
linux-mm, Pekka Enberg, Matt Mackall, Rik van Riel, Andi Kleen
On Tue, Nov 15, 2011 at 5:51 PM, Christoph Lameter <cl@linux.com> wrote:
> On Sun, 13 Nov 2011, Gilad Ben-Yossef wrote:
>
>> on_each_cpu_mask calls a function on processors specified my cpumask,
>> which may include the local processor.
>
> Reviewed-by: Christoph Lameter <cl@linux.com>
>
Thanks :-)
v4 is on the way.
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com
"Unfortunately, cache misses are an equal opportunity pain provider."
-- Mike Galbraith, LKML
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-11-22 10:06 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-13 10:17 [PATCH v3 0/5] Reduce cross CPU IPI interference Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 1/5] smp: Introduce a generic on_each_cpu_mask function Gilad Ben-Yossef
2011-11-15 15:51 ` Christoph Lameter
2011-11-22 10:06 ` Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 2/5] arm: Move arm over to generic on_each_cpu_mask Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 3/5] tile: Move tile to use " Gilad Ben-Yossef
2011-11-13 10:17 ` [PATCH v3 4/5] slub: Only IPI CPUs that have per cpu obj to flush Gilad Ben-Yossef
2011-11-13 12:20 ` Hillf Danton
2011-11-13 14:57 ` Gilad Ben-Yossef
2011-11-14 13:19 ` Hillf Danton
2011-11-14 13:57 ` Gilad Ben-Yossef
2011-11-15 15:54 ` Christoph Lameter
2011-11-13 10:17 ` [PATCH v3 5/5] mm: Only IPI CPUs to drain local pages if they exist Gilad Ben-Yossef
2011-11-15 16:00 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).