* Re: [RFC 1/3] Framework for accurate node based statistics
[not found] ` <20051206200607.GY11190@wotan.suse.de>
@ 2005-12-06 22:52 ` Christoph Lameter
2005-12-07 5:50 ` Keith Owens
2005-12-07 18:39 ` Luck, Tony
0 siblings, 2 replies; 5+ messages in thread
From: Christoph Lameter @ 2005-12-06 22:52 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-kernel, Hugh Dickins, Nick Piggin, linux-mm, linux-ia64,
Marcelo Tosatti
On Tue, 6 Dec 2005, Andi Kleen wrote:
> Ok we'll need a local64_t then. No big deal - can be easily added.
> Or perhaps better a long_local_t so that 32bit doesn't need to
> pay the cost.
I jusw saw that ia64 already has local_t as 64 bit, so that is no problem
for us. Here is a patch that would convert the framework to use local_t.
Is that okay?
The problem with this solution is that the use of local_t will lead to the
use of atomic operations (in case the preemption status is unknown). It
may be better to use atomic operations and simply drop the per_cpu stuff.
That way the summing of the per cpu variables is avoided and the
stats are accurate in real time.
Seems that local.h is rarely used. There was an obvious mistake in there
for ia64.
Index: linux-2.6.15-rc5/mm/page_alloc.c
=================================--- linux-2.6.15-rc5.orig/mm/page_alloc.c 2005-12-06 10:13:49.000000000 -0800
+++ linux-2.6.15-rc5/mm/page_alloc.c 2005-12-06 14:43:41.000000000 -0800
@@ -560,26 +560,25 @@ static int rmqueue_bulk(struct zone *zon
static spinlock_t node_stat_lock;
unsigned long vm_stat_global[NR_STAT_ITEMS];
unsigned long vm_stat_node[MAX_NUMNODES][NR_STAT_ITEMS];
-int vm_stat_diff[NR_CPUS][MAX_NUMNODES][NR_STAT_ITEMS];
+DEFINE_PER_CPU(local_t [MAX_NUMNODES][NR_STAT_ITEMS], vm_stat_diff);
void refresh_vm_stats(void) {
- int cpu;
int node;
int i;
spin_lock(&node_stat_lock);
- cpu = get_cpu();
for_each_online_node(node)
for(i = 0; i < NR_STAT_ITEMS; i++) {
- int * p = vm_stat_diff[cpu][node]+i;
- if (*p) {
- vm_stat_node[node][i] += *p;
- vm_stat_global[i] += *p;
- *p = 0;
+ long v;
+
+ v = cpu_local_read(vm_stat_diff[node][i]);
+ if (v) {
+ vm_stat_node[node][i] += v;
+ vm_stat_global[i] += v;
+ cpu_local_set(vm_stat_diff[node][i], 0);
}
}
- put_cpu();
spin_unlock(&node_stat_lock);
}
Index: linux-2.6.15-rc5/include/linux/page-flags.h
=================================--- linux-2.6.15-rc5.orig/include/linux/page-flags.h 2005-12-06 10:15:59.000000000 -0800
+++ linux-2.6.15-rc5/include/linux/page-flags.h 2005-12-06 14:47:03.000000000 -0800
@@ -8,6 +8,7 @@
#include <linux/percpu.h>
#include <linux/cache.h>
#include <asm/pgtable.h>
+#include <asm/local.h>
/*
* Various page->flags bits:
@@ -169,12 +170,19 @@ enum node_stat_item { NR_MAPPED, NR_PAGE
extern unsigned long vm_stat_global[NR_STAT_ITEMS];
extern unsigned long vm_stat_node[MAX_NUMNODES][NR_STAT_ITEMS];
-extern int vm_stat_diff[NR_CPUS][MAX_NUMNODES][NR_STAT_ITEMS];
+DECLARE_PER_CPU(local_t [MAX_NUMNODES][NR_STAT_ITEMS], vm_stat_diff);
static inline void mod_node_page_state(int node, enum node_stat_item item, int delta)
{
- vm_stat_diff[get_cpu()][node][item] += delta;
- put_cpu();
+ cpu_local_add(delta, vm_stat_diff[node][item]);
+}
+
+/*
+ * For use when we know that preemption is disabled. Avoids atomic operations.
+ */
+static inline void __mod_node_page_state(int node, enum node_stat_item item, int delta)
+{
+ __local_add(delta, &__get_cpu_var(vm_stat_diff[node][item]));
}
#define inc_node_page_state(node, item) mod_node_page_state(node, item, 1)
Index: linux-2.6.15-rc5/include/asm-ia64/local.h
=================================--- linux-2.6.15-rc5.orig/include/asm-ia64/local.h 2005-12-03 21:10:42.000000000 -0800
+++ linux-2.6.15-rc5/include/asm-ia64/local.h 2005-12-06 14:39:47.000000000 -0800
@@ -17,7 +17,7 @@ typedef struct {
#define local_set(l, i) atomic64_set(&(l)->val, i)
#define local_inc(l) atomic64_inc(&(l)->val)
#define local_dec(l) atomic64_dec(&(l)->val)
-#define local_add(l) atomic64_add(&(l)->val)
+#define local_add(i, l) atomic64_add((i), &(l)->val)
#define local_sub(l) atomic64_sub(&(l)->val)
/* Non-atomic variants, i.e., preemption disabled and won't be touched in interrupt, etc. */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC 1/3] Framework for accurate node based statistics
2005-12-06 22:52 ` [RFC 1/3] Framework for accurate node based statistics Christoph Lameter
@ 2005-12-07 5:50 ` Keith Owens
2005-12-07 18:24 ` Christoph Lameter
2005-12-07 18:39 ` Luck, Tony
1 sibling, 1 reply; 5+ messages in thread
From: Keith Owens @ 2005-12-07 5:50 UTC (permalink / raw)
To: Christoph Lameter
Cc: Andi Kleen, linux-kernel, Hugh Dickins, Nick Piggin, linux-mm,
linux-ia64, Marcelo Tosatti
On Tue, 6 Dec 2005 14:52:33 -0800 (PST),
Christoph Lameter <clameter@engr.sgi.com> wrote:
>+DEFINE_PER_CPU(local_t [MAX_NUMNODES][NR_STAT_ITEMS], vm_stat_diff);
How big is that array going to get? The total per cpu data area is
limited to 64K on IA64 and we already use at least 34K.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC 1/3] Framework for accurate node based statistics
2005-12-07 5:50 ` Keith Owens
@ 2005-12-07 18:24 ` Christoph Lameter
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2005-12-07 18:24 UTC (permalink / raw)
To: Keith Owens
Cc: Andi Kleen, linux-kernel, Hugh Dickins, Nick Piggin, linux-mm,
linux-ia64, Marcelo Tosatti
On Wed, 7 Dec 2005, Keith Owens wrote:
> On Tue, 6 Dec 2005 14:52:33 -0800 (PST),
> Christoph Lameter <clameter@engr.sgi.com> wrote:
> >+DEFINE_PER_CPU(local_t [MAX_NUMNODES][NR_STAT_ITEMS], vm_stat_diff);
>
> How big is that array going to get? The total per cpu data area is
> limited to 64K on IA64 and we already use at least 34K.
Maximum around 1k nodes and I guess we may end up with 16 counters:
1024*16*8 = 131k ?
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [RFC 1/3] Framework for accurate node based statistics
2005-12-06 22:52 ` [RFC 1/3] Framework for accurate node based statistics Christoph Lameter
2005-12-07 5:50 ` Keith Owens
@ 2005-12-07 18:39 ` Luck, Tony
2005-12-07 18:47 ` Christoph Lameter
1 sibling, 1 reply; 5+ messages in thread
From: Luck, Tony @ 2005-12-07 18:39 UTC (permalink / raw)
To: Christoph Lameter, Keith Owens
Cc: Andi Kleen, linux-kernel, Hugh Dickins, Nick Piggin, linux-mm,
linux-ia64, Marcelo Tosatti
>> How big is that array going to get? The total per cpu data area is
>> limited to 64K on IA64 and we already use at least 34K.
>
> Maximum around 1k nodes and I guess we may end up with 16 counters:
>
> 1024*16*8 = 131k ?
Ouch.
Can you live with a pointer to that monster block of space in the
per-cpu area?
Otherwise the next step up is a 256K per cpu area ... which I wouldn't
want to make the default (so we'll have another 2*X explosion in the
number of possible configs to test).
-Tony
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [RFC 1/3] Framework for accurate node based statistics
2005-12-07 18:39 ` Luck, Tony
@ 2005-12-07 18:47 ` Christoph Lameter
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2005-12-07 18:47 UTC (permalink / raw)
To: Luck, Tony
Cc: Keith Owens, Andi Kleen, linux-kernel, Hugh Dickins, Nick Piggin,
linux-mm, linux-ia64, Marcelo Tosatti
On Wed, 7 Dec 2005, Luck, Tony wrote:
> Can you live with a pointer to that monster block of space in the
> per-cpu area?
>
> Otherwise the next step up is a 256K per cpu area ... which I wouldn't
> want to make the default (so we'll have another 2*X explosion in the
> number of possible configs to test).
Lets wait. I just did this to show how local_t could be implemented. This
is a RFC and the major problems (f.e. the 3 second delay)
have not been addressed so this is all vaporware for now.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-12-07 18:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20051206182843.19188.82045.sendpatchset@schroedinger.engr.sgi.com>
[not found] ` <20051206183524.GU11190@wotan.suse.de>
[not found] ` <Pine.LNX.4.62.0512061105220.19475@schroedinger.engr.sgi.com>
[not found] ` <20051206192603.GX11190@wotan.suse.de>
[not found] ` <Pine.LNX.4.62.0512061131500.19637@schroedinger.engr.sgi.com>
[not found] ` <20051206200607.GY11190@wotan.suse.de>
2005-12-06 22:52 ` [RFC 1/3] Framework for accurate node based statistics Christoph Lameter
2005-12-07 5:50 ` Keith Owens
2005-12-07 18:24 ` Christoph Lameter
2005-12-07 18:39 ` Luck, Tony
2005-12-07 18:47 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox