* [PATCH] NET : convert ip_rt_acct to per_cpu variables
@ 2007-11-16 8:59 Eric Dumazet
2007-11-16 9:12 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2007-11-16 8:59 UTC (permalink / raw)
To: David Miller; +Cc: netdev@vger.kernel.org
Hi David
I am chasing NR_CPUS syndrom for your NR_CPUS=4096 machines :)
(Feel free to give me an account on one of them just for fun)
I made this patch but have no idea if it actually works
(only compile tested) because I dont know how to use this
CONFIG_NET_CLS_ROUTE stuff
Thank you
[PATCH] NET : NET_CLS_ROUTE : convert ip_rt_acct to per_cpu variables
ip_rt_acct needs 4096 bytes per cpu to perform some accounting.
It is actually allocated as a single huge array [4096*NR_CPUS]
(rounded up to a power of two)
Converting it to a per cpu variable is wanted to :
- Save space on machines were num_possible_cpus() < NR_CPUS
- Better NUMA placement (each cpu gets memory on its node)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
include/net/route.h | 2 +-
net/ipv4/ip_input.c | 2 +-
net/ipv4/route.c | 17 ++---------------
3 files changed, 4 insertions(+), 17 deletions(-)
diff --git a/include/net/route.h b/include/net/route.h
index f7ce625..3044fd1 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -103,7 +103,7 @@ struct rt_cache_stat
unsigned int out_hlist_search;
};
-extern struct ip_rt_acct *ip_rt_acct;
+DECLARE_PER_CPU(struct ip_rt_acct[256], ip_rt_acct);
struct in_device;
extern int ip_rt_init(void);
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 5b8a760..82f9a52 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -347,7 +347,7 @@ static int ip_rcv_finish(struct sk_buff *skb)
#ifdef CONFIG_NET_CLS_ROUTE
if (unlikely(skb->dst->tclassid)) {
- struct ip_rt_acct *st = ip_rt_acct + 256*smp_processor_id();
+ struct ip_rt_acct *st = __get_cpu_var(ip_rt_acct);
u32 idx = skb->dst->tclassid;
st[idx&0xFF].o_packets++;
st[idx&0xFF].o_bytes+=skb->len;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f0b28f9..83b319a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2855,12 +2855,12 @@ ctl_table ipv4_route_table[] = {
#endif
#ifdef CONFIG_NET_CLS_ROUTE
-struct ip_rt_acct *ip_rt_acct;
+DEFINE_PER_CPU(struct ip_rt_acct [256], ip_rt_acct);
/* This code sucks. But you should have seen it before! --RR */
/* IP route accounting ptr for this logical cpu number. */
-#define IP_RT_ACCT_CPU(i) (ip_rt_acct + i * 256)
+#define IP_RT_ACCT_CPU(i) (per_cpu(ip_rt_acct, i))
#ifdef CONFIG_PROC_FS
static int ip_rt_acct_read(char *buffer, char **start, off_t offset,
@@ -2923,19 +2923,6 @@ int __init ip_rt_init(void)
rt_hash_rnd = (int) ((num_physpages ^ (num_physpages>>8)) ^
(jiffies ^ (jiffies >> 7)));
-#ifdef CONFIG_NET_CLS_ROUTE
- {
- int order;
- for (order = 0;
- (PAGE_SIZE << order) < 256 * sizeof(struct ip_rt_acct) * NR_CPUS; order++)
- /* NOTHING */;
- ip_rt_acct = (struct ip_rt_acct *)__get_free_pages(GFP_KERNEL, order);
- if (!ip_rt_acct)
- panic("IP: failed to allocate ip_rt_acct\n");
- memset(ip_rt_acct, 0, PAGE_SIZE << order);
- }
-#endif
-
ipv4_dst_ops.kmem_cachep =
kmem_cache_create("ip_dst_cache", sizeof(struct rtable), 0,
SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] NET : convert ip_rt_acct to per_cpu variables
2007-11-16 8:59 [PATCH] NET : convert ip_rt_acct to per_cpu variables Eric Dumazet
@ 2007-11-16 9:12 ` David Miller
2007-11-16 9:35 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2007-11-16 9:12 UTC (permalink / raw)
To: dada1; +Cc: netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 16 Nov 2007 09:59:03 +0100
> [PATCH] NET : NET_CLS_ROUTE : convert ip_rt_acct to per_cpu variables
>
> ip_rt_acct needs 4096 bytes per cpu to perform some accounting.
> It is actually allocated as a single huge array [4096*NR_CPUS]
> (rounded up to a power of two)
>
> Converting it to a per cpu variable is wanted to :
> - Save space on machines were num_possible_cpus() < NR_CPUS
> - Better NUMA placement (each cpu gets memory on its node)
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
This is better in one sense but worse in another.
At least the previous code dynamically allocated the thing,
now at least one copy is taking up core kernel text image
space.
I think it's an alloc_percpu() candidate, what do you think?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] NET : convert ip_rt_acct to per_cpu variables
2007-11-16 9:12 ` David Miller
@ 2007-11-16 9:35 ` Eric Dumazet
2007-11-16 10:58 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2007-11-16 9:35 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Fri, 16 Nov 2007 01:12:43 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Fri, 16 Nov 2007 09:59:03 +0100
>
> > [PATCH] NET : NET_CLS_ROUTE : convert ip_rt_acct to per_cpu variables
> >
> > ip_rt_acct needs 4096 bytes per cpu to perform some accounting.
> > It is actually allocated as a single huge array [4096*NR_CPUS]
> > (rounded up to a power of two)
> >
> > Converting it to a per cpu variable is wanted to :
> > - Save space on machines were num_possible_cpus() < NR_CPUS
> > - Better NUMA placement (each cpu gets memory on its node)
> >
> > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
> This is better in one sense but worse in another.
>
> At least the previous code dynamically allocated the thing,
> now at least one copy is taking up core kernel text image
> space.
Oh I see, you think that adding 4096 null bytes to the static percpu area might be a problem.
>
> I think it's an alloc_percpu() candidate, what do you think?
>
Yes absolutely, I will submit a new version.
Thank you
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] NET : convert ip_rt_acct to per_cpu variables
2007-11-16 9:35 ` Eric Dumazet
@ 2007-11-16 10:58 ` Eric Dumazet
2007-11-16 11:32 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2007-11-16 10:58 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
On Fri, 16 Nov 2007 10:35:46 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:
> On Fri, 16 Nov 2007 01:12:43 -0800 (PST)
> David Miller <davem@davemloft.net> wrote:
>
> > From: Eric Dumazet <dada1@cosmosbay.com>
> > Date: Fri, 16 Nov 2007 09:59:03 +0100
> >
> > > [PATCH] NET : NET_CLS_ROUTE : convert ip_rt_acct to per_cpu variables
> > >
> > > ip_rt_acct needs 4096 bytes per cpu to perform some accounting.
> > > It is actually allocated as a single huge array [4096*NR_CPUS]
> > > (rounded up to a power of two)
> > >
> > > Converting it to a per cpu variable is wanted to :
> > > - Save space on machines were num_possible_cpus() < NR_CPUS
> > > - Better NUMA placement (each cpu gets memory on its node)
> > >
> > > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> >
> > This is better in one sense but worse in another.
> >
> > At least the previous code dynamically allocated the thing,
> > now at least one copy is taking up core kernel text image
> > space.
>
> Oh I see, you think that adding 4096 null bytes to the static percpu area might be a problem.
>
> >
> > I think it's an alloc_percpu() candidate, what do you think?
> >
>
> Yes absolutely, I will submit a new version.
>
> Thank you
[PATCH] NET : NET_CLS_ROUTE : convert ip_rt_acct to per_cpu variables
ip_rt_acct needs 4096 bytes per cpu to perform some accounting.
It is actually allocated as a single huge array [4096*NR_CPUS]
(rounded up to a power of two)
Converting it to a per cpu variable is wanted to :
- Save space on machines were num_possible_cpus() < NR_CPUS
- Better NUMA placement (each cpu gets memory on its node)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
net/ipv4/ip_input.c | 2 +-
net/ipv4/route.c | 15 +++------------
2 files changed, 4 insertions(+), 13 deletions(-)
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 5b8a760..4068e17 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -347,7 +347,7 @@ static int ip_rcv_finish(struct sk_buff *skb)
#ifdef CONFIG_NET_CLS_ROUTE
if (unlikely(skb->dst->tclassid)) {
- struct ip_rt_acct *st = ip_rt_acct + 256*smp_processor_id();
+ struct ip_rt_acct *st = per_cpu_ptr(ip_rt_acct, smp_processor_id());
u32 idx = skb->dst->tclassid;
st[idx&0xFF].o_packets++;
st[idx&0xFF].o_bytes+=skb->len;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 70529a9..856807c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2855,12 +2855,10 @@ ctl_table ipv4_route_table[] = {
#endif
#ifdef CONFIG_NET_CLS_ROUTE
-struct ip_rt_acct *ip_rt_acct;
-
-/* This code sucks. But you should have seen it before! --RR */
+struct ip_rt_acct *ip_rt_acct __read_mostly;
/* IP route accounting ptr for this logical cpu number. */
-#define IP_RT_ACCT_CPU(i) (ip_rt_acct + i * 256)
+#define IP_RT_ACCT_CPU(cpu) (per_cpu_ptr(ip_rt_acct, cpu))
#ifdef CONFIG_PROC_FS
static int ip_rt_acct_read(char *buffer, char **start, off_t offset,
@@ -2920,16 +2918,9 @@ int __init ip_rt_init(void)
(jiffies ^ (jiffies >> 7)));
#ifdef CONFIG_NET_CLS_ROUTE
- {
- int order;
- for (order = 0;
- (PAGE_SIZE << order) < 256 * sizeof(struct ip_rt_acct) * NR_CPUS; order++)
- /* NOTHING */;
- ip_rt_acct = (struct ip_rt_acct *)__get_free_pages(GFP_KERNEL, order);
+ ip_rt_acct = __alloc_percpu(256 * sizeof(struct ip_rt_acct));
if (!ip_rt_acct)
panic("IP: failed to allocate ip_rt_acct\n");
- memset(ip_rt_acct, 0, PAGE_SIZE << order);
- }
#endif
ipv4_dst_ops.kmem_cachep =
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] NET : convert ip_rt_acct to per_cpu variables
2007-11-16 10:58 ` Eric Dumazet
@ 2007-11-16 11:32 ` David Miller
0 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2007-11-16 11:32 UTC (permalink / raw)
To: dada1; +Cc: netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 16 Nov 2007 11:58:52 +0100
> [PATCH] NET : NET_CLS_ROUTE : convert ip_rt_acct to per_cpu variables
>
> ip_rt_acct needs 4096 bytes per cpu to perform some accounting.
> It is actually allocated as a single huge array [4096*NR_CPUS]
> (rounded up to a power of two)
>
> Converting it to a per cpu variable is wanted to :
> - Save space on machines were num_possible_cpus() < NR_CPUS
> - Better NUMA placement (each cpu gets memory on its node)
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Applied to net-2.6.25, thanks Eric.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-11-16 11:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-16 8:59 [PATCH] NET : convert ip_rt_acct to per_cpu variables Eric Dumazet
2007-11-16 9:12 ` David Miller
2007-11-16 9:35 ` Eric Dumazet
2007-11-16 10:58 ` Eric Dumazet
2007-11-16 11:32 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).