* [RFC] make per interface sysctl entries configurable
@ 2009-10-25 17:54 Octavian Purdila
2009-10-25 18:07 ` Denys Fedoryschenko
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Octavian Purdila @ 2009-10-25 17:54 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Benjamin LaHaise, Stephen Hemminger, Cosmin Ratiu, netdev
[-- Attachment #1: Type: text/plain, Size: 436 bytes --]
RFC patches are attached.
Another possible approach: add an interface flag and use it to decide whether
we want per interface sysctl entries or not.
Benchmarks for creating 1000 interface (with the ndst module previously posted
on the list, ppc750 @800Mhz machine):
- without the patches:
real 4m 38.27s
user 0m 0.00s
sys 2m 18.90s
- with the patches:
real 0m 0.10s
user 0m 0.00s
sys 0m 0.05s
Thanks,
tavi
[-- Attachment #2: 818108.diff --]
[-- Type: text/x-patch, Size: 2756 bytes --]
net: CONFIG_NET_SYSCTL_DEV: make per interface dev_snmp6 proc entries optional
Use same CONFIG_NET_SYSCTL_DEV config option (we should probably
rename it to a better name) to enable/disable per interface dev_snmp6
proc entries.
--- //packages/linux_2.6.31/rc7/src/arch/powerpc/configs/ixia_ppc750_defconfig
+++ //packages/linux_2.6.31/rc7/src/arch/powerpc/configs/ixia_ppc750_defconfig
@@ -287,6 +287,7 @@
#
# Networking options
#
+# CONFIG_NET_SYSCTL_DEV is not set
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
--- //packages/linux_2.6.31/rc7/src/net/Kconfig
+++ //packages/linux_2.6.31/rc7/src/net/Kconfig
@@ -25,6 +25,15 @@
menu "Networking options"
+config NET_SYSCTL_DEV
+ bool "Per device sysctl entries"
+ default y
+ depends on SYSCTL
+ help
+ Enables per device sysctl entries. You want this enabled unless
+ your system has a large number of interfaces and you want to reduce
+ memory usage.
+
source "net/packet/Kconfig"
source "net/unix/Kconfig"
source "net/xfrm/Kconfig"
--- //packages/linux_2.6.31/rc7/src/net/ipv4/devinet.c
+++ //packages/linux_2.6.31/rc7/src/net/ipv4/devinet.c
@@ -97,7 +97,7 @@
static BLOCKING_NOTIFIER_HEAD(inetaddr_chain);
static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap,
int destroy);
-#ifdef CONFIG_SYSCTL
+#ifdef CONFIG_NET_SYSCTL_DEV
static void devinet_sysctl_register(struct in_device *idev);
static void devinet_sysctl_unregister(struct in_device *idev);
#else
@@ -1397,7 +1397,6 @@
return ret;
}
-
#define DEVINET_SYSCTL_ENTRY(attr, name, mval, proc, sysctl) \
{ \
.ctl_name = NET_IPV4_CONF_ ## attr, \
@@ -1531,6 +1530,7 @@
kfree(t);
}
+#ifdef CONFIG_NET_SYSCTL_DEV
static void devinet_sysctl_register(struct in_device *idev)
{
neigh_sysctl_register(idev->dev, idev->arp_parms, NET_IPV4,
@@ -1544,6 +1544,7 @@
__devinet_sysctl_unregister(&idev->cnf);
neigh_sysctl_unregister(idev->arp_parms);
}
+#endif
static struct ctl_table ctl_forward_entry[] = {
{
--- //packages/linux_2.6.31/rc7/src/net/ipv6/addrconf.c
+++ //packages/linux_2.6.31/rc7/src/net/ipv6/addrconf.c
@@ -100,7 +100,7 @@
#define INFINITY_LIFE_TIME 0xFFFFFFFF
#define TIME_DELTA(a,b) ((unsigned long)((long)(a) - (long)(b)))
-#ifdef CONFIG_SYSCTL
+#ifdef CONFIG_NET_SYSCTL_DEV
static void addrconf_sysctl_register(struct inet6_dev *idev);
static void addrconf_sysctl_unregister(struct inet6_dev *idev);
#else
@@ -4412,6 +4412,7 @@
kfree(t);
}
+#ifdef CONFIG_NET_SYSCTL_DEV
static void addrconf_sysctl_register(struct inet6_dev *idev)
{
neigh_sysctl_register(idev->dev, idev->nd_parms, NET_IPV6,
@@ -4427,7 +4428,7 @@
__addrconf_sysctl_unregister(&idev->cnf);
neigh_sysctl_unregister(idev->nd_parms);
}
-
+#endif
#endif
[-- Attachment #3: 818110.diff --]
[-- Type: text/x-patch, Size: 1277 bytes --]
net: CONFIG_NET_SYSCTL_DEV: make per interface dev_snmp6 proc entries optional
Use same CONFIG_NET_SYSCTL_DEV config option (we should probably
rename it to a better name) to enable/disable per interface dev_snmp6
proc entries.
--- //packages/linux_2.6.31/rc7/src/include/net/ipv6.h
+++ //packages/linux_2.6.31/rc7/src/include/net/ipv6.h
@@ -604,8 +604,14 @@
extern void udplite6_proc_exit(void);
extern int ipv6_misc_proc_init(void);
extern void ipv6_misc_proc_exit(void);
+
+#ifdef CONFIG_NET_SYSCTL_DEV
extern int snmp6_register_dev(struct inet6_dev *idev);
extern int snmp6_unregister_dev(struct inet6_dev *idev);
+#else
+static inline int snmp6_register_dev(struct inet6_dev *idev) { return 0; }
+static inline int snmp6_unregister_dev(struct inet6_dev *idev) { return 0; }
+#endif
#else
static inline int ac6_proc_init(struct net *net) { return 0; }
--- //packages/linux_2.6.31/rc7/src/net/ipv6/proc.c
+++ //packages/linux_2.6.31/rc7/src/net/ipv6/proc.c
@@ -232,6 +232,7 @@
.release = single_release,
};
+#ifdef CONFIG_NET_SYSCTL_DEV
int snmp6_register_dev(struct inet6_dev *idev)
{
struct proc_dir_entry *p;
@@ -266,6 +267,7 @@
idev->stats.proc_dir_entry = NULL;
return 0;
}
+#endif
static int ipv6_proc_init_net(struct net *net)
{
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 17:54 [RFC] make per interface sysctl entries configurable Octavian Purdila
@ 2009-10-25 18:07 ` Denys Fedoryschenko
2009-10-25 21:37 ` Eric Dumazet
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryschenko @ 2009-10-25 18:07 UTC (permalink / raw)
To: Octavian Purdila
Cc: Eric Dumazet, Benjamin LaHaise, Stephen Hemminger, Cosmin Ratiu,
netdev
Very interesting patch, because i have PPPoE and sysctl locking issue my issue
N1(accorting to perf and oprofile) on massive pppoe login and during
operation.
Probably i will try to apply it on one of loaded (but redundant, in case it
will crash) pppoe.
On Sunday 25 October 2009 19:54:49 Octavian Purdila wrote:
> RFC patches are attached.
>
> Another possible approach: add an interface flag and use it to decide
> whether we want per interface sysctl entries or not.
>
> Benchmarks for creating 1000 interface (with the ndst module previously
> posted on the list, ppc750 @800Mhz machine):
>
> - without the patches:
>
> real 4m 38.27s
> user 0m 0.00s
> sys 2m 18.90s
>
> - with the patches:
>
> real 0m 0.10s
> user 0m 0.00s
> sys 0m 0.05s
>
> Thanks,
> tavi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 17:54 [RFC] make per interface sysctl entries configurable Octavian Purdila
2009-10-25 18:07 ` Denys Fedoryschenko
@ 2009-10-25 21:37 ` Eric Dumazet
2009-10-25 22:21 ` Octavian Purdila
2009-10-26 4:31 ` Stephen Hemminger
2009-10-26 15:24 ` Denys Fedoryschenko
3 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2009-10-25 21:37 UTC (permalink / raw)
To: Octavian Purdila
Cc: Benjamin LaHaise, Stephen Hemminger, Cosmin Ratiu, netdev
Octavian Purdila a écrit :
> RFC patches are attached.
>
> Another possible approach: add an interface flag and use it to decide whether
> we want per interface sysctl entries or not.
>
Hmm, could we speedup sysctl instead, adding rbtree or something ?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 21:37 ` Eric Dumazet
@ 2009-10-25 22:21 ` Octavian Purdila
2009-10-25 22:32 ` Denys Fedoryschenko
2009-10-26 9:01 ` Cosmin Ratiu
0 siblings, 2 replies; 8+ messages in thread
From: Octavian Purdila @ 2009-10-25 22:21 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Benjamin LaHaise, Stephen Hemminger, Cosmin Ratiu, netdev
On Sunday 25 October 2009 23:37:19 you wrote:
> Octavian Purdila a écrit :
> > RFC patches are attached.
> >
> > Another possible approach: add an interface flag and use it to decide
> > whether we want per interface sysctl entries or not.
>
> Hmm, could we speedup sysctl instead, adding rbtree or something ?
>
Very good point, I think this is the best solution for people using a
moderately high number of interfaces (a few thousand).
But for really large setups there is another issue: memory consumption. In
fact, in order to be able to scale to 128K interfaces and still have a
significant amount of memory available to applications we also had to disable
sysfs and #ifdef CONFIG_SYSFS struct device from net_device.
I would also argue that when you have such a large number of interfaces you
don't need to change setting on a per interface basis. Or at least this is our
case :) and I suspect that the case with a large number of PPP interfaces is
similar.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 22:21 ` Octavian Purdila
@ 2009-10-25 22:32 ` Denys Fedoryschenko
2009-10-26 9:01 ` Cosmin Ratiu
1 sibling, 0 replies; 8+ messages in thread
From: Denys Fedoryschenko @ 2009-10-25 22:32 UTC (permalink / raw)
To: Octavian Purdila
Cc: Eric Dumazet, Benjamin LaHaise, Stephen Hemminger, Cosmin Ratiu,
netdev
On Monday 26 October 2009 00:21:48 Octavian Purdila wrote:
> Very good point, I think this is the best solution for people using a
> moderately high number of interfaces (a few thousand).
>
> But for really large setups there is another issue: memory consumption. In
> fact, in order to be able to scale to 128K interfaces and still have a
> significant amount of memory available to applications we also had to
> disable sysfs and #ifdef CONFIG_SYSFS struct device from net_device.
>
> I would also argue that when you have such a large number of interfaces you
> don't need to change setting on a per interface basis. Or at least this is
> our case :) and I suspect that the case with a large number of PPP
> interfaces is similar.
I will add also, sysctl -a (over busybox) on pppoe with 2k interfaces takes
ages to complete.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 17:54 [RFC] make per interface sysctl entries configurable Octavian Purdila
2009-10-25 18:07 ` Denys Fedoryschenko
2009-10-25 21:37 ` Eric Dumazet
@ 2009-10-26 4:31 ` Stephen Hemminger
2009-10-26 15:24 ` Denys Fedoryschenko
3 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2009-10-26 4:31 UTC (permalink / raw)
To: Octavian Purdila; +Cc: Eric Dumazet, Benjamin LaHaise, Cosmin Ratiu, netdev
On Sun, 25 Oct 2009 19:54:49 +0200
Octavian Purdila <opurdila@ixiacom.com> wrote:
>
> RFC patches are attached.
>
> Another possible approach: add an interface flag and use it to decide whether
> we want per interface sysctl entries or not.
>
> Benchmarks for creating 1000 interface (with the ndst module previously posted
> on the list, ppc750 @800Mhz machine):
>
> - without the patches:
>
> real 4m 38.27s
> user 0m 0.00s
> sys 2m 18.90s
>
> - with the patches:
>
> real 0m 0.10s
> user 0m 0.00s
> sys 0m 0.05s
>
> Thanks,
> tavi
I would rather optimize the algorithm than give up and make it
not available. It should be possible to do better by just using some
better programming.
--
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 22:21 ` Octavian Purdila
2009-10-25 22:32 ` Denys Fedoryschenko
@ 2009-10-26 9:01 ` Cosmin Ratiu
1 sibling, 0 replies; 8+ messages in thread
From: Cosmin Ratiu @ 2009-10-26 9:01 UTC (permalink / raw)
To: Octavian Purdila
Cc: Eric Dumazet, Benjamin LaHaise, Stephen Hemminger, netdev
On Monday 26 October 2009 00:21:48 Octavian Purdila wrote:
> On Sunday 25 October 2009 23:37:19 you wrote:
> > Octavian Purdila a écrit :
> > > RFC patches are attached.
> > >
> > > Another possible approach: add an interface flag and use it to decide
> > > whether we want per interface sysctl entries or not.
> >
> > Hmm, could we speedup sysctl instead, adding rbtree or something ?
>
> Very good point, I think this is the best solution for people using a
> moderately high number of interfaces (a few thousand).
>
> But for really large setups there is another issue: memory consumption. In
> fact, in order to be able to scale to 128K interfaces and still have a
> significant amount of memory available to applications we also had to
> disable sysfs and #ifdef CONFIG_SYSFS struct device from net_device.
>
> I would also argue that when you have such a large number of interfaces you
> don't need to change setting on a per interface basis. Or at least this is
> our case :) and I suspect that the case with a large number of PPP
> interfaces is similar.
>
Another possible approach: shared settings for an interface group. If you have
a large number of interfaces of the same type it would be nice if you could
change some setting for the whole group instead of globally or individually.
Is this approach feasible anyway? Or I'm talking rubbish.
Cosmin.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC] make per interface sysctl entries configurable
2009-10-25 17:54 [RFC] make per interface sysctl entries configurable Octavian Purdila
` (2 preceding siblings ...)
2009-10-26 4:31 ` Stephen Hemminger
@ 2009-10-26 15:24 ` Denys Fedoryschenko
3 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryschenko @ 2009-10-26 15:24 UTC (permalink / raw)
To: Octavian Purdila
Cc: Eric Dumazet, Benjamin LaHaise, Stephen Hemminger, Cosmin Ratiu,
netdev
I test it on pppoe with 1k customers. It works flawlessly.
When there is problem on network and i have massive users disconnect and then
login, the bottleneck is in lock somewhere in creation of sysctl(according
perf). PPPoE after 200-300 interfaces will start dying, and connection rate
will drop to 20-50 customers per minute, load average will jump to 70-100 (i
guess pppd processes waiting their turn). With this patch i am able to
sustain 200-300 customers / minute login rate and perftop is "clear" now.
Definitely this option is optional, and doesn't cut any functionality by
default, just giving more choice. And for PPP (pppoe/pptp) NAS it is very
useful.
On Sunday 25 October 2009 19:54:49 Octavian Purdila wrote:
> RFC patches are attached.
>
> Another possible approach: add an interface flag and use it to decide
> whether we want per interface sysctl entries or not.
>
> Benchmarks for creating 1000 interface (with the ndst module previously
> posted on the list, ppc750 @800Mhz machine):
>
> - without the patches:
>
> real 4m 38.27s
> user 0m 0.00s
> sys 2m 18.90s
>
> - with the patches:
>
> real 0m 0.10s
> user 0m 0.00s
> sys 0m 0.05s
>
> Thanks,
> tavi
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-10-26 15:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-25 17:54 [RFC] make per interface sysctl entries configurable Octavian Purdila
2009-10-25 18:07 ` Denys Fedoryschenko
2009-10-25 21:37 ` Eric Dumazet
2009-10-25 22:21 ` Octavian Purdila
2009-10-25 22:32 ` Denys Fedoryschenko
2009-10-26 9:01 ` Cosmin Ratiu
2009-10-26 4:31 ` Stephen Hemminger
2009-10-26 15:24 ` Denys Fedoryschenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).