* [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown
@ 2016-01-04 23:10 Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 1/4] net: add event_list to struct net and provide utility functions Salam Noureddine
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-01-04 23:10 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Eric W. Biederman, netdev
Cc: Salam Noureddine
fib_flush walks the whole fib in a net_namespace and is called for
each net_device being closed or unregistered. This can be very expensive
when dealing with 100k or more routes in the fib and removal of a lot
of interfaces. These four patches deal with this issue by calling fib_flush
just once for each net namespace and introduce a new function arp_ifdown_all
that does a similar optimization for the neighbour table.
Salam Noureddine (4):
net: add event_list to struct net and provide utility functions
net: dev: add batching to net_device notifiers
net: core: introduce neigh_ifdown_all for all down interfaces
net: fib: avoid calling fib_flush for each device when doing batch
close and unregister
include/linux/netdevice.h | 2 ++
include/net/arp.h | 1 +
include/net/neighbour.h | 1 +
include/net/net_namespace.h | 22 ++++++++++++++++++++++
include/net/netns/ipv4.h | 1 +
net/core/dev.c | 39 ++++++++++++++++++++++++++++++++++++---
net/core/neighbour.c | 38 +++++++++++++++++++++++++++++++-------
net/ipv4/arp.c | 4 ++++
net/ipv4/fib_frontend.c | 16 ++++++++++++++--
9 files changed, 112 insertions(+), 12 deletions(-)
--
1.8.1.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net-next 1/4] net: add event_list to struct net and provide utility functions
2016-01-04 23:10 [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Salam Noureddine
@ 2016-01-04 23:10 ` Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 2/4] net: dev: add batching to net_device notifiers Salam Noureddine
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-01-04 23:10 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Eric W. Biederman, netdev
Cc: Salam Noureddine
Signed-off-by: Salam Noureddine <noureddine@arista.com>
---
include/net/net_namespace.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 4089abc..4cf47de 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -58,6 +58,7 @@ struct net {
struct list_head list; /* list of network namespaces */
struct list_head cleanup_list; /* namespaces on death row */
struct list_head exit_list; /* Use only net_mutex */
+ struct list_head event_list; /* net_device notifier list */
struct user_namespace *user_ns; /* Owning user namespace */
spinlock_t nsid_lock;
@@ -380,4 +381,25 @@ static inline void fnhe_genid_bump(struct net *net)
atomic_inc(&net->fnhe_genid);
}
+#ifdef CONFIG_NET_NS
+static inline void net_add_event_list(struct list_head *head, struct net *net)
+{
+ if (!list_empty(&net->event_list))
+ list_add_tail(&net->event_list, head);
+}
+
+static inline void net_del_event_list(struct net *net)
+{
+ list_del_init(&net->event_list);
+}
+#else
+static inline void net_add_event_list(struct list_head *head, struct net *net)
+{
+}
+
+static inline void net_del_event_list(struct net *net)
+{
+}
+#endif
+
#endif /* __NET_NET_NAMESPACE_H */
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next 2/4] net: dev: add batching to net_device notifiers
2016-01-04 23:10 [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 1/4] net: add event_list to struct net and provide utility functions Salam Noureddine
@ 2016-01-04 23:10 ` Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 3/4] net: core: introduce neigh_ifdown_all for all down interfaces Salam Noureddine
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-01-04 23:10 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Eric W. Biederman, netdev
Cc: Salam Noureddine
This can be used to optimize bringing down and unregsitering
net_devices by running certain cleanup operations only on the
net namespace instead of on each net_device.
Signed-off-by: Salam Noureddine <noureddine@arista.com>
---
include/linux/netdevice.h | 2 ++
net/core/dev.c | 39 ++++++++++++++++++++++++++++++++++++---
2 files changed, 38 insertions(+), 3 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c20b814..1b12269 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2183,6 +2183,8 @@ struct netdev_lag_lower_state_info {
#define NETDEV_BONDING_INFO 0x0019
#define NETDEV_PRECHANGEUPPER 0x001A
#define NETDEV_CHANGELOWERSTATE 0x001B
+#define NETDEV_UNREGISTER_BATCH 0x001C
+#define NETDEV_DOWN_BATCH 0x001D
int register_netdevice_notifier(struct notifier_block *nb);
int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 914b4a2..77410a3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1439,11 +1439,16 @@ static int __dev_close(struct net_device *dev)
int dev_close_many(struct list_head *head, bool unlink)
{
struct net_device *dev, *tmp;
+ struct net *net, *net_tmp;
+ LIST_HEAD(net_head);
/* Remove the devices that don't need to be closed */
- list_for_each_entry_safe(dev, tmp, head, close_list)
+ list_for_each_entry_safe(dev, tmp, head, close_list) {
if (!(dev->flags & IFF_UP))
list_del_init(&dev->close_list);
+ else
+ net_add_event_list(&net_head, dev_net(dev));
+ }
__dev_close_many(head);
@@ -1454,6 +1459,11 @@ int dev_close_many(struct list_head *head, bool unlink)
list_del_init(&dev->close_list);
}
+ list_for_each_entry_safe(net, net_tmp, &net_head, event_list) {
+ call_netdevice_notifiers(NETDEV_DOWN_BATCH, net->loopback_dev);
+ net_del_event_list(net);
+ }
+
return 0;
}
EXPORT_SYMBOL(dev_close_many);
@@ -1572,8 +1582,12 @@ rollback:
call_netdevice_notifier(nb, NETDEV_GOING_DOWN,
dev);
call_netdevice_notifier(nb, NETDEV_DOWN, dev);
+ call_netdevice_notifier(nb, NETDEV_DOWN_BATCH,
+ dev);
}
call_netdevice_notifier(nb, NETDEV_UNREGISTER, dev);
+ call_netdevice_notifier(nb, NETDEV_UNREGISTER_BATCH,
+ dev);
}
}
@@ -1614,8 +1628,12 @@ int unregister_netdevice_notifier(struct notifier_block *nb)
call_netdevice_notifier(nb, NETDEV_GOING_DOWN,
dev);
call_netdevice_notifier(nb, NETDEV_DOWN, dev);
+ call_netdevice_notifier(nb, NETDEV_DOWN_BATCH,
+ dev);
}
call_netdevice_notifier(nb, NETDEV_UNREGISTER, dev);
+ call_netdevice_notifier(nb, NETDEV_UNREGISTER_BATCH,
+ dev);
}
}
unlock:
@@ -6187,10 +6205,12 @@ void __dev_notify_flags(struct net_device *dev, unsigned int old_flags,
rtmsg_ifinfo(RTM_NEWLINK, dev, gchanges, GFP_ATOMIC);
if (changes & IFF_UP) {
- if (dev->flags & IFF_UP)
+ if (dev->flags & IFF_UP) {
call_netdevice_notifiers(NETDEV_UP, dev);
- else
+ } else {
call_netdevice_notifiers(NETDEV_DOWN, dev);
+ call_netdevice_notifiers(NETDEV_DOWN_BATCH, dev);
+ }
}
if (dev->flags & IFF_UP &&
@@ -6427,7 +6447,9 @@ static void net_set_todo(struct net_device *dev)
static void rollback_registered_many(struct list_head *head)
{
struct net_device *dev, *tmp;
+ struct net *net, *net_tmp;
LIST_HEAD(close_head);
+ LIST_HEAD(net_head);
BUG_ON(dev_boot_phase);
ASSERT_RTNL();
@@ -6504,6 +6526,15 @@ static void rollback_registered_many(struct list_head *head)
#endif
}
+ list_for_each_entry(dev, head, unreg_list) {
+ net_add_event_list(&net_head, dev_net(dev));
+ }
+ list_for_each_entry_safe(net, net_tmp, &net_head, event_list) {
+ call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH,
+ net->loopback_dev);
+ net_del_event_list(net);
+ }
+
synchronize_net();
list_for_each_entry(dev, head, unreg_list)
@@ -7065,6 +7096,7 @@ static void netdev_wait_allrefs(struct net_device *dev)
/* Rebroadcast unregister notification */
call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
+ call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev);
__rtnl_unlock();
rcu_barrier();
@@ -7581,6 +7613,7 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
the device is just moving and can keep their slaves up.
*/
call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
+ call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev);
rcu_barrier();
call_netdevice_notifiers(NETDEV_UNREGISTER_FINAL, dev);
rtmsg_ifinfo(RTM_DELLINK, dev, ~0U, GFP_KERNEL);
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next 3/4] net: core: introduce neigh_ifdown_all for all down interfaces
2016-01-04 23:10 [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 1/4] net: add event_list to struct net and provide utility functions Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 2/4] net: dev: add batching to net_device notifiers Salam Noureddine
@ 2016-01-04 23:10 ` Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 4/4] net: fib: avoid calling fib_flush for each device when doing batch close and unregister Salam Noureddine
2016-01-05 0:35 ` [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Eric W. Biederman
4 siblings, 0 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-01-04 23:10 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Eric W. Biederman, netdev
Cc: Salam Noureddine
This cleans up neighbour entries for all interfaces in the down
state, avoiding walking the whole neighbour table for each interface
being brought down.
Signed-off-by: Salam Noureddine <noureddine@arista.com>
---
include/net/arp.h | 1 +
include/net/neighbour.h | 1 +
net/core/neighbour.c | 38 +++++++++++++++++++++++++++++++-------
net/ipv4/arp.c | 4 ++++
4 files changed, 37 insertions(+), 7 deletions(-)
diff --git a/include/net/arp.h b/include/net/arp.h
index 5e0f891..0efee66 100644
--- a/include/net/arp.h
+++ b/include/net/arp.h
@@ -43,6 +43,7 @@ void arp_send(int type, int ptype, __be32 dest_ip,
const unsigned char *src_hw, const unsigned char *th);
int arp_mc_map(__be32 addr, u8 *haddr, struct net_device *dev, int dir);
void arp_ifdown(struct net_device *dev);
+void arp_ifdown_all(void);
struct sk_buff *arp_create(int type, int ptype, __be32 dest_ip,
struct net_device *dev, __be32 src_ip,
diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 8b68384..8785d7b 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -318,6 +318,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new, u32 flags);
void __neigh_set_probe_once(struct neighbour *neigh);
void neigh_changeaddr(struct neigh_table *tbl, struct net_device *dev);
int neigh_ifdown(struct neigh_table *tbl, struct net_device *dev);
+int neigh_ifdown_all(struct neigh_table *tbl);
int neigh_resolve_output(struct neighbour *neigh, struct sk_buff *skb);
int neigh_connected_output(struct neighbour *neigh, struct sk_buff *skb);
int neigh_direct_output(struct neighbour *neigh, struct sk_buff *skb);
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index f18ae91..bfbd97a 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -54,7 +54,8 @@ do { \
static void neigh_timer_handler(unsigned long arg);
static void __neigh_notify(struct neighbour *n, int type, int flags);
static void neigh_update_notify(struct neighbour *neigh);
-static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev);
+static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
+ bool all_down);
#ifdef CONFIG_PROC_FS
static const struct file_operations neigh_stat_seq_fops;
@@ -192,7 +193,8 @@ static void pneigh_queue_purge(struct sk_buff_head *list)
}
}
-static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev)
+static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev,
+ bool all_down)
{
int i;
struct neigh_hash_table *nht;
@@ -210,6 +212,12 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev)
np = &n->next;
continue;
}
+ if (!dev && n->dev && all_down) {
+ if (n->dev->flags & IFF_UP) {
+ np = &n->next;
+ continue;
+ }
+ }
rcu_assign_pointer(*np,
rcu_dereference_protected(n->next,
lockdep_is_held(&tbl->lock)));
@@ -245,7 +253,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev)
void neigh_changeaddr(struct neigh_table *tbl, struct net_device *dev)
{
write_lock_bh(&tbl->lock);
- neigh_flush_dev(tbl, dev);
+ neigh_flush_dev(tbl, dev, false);
write_unlock_bh(&tbl->lock);
}
EXPORT_SYMBOL(neigh_changeaddr);
@@ -253,8 +261,8 @@ EXPORT_SYMBOL(neigh_changeaddr);
int neigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
{
write_lock_bh(&tbl->lock);
- neigh_flush_dev(tbl, dev);
- pneigh_ifdown(tbl, dev);
+ neigh_flush_dev(tbl, dev, false);
+ pneigh_ifdown(tbl, dev, false);
write_unlock_bh(&tbl->lock);
del_timer_sync(&tbl->proxy_timer);
@@ -263,6 +271,19 @@ int neigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
}
EXPORT_SYMBOL(neigh_ifdown);
+int neigh_ifdown_all(struct neigh_table *tbl)
+{
+ write_lock_bh(&tbl->lock);
+ neigh_flush_dev(tbl, NULL, true);
+ pneigh_ifdown(tbl, NULL, true);
+ write_unlock_bh(&tbl->lock);
+
+ del_timer_sync(&tbl->proxy_timer);
+ pneigh_queue_purge(&tbl->proxy_queue);
+ return 0;
+}
+EXPORT_SYMBOL(neigh_ifdown_all);
+
static struct neighbour *neigh_alloc(struct neigh_table *tbl, struct net_device *dev)
{
struct neighbour *n = NULL;
@@ -645,7 +666,8 @@ int pneigh_delete(struct neigh_table *tbl, struct net *net, const void *pkey,
return -ENOENT;
}
-static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
+static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
+ bool all_down)
{
struct pneigh_entry *n, **np;
u32 h;
@@ -653,7 +675,9 @@ static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
for (h = 0; h <= PNEIGH_HASHMASK; h++) {
np = &tbl->phash_buckets[h];
while ((n = *np) != NULL) {
- if (!dev || n->dev == dev) {
+ if ((!dev && !all_down) || (all_down && n->dev &&
+ !(n->dev->flags & IFF_UP)) ||
+ n->dev == dev) {
*np = n->next;
if (tbl->pdestructor)
tbl->pdestructor(n);
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 59b3e0e..1328244 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -1219,6 +1219,10 @@ void arp_ifdown(struct net_device *dev)
neigh_ifdown(&arp_tbl, dev);
}
+void arp_ifdown_all(void)
+{
+ neigh_ifdown_all(&arp_tbl);
+}
/*
* Called once on startup.
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next 4/4] net: fib: avoid calling fib_flush for each device when doing batch close and unregister
2016-01-04 23:10 [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Salam Noureddine
` (2 preceding siblings ...)
2016-01-04 23:10 ` [PATCH net-next 3/4] net: core: introduce neigh_ifdown_all for all down interfaces Salam Noureddine
@ 2016-01-04 23:10 ` Salam Noureddine
2016-01-05 0:35 ` [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Eric W. Biederman
4 siblings, 0 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-01-04 23:10 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Eric W. Biederman, netdev
Cc: Salam Noureddine
Call fib_flush at the end when closing or unregistering multiple
devices. This can save walking the fib many times and greatly
reduce rtnl_lock hold time when unregistering many devices with
a fib having hundreds of thousands of routes.
Signed-off-by: Salam Noureddine <noureddine@arista.com>
---
include/net/netns/ipv4.h | 1 +
net/ipv4/fib_frontend.c | 16 ++++++++++++++--
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index d75be32..d59a078 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -111,5 +111,6 @@ struct netns_ipv4 {
#endif
#endif
atomic_t rt_genid;
+ bool needs_fib_flush;
};
#endif
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 4734475..808426e 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1161,11 +1161,22 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
unsigned int flags;
if (event == NETDEV_UNREGISTER) {
- fib_disable_ip(dev, event, true);
+ if (fib_sync_down_dev(dev, event, true))
+ net->ipv4.needs_fib_flush = true;
rt_flush_dev(dev);
return NOTIFY_DONE;
}
+ if (event == NETDEV_UNREGISTER_BATCH || event == NETDEV_DOWN_BATCH) {
+ if (net->ipv4.needs_fib_flush) {
+ fib_flush(net);
+ net->ipv4.needs_fib_flush = false;
+ }
+ rt_cache_flush(net);
+ arp_ifdown_all();
+ return NOTIFY_DONE;
+ }
+
in_dev = __in_dev_get_rtnl(dev);
if (!in_dev)
return NOTIFY_DONE;
@@ -1182,7 +1193,8 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
rt_cache_flush(net);
break;
case NETDEV_DOWN:
- fib_disable_ip(dev, event, false);
+ if (fib_sync_down_dev(dev, event, false))
+ net->ipv4.needs_fib_flush = true;
break;
case NETDEV_CHANGE:
flags = dev_get_flags(dev);
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown
2016-01-04 23:10 [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Salam Noureddine
` (3 preceding siblings ...)
2016-01-04 23:10 ` [PATCH net-next 4/4] net: fib: avoid calling fib_flush for each device when doing batch close and unregister Salam Noureddine
@ 2016-01-05 0:35 ` Eric W. Biederman
2016-01-05 1:12 ` Salam Noureddine
4 siblings, 1 reply; 9+ messages in thread
From: Eric W. Biederman @ 2016-01-05 0:35 UTC (permalink / raw)
To: Salam Noureddine
Cc: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, netdev
Salam Noureddine <noureddine@arista.com> writes:
> fib_flush walks the whole fib in a net_namespace and is called for
> each net_device being closed or unregistered. This can be very expensive
> when dealing with 100k or more routes in the fib and removal of a lot
> of interfaces. These four patches deal with this issue by calling fib_flush
> just once for each net namespace and introduce a new function arp_ifdown_all
> that does a similar optimization for the neighbour table.
Two things would be very valuable with this patchset.
Some numbers on how much your changes have improved the code in the case
you care about. I suspect the improvements are not subtle so this
should not be hard.
Can you please provide a justification for event_list. Just skimming
through it appears that event_list because a duplicate of the list of
batched network devices that are passed to dev_close_many and friends.
If the list is actually a duplicate it appears foolish to create it.
Eric
> Salam Noureddine (4):
> net: add event_list to struct net and provide utility functions
> net: dev: add batching to net_device notifiers
> net: core: introduce neigh_ifdown_all for all down interfaces
> net: fib: avoid calling fib_flush for each device when doing batch
> close and unregister
>
> include/linux/netdevice.h | 2 ++
> include/net/arp.h | 1 +
> include/net/neighbour.h | 1 +
> include/net/net_namespace.h | 22 ++++++++++++++++++++++
> include/net/netns/ipv4.h | 1 +
> net/core/dev.c | 39 ++++++++++++++++++++++++++++++++++++---
> net/core/neighbour.c | 38 +++++++++++++++++++++++++++++++-------
> net/ipv4/arp.c | 4 ++++
> net/ipv4/fib_frontend.c | 16 ++++++++++++++--
> 9 files changed, 112 insertions(+), 12 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown
2016-01-05 0:35 ` [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Eric W. Biederman
@ 2016-01-05 1:12 ` Salam Noureddine
2016-01-12 1:53 ` Salam Noureddine
0 siblings, 1 reply; 9+ messages in thread
From: Salam Noureddine @ 2016-01-05 1:12 UTC (permalink / raw)
To: Eric W. Biederman
Cc: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, netdev
On Mon, Jan 4, 2016 at 4:35 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Two things would be very valuable with this patchset.
>
> Some numbers on how much your changes have improved the code in the case
> you care about. I suspect the improvements are not subtle so this
> should not be hard.
>
> Can you please provide a justification for event_list. Just skimming
> through it appears that event_list because a duplicate of the list of
> batched network devices that are passed to dev_close_many and friends.
> If the list is actually a duplicate it appears foolish to create it.
>
> Eric
The performance test I ran tries to unregister 1000 dummy interfaces
with 512K routes in the fib.
Without the patch I could unregister 35 interfaces per second and with
the patch it jumped to 620
interfaces per second. 512K is a lot of routes but I am assuming we
would get a good improvement
even with 100K routes in the fib.
I am using event_list to put all the net namespaces in the current
net_device batch on a list and only
call the NETDEV_UNREGISTER_BATCH on those namespaces. It would be
possible to just call the
notifier for NETDEV_DOWN/UNREGISTER_BATCH for all the devices on the
list and rely on the
needs_fib_flush flag to only call fib_flush once per namespace but it
seems like a waste to me.
Salam
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown
2016-01-05 1:12 ` Salam Noureddine
@ 2016-01-12 1:53 ` Salam Noureddine
0 siblings, 0 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-01-12 1:53 UTC (permalink / raw)
To: Eric W. Biederman
Cc: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Network Development
Any thoughts on the correctness of the patchset? Or a different way to
solve this issue leading to long rtnl_lock hold time?
Thanks,
Salam
On Mon, Jan 4, 2016 at 5:12 PM, Salam Noureddine <noureddine@arista.com> wrote:
> On Mon, Jan 4, 2016 at 4:35 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
>> Two things would be very valuable with this patchset.
>>
>> Some numbers on how much your changes have improved the code in the case
>> you care about. I suspect the improvements are not subtle so this
>> should not be hard.
>>
>> Can you please provide a justification for event_list. Just skimming
>> through it appears that event_list because a duplicate of the list of
>> batched network devices that are passed to dev_close_many and friends.
>> If the list is actually a duplicate it appears foolish to create it.
>>
>> Eric
>
> The performance test I ran tries to unregister 1000 dummy interfaces
> with 512K routes in the fib.
> Without the patch I could unregister 35 interfaces per second and with
> the patch it jumped to 620
> interfaces per second. 512K is a lot of routes but I am assuming we
> would get a good improvement
> even with 100K routes in the fib.
>
> I am using event_list to put all the net namespaces in the current
> net_device batch on a list and only
> call the NETDEV_UNREGISTER_BATCH on those namespaces. It would be
> possible to just call the
> notifier for NETDEV_DOWN/UNREGISTER_BATCH for all the devices on the
> list and rely on the
> needs_fib_flush flag to only call fib_flush once per namespace but it
> seems like a waste to me.
>
> Salam
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown
@ 2016-02-01 22:32 Salam Noureddine
0 siblings, 0 replies; 9+ messages in thread
From: Salam Noureddine @ 2016-02-01 22:32 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jiri Pirko, Alexei Starovoitov,
Daniel Borkmann, Eric W. Biederman, netdev
Cc: Salam Noureddine
fib_flush walks the whole fib in a net_namespace and is called for
each net_device being closed or unregistered. This can be very expensive
when dealing with 100k or more routes in the fib and removal of a lot
of interfaces. These four patches deal with this issue by calling fib_flush
just once for each net namespace and introduce a new function arp_ifdown_all
that does a similar optimization for the neighbour table.
I got the following benchmark results on one of our switches.
Without this patch, deleting 1k interfaces with 100k routes in the fib held
the rtnl_lock for 13 seconds. With the patch, rtnl_lock hold time went down
to 5 seconds. The gain is even more pronounced with 512k routes in the FIB.
In this case, without the patch, rtnl_lock was held for 36 seconds and with
the patch it was held for 5.5 seconds.
Salam Noureddine (4):
net: add event_list to struct net and provide utility functions
net: dev: add batching to net_device notifiers
net: core: introduce neigh_ifdown_all for all down interfaces
net: fib: avoid calling fib_flush for each device when doing batch
close and unregister
include/linux/netdevice.h | 2 ++
include/net/arp.h | 1 +
include/net/neighbour.h | 1 +
include/net/net_namespace.h | 22 ++++++++++++++++++++++
include/net/netns/ipv4.h | 1 +
net/core/dev.c | 39 ++++++++++++++++++++++++++++++++++++---
net/core/neighbour.c | 38 +++++++++++++++++++++++++++++++-------
net/ipv4/arp.c | 4 ++++
net/ipv4/fib_frontend.c | 16 ++++++++++++++--
9 files changed, 112 insertions(+), 12 deletions(-)
--
1.8.1.4
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-02-01 22:32 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-04 23:10 [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 1/4] net: add event_list to struct net and provide utility functions Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 2/4] net: dev: add batching to net_device notifiers Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 3/4] net: core: introduce neigh_ifdown_all for all down interfaces Salam Noureddine
2016-01-04 23:10 ` [PATCH net-next 4/4] net: fib: avoid calling fib_flush for each device when doing batch close and unregister Salam Noureddine
2016-01-05 0:35 ` [PATCH net-next 0/4] batch calls to fib_flush and arp_ifdown Eric W. Biederman
2016-01-05 1:12 ` Salam Noureddine
2016-01-12 1:53 ` Salam Noureddine
-- strict thread matches above, loose matches on Subject: below --
2016-02-01 22:32 Salam Noureddine
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).