netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Lahav Schlesinger <lschlesinger@drivenets.com>
Cc: netdev@vger.kernel.org, kuba@kernel.org, dsahern@gmail.com
Subject: Re: [PATCH net-next v4] rtnetlink: Support fine-grained netdevice bulk deletion
Date: Sun, 5 Dec 2021 11:53:03 +0200	[thread overview]
Message-ID: <YayL/7d/hm3TYjtV@shredder> (raw)
In-Reply-To: <20211202174502.28903-1-lschlesinger@drivenets.com>

On Thu, Dec 02, 2021 at 07:45:02PM +0200, Lahav Schlesinger wrote:
> Under large scale, some routers are required to support tens of thousands
> of devices at once, both physical and virtual (e.g. loopbacks, tunnels,
> vrfs, etc).
> At times such routers are required to delete massive amounts of devices
> at once, such as when a factory reset is performed on the router (causing
> a deletion of all devices), or when a configuration is restored after an
> upgrade, or as a request from an operator.
> 
> Currently there are 2 means of deleting devices using Netlink:
> 1. Deleting a single device (either by ifindex using ifinfomsg::ifi_index,
> or by name using IFLA_IFNAME)
> 2. Delete all device that belong to a group (using IFLA_GROUP)
> 
> Deletion of devices one-by-one has poor performance on large scale of
> devices compared to "group deletion":
> After all device are handled, netdev_run_todo() is called which
> calls rcu_barrier() to finish any outstanding RCU callbacks that were
> registered during the deletion of the device, then wait until the
> refcount of all the devices is 0, then perform final cleanups.
> 
> However, calling rcu_barrier() is a very costly operation, each call
> taking in the order of 10s of milliseconds.
> 
> When deleting a large number of device one-by-one, rcu_barrier()
> will be called for each device being deleted.
> As an example, following benchmark deletes 10K loopback devices,
> all of which are UP and with only IPv6 LLA being configured:
> 
> 1. Deleting one-by-one using 1 thread : 243 seconds
> 2. Deleting one-by-one using 10 thread: 70 seconds
> 3. Deleting one-by-one using 50 thread: 54 seconds
> 4. Deleting all using "group deletion": 30 seconds
> 
> Note that even though the deletion logic takes place under the rtnl
> lock, since the call to rcu_barrier() is outside the lock we gain
> some improvements.
> 
> But, while "group deletion" is the fastest, it is not suited for
> deleting large number of arbitrary devices which are unknown a head of
> time. Furthermore, moving large number of devices to a group is also a
> costly operation.

These are the number I get in a VM running on my laptop.

Moving 16k dummy netdevs to a group:

# time -p ip -b group.batch 
real 1.91
user 0.04
sys 0.27

Deleting the group:

# time -p ip link del group 10
real 6.15
user 0.00
sys 3.02

IMO, these numbers do not justify a new API. Also, your user space can
be taught to create all the netdevs in the same group to begin with:

# ip link add name dummy1 group 10 type dummy
# ip link show dev dummy1
10: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group 10 qlen 1000
    link/ether 12:b6:7d:ff:48:99 brd ff:ff:ff:ff:ff:ff

Moreover, unlike the list API that is specific to deletion, the group
API also lets you batch set operations:

# ip link set group 10 mtu 2000
# ip link show dev dummy1
10: dummy1: <BROADCAST,NOARP> mtu 2000 qdisc noop state DOWN mode
DEFAULT group 10 qlen 1000
    link/ether 12:b6:7d:ff:48:99 brd ff:ff:ff:ff:ff:ff

If you are using namespaces, then during "factory reset" you can delete
the namespace which should trigger batch deletion of the netdevs inside
it.

> 
> This patch adds support for passing an arbitrary list of ifindex of
> devices to delete with a new IFLA_IFINDEX attribute. A single message
> may contain multiple instances of this attribute).
> This gives a more fine-grained control over which devices to delete,
> while still resulting in rcu_barrier() being called only once.
> Indeed, the timings of using this new API to delete 10K devices is
> the same as using the existing "group" deletion.
> 
> Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>

  parent reply	other threads:[~2021-12-05  9:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-02 17:45 [PATCH net-next v4] rtnetlink: Support fine-grained netdevice bulk deletion Lahav Schlesinger
2021-12-04  1:06 ` Jakub Kicinski
2021-12-04 10:15 ` Nikolay Aleksandrov
2021-12-04 16:18   ` David Ahern
2021-12-05  8:28   ` Lahav Schlesinger
2021-12-05  9:53 ` Ido Schimmel [this message]
2021-12-05 12:11   ` Lahav Schlesinger
2021-12-05 13:49     ` Ido Schimmel
2021-12-05 15:05       ` Lahav Schlesinger
2021-12-06  8:07         ` Nicolas Dichtel
2021-12-07  3:56       ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YayL/7d/hm3TYjtV@shredder \
    --to=idosch@idosch.org \
    --cc=dsahern@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lschlesinger@drivenets.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).