From: Stephen Hemminger <stephen@networkplumber.org>
To: William Tu <witu@nvidia.com>
Cc: <netdev@vger.kernel.org>, <jiri@nvidia.com>, <bodong@nvidia.com>,
<kuba@kernel.org>
Subject: Re: [PATCH RFC net-next] net: cache the __dev_alloc_name()
Date: Tue, 7 May 2024 21:24:36 -0700 [thread overview]
Message-ID: <20240507212436.75c799ad@hermes.local> (raw)
In-Reply-To: <20240506203207.1307971-1-witu@nvidia.com>
On Mon, 6 May 2024 20:32:07 +0000
William Tu <witu@nvidia.com> wrote:
> When a system has around 1000 netdevs, adding the 1001st device becomes
> very slow. The devlink command to create an SF
> $ devlink port add pci/0000:03:00.0 flavour pcisf \
> pfnum 0 sfnum 1001
> takes around 5 seconds, and Linux perf and flamegraph show 19% of time
> spent on __dev_alloc_name() [1].
>
> The reason is that devlink first requests for next available "eth%d".
> And __dev_alloc_name will scan all existing netdev to match on "ethN",
> set N to a 'inuse' bitmap, and find/return next available number,
> in our case eth0.
>
> And later on based on udev rule, we renamed it from eth0 to
> "en3f0pf0sf1001" and with altname below
> 14: en3f0pf0sf1001: <BROADCAST,MULTICAST,UP,LOWER_UP> ...
> altname enp3s0f0npf0sf1001
>
> So eth0 is actually never being used, but as we have 1k "en3f0pf0sfN"
> devices + 1k altnames, the __dev_alloc_name spends lots of time goint
> through all existing netdev and try to build the 'inuse' bitmap of
> pattern 'eth%d'. And the bitmap barely has any bit set, and it rescanes
> every time.
>
> I want to see if it makes sense to save/cache the result, or is there
> any way to not go through the 'eth%d' pattern search. The RFC patch
> adds name_pat (name pattern) hlist and saves the 'inuse' bitmap. It saves
> pattens, ex: "eth%d", "veth%d", with the bitmap, and lookup before
> scanning all existing netdevs.
>
> Note: code is working just for quick performance benchmark, and still
> missing lots of stuff. Using hlist seems to overkill, as I think
> we only have few patterns
> $ git grep alloc_netdev drivers/ net/ | grep %d
>
> 1. https://github.com/williamtu/net-next/issues/1
>
> Signed-off-by: William Tu <witu@nvidia.com>
Actual patch is bit of a mess, with commented out code, leftover printks,
random whitespace changes. Please fix that.
The issue is that bitmap gets to be large and adds bloat to embedded devices.
Perhaps you could either force devlink to use the same device each time (eth0)
if it is going to be renamed anyway.
next prev parent reply other threads:[~2024-05-08 4:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-06 20:32 [PATCH RFC net-next] net: cache the __dev_alloc_name() William Tu
2024-05-07 7:26 ` Paolo Abeni
2024-05-07 18:55 ` William Tu
2024-05-09 7:46 ` Paolo Abeni
2024-05-09 13:06 ` William Tu
2024-05-08 4:24 ` Stephen Hemminger [this message]
2024-05-09 3:27 ` William Tu
2024-05-10 21:30 ` William Tu
2024-05-09 6:11 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240507212436.75c799ad@hermes.local \
--to=stephen@networkplumber.org \
--cc=bodong@nvidia.com \
--cc=jiri@nvidia.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=witu@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.