From: Paolo Abeni <pabeni@redhat.com>
To: Jakub Kicinski <kuba@kernel.org>, davem@davemloft.net
Cc: netdev@vger.kernel.org, edumazet@google.com, mkubecek@suse.cz,
lorenzo@kernel.org
Subject: Re: [PATCH net-next 1/2] net: store netdevs in an xarray
Date: Mon, 24 Jul 2023 10:18:04 +0200 [thread overview]
Message-ID: <20788d4df9bbcdce9453be3fd047fdf8e0465714.camel@redhat.com> (raw)
In-Reply-To: <20230722014237.4078962-2-kuba@kernel.org>
On Fri, 2023-07-21 at 18:42 -0700, Jakub Kicinski wrote:
> Iterating over the netdev hash table for netlink dumps is hard.
> Dumps are done in "chunks" so we need to save the position
> after each chunk, so we know where to restart from. Because
> netdevs are stored in a hash table we remember which bucket
> we were in and how many devices we dumped.
>
> Since we don't hold any locks across the "chunks" - devices may
> come and go while we're dumping. If that happens we may miss
> a device (if device is deleted from the bucket we were in).
> We indicate to user space that this may have happened by setting
> NLM_F_DUMP_INTR. User space is supposed to dump again (I think)
> if it sees that. Somehow I doubt most user space gets this right..
>
> To illustrate let's look at an example:
>
> System state:
> start: # [A, B, C]
> del: B # [A, C]
>
> with the hash table we may dump [A, B], missing C completely even
> tho it existed both before and after the "del B".
>
> Add an xarray and use it to allocate ifindexes. This way we
> can iterate ifindexes in order, without the worry that we'll
> skip one. We may still generate a dump of a state which "never
> existed", for example for a set of values and sequence of ops:
>
> System state:
> start: # [A, B]
> add: C # [A, C, B]
> del: B # [A, C]
>
> we may generate a dump of [A], if C got an index between A and B.
> System has never been in such state. But I'm 90% sure that's perfectly
> fine, important part is that we can't _miss_ devices which exist before
> and after. User space which wants to mirror kernel's state subscribes
> to notifications and does periodic dumps so it will know that C exists
> from the notification about its creation or from the next dump
> (next dump is _guaranteed_ to include C, if it doesn't get removed).
>
> To avoid any perf regressions keep the hash table for now. Most
> net namespaces have very few devices and microbenchmarking 1M lookups
> on Skylake I get the following results (not counting loopback
> to number of devs):
A possibly dumb question: why using an xarray over a plain list? It
looks like the idea is to additionally use xarray for device lookup
beyond for dumping?
WRT the above, have you considered instead replacing dev_name_head with
an rhashtable? (and add the mentioned list)
Cheers,
Paolo
next prev parent reply other threads:[~2023-07-24 8:18 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-22 1:42 [PATCH net-next 0/2] net: store netdevs in an xarray Jakub Kicinski
2023-07-22 1:42 ` [PATCH net-next 1/2] " Jakub Kicinski
2023-07-22 1:47 ` Jakub Kicinski
2023-07-24 8:18 ` Paolo Abeni [this message]
2023-07-24 15:41 ` Jakub Kicinski
2023-07-24 16:23 ` Paolo Abeni
2023-07-24 17:27 ` Jakub Kicinski
2023-07-24 19:07 ` Jakub Kicinski
2023-07-25 11:11 ` Paolo Abeni
2023-07-25 16:56 ` Jakub Kicinski
2023-07-25 17:54 ` Sabrina Dubroca
2023-07-25 19:45 ` Jakub Kicinski
2023-07-24 19:09 ` Leon Romanovsky
2023-07-22 1:42 ` [PATCH net-next 2/2] net: convert some netlink netdev iterators to depend on the xarray Jakub Kicinski
2023-07-24 15:28 ` [PATCH net-next 0/2] net: store netdevs in an xarray Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20788d4df9bbcdce9453be3fd047fdf8e0465714.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=lorenzo@kernel.org \
--cc=mkubecek@suse.cz \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).