From: Jakub Kicinski <kuba@kernel.org>
To: I Viswanath <viswanathiyyappan@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, sdf@fomichev.me, kuniyu@google.com,
ahmed.zaki@intel.com, aleksander.lobakin@intel.com,
jacob.e.keller@intel.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, skhan@linuxfoundation.org,
linux-kernel-mentees@lists.linux.dev,
david.hunter.linux@gmail.com, khalid@kernel.org
Subject: Re: [RFC/RFT PATCH net-next v3 1/2] net: Add ndo_write_rx_config and helper structs and functions:
Date: Tue, 4 Nov 2025 16:46:25 -0800 [thread overview]
Message-ID: <20251104164625.5a18db43@kernel.org> (raw)
In-Reply-To: <CAPrAcgPD0ZPNPOivpX=69qC88-AAeW+Jy=oy-6+PP8jDxzNabA@mail.gmail.com>
On Tue, 4 Nov 2025 22:13:49 +0530 I Viswanath wrote:
> On Fri, 31 Oct 2025 at 07:50, Jakub Kicinski <kuba@kernel.org> wrote:
> > The driver you picked is relatively trivial, advanced drivers need
> > to sync longer lists of mcast / ucast addresses. Bulk of the complexity
> > is in keeping those lists. Simple
> >
> > *rx_config = *(config_ptr);
> >
> > assignment is not enough.
>
> Apologies, I had the wrong mental model of the snapshot.
>
> From what I understand, the snapshot should look something like
>
> struct netif_rx_config {
> char *uc_addrs; // of size uc_count * dev->addr_len
> char *mc_addrs; // of size mc_count * dev->addr_len
> int uc_count;
> int mc_count;
> bool multi_en, promisc_en, vlan_en;
> void *device_specific_config;
> }
> Correct me if I have missed anything
>
> Does the following pseudocode/skeleton make sense?
>
> update_config() will be called at end of set_rx_mode()
>
> read_config() is execute_write_rx_config() and do_io() is
> dev->netdev_ops->ndo_write_rx_config() named that way
> for consistency (since read/update)
>
> atomic_t cfg_in_use = ATOMIC_INIT(false);
> atomic_t cfg_update_pending = ATOMIC_INIT(false);
>
> struct netif_rx_config *active, *staged;
>
> void update_config()
> {
> int was_config_pending = atomic_xchg(&cfg_update_pending, false);
>
> // If prepare_config fails, it leaves staged untouched
> // So, we check for and apply if pending update
> int rc = prepare_config(&staged);
> if (rc && !was_config_pending)
> return;
>
> if (atomic_read(&cfg_in_use)) {
> atomic_set(&cfg_update_pending, true);
> return;
> }
> swap(active, staged);
> }
>
> void read_config()
> {
> atomic_set(&cfg_in_use, true);
> do_io(active);
> atomic_set(&cfg_in_use, false);
>
> // To account for the edge case where update_config() is called
> // during the execution of read_config() and there are no subsequent
> // calls to update_config()
> if (atomic_xchg(&cfg_update_pending, false))
> swap(active, staged);
> }
I wouldn't use atomic flags. IIRC ndo_set_rx_mode is called under
netif_addr_lock_bh(), so we can reuse that lock, have update_config()
assume ownership of the pending config and update it directly.
And read_config() (which IIUC runs from a wq) can take that lock
briefly, and swap which config is pending.
> >The driver needs to know old and new entries
> > and send ADD/DEL commands to FW. Converting virtio_net would be better,
> > but it does one huge dump which is also not representative of most
> > advanced NICs.
>
> We can definitely do this in prepare_config()
> Speaking of which, How big can uc_count and mc_count be?
>
> Would krealloc(buffer, uc_count * dev->addr_len, GFP_ATOMIC) be a good idea?
Not sure about the max value but I'd think low thousands is probably
a good target. IOW yes, I think one linear buffer may be a concern.
I'd think order 1 allocation may be fine tho..
> Well, virtio-net does kmalloc((uc_count + mc_count) * ETH_ALEN) + ...,
> GFP_ATOMIC),
> so this shouldn't introduce any new failures for virtio-net
Right but IDK if virtio is used on systems with the same sort of scale
as a large physical function driver..
> > Let's only allocate any extra state if driver has the NDO
> > We need to shut down sooner, some time between ndo_stop and ndo_uninit
>
> Would it make sense to move init (if ndo exists) and cleanup to
> __dev_open and __dev_close?
Yes, indeed.
next prev parent reply other threads:[~2025-11-05 0:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 17:42 [RFC/RFT PATCH net-next v3 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write I Viswanath
2025-10-28 17:42 ` [RFC/RFT PATCH net-next v3 1/2] net: Add ndo_write_rx_config and helper structs and functions: I Viswanath
2025-10-31 2:20 ` Jakub Kicinski
2025-11-04 16:43 ` I Viswanath
2025-11-05 0:46 ` Jakub Kicinski [this message]
2025-11-06 16:08 ` I Viswanath
2025-11-06 22:12 ` Jakub Kicinski
2025-11-06 20:03 ` I Viswanath
2025-10-28 17:42 ` [RFC/RFT PATCH net-next v3 2/2] net: ethernet: Implement ndo_write_rx_config callback for the 8139cp driver I Viswanath
2025-10-29 4:48 ` [RFC/RFT PATCH net-next v3 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write I Viswanath
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251104164625.5a18db43@kernel.org \
--to=kuba@kernel.org \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=davem@davemloft.net \
--cc=david.hunter.linux@gmail.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=khalid@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel-mentees@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=skhan@linuxfoundation.org \
--cc=viswanathiyyappan@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).