From: Jakub Kicinski <kuba@kernel.org>
To: I Viswanath <viswanathiyyappan@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, sdf@fomichev.me, kuniyu@google.com,
ahmed.zaki@intel.com, aleksander.lobakin@intel.com,
jacob.e.keller@intel.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, skhan@linuxfoundation.org,
linux-kernel-mentees@lists.linux.dev,
david.hunter.linux@gmail.com, khalid@kernel.org
Subject: Re: [RFC/RFT PATCH net-next v3 1/2] net: Add ndo_write_rx_config and helper structs and functions:
Date: Tue, 4 Nov 2025 16:46:25 -0800 [thread overview]
Message-ID: <20251104164625.5a18db43@kernel.org> (raw)
In-Reply-To: <CAPrAcgPD0ZPNPOivpX=69qC88-AAeW+Jy=oy-6+PP8jDxzNabA@mail.gmail.com>
On Tue, 4 Nov 2025 22:13:49 +0530 I Viswanath wrote:
> On Fri, 31 Oct 2025 at 07:50, Jakub Kicinski <kuba@kernel.org> wrote:
> > The driver you picked is relatively trivial, advanced drivers need
> > to sync longer lists of mcast / ucast addresses. Bulk of the complexity
> > is in keeping those lists. Simple
> >
> > *rx_config = *(config_ptr);
> >
> > assignment is not enough.
>
> Apologies, I had the wrong mental model of the snapshot.
>
> From what I understand, the snapshot should look something like
>
> struct netif_rx_config {
> char *uc_addrs; // of size uc_count * dev->addr_len
> char *mc_addrs; // of size mc_count * dev->addr_len
> int uc_count;
> int mc_count;
> bool multi_en, promisc_en, vlan_en;
> void *device_specific_config;
> }
> Correct me if I have missed anything
>
> Does the following pseudocode/skeleton make sense?
>
> update_config() will be called at end of set_rx_mode()
>
> read_config() is execute_write_rx_config() and do_io() is
> dev->netdev_ops->ndo_write_rx_config() named that way
> for consistency (since read/update)
>
> atomic_t cfg_in_use = ATOMIC_INIT(false);
> atomic_t cfg_update_pending = ATOMIC_INIT(false);
>
> struct netif_rx_config *active, *staged;
>
> void update_config()
> {
> int was_config_pending = atomic_xchg(&cfg_update_pending, false);
>
> // If prepare_config fails, it leaves staged untouched
> // So, we check for and apply if pending update
> int rc = prepare_config(&staged);
> if (rc && !was_config_pending)
> return;
>
> if (atomic_read(&cfg_in_use)) {
> atomic_set(&cfg_update_pending, true);
> return;
> }
> swap(active, staged);
> }
>
> void read_config()
> {
> atomic_set(&cfg_in_use, true);
> do_io(active);
> atomic_set(&cfg_in_use, false);
>
> // To account for the edge case where update_config() is called
> // during the execution of read_config() and there are no subsequent
> // calls to update_config()
> if (atomic_xchg(&cfg_update_pending, false))
> swap(active, staged);
> }
I wouldn't use atomic flags. IIRC ndo_set_rx_mode is called under
netif_addr_lock_bh(), so we can reuse that lock, have update_config()
assume ownership of the pending config and update it directly.
And read_config() (which IIUC runs from a wq) can take that lock
briefly, and swap which config is pending.
> >The driver needs to know old and new entries
> > and send ADD/DEL commands to FW. Converting virtio_net would be better,
> > but it does one huge dump which is also not representative of most
> > advanced NICs.
>
> We can definitely do this in prepare_config()
> Speaking of which, How big can uc_count and mc_count be?
>
> Would krealloc(buffer, uc_count * dev->addr_len, GFP_ATOMIC) be a good idea?
Not sure about the max value but I'd think low thousands is probably
a good target. IOW yes, I think one linear buffer may be a concern.
I'd think order 1 allocation may be fine tho..
> Well, virtio-net does kmalloc((uc_count + mc_count) * ETH_ALEN) + ...,
> GFP_ATOMIC),
> so this shouldn't introduce any new failures for virtio-net
Right but IDK if virtio is used on systems with the same sort of scale
as a large physical function driver..
> > Let's only allocate any extra state if driver has the NDO
> > We need to shut down sooner, some time between ndo_stop and ndo_uninit
>
> Would it make sense to move init (if ndo exists) and cleanup to
> __dev_open and __dev_close?
Yes, indeed.
next prev parent reply other threads:[~2025-11-05 0:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 17:42 [RFC/RFT PATCH net-next v3 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write I Viswanath
2025-10-28 17:42 ` [RFC/RFT PATCH net-next v3 1/2] net: Add ndo_write_rx_config and helper structs and functions: I Viswanath
2025-10-31 2:20 ` Jakub Kicinski
2025-11-04 16:43 ` I Viswanath
2025-11-05 0:46 ` Jakub Kicinski [this message]
2025-11-06 16:08 ` I Viswanath
2025-11-06 22:12 ` Jakub Kicinski
2025-11-06 20:03 ` I Viswanath
2025-10-28 17:42 ` [RFC/RFT PATCH net-next v3 2/2] net: ethernet: Implement ndo_write_rx_config callback for the 8139cp driver I Viswanath
2025-10-29 4:48 ` [RFC/RFT PATCH net-next v3 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write I Viswanath
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251104164625.5a18db43@kernel.org \
--to=kuba@kernel.org \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=davem@davemloft.net \
--cc=david.hunter.linux@gmail.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=khalid@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel-mentees@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=skhan@linuxfoundation.org \
--cc=viswanathiyyappan@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.