From: Cosmin Ratiu <cratiu@nvidia.com>
To: "jacob.e.keller@intel.com" <jacob.e.keller@intel.com>,
Tariq Toukan <tariqt@nvidia.com>,
"kuba@kernel.org" <kuba@kernel.org>
Cc: "allison.henderson@oracle.com" <allison.henderson@oracle.com>,
"jiri@resnulli.us" <jiri@resnulli.us>,
Moshe Shemesh <moshe@nvidia.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"daniel.zahka@gmail.com" <daniel.zahka@gmail.com>,
"donald.hunter@gmail.com" <donald.hunter@gmail.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"matttbe@kernel.org" <matttbe@kernel.org>,
"pabeni@redhat.com" <pabeni@redhat.com>,
"horms@kernel.org" <horms@kernel.org>,
Parav Pandit <parav@nvidia.com>,
"corbet@lwn.net" <corbet@lwn.net>,
"razor@blackwall.org" <razor@blackwall.org>,
Dragos Tatulea <dtatulea@nvidia.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"willemb@google.com" <willemb@google.com>,
Jiri Pirko <jiri@nvidia.com>,
Adithya Jayachandran <ajayachandra@nvidia.com>,
Dan Jurgens <danielj@nvidia.com>,
"leon@kernel.org" <leon@kernel.org>,
"kees@kernel.org" <kees@kernel.org>,
"vadim.fedorenko@linux.dev" <vadim.fedorenko@linux.dev>,
Saeed Mahameed <saeedm@nvidia.com>,
"shuah@kernel.org" <shuah@kernel.org>,
"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
Mark Bloch <mbloch@nvidia.com>,
Shahar Shitrit <shshitrit@nvidia.com>,
Carolina Jubran <cjubran@nvidia.com>,
Nimrod Oren <noren@nvidia.com>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
"minhquangbui99@gmail.com" <minhquangbui99@gmail.com>,
"dw@davidwei.uk" <dw@davidwei.uk>,
"skhan@linuxfoundation.org" <skhan@linuxfoundation.org>,
Petr Machata <petrm@nvidia.com>,
"edumazet@google.com" <edumazet@google.com>,
"antonio@openvpn.net" <antonio@openvpn.net>,
"mst@redhat.com" <mst@redhat.com>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
Shay Drori <shayd@nvidia.com>,
"sdf@fomichev.me" <sdf@fomichev.me>,
"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
Gal Pressman <gal@nvidia.com>, "joe@dama.to" <joe@dama.to>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances
Date: Wed, 1 Apr 2026 10:22:53 +0000 [thread overview]
Message-ID: <4fcbf36799b5bfd5c0b68b0127f4f67aef00fdde.camel@nvidia.com> (raw)
In-Reply-To: <c547be19-adaf-4442-be2b-debcbafa4191@intel.com>
On Tue, 2026-03-31 at 16:55 -0700, Jacob Keller wrote:
> On 3/31/2026 5:20 AM, Cosmin Ratiu wrote:
> > On Mon, 2026-03-30 at 19:08 -0700, Jakub Kicinski wrote:
> > > This is an AI-generated review of your patch. The human sending
> > > this
> > > email has considered the AI review valid, or at least plausible.
> > > ---
> > > devlink: Add helpers to lock nested-in instances
> > >
> > > Upcoming code will need to obtain a reference to locked nested-in
> > > devlink instances. Add helpers to lock, obtain an already locked
> > > reference and unlock/unref the nested-in instance.
> > >
> > > > diff --git a/net/devlink/core.c b/net/devlink/core.c
> > > > index eeb6a71f5f56e..db11248df7120 100644
> > > > --- a/net/devlink/core.c
> > > > +++ b/net/devlink/core.c
> > >
> > > [ ... ]
> > >
> > > > +/* Returns the nested in devlink object and validates its lock
> > > > is
> > > > held. */
> > > > +struct devlink *devlink_nested_in_get_locked(struct
> > > > devlink_rel
> > > > *rel)
> > > > +{
> > > > + struct devlink *devlink;
> > > > + unsigned long index;
> > > > +
> > > > + if (!rel)
> > > > + return NULL;
> > > > + index = rel->nested_in.devlink_index;
> > > > + devlink = xa_find(&devlinks, &index, index,
> > > > DEVLINK_REGISTERED);
> > > > + if (devlink)
> > > > + devl_assert_locked(devlink);
> > >
> > > Can this dereference a freed pointer if the lock is not actually
> > > held?
> > >
> > > xa_find() acquires and releases the RCU read lock internally.
> > > When it
> > > returns,
> > > the devlink pointer is no longer protected by RCU or a reference
> > > count.
> > >
> > > If a caller invokes this helper without holding the devlink lock
> > > (which is the
> > > exact scenario the assertion is designed to detect), another
> > > thread
> > > could
> > > concurrently unregister and free the devlink.
> > >
> > > If the RCU grace period expires before the assertion executes,
> > > devl_assert_locked() would dereference freed memory. Should the
> > > lookup and
> > > assertion be enclosed within an explicit rcu_read_lock() block to
> > > safely
> > > assert the lock?
> >
> > This function is meant to be invoked by callers already holding all
> > locks from a given instance up the nesting tree. Calling it outside
> > this context could indeed lead to a race as described, where
> > another
> > entity unregisters a devlink about-to-be-asserted on.
> >
>
> Hmm. I'm struggling to follow this. If you already expect the parent
> to
> hold the nested devlink's lock, it must have a pointer to this
> devlink
> instance. In that case, why would you even need
> devlink_nested_in_get_locked in the first place?
After some more intense staring, I realized that intermediate instances
don't actually need to be locked, only the ancestor needs to. With that
in mind, the code get simplified:
- devlink_nested_in_get_locked and devlink_nested_in_put_unlock can be
removed.
- recursive unlocking in devl_rate_unlock is gone.
>
> > All current callers hold the locks, but to make it more robust, I
> > will
> > add the rcu_read_lock/unlock section around the find+assert.
> >
>
> You should at least do that. It is the correct pattern when dealing
> with
> reference counting data structures from the xarray. That's already
> how
> devlinks_xa_get() is implemented.
>
> > >
> > > > + return devlink;
> > > > +}
> > > > +
> > > > +void devlink_nested_in_put_unlock(struct devlink_rel *rel)
> > > > +{
> > > > + struct devlink *devlink =
> > > > devlink_nested_in_get_locked(rel);
> > >
> > > Could this corrupt the mutex state and cause a refcount
> > > underflow?
> > >
> > > This helper takes a devlink_rel pointer instead of the devlink
> > > pointer
> > > acquired by devlink_nested_in_get_lock(), and performs a
> > > secondary
> > > global
> > > lookup to find the devlink.
> > >
> > > If a caller mistakenly calls this in an error cleanup path where
> > > they
> > > did not
> > > actually acquire the lock, the global xa_find() will still locate
> > > the
> > > registered devlink. This would execute devl_unlock() and
> > > devlink_put() on a
> > > devlink the current thread does not own.
> > >
>
> If the caller already held the lock, why is
> devlink_nested_in_put_unlock
> calling the devl_unlock instead of the caller anyways? That seems
> confusing. Wouldn't the normal pattern be to
> devlink_nested_in_get_lock()? Oh, that is a separate function. Ok I
> see.
>
> > > Would it be safer for unlock/put helpers to take the exact
> > > pointer
> > > returned by
> > > the lock/get helper to ensure safe resource cleanup?
> >
> > 2 issues here:
> > 1) Mistakenly calling this without having acquired the lock. This
> > is
> > akin to saying mutex_unlock is dangerous if the lock isn't held.
> > Technically true, but moot.
> > 2) The rel argument: It is intentional, so that all 3 functions are
> > symmetrical.
> >
>
> IMO it would make more sense for the put version to be a put on the
> returned devlink pointer. I guess its not symmetrical, but it removes
> the need to perform the second lookup and it makes it easier to
> reason
> about the pointer you're releasing being the same one.
>
> Having put take different arguments from get is the usual pattern for
> such a behavior.
>
> Also devlink_nested_in_get_locked() doesn't increase the ref count so
> it
> is sort of "relying" on the caller already having a reference to it,
> which makes me think its not very useful. The only valid way to call
> this function as it exists now safely is to already hold a reference
> to
> the object, which also already requires you to have a valid pointer
> making me wonder why you'd ever need to call it in the first place.
>
> The only example you have is to make devlink_nested_in_put_unlock()
> take
> a devlink_rel pointer as its argument instead of just calling it on
> the
> pointer returned by devlink_nested_in_get_lock().
>
> This implementation seems confusing and likely to lead to errors.
I hope the next version will be more suitable.
Thank you for the comments and suggestions.
>
> Thanks,
> Jake
next prev parent reply other threads:[~2026-04-01 10:23 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 6:59 [PATCH net-next V9 00/14] devlink and mlx5: Support cross-function rate scheduling Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 01/14] devlink: Update nested instance locking comment Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances Tariq Toukan
2026-03-31 2:08 ` Jakub Kicinski
2026-03-31 12:20 ` Cosmin Ratiu
2026-03-31 23:55 ` Jacob Keller
2026-04-01 10:22 ` Cosmin Ratiu [this message]
2026-04-01 20:18 ` Jacob Keller
2026-03-26 6:59 ` [PATCH net-next V9 03/14] devlink: Migrate from info->user_ptr to info->ctx Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 04/14] devlink: Decouple rate storage from associated devlink object Tariq Toukan
2026-03-31 2:08 ` Jakub Kicinski
2026-03-31 12:28 ` Cosmin Ratiu
2026-03-26 6:59 ` [PATCH net-next V9 05/14] devlink: Add parent dev to devlink API Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 06/14] devlink: Allow parent dev for rate-set and rate-new Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 07/14] devlink: Allow rate node parents from other devlinks Tariq Toukan
2026-03-31 2:08 ` Jakub Kicinski
2026-03-31 12:44 ` Cosmin Ratiu
2026-03-26 6:59 ` [PATCH net-next V9 08/14] net/mlx5: qos: Use mlx5_lag_query_bond_speed to query LAG speed Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 09/14] net/mlx5: qos: Expose a function to clear a vport's parent Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 10/14] net/mlx5: qos: Model the root node in the scheduling hierarchy Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 11/14] net/mlx5: qos: Remove qos domains and use shd lock Tariq Toukan
2026-03-31 2:08 ` Jakub Kicinski
2026-03-31 12:53 ` Cosmin Ratiu
2026-03-31 16:37 ` Cosmin Ratiu
2026-03-26 6:59 ` [PATCH net-next V9 12/14] net/mlx5: qos: Support cross-device tx scheduling Tariq Toukan
2026-03-31 2:08 ` Jakub Kicinski
2026-03-31 12:57 ` Cosmin Ratiu
2026-03-26 6:59 ` [PATCH net-next V9 13/14] selftests: drv-net: Add test for cross-esw rate scheduling Tariq Toukan
2026-03-26 6:59 ` [PATCH net-next V9 14/14] net/mlx5: Document devlink rates Tariq Toukan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4fcbf36799b5bfd5c0b68b0127f4f67aef00fdde.camel@nvidia.com \
--to=cratiu@nvidia.com \
--cc=ajayachandra@nvidia.com \
--cc=allison.henderson@oracle.com \
--cc=andrew+netdev@lunn.ch \
--cc=antonio@openvpn.net \
--cc=chuck.lever@oracle.com \
--cc=cjubran@nvidia.com \
--cc=corbet@lwn.net \
--cc=daniel.zahka@gmail.com \
--cc=daniel@iogearbox.net \
--cc=danielj@nvidia.com \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=dtatulea@nvidia.com \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=jiri@nvidia.com \
--cc=jiri@resnulli.us \
--cc=joe@dama.to \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=matttbe@kernel.org \
--cc=mbloch@nvidia.com \
--cc=minhquangbui99@gmail.com \
--cc=moshe@nvidia.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=noren@nvidia.com \
--cc=pabeni@redhat.com \
--cc=parav@nvidia.com \
--cc=petrm@nvidia.com \
--cc=razor@blackwall.org \
--cc=saeedm@nvidia.com \
--cc=sdf@fomichev.me \
--cc=shayd@nvidia.com \
--cc=shshitrit@nvidia.com \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=tariqt@nvidia.com \
--cc=vadim.fedorenko@linux.dev \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox