public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
From: Cosmin Ratiu <cratiu@nvidia.com>
To: "jacob.e.keller@intel.com" <jacob.e.keller@intel.com>,
	Tariq Toukan <tariqt@nvidia.com>,
	"kuba@kernel.org" <kuba@kernel.org>
Cc: "allison.henderson@oracle.com" <allison.henderson@oracle.com>,
	"jiri@resnulli.us" <jiri@resnulli.us>,
	Moshe Shemesh <moshe@nvidia.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"daniel.zahka@gmail.com" <daniel.zahka@gmail.com>,
	"donald.hunter@gmail.com" <donald.hunter@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"matttbe@kernel.org" <matttbe@kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"horms@kernel.org" <horms@kernel.org>,
	Parav Pandit <parav@nvidia.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"razor@blackwall.org" <razor@blackwall.org>,
	Dragos Tatulea <dtatulea@nvidia.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"willemb@google.com" <willemb@google.com>,
	Jiri Pirko <jiri@nvidia.com>,
	Adithya Jayachandran <ajayachandra@nvidia.com>,
	Dan Jurgens <danielj@nvidia.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"kees@kernel.org" <kees@kernel.org>,
	"vadim.fedorenko@linux.dev" <vadim.fedorenko@linux.dev>,
	Saeed Mahameed <saeedm@nvidia.com>,
	"shuah@kernel.org" <shuah@kernel.org>,
	"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
	Mark Bloch <mbloch@nvidia.com>,
	Shahar Shitrit <shshitrit@nvidia.com>,
	Carolina Jubran <cjubran@nvidia.com>,
	Nimrod Oren <noren@nvidia.com>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"minhquangbui99@gmail.com" <minhquangbui99@gmail.com>,
	"dw@davidwei.uk" <dw@davidwei.uk>,
	"skhan@linuxfoundation.org" <skhan@linuxfoundation.org>,
	Petr Machata <petrm@nvidia.com>,
	"edumazet@google.com" <edumazet@google.com>,
	"antonio@openvpn.net" <antonio@openvpn.net>,
	"mst@redhat.com" <mst@redhat.com>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Shay Drori <shayd@nvidia.com>,
	"sdf@fomichev.me" <sdf@fomichev.me>,
	"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
	Gal Pressman <gal@nvidia.com>, "joe@dama.to" <joe@dama.to>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances
Date: Wed, 1 Apr 2026 10:22:53 +0000	[thread overview]
Message-ID: <4fcbf36799b5bfd5c0b68b0127f4f67aef00fdde.camel@nvidia.com> (raw)
In-Reply-To: <c547be19-adaf-4442-be2b-debcbafa4191@intel.com>

On Tue, 2026-03-31 at 16:55 -0700, Jacob Keller wrote:
> On 3/31/2026 5:20 AM, Cosmin Ratiu wrote:
> > On Mon, 2026-03-30 at 19:08 -0700, Jakub Kicinski wrote:
> > > This is an AI-generated review of your patch. The human sending
> > > this
> > > email has considered the AI review valid, or at least plausible.
> > > ---
> > > devlink: Add helpers to lock nested-in instances
> > > 
> > > Upcoming code will need to obtain a reference to locked nested-in
> > > devlink instances. Add helpers to lock, obtain an already locked
> > > reference and unlock/unref the nested-in instance.
> > > 
> > > > diff --git a/net/devlink/core.c b/net/devlink/core.c
> > > > index eeb6a71f5f56e..db11248df7120 100644
> > > > --- a/net/devlink/core.c
> > > > +++ b/net/devlink/core.c
> > > 
> > > [ ... ]
> > > 
> > > > +/* Returns the nested in devlink object and validates its lock
> > > > is
> > > > held. */
> > > > +struct devlink *devlink_nested_in_get_locked(struct
> > > > devlink_rel
> > > > *rel)
> > > > +{
> > > > +	struct devlink *devlink;
> > > > +	unsigned long index;
> > > > +
> > > > +	if (!rel)
> > > > +		return NULL;
> > > > +	index = rel->nested_in.devlink_index;
> > > > +	devlink = xa_find(&devlinks, &index, index,
> > > > DEVLINK_REGISTERED);
> > > > +	if (devlink)
> > > > +		devl_assert_locked(devlink);
> > > 
> > > Can this dereference a freed pointer if the lock is not actually
> > > held?
> > > 
> > > xa_find() acquires and releases the RCU read lock internally.
> > > When it
> > > returns,
> > > the devlink pointer is no longer protected by RCU or a reference
> > > count.
> > > 
> > > If a caller invokes this helper without holding the devlink lock
> > > (which is the
> > > exact scenario the assertion is designed to detect), another
> > > thread
> > > could
> > > concurrently unregister and free the devlink.
> > > 
> > > If the RCU grace period expires before the assertion executes,
> > > devl_assert_locked() would dereference freed memory. Should the
> > > lookup and
> > > assertion be enclosed within an explicit rcu_read_lock() block to
> > > safely
> > > assert the lock?
> > 
> > This function is meant to be invoked by callers already holding all
> > locks from a given instance up the nesting tree. Calling it outside
> > this context could indeed lead to a race as described, where
> > another
> > entity unregisters a devlink about-to-be-asserted on.
> > 
> 
> Hmm. I'm struggling to follow this. If you already expect the parent
> to
> hold the nested devlink's lock, it must have a pointer to this
> devlink
> instance. In that case, why would you even need
> devlink_nested_in_get_locked in the first place?

After some more intense staring, I realized that intermediate instances
don't actually need to be locked, only the ancestor needs to. With that
in mind, the code get simplified:
- devlink_nested_in_get_locked and devlink_nested_in_put_unlock can be
removed.
- recursive unlocking in devl_rate_unlock is gone.

> 
> > All current callers hold the locks, but to make it more robust, I
> > will
> > add the rcu_read_lock/unlock section around the find+assert.
> > 
> 
> You should at least do that. It is the correct pattern when dealing
> with
> reference counting data structures from the xarray. That's already
> how
> devlinks_xa_get() is implemented.
> 
> > > 
> > > > +	return devlink;
> > > > +}
> > > > +
> > > > +void devlink_nested_in_put_unlock(struct devlink_rel *rel)
> > > > +{
> > > > +	struct devlink *devlink =
> > > > devlink_nested_in_get_locked(rel);
> > > 
> > > Could this corrupt the mutex state and cause a refcount
> > > underflow?
> > > 
> > > This helper takes a devlink_rel pointer instead of the devlink
> > > pointer
> > > acquired by devlink_nested_in_get_lock(), and performs a
> > > secondary
> > > global
> > > lookup to find the devlink.
> > > 
> > > If a caller mistakenly calls this in an error cleanup path where
> > > they
> > > did not
> > > actually acquire the lock, the global xa_find() will still locate
> > > the
> > > registered devlink. This would execute devl_unlock() and
> > > devlink_put() on a
> > > devlink the current thread does not own.
> > > 
> 
> If the caller already held the lock, why is
> devlink_nested_in_put_unlock
> calling the devl_unlock instead of the caller anyways? That seems
> confusing. Wouldn't the normal pattern be to
> devlink_nested_in_get_lock()? Oh, that is a separate function. Ok I
> see.
> 
> > > Would it be safer for unlock/put helpers to take the exact
> > > pointer
> > > returned by
> > > the lock/get helper to ensure safe resource cleanup?
> > 
> > 2 issues here:
> > 1) Mistakenly calling this without having acquired the lock. This
> > is
> > akin to saying mutex_unlock is dangerous if the lock isn't held.
> > Technically true, but moot.
> > 2) The rel argument: It is intentional, so that all 3 functions are
> > symmetrical.
> > 
> 
> IMO it would make more sense for the put version to be a put on the
> returned devlink pointer. I guess its not symmetrical, but it removes
> the need to perform the second lookup and it makes it easier to
> reason
> about the pointer you're releasing being the same one.
> 
> Having put take different arguments from get is the usual pattern for
> such a behavior.
> 
> Also devlink_nested_in_get_locked() doesn't increase the ref count so
> it
> is sort of "relying" on the caller already having a reference to it,
> which makes me think its not very useful. The only valid way to call
> this function as it exists now safely is to already hold a reference
> to
> the object, which also already requires you to have a valid pointer
> making me wonder why you'd ever need to call it in the first place.
> 
> The only example you have is to make devlink_nested_in_put_unlock()
> take
> a devlink_rel pointer as its argument instead of just calling it on
> the
> pointer returned by devlink_nested_in_get_lock().
> 
> This implementation seems confusing and likely to lead to errors.

I hope the next version will be more suitable.
Thank you for the comments and suggestions.

> 
> Thanks,
> Jake


  reply	other threads:[~2026-04-01 10:23 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26  6:59 [PATCH net-next V9 00/14] devlink and mlx5: Support cross-function rate scheduling Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 01/14] devlink: Update nested instance locking comment Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:20     ` Cosmin Ratiu
2026-03-31 23:55       ` Jacob Keller
2026-04-01 10:22         ` Cosmin Ratiu [this message]
2026-04-01 20:18           ` Jacob Keller
2026-03-26  6:59 ` [PATCH net-next V9 03/14] devlink: Migrate from info->user_ptr to info->ctx Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 04/14] devlink: Decouple rate storage from associated devlink object Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:28     ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 05/14] devlink: Add parent dev to devlink API Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 06/14] devlink: Allow parent dev for rate-set and rate-new Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 07/14] devlink: Allow rate node parents from other devlinks Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:44     ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 08/14] net/mlx5: qos: Use mlx5_lag_query_bond_speed to query LAG speed Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 09/14] net/mlx5: qos: Expose a function to clear a vport's parent Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 10/14] net/mlx5: qos: Model the root node in the scheduling hierarchy Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 11/14] net/mlx5: qos: Remove qos domains and use shd lock Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:53     ` Cosmin Ratiu
2026-03-31 16:37       ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 12/14] net/mlx5: qos: Support cross-device tx scheduling Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:57     ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 13/14] selftests: drv-net: Add test for cross-esw rate scheduling Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 14/14] net/mlx5: Document devlink rates Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fcbf36799b5bfd5c0b68b0127f4f67aef00fdde.camel@nvidia.com \
    --to=cratiu@nvidia.com \
    --cc=ajayachandra@nvidia.com \
    --cc=allison.henderson@oracle.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=antonio@openvpn.net \
    --cc=chuck.lever@oracle.com \
    --cc=cjubran@nvidia.com \
    --cc=corbet@lwn.net \
    --cc=daniel.zahka@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=danielj@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=dtatulea@nvidia.com \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=horms@kernel.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jiri@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=joe@dama.to \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=minhquangbui99@gmail.com \
    --cc=moshe@nvidia.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=noren@nvidia.com \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=petrm@nvidia.com \
    --cc=razor@blackwall.org \
    --cc=saeedm@nvidia.com \
    --cc=sdf@fomichev.me \
    --cc=shayd@nvidia.com \
    --cc=shshitrit@nvidia.com \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=tariqt@nvidia.com \
    --cc=vadim.fedorenko@linux.dev \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox