public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
From: Jacob Keller <jacob.e.keller@intel.com>
To: Cosmin Ratiu <cratiu@nvidia.com>,
	Tariq Toukan <tariqt@nvidia.com>,
	"kuba@kernel.org" <kuba@kernel.org>
Cc: "allison.henderson@oracle.com" <allison.henderson@oracle.com>,
	"Moshe Shemesh" <moshe@nvidia.com>,
	"jiri@resnulli.us" <jiri@resnulli.us>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"daniel.zahka@gmail.com" <daniel.zahka@gmail.com>,
	"donald.hunter@gmail.com" <donald.hunter@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"matttbe@kernel.org" <matttbe@kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"horms@kernel.org" <horms@kernel.org>,
	Parav Pandit <parav@nvidia.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"kees@kernel.org" <kees@kernel.org>,
	"willemb@google.com" <willemb@google.com>,
	Dragos Tatulea <dtatulea@nvidia.com>,
	"razor@blackwall.org" <razor@blackwall.org>,
	"Adithya Jayachandran" <ajayachandra@nvidia.com>,
	Dan Jurgens <danielj@nvidia.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"vadim.fedorenko@linux.dev" <vadim.fedorenko@linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Saeed Mahameed <saeedm@nvidia.com>,
	"shuah@kernel.org" <shuah@kernel.org>,
	"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
	Mark Bloch <mbloch@nvidia.com>, Jiri Pirko <jiri@nvidia.com>,
	Carolina Jubran <cjubran@nvidia.com>,
	Shahar Shitrit <shshitrit@nvidia.com>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"minhquangbui99@gmail.com" <minhquangbui99@gmail.com>,
	Nimrod Oren <noren@nvidia.com>, "dw@davidwei.uk" <dw@davidwei.uk>,
	"skhan@linuxfoundation.org" <skhan@linuxfoundation.org>,
	Petr Machata <petrm@nvidia.com>,
	"edumazet@google.com" <edumazet@google.com>,
	"antonio@openvpn.net" <antonio@openvpn.net>,
	"mst@redhat.com" <mst@redhat.com>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Shay Drori <shayd@nvidia.com>,
	"sdf@fomichev.me" <sdf@fomichev.me>,
	"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
	Gal Pressman <gal@nvidia.com>, "joe@dama.to" <joe@dama.to>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances
Date: Tue, 31 Mar 2026 16:55:20 -0700	[thread overview]
Message-ID: <c547be19-adaf-4442-be2b-debcbafa4191@intel.com> (raw)
In-Reply-To: <ff5b2ec46d6cb639872318bdde429c46cac77f5b.camel@nvidia.com>

On 3/31/2026 5:20 AM, Cosmin Ratiu wrote:
> On Mon, 2026-03-30 at 19:08 -0700, Jakub Kicinski wrote:
>> This is an AI-generated review of your patch. The human sending this
>> email has considered the AI review valid, or at least plausible.
>> ---
>> devlink: Add helpers to lock nested-in instances
>>
>> Upcoming code will need to obtain a reference to locked nested-in
>> devlink instances. Add helpers to lock, obtain an already locked
>> reference and unlock/unref the nested-in instance.
>>
>>> diff --git a/net/devlink/core.c b/net/devlink/core.c
>>> index eeb6a71f5f56e..db11248df7120 100644
>>> --- a/net/devlink/core.c
>>> +++ b/net/devlink/core.c
>>
>> [ ... ]
>>
>>> +/* Returns the nested in devlink object and validates its lock is
>>> held. */
>>> +struct devlink *devlink_nested_in_get_locked(struct devlink_rel
>>> *rel)
>>> +{
>>> +	struct devlink *devlink;
>>> +	unsigned long index;
>>> +
>>> +	if (!rel)
>>> +		return NULL;
>>> +	index = rel->nested_in.devlink_index;
>>> +	devlink = xa_find(&devlinks, &index, index,
>>> DEVLINK_REGISTERED);
>>> +	if (devlink)
>>> +		devl_assert_locked(devlink);
>>
>> Can this dereference a freed pointer if the lock is not actually
>> held?
>>
>> xa_find() acquires and releases the RCU read lock internally. When it
>> returns,
>> the devlink pointer is no longer protected by RCU or a reference
>> count.
>>
>> If a caller invokes this helper without holding the devlink lock
>> (which is the
>> exact scenario the assertion is designed to detect), another thread
>> could
>> concurrently unregister and free the devlink.
>>
>> If the RCU grace period expires before the assertion executes,
>> devl_assert_locked() would dereference freed memory. Should the
>> lookup and
>> assertion be enclosed within an explicit rcu_read_lock() block to
>> safely
>> assert the lock?
> 
> This function is meant to be invoked by callers already holding all
> locks from a given instance up the nesting tree. Calling it outside
> this context could indeed lead to a race as described, where another
> entity unregisters a devlink about-to-be-asserted on.
> 

Hmm. I'm struggling to follow this. If you already expect the parent to
hold the nested devlink's lock, it must have a pointer to this devlink
instance. In that case, why would you even need
devlink_nested_in_get_locked in the first place?

> All current callers hold the locks, but to make it more robust, I will
> add the rcu_read_lock/unlock section around the find+assert.
> 

You should at least do that. It is the correct pattern when dealing with
reference counting data structures from the xarray. That's already how
devlinks_xa_get() is implemented.

>>
>>> +	return devlink;
>>> +}
>>> +
>>> +void devlink_nested_in_put_unlock(struct devlink_rel *rel)
>>> +{
>>> +	struct devlink *devlink =
>>> devlink_nested_in_get_locked(rel);
>>
>> Could this corrupt the mutex state and cause a refcount underflow?
>>
>> This helper takes a devlink_rel pointer instead of the devlink
>> pointer
>> acquired by devlink_nested_in_get_lock(), and performs a secondary
>> global
>> lookup to find the devlink.
>>
>> If a caller mistakenly calls this in an error cleanup path where they
>> did not
>> actually acquire the lock, the global xa_find() will still locate the
>> registered devlink. This would execute devl_unlock() and
>> devlink_put() on a
>> devlink the current thread does not own.
>>

If the caller already held the lock, why is devlink_nested_in_put_unlock
calling the devl_unlock instead of the caller anyways? That seems
confusing. Wouldn't the normal pattern be to
devlink_nested_in_get_lock()? Oh, that is a separate function. Ok I see.

>> Would it be safer for unlock/put helpers to take the exact pointer
>> returned by
>> the lock/get helper to ensure safe resource cleanup?
> 
> 2 issues here:
> 1) Mistakenly calling this without having acquired the lock. This is
> akin to saying mutex_unlock is dangerous if the lock isn't held.
> Technically true, but moot.
> 2) The rel argument: It is intentional, so that all 3 functions are
> symmetrical.
> 

IMO it would make more sense for the put version to be a put on the
returned devlink pointer. I guess its not symmetrical, but it removes
the need to perform the second lookup and it makes it easier to reason
about the pointer you're releasing being the same one.

Having put take different arguments from get is the usual pattern for
such a behavior.

Also devlink_nested_in_get_locked() doesn't increase the ref count so it
is sort of "relying" on the caller already having a reference to it,
which makes me think its not very useful. The only valid way to call
this function as it exists now safely is to already hold a reference to
the object, which also already requires you to have a valid pointer
making me wonder why you'd ever need to call it in the first place.

The only example you have is to make devlink_nested_in_put_unlock() take
a devlink_rel pointer as its argument instead of just calling it on the
pointer returned by devlink_nested_in_get_lock().

This implementation seems confusing and likely to lead to errors.

Thanks,
Jake

  reply	other threads:[~2026-03-31 23:55 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26  6:59 [PATCH net-next V9 00/14] devlink and mlx5: Support cross-function rate scheduling Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 01/14] devlink: Update nested instance locking comment Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:20     ` Cosmin Ratiu
2026-03-31 23:55       ` Jacob Keller [this message]
2026-04-01 10:22         ` Cosmin Ratiu
2026-04-01 20:18           ` Jacob Keller
2026-03-26  6:59 ` [PATCH net-next V9 03/14] devlink: Migrate from info->user_ptr to info->ctx Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 04/14] devlink: Decouple rate storage from associated devlink object Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:28     ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 05/14] devlink: Add parent dev to devlink API Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 06/14] devlink: Allow parent dev for rate-set and rate-new Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 07/14] devlink: Allow rate node parents from other devlinks Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:44     ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 08/14] net/mlx5: qos: Use mlx5_lag_query_bond_speed to query LAG speed Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 09/14] net/mlx5: qos: Expose a function to clear a vport's parent Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 10/14] net/mlx5: qos: Model the root node in the scheduling hierarchy Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 11/14] net/mlx5: qos: Remove qos domains and use shd lock Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:53     ` Cosmin Ratiu
2026-03-31 16:37       ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 12/14] net/mlx5: qos: Support cross-device tx scheduling Tariq Toukan
2026-03-31  2:08   ` Jakub Kicinski
2026-03-31 12:57     ` Cosmin Ratiu
2026-03-26  6:59 ` [PATCH net-next V9 13/14] selftests: drv-net: Add test for cross-esw rate scheduling Tariq Toukan
2026-03-26  6:59 ` [PATCH net-next V9 14/14] net/mlx5: Document devlink rates Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c547be19-adaf-4442-be2b-debcbafa4191@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=ajayachandra@nvidia.com \
    --cc=allison.henderson@oracle.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=antonio@openvpn.net \
    --cc=chuck.lever@oracle.com \
    --cc=cjubran@nvidia.com \
    --cc=corbet@lwn.net \
    --cc=cratiu@nvidia.com \
    --cc=daniel.zahka@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=danielj@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=dtatulea@nvidia.com \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=horms@kernel.org \
    --cc=jiri@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=joe@dama.to \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=minhquangbui99@gmail.com \
    --cc=moshe@nvidia.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=noren@nvidia.com \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=petrm@nvidia.com \
    --cc=razor@blackwall.org \
    --cc=saeedm@nvidia.com \
    --cc=sdf@fomichev.me \
    --cc=shayd@nvidia.com \
    --cc=shshitrit@nvidia.com \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=tariqt@nvidia.com \
    --cc=vadim.fedorenko@linux.dev \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox