Re: [PATCH net 1/2] dpll: fix NULL deref in dpll_device_ops() during teardown race

Netdev List
 help / color / mirror / Atom feed

From: Jakub Kicinski <kuba@kernel.org>
To: poros@redhat.com
Cc: Jakub Kicinski <kuba@kernel.org>,
	netdev@vger.kernel.org, ivecera@redhat.com,
	vadim.fedorenko@linux.dev, arkadiusz.kubalewski@intel.com,
	jiri@resnulli.us, milena.olech@intel.com, davem@davemloft.net,
	michal.michalik@intel.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net 1/2] dpll: fix NULL deref in dpll_device_ops() during teardown race
Date: Wed, 20 May 2026 19:19:53 -0700	[thread overview]
Message-ID: <20260521021953.688132-1-kuba@kernel.org> (raw)
In-Reply-To: <20260516191317.1005612-2-poros@redhat.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
dpll: fix NULL deref in dpll_device_ops() during teardown race

This patch tolerates an empty registration_list on a still-indexed dpll
by dropping a WARN_ON in dpll_device_registration_first(), making
dpll_priv() and dpll_device_ops() return NULL in that state, and adding
an early return in dpll_cmd_pin_get_one() when the first dpll-ref of a
pin is already torn down.

> diff --git a/drivers/dpll/dpll_core.c b/drivers/dpll/dpll_core.c
> index cbb635db4321..4a058b46c69d 100644
> --- a/drivers/dpll/dpll_core.c
> +++ b/drivers/dpll/dpll_core.c
> @@ -1060,12 +1060,8 @@ EXPORT_SYMBOL_GPL(dpll_pin_ref_sync_pair_add);
>  static struct dpll_device_registration *
>  dpll_device_registration_first(struct dpll_device *dpll)
>  {
> -	struct dpll_device_registration *reg;
> -
> -	reg = list_first_entry_or_null((struct list_head *)&dpll->registration_list,
> -				       struct dpll_device_registration, list);
> -	WARN_ON(!reg);
> -	return reg;
> +	return list_first_entry_or_null((struct list_head *)&dpll->registration_list,
> +					struct dpll_device_registration, list);
>  }
>  
>  void *dpll_priv(struct dpll_device *dpll)
> @@ -1073,6 +1069,8 @@ void *dpll_priv(struct dpll_device *dpll)
>  	struct dpll_device_registration *reg;
>  
>  	reg = dpll_device_registration_first(dpll);
> +	if (!reg)
> +		return NULL;
>  	return reg->priv;
>  }

[High]
With dpll_priv() now silently returning NULL for any half-dead dpll,
the contract for the ~28 in-tree callers of dpll_priv() changes.  The
commit message asserts that the only path which can observe the new
NULL return is dpll_cmd_pin_get_one() reached from
dpll_pin_event_send(), and that this path is "gated here" by the new
check in dpll_cmd_pin_get_one().  Is that gate actually coextensive
with the set of dplls walked by this code path?

> diff --git a/drivers/dpll/dpll_netlink.c b/drivers/dpll/dpll_netlink.c
> index ea6360263786..b73d718e8a56 100644
> --- a/drivers/dpll/dpll_netlink.c
> +++ b/drivers/dpll/dpll_netlink.c
> @@ -709,6 +709,12 @@ dpll_cmd_pin_get_one(struct sk_buff *msg, struct dpll_pin *pin,
>  	ref = dpll_xa_ref_dpll_first(&pin->dpll_refs);
>  	ASSERT_NOT_NULL(ref);
>  
> +	/* The first dpll the pin references may be torn down while still
> +	 * pinned by foreign-driver refs; drop the notification cleanly.
> +	 */
> +	if (!dpll_device_ops(ref->dpll))
> +		return -ENODEV;
> +

[High]
The check only validates the dpll returned by
dpll_xa_ref_dpll_first(&pin->dpll_refs), i.e. the first ref.  Later in
the same function, when pin->parent_refs is empty, control falls
through to dpll_msg_add_pin_dplls(), which walks every entry of
pin->dpll_refs:

    xa_for_each(&pin->dpll_refs, index, ref) {
        ...
        ret = dpll_msg_add_pin_on_dpll_state(msg, pin, ref, extack);
        ...
        ret = dpll_msg_add_pin_operstate(msg, pin, ref, extack);
        ...
        ret = dpll_msg_add_pin_prio(msg, pin, ref, extack);
        ...
        ret = dpll_msg_add_pin_direction(msg, pin, ref, extack);
        ...
        ret = dpll_msg_add_phase_offset(msg, pin, ref, extack);
        ...
    }

Each of those helpers calls dpll_priv(ref->dpll) and passes the
result as the dpll_priv argument to a driver ops callback.  Can a
non-first ref->dpll in this loop be in the same half-dead state (entry
still indexed in dpll_device_xa, but registration_list empty), so that
dpll_priv() now hands a NULL back to a driver callback that expects a
non-NULL private pointer (e.g. ice_dpll_pin_state_get does
struct ice_dpll *d = dpll_priv; struct ice_pf *pf = d->pf;)?

The teardown path the commit message describes
(zl3073x_dev_dpll_fini iterating list_for_each_entry over zldev->dplls
and calling zl3073x_dpll_unregister on each) drops dpll_lock between
per-dpll unregisters.  If a pin from a foreign driver references two
dplls of the unloading provider, can a notify work observe one
already torn down and the other still healthy, with the new gate
inspecting only the first?

Should the same NULL-tolerant treatment also be applied to the
xa_for_each loop in dpll_msg_add_pin_dplls(), or should the gate skip
half-dead refs there as well, so the commit message claim "those
callers cannot see the new NULL return ... only path that can is ...
gated here" actually holds?

[Medium]
Separately, the new -ENODEV return interacts with dpll_nl_pin_get_dumpit():

    ret = dpll_cmd_pin_get_one(skb, pin, cb->extack);
    if (ret) {
        genlmsg_cancel(skb, hdr);
        break;
    }

dpll_pin_available(pin) returns true as long as any of the pin's
parent_refs or dpll_refs is still DPLL_REGISTERED, so a pin can be
"available" while its first dpll_ref is the torn-down one.  In that
case the new -ENODEV breaks out of the dump loop and propagates to
userspace, hiding every later pin from a single dump call.  Should
this dumpit caller treat -ENODEV from dpll_cmd_pin_get_one() as a
"continue" rather than a fatal error, so the dump skips the transient
pin instead of failing wholesale?

next prev parent reply	other threads:[~2026-05-21  2:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-16 19:13 [PATCH net 0/2] dpll: fix two teardown races on E825-C Petr Oros
2026-05-16 19:13 ` [PATCH net 1/2] dpll: fix NULL deref in dpll_device_ops() during teardown race Petr Oros
2026-05-21  2:19   ` Jakub Kicinski
2026-05-21  2:19   ` Jakub Kicinski [this message]
2026-05-16 19:13 ` [PATCH net 2/2] dpll: filter pin->dpll_refs by cookie in dpll_pin_on_pin_unregister Petr Oros
2026-05-21  2:19   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260521021953.688132-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=arkadiusz.kubalewski@intel.com \
    --cc=davem@davemloft.net \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michal.michalik@intel.com \
    --cc=milena.olech@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=poros@redhat.com \
    --cc=vadim.fedorenko@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox