Re: [PATCH iwl-next v11] ice: add support for unmanaged DPLL on E830 NIC

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Jacob Keller <jacob.e.keller@intel.com>
To: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>,
	<intel-wired-lan@lists.osuosl.org>
Cc: <netdev@vger.kernel.org>, <anthony.l.nguyen@intel.com>,
	<przemyslaw.kitszel@intel.com>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <pmenzel@molgen.mpg.de>,
	<aleksandr.loktionov@intel.com>, <horms@kernel.org>,
	<grzegorz.nitka@intel.com>, <vgrinber@redhat.com>,
	<zoltan.fodor@intel.com>
Subject: Re: [PATCH iwl-next v11] ice: add support for unmanaged DPLL on E830 NIC
Date: Mon, 4 May 2026 15:31:23 -0700	[thread overview]
Message-ID: <ca256097-1538-4c08-ba01-777bf646fc33@intel.com> (raw)
In-Reply-To: <20260217155808.1209194-1-arkadiusz.kubalewski@intel.com>

On 2/17/2026 7:58 AM, Arkadiusz Kubalewski wrote:
> Hardware variants of E830 may support an unmanaged DPLL where the
> configuration is hardcoded within the hardware and firmware, meaning
> users cannot modify settings. However, users are able to check the DPLL
> lock status and obtain configuration information through the Linux DPLL
> and devlink health subsystem.
> 
> Availability of 'loss of lock' health status code determines if such
> support is available, if true, register single DPLL device with 1 input
> and 1 output and provide hardcoded/read only properties of a pin and
> DPLL device. User is only allowed to check DPLL device status and receive
> notifications on DPLL lock status change.
> 
> When present, the DPLL device locks to an external signal provided
> through the PCIe/OCP pin. The expected input signal is 1PPS
> (1 Pulse Per Second) embedded on a 10MHz reference clock.
> The DPLL produces output:
> - for MAC (Media Access Control) & PHY (Physical Layer) clocks,
> - 1PPS for synchronization of onboard PHC (Precision Hardware Clock) timer.
> 
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
> ---
> v11:
> - rebase and fix conflicts


Jakub rightfully pointed out the mistaken inclusion of HAVE_DPLL_ESYNC.

> diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c b/drivers/net/ethernet/intel/ice/ice_dpll.c
> index 73a4e28ed75f..64518202dfff 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dpll.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c
> @@ -2588,6 +2601,21 @@ static const struct dpll_pin_ops ice_dpll_output_ops = {
>  	.esync_get = ice_dpll_output_esync_get,
>  };
>  
> +static const struct dpll_pin_ops ice_dpll_input_unmanaged_ops = {
> +	.frequency_get = ice_dpll_input_frequency_get,
> +	.direction_get = ice_dpll_input_direction,
> +	.state_on_dpll_get = ice_dpll_input_state_get,
> +#if defined(HAVE_DPLL_ESYNC)
> +	.esync_get = ice_dpll_input_esync_get,
> +#endif /* HAVE_DPLL_ESYNC */
> +};
> +
Please fix this, and remove any other remaining kcompat cruft as well.

Also could you please review the following Sashiko reports and make sure
there is no issues?

The netdev sashiko.dev bot says this:

> commit 7aae4a54d3ef18d8694bc535b6e5959839c9adfd
> Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
> ice: add support for unmanaged DPLL on E830 NIC
> Hardware variants of E830 may support an unmanaged DPLL where the
> configuration is hardcoded. This patch allows users to check the DPLL
> lock status and obtain configuration information through the Linux DPLL
> and devlink health subsystem.
>> diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c b/drivers/net/ethernet/intel/ice/devlink/health.c
>> --- a/drivers/net/ethernet/intel/ice/devlink/health.c
>> +++ b/drivers/net/ethernet/intel/ice/devlink/health.c
> [ ... ]
>> @@ -242,6 +244,8 @@ void ice_process_health_status_event(struct ice_pf *pf, struct ice_rq_event_info
>>  				pf->health_reporters.fw_status = *health_info;
>>  				devlink_health_report(pf->health_reporters.fw,
>>  						      "FW syndrome reported", NULL);
>> +				if (status_code == ICE_AQC_HEALTH_STATUS_INFO_LOSS_OF_LOCK)
>> +					ice_dpll_lock_state_set_unmanaged(pf, health_info, true);
> Is it possible for this health event to arrive when the DPLL subsystem is not
> fully initialized or already torn down?
> Looking at ice_dpll_lock_state_set_unmanaged(), it unconditionally acquires
> pf->dplls.lock:
> void ice_dpll_lock_state_set_unmanaged(...)
> {
>     ...
> 	mutex_lock(&pf->dplls.lock);
>     ...
> }
> If the hardware doesn't support DPLL, ice_dpll_init() is never called, leaving
> the mutex uninitialized. Additionally, if ice_dpll_init_unmanaged() fails,
> it destroys the mutex and exits.
> During driver unload, ice_dpll_deinit() destroys the mutex and frees the
> DPLL structures, but the service task processing these admin queue events
> might not be stopped until later in the teardown sequence.
> Could this lead to a kernel panic from locking an uninitialized or destroyed
> mutex, or a use-after-free when passing the d->dpll pointer to
> dpll_device_change_ntf()?
> Should there be a check ensuring DPLL is active (e.g. by checking
> test_bit(ICE_FLAG_DPLL, pf->flags)) before handling the event?


My own execution using claude had this to say:

> Now I have everything. Let me compose the report.
> commit d9b481c1d5d8be9dbbca36f40be64db91eb12caa
> Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
> ice: add support for unmanaged DPLL on E830 NIC
> This patch adds support for an unmanaged DPLL on E830 NIC variants,
> where the DPLL configuration is hardcoded in hardware/firmware.  It
> registers a single PPS DPLL device with one input and one output,
> reporting lock status via health status events and the DPLL subsystem.
> A few things stood out during review.
>> diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c b/drivers/net/ethernet/intel/ice/devlink/health.c
>> index 8e9a8a8178d4..31e6c5107c97 100644
>> --- a/drivers/net/ethernet/intel/ice/devlink/health.c
>> +++ b/drivers/net/ethernet/intel/ice/devlink/health.c
> [ ... ]
>> @@ -242,6 +244,8 @@ void ice_process_health_status_event(struct ice_pf *pf, struct ice_rq_event_info
>> 				pf->health_reporters.fw_status = *health_info;
>> 				devlink_health_report(pf->health_reporters.fw,
>> 						      "FW syndrome reported", NULL);
>>+				if (status_code == ICE_AQC_HEALTH_STATUS_INFO_LOSS_OF_LOCK)
>>+					ice_dpll_lock_state_set_unmanaged(pf, health_info, true);
>> 				break;
> This calls ice_dpll_lock_state_set_unmanaged() for every PF that
> receives a 0x601 health event, with no check for pf->dplls.unmanaged
> or test_bit(ICE_FLAG_DPLL, pf->flags).
> The only guard inside ice_dpll_lock_state_set_unmanaged() is
> ice_pf_src_tmr_owned(), which checks PTP timer ownership -- not
> whether the DPLL subsystem was initialized.
> Consider a device where the firmware supports health code 0x601 but
> the ZL80640 CGU is not in the netlist.  In ice_init_features():
>>+	err = ice_is_health_status_code_supported(&pf->hw, code,
>>+						  &pf->dplls.unmanaged);
>>+	if (err || !ice_is_unmanaged_cgu_in_netlist(&pf->hw))
>>+		pf->dplls.unmanaged = false;
> This sets pf->dplls.unmanaged = false, so ice_dpll_init() is never
> called and pf->dplls.lock is never initialized via mutex_init().
> But the firmware can still send the health event, and then:
>>+void ice_dpll_lock_state_set_unmanaged(struct ice_pf *pf,
>>+				       const struct ice_aqc_health_status_elem *buff,
>>+				       bool notify)
>>+{
> [ ... ]
>>+	mutex_lock(&pf->dplls.lock);
> Can this mutex_lock() on an uninitialized mutex cause undefined
> behavior?  With CONFIG_DEBUG_MUTEXES this would trigger a BUG().
> Also, on managed DPLL devices (e.g. E810 with CGU), the DPLL worker
> thread maintains pf->dplls.pps.dpll_state via ice_dpll_update_state().
> Would this unconditional call overwrite the managed state machine?
>> diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c b/drivers/net/ethernet/intel/ice/ice_dpll.c
>> index 3eea83b0cdc9..00704dfeb3b2 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_dpll.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c
> [ ... ]
>>+static const struct dpll_pin_ops ice_dpll_input_unmanaged_ops = {
>>+	.frequency_get = ice_dpll_input_frequency_get,
>>+	.direction_get = ice_dpll_input_direction,
>>+	.state_on_dpll_get = ice_dpll_input_state_get,
>>+#if defined(HAVE_DPLL_ESYNC)
>>+	.esync_get = ice_dpll_input_esync_get,
>>+#endif /* HAVE_DPLL_ESYNC */
>>+};
> HAVE_DPLL_ESYNC does not appear to be defined anywhere in the upstream
> kernel tree -- it looks like an out-of-tree compatibility macro.  In
> upstream builds this means .esync_get is always NULL for unmanaged
> input pins.
> The managed ice_dpll_input_ops sets .esync_get unconditionally, and
> ice_dpll_input_esync_get() already has explicit unmanaged-specific
> logic added by this patch.  The init code also unconditionally sets
> ICE_AQC_GET_CGU_IN_CFG_STATUS_ESYNC_CAP and ICE_DPLL_IN_ESYNC_ENABLED
> on unmanaged input pins.
> The documentation example at the end of the patch shows
> esync-frequency and esync-pulse in the pin dump, but those fields
> will never appear in an upstream build since the callback is compiled
> out.  Should the ifdef be removed so this matches ice_dpll_input_ops?
> [ ... ]
>> @@ -3153,12 +3181,15 @@ ice_dpll_init_direct_pins(struct ice_pf *pf, bool cgu,
>>  	int ret;
>>  
>>  	ret = ice_dpll_get_pins(pf, pins, start_idx, count, pf->dplls.clock_id);
>>- 	if (ret)
>>+	if (!cgu || ret)
>>  		return ret;
>>- 	if (cgu) {
>>+
>>+	if (first) {
>>  		ret = ice_dpll_register_pins(first, pins, ops, count);
>>  		if (ret)
>>  			goto release_pins;
>>+	}
>>+	if (second) {
>>  		ret = ice_dpll_register_pins(second, pins, ops, count);
>>  		if (ret)
>>  			goto unregister_first;
> ice_dpll_init_direct_pins() was updated with NULL guards for first
> and second.  However, ice_dpll_deinit_direct_pins() was not:
> ice_dpll_deinit_direct_pins() {
>     if (cgu) {
>         ice_dpll_unregister_pins(first, pins, ops, count);
>         ice_dpll_unregister_pins(second, pins, ops, count);
>     }
>     ...
> }
> For the unmanaged case, ice_dpll_init_unmanaged() only creates the
> PPS dpll -- eec.dpll stays NULL.  If output pin init fails inside
> ice_dpll_init_pins(pf, true), the error path falls through to
> deinit_inputs which calls:
>     ice_dpll_deinit_direct_pins(pf, cgu=true, ...,
>                                 pf->dplls.pps.dpll,
>                                 pf->dplls.eec.dpll);
>                                               ^^^^
>                                               NULL
> This passes NULL as second to ice_dpll_unregister_pins(), which
> calls dpll_pin_unregister(NULL, ...), dereferencing the NULL dpll
> pointer.  Would it make sense to add the same NULL guards to
> ice_dpll_deinit_direct_pins() that were added to the init path?
> [ ... ]
>> @@ -3923,6 +4001,15 @@ ice_dpll_init_info_direct_pins(struct ice_pf *pf,
>>  		pins[i].prop.board_label = ice_cgu_get_pin_name(hw, i, input);
>>  		pins[i].prop.type = ice_cgu_get_pin_type(hw, i, input);
>>  		if (input) {
>>+			if (pf->dplls.unmanaged) {
>>+				pins[i].freq = ice_dpll_pin_freq_info(hw, i,
>>+								      input);
>>+				pins[i].state[0] = DPLL_PIN_STATE_CONNECTED;
>>+				pins[i].status =
>>+					ICE_AQC_GET_CGU_IN_CFG_STATUS_ESYNC_CAP;
>>+				pins[i].flags[0] = ICE_DPLL_IN_ESYNC_ENABLED;
>>+				continue;
>>+			}
> [ ... ]
>>  		} else {
>>+			if (pf->dplls.unmanaged) {
>>+				pins[i].freq = ice_dpll_pin_freq_info(hw, i,
>>+								      input);
>>+				pins[i].state[0] = DPLL_PIN_STATE_CONNECTED;
>>+				continue;
>>+			}
> [ ... ]
>>  		pins[i].prop.freq_supported =
>>  			ice_cgu_get_pin_freq_supp(hw, i, input, &freq_supp_num);
>>  		pins[i].prop.freq_supported_num = freq_supp_num;
>>  		pins[i].pf = pf;
>>  	}
> Both unmanaged branches use continue, which skips the assignments of
> pins[i].prop.freq_supported, pins[i].prop.freq_supported_num, and
> pins[i].pf at the bottom of the loop.
> The missing freq_supported / freq_supported_num means the DPLL netlink
> code cannot report supported frequency ranges to userspace for
> unmanaged pins, even though the data is available from
> ice_cgu_get_pin_freq_supp().
> The missing pins[i].pf = pf assignment leaves the pointer NULL (from
> kcalloc zeroing).  Current unmanaged callbacks use d->pf from
> dpll_priv so there is no immediate crash, but if any callback that
> dereferences p->pf were added to the unmanaged ops in the future it
> would hit a NULL pointer dereference.  Was the continue intended to
> skip all of these assignments?

     prev parent reply	other threads:[~2026-05-04 22:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17 15:58 [PATCH iwl-next v11] ice: add support for unmanaged DPLL on E830 NIC Arkadiusz Kubalewski
2026-03-17 16:58 ` [Intel-wired-lan] " Mekala, SunithaX D
2026-05-04 22:31 ` Jacob Keller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca256097-1538-4c08-ba01-777bf646fc33@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=arkadiusz.kubalewski@intel.com \
    --cc=grzegorz.nitka@intel.com \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=vgrinber@redhat.com \
    --cc=zoltan.fodor@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox