public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tony Nguyen <anthony.l.nguyen@intel.com>
To: Aaron Ma <aaron.ma@canonical.com>, <przemyslaw.kitszel@intel.com>,
	<andrew+netdev@lunn.ch>, <davem@davemloft.net>,
	<edumazet@google.com>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] ice: Initialize RDMA after rebuild
Date: Thu, 11 Dec 2025 15:59:07 -0800	[thread overview]
Message-ID: <7bcc0f14-0005-4d72-9bc3-a32304499630@intel.com> (raw)
In-Reply-To: <20251205082459.1586143-2-aaron.ma@canonical.com>



On 12/5/2025 12:24 AM, Aaron Ma wrote:
> After wakeup from suspend, IRDMA is initialized with error:
> 
> kernel: ice 0000:60:00.0: IRDMA hardware initialization FAILED init_state=4 status=-110
> kernel: ice 0000:60:00.1: IRDMA hardware initialization FAILED init_state=4 status=-110
> kernel: irdma.gen_2 ice.roce.1: probe with driver irdma.gen_2 failed with error -110
> kernel: irdma.gen_2 ice.roce.2: probe with driver irdma.gen_2 failed with error -110
> 
> IRDMA times out because the initialization before the schedule reset.
> The ice_init_rdma() function already calls ice_plug_aux_dev() internally,
> ensuring proper initialization order.
> 
> Fixes: bc69ad74867db ("ice: avoid IRQ collision to fix init failure on ACPI S3 resume")
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
> ---
> V1 -> V2: no changes.
> 
>   drivers/net/ethernet/intel/ice/ice_main.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 2533876f1a2fd..c6dd04d24ac09 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -5677,11 +5677,6 @@ static int ice_resume(struct device *dev)
>   	if (ret)
>   		dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);
>   
> -	ret = ice_init_rdma(pf);
> -	if (ret)
> -		dev_err(dev, "Reinitialize RDMA during resume failed: %d\n",
> -			ret);
> -
>   	clear_bit(ICE_DOWN, pf->state);
>   	/* Now perform PF reset and rebuild */
>   	reset_type = ICE_RESET_PFR;
> @@ -7805,7 +7800,12 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type)
>   
>   	ice_health_clear(pf);
>   
> -	ice_plug_aux_dev(pf);
> +	/* Initialize RDMA after control queues are ready */
> +	err = ice_init_rdma(pf);

ice_init_rdma() allocates a new pf->cdev_info on each call. While it 
works for this particular flow, ice_rebuild() is called for all reset 
paths so this can cause a memory leak with cdev_info since RDMA is not 
de-inited for resets.

Additionally, ice_init_rdma() seems to be well placed in ice_resume() to 
mirror the deinit in ice_suspend(). As you mentioned the problem is 
caused by plug occurring before a reset. I think the call to 
ice_plug_aux_dev() should be removed from ice_init_rdma() to stop this 
from happening. With that change the plug won't occur before a reset 
and, following reset, plug will be called as part of rebuild when 
everything is up and ready. As ice_init_rdma() is also called in one 
other location (probe), ice_plug_aux_dev() should be added after the 
RDMA init to preserve current flow.

Corresponding changes should be made to the cleanup function as well to 
match these changes. i.e. mirror the removal of ice_plug_aux_dev() from 
ice_init_rdma() with removing ice_unplug_aux_dev() from 
ice_deinit_rdma() and precede the calls of ice_deinit_rdma() with 
ice_unplug_aux_dev().

Thanks,
Tony


> +	if (err)
> +		dev_err(dev, "Reinitialize RDMA after rebuild failed: %d\n",
> +			err);
> +
>   	if (ice_is_feature_supported(pf, ICE_F_SRIOV_LAG))
>   		ice_lag_rebuild(pf);
>   


  reply	other threads:[~2025-12-11 23:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-05  8:24 [PATCH v2 1/2] ice: Fix NULL pointer dereference in ice_vsi_set_napi_queues Aaron Ma
2025-12-05  8:24 ` [PATCH v2 2/2] ice: Initialize RDMA after rebuild Aaron Ma
2025-12-11 23:59   ` Tony Nguyen [this message]
2025-12-05 15:11 ` [PATCH v2 1/2] ice: Fix NULL pointer dereference in ice_vsi_set_napi_queues Simon Horman
2025-12-08  2:07   ` Aaron Ma
2025-12-29 15:14 ` [Intel-wired-lan] " Loktionov, Aleksandr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7bcc0f14-0005-4d72-9bc3-a32304499630@intel.com \
    --to=anthony.l.nguyen@intel.com \
    --cc=aaron.ma@canonical.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox