[PATCH] ice: irdma hardware init failed after suspend/resume

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] ice: irdma hardware init failed after suspend/resume
@ 2024-05-28 10:03 Ricky Wu
  2024-05-28 10:43 ` [Intel-wired-lan] " Michal Swiatkowski
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ricky Wu @ 2024-05-28 10:03 UTC (permalink / raw)
  To: jesse.brandeburg
  Cc: anthony.l.nguyen, davem, edumazet, kuba, pabeni, intel-wired-lan,
	netdev, linux-kernel, rickywu0421, en-wei.wu

A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes
that irdma would break and report hardware initialization failed after
suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5).

The problem is caused due to the collision between the irq numbers
requested in irdma and the irq numbers requested in other drivers
after suspend/resume.

The irq numbers used by irdma are derived from ice's ice_pf->msix_entries
which stores mappings between MSI-X index and Linux interrupt number.
It's supposed to be cleaned up when suspend and rebuilt in resume but
it's not, causing irdma using the old irq numbers stored in the old
ice_pf->msix_entries to request_irq() when resume. And eventually
collide with other drivers.

This patch fixes this problem. On suspend, we call ice_deinit_rdma() to
clean up the ice_pf->msix_entries (and free the MSI-X vectors used by
irdma if we've dynamically allocated them). On Resume, we call
ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the
MSI-X vectors if we would like to dynamically allocate them).

Signed-off-by: Ricky Wu <en-wei.wu@canonical.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index f60c022f7960..ec3cbadaa162 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5544,7 +5544,7 @@ static int ice_suspend(struct device *dev)
 	 */
 	disabled = ice_service_task_stop(pf);

-	ice_unplug_aux_dev(pf);
+	ice_deinit_rdma(pf);

 	/* Already suspended?, then there is nothing to do */
 	if (test_and_set_bit(ICE_SUSPENDED, pf->state)) {
@@ -5624,6 +5624,10 @@ static int ice_resume(struct device *dev)
 	if (ret)
 		dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);

+	ret = ice_init_rdma(pf);
+	if (ret)
+		dev_err(dev, "Reinitialize RDMA during resume failed: %d\n", ret);
+
 	clear_bit(ICE_DOWN, pf->state);
 	/* Now perform PF reset and rebuild */
 	reset_type = ICE_RESET_PFR;
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-28 10:03 [PATCH] ice: irdma hardware init failed after suspend/resume Ricky Wu
@ 2024-05-28 10:43 ` Michal Swiatkowski
  2024-05-28 10:43 ` Wojciech Drewek
  2024-05-28 11:12 ` [Intel-wired-lan] " Paul Menzel
  2 siblings, 0 replies; 9+ messages in thread
From: Michal Swiatkowski @ 2024-05-28 10:43 UTC (permalink / raw)
  To: Ricky Wu
  Cc: jesse.brandeburg, intel-wired-lan, rickywu0421, linux-kernel,
	edumazet, anthony.l.nguyen, netdev, kuba, pabeni, davem

On Tue, May 28, 2024 at 06:03:15PM +0800, Ricky Wu wrote:
> A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes
> that irdma would break and report hardware initialization failed after
> suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5).
> 
> The problem is caused due to the collision between the irq numbers
> requested in irdma and the irq numbers requested in other drivers
> after suspend/resume.
> 
> The irq numbers used by irdma are derived from ice's ice_pf->msix_entries
> which stores mappings between MSI-X index and Linux interrupt number.
> It's supposed to be cleaned up when suspend and rebuilt in resume but
> it's not, causing irdma using the old irq numbers stored in the old
> ice_pf->msix_entries to request_irq() when resume. And eventually
> collide with other drivers.
> 
> This patch fixes this problem. On suspend, we call ice_deinit_rdma() to
> clean up the ice_pf->msix_entries (and free the MSI-X vectors used by
> irdma if we've dynamically allocated them). On Resume, we call
> ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the
> MSI-X vectors if we would like to dynamically allocate them).
> 
> Signed-off-by: Ricky Wu <en-wei.wu@canonical.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index f60c022f7960..ec3cbadaa162 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -5544,7 +5544,7 @@ static int ice_suspend(struct device *dev)
>  	 */
>  	disabled = ice_service_task_stop(pf);
>  
> -	ice_unplug_aux_dev(pf);
> +	ice_deinit_rdma(pf);
>  
>  	/* Already suspended?, then there is nothing to do */
>  	if (test_and_set_bit(ICE_SUSPENDED, pf->state)) {
> @@ -5624,6 +5624,10 @@ static int ice_resume(struct device *dev)
>  	if (ret)
>  		dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);
>  
> +	ret = ice_init_rdma(pf);
> +	if (ret)
> +		dev_err(dev, "Reinitialize RDMA during resume failed: %d\n", ret);
> +
>  	clear_bit(ICE_DOWN, pf->state);
>  	/* Now perform PF reset and rebuild */
>  	reset_type = ICE_RESET_PFR;

Looks fine, thanks for the fix. You can add fixes tag and target it to a
net tree.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>

> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-28 10:03 [PATCH] ice: irdma hardware init failed after suspend/resume Ricky Wu
  2024-05-28 10:43 ` [Intel-wired-lan] " Michal Swiatkowski
@ 2024-05-28 10:43 ` Wojciech Drewek
  2024-05-28 11:12 ` [Intel-wired-lan] " Paul Menzel
  2 siblings, 0 replies; 9+ messages in thread
From: Wojciech Drewek @ 2024-05-28 10:43 UTC (permalink / raw)
  To: Ricky Wu, jesse.brandeburg
  Cc: anthony.l.nguyen, davem, edumazet, kuba, pabeni, intel-wired-lan,
	netdev, linux-kernel, rickywu0421, Michal Swiatkowski



On 28.05.2024 12:03, Ricky Wu wrote:
> A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes
> that irdma would break and report hardware initialization failed after
> suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5).
> 
> The problem is caused due to the collision between the irq numbers
> requested in irdma and the irq numbers requested in other drivers
> after suspend/resume.
> 
> The irq numbers used by irdma are derived from ice's ice_pf->msix_entries
> which stores mappings between MSI-X index and Linux interrupt number.
> It's supposed to be cleaned up when suspend and rebuilt in resume but
> it's not, causing irdma using the old irq numbers stored in the old
> ice_pf->msix_entries to request_irq() when resume. And eventually
> collide with other drivers.
> 
> This patch fixes this problem. On suspend, we call ice_deinit_rdma() to
> clean up the ice_pf->msix_entries (and free the MSI-X vectors used by
> irdma if we've dynamically allocated them). On Resume, we call
> ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the
> MSI-X vectors if we would like to dynamically allocate them).
> 
> Signed-off-by: Ricky Wu <en-wei.wu@canonical.com>
> ---

Thanks for the patch!

>  drivers/net/ethernet/intel/ice/ice_main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index f60c022f7960..ec3cbadaa162 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -5544,7 +5544,7 @@ static int ice_suspend(struct device *dev)
>  	 */
>  	disabled = ice_service_task_stop(pf);
>  
> -	ice_unplug_aux_dev(pf);
> +	ice_deinit_rdma(pf);

I think this should be called later in the reset path IMO.
You should call ice_deinit_rdma in ice_prepare_for_reset (replace ice_unplug_aux_dev),

>  
>  	/* Already suspended?, then there is nothing to do */
>  	if (test_and_set_bit(ICE_SUSPENDED, pf->state)) {
> @@ -5624,6 +5624,10 @@ static int ice_resume(struct device *dev)
>  	if (ret)
>  		dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);
>  
> +	ret = ice_init_rdma(pf);
> +	if (ret)
> +		dev_err(dev, "Reinitialize RDMA during resume failed: %d\n", ret);

And ice_init_rdma should be moved to ice_rebuild (replace ice_plug_aux_dev)

> +
>  	clear_bit(ICE_DOWN, pf->state);
>  	/* Now perform PF reset and rebuild */
>  	reset_type = ICE_RESET_PFR;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-28 10:03 [PATCH] ice: irdma hardware init failed after suspend/resume Ricky Wu
  2024-05-28 10:43 ` [Intel-wired-lan] " Michal Swiatkowski
  2024-05-28 10:43 ` Wojciech Drewek
@ 2024-05-28 11:12 ` Paul Menzel
  2024-05-29  3:17   ` En-Wei WU
  2 siblings, 1 reply; 9+ messages in thread
From: Paul Menzel @ 2024-05-28 11:12 UTC (permalink / raw)
  To: Ricky Wu
  Cc: jesse.brandeburg, intel-wired-lan, rickywu0421, linux-kernel,
	edumazet, anthony.l.nguyen, netdev, kuba, pabeni, davem

Dear Ricky,


Thank you for your patch. Some minor nits. It’d be great if you made the 
summary about the action and not an issue description. Maybe:

Avoid IRQ collision to fix init failure on ACPI S3 resume

Am 28.05.24 um 12:03 schrieb Ricky Wu:
> A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes
> that irdma would break and report hardware initialization failed after
> suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5).
> 
> The problem is caused due to the collision between the irq numbers
> requested in irdma and the irq numbers requested in other drivers
> after suspend/resume.
> 
> The irq numbers used by irdma are derived from ice's ice_pf->msix_entries
> which stores mappings between MSI-X index and Linux interrupt number.
> It's supposed to be cleaned up when suspend and rebuilt in resume but
> it's not, causing irdma using the old irq numbers stored in the old
> ice_pf->msix_entries to request_irq() when resume. And eventually
> collide with other drivers.
> 
> This patch fixes this problem. On suspend, we call ice_deinit_rdma() to
> clean up the ice_pf->msix_entries (and free the MSI-X vectors used by
> irdma if we've dynamically allocated them). On Resume, we call

resume

> ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the
> MSI-X vectors if we would like to dynamically allocate them).
> 
> Signed-off-by: Ricky Wu <en-wei.wu@canonical.com>

Please add a Link: tag.

If this was tested by somebody else, please also add a Tested-by: line.

> ---
>   drivers/net/ethernet/intel/ice/ice_main.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index f60c022f7960..ec3cbadaa162 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -5544,7 +5544,7 @@ static int ice_suspend(struct device *dev)
>   	 */
>   	disabled = ice_service_task_stop(pf);
>   
> -	ice_unplug_aux_dev(pf);
> +	ice_deinit_rdma(pf);
>   
>   	/* Already suspended?, then there is nothing to do */
>   	if (test_and_set_bit(ICE_SUSPENDED, pf->state)) {
> @@ -5624,6 +5624,10 @@ static int ice_resume(struct device *dev)
>   	if (ret)
>   		dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);
>   
> +	ret = ice_init_rdma(pf);
> +	if (ret)
> +		dev_err(dev, "Reinitialize RDMA during resume failed: %d\n", ret);
> +
>   	clear_bit(ICE_DOWN, pf->state);
>   	/* Now perform PF reset and rebuild */
>   	reset_type = ICE_RESET_PFR;

What effect does this have on resume time?


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-28 11:12 ` [Intel-wired-lan] " Paul Menzel
@ 2024-05-29  3:17   ` En-Wei WU
  2024-05-29 20:18     ` Paul Menzel
  0 siblings, 1 reply; 9+ messages in thread
From: En-Wei WU @ 2024-05-29  3:17 UTC (permalink / raw)
  To: Paul Menzel
  Cc: jesse.brandeburg, intel-wired-lan, rickywu0421, linux-kernel,
	edumazet, anthony.l.nguyen, netdev, kuba, pabeni, davem

Thanks for your kind and quick reply.

> I think this should be called later in the reset path IMO.
> You should call ice_deinit_rdma in ice_prepare_for_reset (replace ice_unplug_aux_dev),
I'm afraid this would break the existing code because in
ice_deinit_rdma(), it will remove some entries in
pf->irq_tracker.entries. And in ice_reinit_interrupt_scheme() (which
is called before ice_prepare_for_reset), some entries have been
allocated for other irq usage.

> What effect does this have on resume time?
When we call ice_init_rdma() at resume time, it will allocate entries
at pf->irq_tracker.entries and update pf->msix_entries for later use
(request_irq) by irdma.

On Tue, 28 May 2024 at 19:12, Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> Dear Ricky,
>
>
> Thank you for your patch. Some minor nits. It’d be great if you made the
> summary about the action and not an issue description. Maybe:
>
> Avoid IRQ collision to fix init failure on ACPI S3 resume
>
> Am 28.05.24 um 12:03 schrieb Ricky Wu:
> > A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes
> > that irdma would break and report hardware initialization failed after
> > suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5).
> >
> > The problem is caused due to the collision between the irq numbers
> > requested in irdma and the irq numbers requested in other drivers
> > after suspend/resume.
> >
> > The irq numbers used by irdma are derived from ice's ice_pf->msix_entries
> > which stores mappings between MSI-X index and Linux interrupt number.
> > It's supposed to be cleaned up when suspend and rebuilt in resume but
> > it's not, causing irdma using the old irq numbers stored in the old
> > ice_pf->msix_entries to request_irq() when resume. And eventually
> > collide with other drivers.
> >
> > This patch fixes this problem. On suspend, we call ice_deinit_rdma() to
> > clean up the ice_pf->msix_entries (and free the MSI-X vectors used by
> > irdma if we've dynamically allocated them). On Resume, we call
>
> resume
>
> > ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the
> > MSI-X vectors if we would like to dynamically allocate them).
> >
> > Signed-off-by: Ricky Wu <en-wei.wu@canonical.com>
>
> Please add a Link: tag.
>
> If this was tested by somebody else, please also add a Tested-by: line.
>
> > ---
> >   drivers/net/ethernet/intel/ice/ice_main.c | 6 +++++-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > index f60c022f7960..ec3cbadaa162 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -5544,7 +5544,7 @@ static int ice_suspend(struct device *dev)
> >        */
> >       disabled = ice_service_task_stop(pf);
> >
> > -     ice_unplug_aux_dev(pf);
> > +     ice_deinit_rdma(pf);
> >
> >       /* Already suspended?, then there is nothing to do */
> >       if (test_and_set_bit(ICE_SUSPENDED, pf->state)) {
> > @@ -5624,6 +5624,10 @@ static int ice_resume(struct device *dev)
> >       if (ret)
> >               dev_err(dev, "Cannot restore interrupt scheme: %d\n", ret);
> >
> > +     ret = ice_init_rdma(pf);
> > +     if (ret)
> > +             dev_err(dev, "Reinitialize RDMA during resume failed: %d\n", ret);
> > +
> >       clear_bit(ICE_DOWN, pf->state);
> >       /* Now perform PF reset and rebuild */
> >       reset_type = ICE_RESET_PFR;
>
> What effect does this have on resume time?
>
>
> Kind regards,
>
> Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-29  3:17   ` En-Wei WU
@ 2024-05-29 20:18     ` Paul Menzel
  2024-05-30  7:08       ` En-Wei WU
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Menzel @ 2024-05-29 20:18 UTC (permalink / raw)
  To: En-Wei WU
  Cc: jesse.brandeburg, intel-wired-lan, rickywu0421, linux-kernel,
	edumazet, anthony.l.nguyen, netdev, kuba, pabeni, davem

Dear En-Wei,


Thank you for responding so quickly.

Am 29.05.24 um 05:17 schrieb En-Wei WU:

[…]

>> What effect does this have on resume time?
> 
> When we call ice_init_rdma() at resume time, it will allocate entries
> at pf->irq_tracker.entries and update pf->msix_entries for later use
> (request_irq) by irdma.

Sorry for being unclear. I meant, does resuming the system take longer 
now? (initcall_debug might give a clue.)


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-29 20:18     ` Paul Menzel
@ 2024-05-30  7:08       ` En-Wei WU
  2024-05-30  7:55         ` Paul Menzel
  2024-05-31  9:37         ` Wojciech Drewek
  0 siblings, 2 replies; 9+ messages in thread
From: En-Wei WU @ 2024-05-30  7:08 UTC (permalink / raw)
  To: Paul Menzel
  Cc: jesse.brandeburg, intel-wired-lan, rickywu0421, linux-kernel,
	edumazet, anthony.l.nguyen, netdev, kuba, pabeni, davem,
	wojciech.drewek

Thank you for your reply.

> Sorry for being unclear. I meant, does resuming the system take longer
> now? (initcall_debug might give a clue.)
I've tested the S3 suspend/resume with the initcall_debug kernel
command option, and it shows no clear difference between having or not
having the ice_init_rdma in ice_resume:
Without ice_init_rdma:
```
[  104.241129] ice 0000:86:00.0: PM: pci_pm_resume+0x0/0x110 returned
0 after 9415 usecs
[  104.241206] ice 0000:86:00.1: PM: pci_pm_resume+0x0/0x110 returned
0 after 9443 usecs
```
With ice_init_rdma:
```
[  122.749022] ice 0000:86:00.1: PM: pci_pm_resume+0x0/0x110 returned
0 after 9485 usecs
[  122.749068] ice 0000:86:00.0: PM: pci_pm_resume+0x0/0x110 returned
0 after 9532 usecs
```

> And ice_init_rdma should be moved to ice_rebuild (replace ice_plug_aux_dev)
We can defer the ice_init_rdma to the later service task by adopting this.

> You should call ice_deinit_rdma in ice_prepare_for_reset (replace ice_unplug_aux_dev),
It seems like we must call ice_deinit_rdma in ice_suspend. If we call
it in the later service task, it will:
1. break some existing code setup by ice_resume
2. Since the PCI-X vector table is flushed at the end of ice_suspend,
we have no way to release PCI-X vectors for rdma if we had allocated
it dynamically
The second point is important since we didn't release the PCI-X
vectors for rdma (if we allocated it dynamically) in the original
ice_suspend, and it's somewhat like a leak in the original code.

Best regards,
Ricky.

On Thu, 30 May 2024 at 04:19, Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> Dear En-Wei,
>
>
> Thank you for responding so quickly.
>
> Am 29.05.24 um 05:17 schrieb En-Wei WU:
>
> […]
>
> >> What effect does this have on resume time?
> >
> > When we call ice_init_rdma() at resume time, it will allocate entries
> > at pf->irq_tracker.entries and update pf->msix_entries for later use
> > (request_irq) by irdma.
>
> Sorry for being unclear. I meant, does resuming the system take longer
> now? (initcall_debug might give a clue.)
>
>
> Kind regards,
>
> Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-30  7:08       ` En-Wei WU
@ 2024-05-30  7:55         ` Paul Menzel
  2024-05-31  9:37         ` Wojciech Drewek
  1 sibling, 0 replies; 9+ messages in thread
From: Paul Menzel @ 2024-05-30  7:55 UTC (permalink / raw)
  To: En-Wei WU
  Cc: jesse.brandeburg, intel-wired-lan, rickywu0421, linux-kernel,
	edumazet, anthony.l.nguyen, netdev, kuba, pabeni, davem,
	wojciech.drewek

Dear En-Wei,


Thank you for your quick reply.

Am 30.05.24 um 09:08 schrieb En-Wei WU:

>> Sorry for being unclear. I meant, does resuming the system take longer
>> now? (initcall_debug might give a clue.)
> I've tested the S3 suspend/resume with the initcall_debug kernel
> command option, and it shows no clear difference between having or not
> having the ice_init_rdma in ice_resume:
> Without ice_init_rdma:
> ```
> [  104.241129] ice 0000:86:00.0: PM: pci_pm_resume+0x0/0x110 returned 0 after 9415 usecs
> [  104.241206] ice 0000:86:00.1: PM: pci_pm_resume+0x0/0x110 returned 0 after 9443 usecs
> ```
> With ice_init_rdma:
> ```
> [  122.749022] ice 0000:86:00.1: PM: pci_pm_resume+0x0/0x110 returned 0 after 9485 usecs
> [  122.749068] ice 0000:86:00.0: PM: pci_pm_resume+0x0/0x110 returned 0 after 9532 usecs
> ```

Thank you for verifying this. Nice to hear, it has no impact.

[…]


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH] ice: irdma hardware init failed after suspend/resume
  2024-05-30  7:08       ` En-Wei WU
  2024-05-30  7:55         ` Paul Menzel
@ 2024-05-31  9:37         ` Wojciech Drewek
  1 sibling, 0 replies; 9+ messages in thread
From: Wojciech Drewek @ 2024-05-31  9:37 UTC (permalink / raw)
  To: En-Wei WU, Paul Menzel
  Cc: netdev, rickywu0421, linux-kernel, edumazet, anthony.l.nguyen,
	kuba, intel-wired-lan, pabeni, davem



On 30.05.2024 09:08, En-Wei WU wrote:
> Thank you for your reply.
> 
>> Sorry for being unclear. I meant, does resuming the system take longer
>> now? (initcall_debug might give a clue.)
> I've tested the S3 suspend/resume with the initcall_debug kernel
> command option, and it shows no clear difference between having or not
> having the ice_init_rdma in ice_resume:
> Without ice_init_rdma:
> ```
> [  104.241129] ice 0000:86:00.0: PM: pci_pm_resume+0x0/0x110 returned
> 0 after 9415 usecs
> [  104.241206] ice 0000:86:00.1: PM: pci_pm_resume+0x0/0x110 returned
> 0 after 9443 usecs
> ```
> With ice_init_rdma:
> ```
> [  122.749022] ice 0000:86:00.1: PM: pci_pm_resume+0x0/0x110 returned
> 0 after 9485 usecs
> [  122.749068] ice 0000:86:00.0: PM: pci_pm_resume+0x0/0x110 returned
> 0 after 9532 usecs
> ```
> 
>> And ice_init_rdma should be moved to ice_rebuild (replace ice_plug_aux_dev)
> We can defer the ice_init_rdma to the later service task by adopting this.
> 
>> You should call ice_deinit_rdma in ice_prepare_for_reset (replace ice_unplug_aux_dev),
> It seems like we must call ice_deinit_rdma in ice_suspend. If we call
> it in the later service task, it will:
> 1. break some existing code setup by ice_resume
> 2. Since the PCI-X vector table is flushed at the end of ice_suspend,
> we have no way to release PCI-X vectors for rdma if we had allocated
> it dynamically
> The second point is important since we didn't release the PCI-X
> vectors for rdma (if we allocated it dynamically) in the original
> ice_suspend, and it's somewhat like a leak in the original code.
> 
> Best regards,
> Ricky.

Makes sense, let's keep ice_deinit_rdma in ice_suspend.

> 
> On Thu, 30 May 2024 at 04:19, Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>>
>> Dear En-Wei,
>>
>>
>> Thank you for responding so quickly.
>>
>> Am 29.05.24 um 05:17 schrieb En-Wei WU:
>>
>> […]
>>
>>>> What effect does this have on resume time?
>>>
>>> When we call ice_init_rdma() at resume time, it will allocate entries
>>> at pf->irq_tracker.entries and update pf->msix_entries for later use
>>> (request_irq) by irdma.
>>
>> Sorry for being unclear. I meant, does resuming the system take longer
>> now? (initcall_debug might give a clue.)
>>
>>
>> Kind regards,
>>
>> Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-05-31  9:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-28 10:03 [PATCH] ice: irdma hardware init failed after suspend/resume Ricky Wu
2024-05-28 10:43 ` [Intel-wired-lan] " Michal Swiatkowski
2024-05-28 10:43 ` Wojciech Drewek
2024-05-28 11:12 ` [Intel-wired-lan] " Paul Menzel
2024-05-29  3:17   ` En-Wei WU
2024-05-29 20:18     ` Paul Menzel
2024-05-30  7:08       ` En-Wei WU
2024-05-30  7:55         ` Paul Menzel
2024-05-31  9:37         ` Wojciech Drewek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).