linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iwl-net v2] ice: reset first in crash dump kernels
@ 2023-10-02 20:02 Jesse Brandeburg
  2023-10-02 23:49 ` Jay Vosburgh
  2023-10-03 22:41 ` Tony Nguyen
  0 siblings, 2 replies; 6+ messages in thread
From: Jesse Brandeburg @ 2023-10-02 20:02 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Jesse Brandeburg, linux-pci, pmenzel, netdev, jkc, Vishal Agrawal,
	Przemek Kitszel

When the system boots into the crash dump kernel after a panic, the ice
networking device may still have pending transactions that can cause errors
or machine checks when the device is re-enabled. This can prevent the crash
dump kernel from loading the driver or collecting the crash data.

To avoid this issue, perform a function level reset (FLR) on the ice device
via PCIe config space before enabling it on the crash kernel. This will
clear any outstanding transactions and stop all queues and interrupts.
Restore the config space after the FLR, otherwise it was found in testing
that the driver wouldn't load successfully.

The following sequence causes the original issue:
- Load the ice driver with modprobe ice
- Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
- Trigger a crash with echo c > /proc/sysrq-trigger
- Load the ice driver again (or let it load automatically) with modprobe ice
- The system crashes again during pcim_enable_device()

Reported-by: Vishal Agrawal <vagrawal@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
v2: respond to list comments and update commit message
v1: initial version
---
 drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c8286adae946..6550c46e4e36 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6,6 +6,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <generated/utsrelease.h>
+#include <linux/crash_dump.h>
 #include "ice.h"
 #include "ice_base.h"
 #include "ice_lib.h"
@@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 		return -EINVAL;
 	}
 
+	/* when under a kdump kernel initiate a reset before enabling the
+	 * device in order to clear out any pending DMA transactions. These
+	 * transactions can cause some systems to machine check when doing
+	 * the pcim_enable_device() below.
+	 */
+	if (is_kdump_kernel()) {
+		pci_save_state(pdev);
+		pci_clear_master(pdev);
+		err = pcie_flr(pdev);
+		if (err)
+			return err;
+		pci_restore_state(pdev);
+	}
+
 	/* this driver uses devres, see
 	 * Documentation/driver-api/driver-model/devres.rst
 	 */

base-commit: 6a70e5cbedaf8ad10528ac9ac114f3ec20f422df
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH iwl-net v2] ice: reset first in crash dump kernels
  2023-10-02 20:02 [PATCH iwl-net v2] ice: reset first in crash dump kernels Jesse Brandeburg
@ 2023-10-02 23:49 ` Jay Vosburgh
  2023-10-03  5:50   ` Jesse Brandeburg
  2023-10-03 22:41 ` Tony Nguyen
  1 sibling, 1 reply; 6+ messages in thread
From: Jay Vosburgh @ 2023-10-02 23:49 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: intel-wired-lan, linux-pci, pmenzel, netdev, jkc, Vishal Agrawal,
	Przemek Kitszel

Jesse Brandeburg <jesse.brandeburg@intel.com> wrote:

>When the system boots into the crash dump kernel after a panic, the ice
>networking device may still have pending transactions that can cause errors
>or machine checks when the device is re-enabled. This can prevent the crash
>dump kernel from loading the driver or collecting the crash data.
>
>To avoid this issue, perform a function level reset (FLR) on the ice device
>via PCIe config space before enabling it on the crash kernel. This will
>clear any outstanding transactions and stop all queues and interrupts.
>Restore the config space after the FLR, otherwise it was found in testing
>that the driver wouldn't load successfully.

	How does this differ from ading "reset_devices" to the crash
kernel command line, per Documentation/admin-guide/kdump/kdump.rst?

	-J

>The following sequence causes the original issue:
>- Load the ice driver with modprobe ice
>- Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
>- Trigger a crash with echo c > /proc/sysrq-trigger
>- Load the ice driver again (or let it load automatically) with modprobe ice
>- The system crashes again during pcim_enable_device()
>
>Reported-by: Vishal Agrawal <vagrawal@redhat.com>
>Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>---
>v2: respond to list comments and update commit message
>v1: initial version
>---
> drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
>diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>index c8286adae946..6550c46e4e36 100644
>--- a/drivers/net/ethernet/intel/ice/ice_main.c
>+++ b/drivers/net/ethernet/intel/ice/ice_main.c
>@@ -6,6 +6,7 @@
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> 
> #include <generated/utsrelease.h>
>+#include <linux/crash_dump.h>
> #include "ice.h"
> #include "ice_base.h"
> #include "ice_lib.h"
>@@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
> 		return -EINVAL;
> 	}
> 
>+	/* when under a kdump kernel initiate a reset before enabling the
>+	 * device in order to clear out any pending DMA transactions. These
>+	 * transactions can cause some systems to machine check when doing
>+	 * the pcim_enable_device() below.
>+	 */
>+	if (is_kdump_kernel()) {
>+		pci_save_state(pdev);
>+		pci_clear_master(pdev);
>+		err = pcie_flr(pdev);
>+		if (err)
>+			return err;
>+		pci_restore_state(pdev);
>+	}
>+
> 	/* this driver uses devres, see
> 	 * Documentation/driver-api/driver-model/devres.rst
> 	 */
>
>base-commit: 6a70e5cbedaf8ad10528ac9ac114f3ec20f422df
>-- 
>2.39.3
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH iwl-net v2] ice: reset first in crash dump kernels
  2023-10-02 23:49 ` Jay Vosburgh
@ 2023-10-03  5:50   ` Jesse Brandeburg
  2023-10-03 17:43     ` Jay Vosburgh
  0 siblings, 1 reply; 6+ messages in thread
From: Jesse Brandeburg @ 2023-10-03  5:50 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: intel-wired-lan, linux-pci, pmenzel, netdev, jkc, Vishal Agrawal,
	Przemek Kitszel

On 10/2/2023 4:49 PM, Jay Vosburgh wrote:
> Jesse Brandeburg <jesse.brandeburg@intel.com> wrote:
> 
>> When the system boots into the crash dump kernel after a panic, the ice
>> networking device may still have pending transactions that can cause errors
>> or machine checks when the device is re-enabled. This can prevent the crash
>> dump kernel from loading the driver or collecting the crash data.
>>
>> To avoid this issue, perform a function level reset (FLR) on the ice device
>> via PCIe config space before enabling it on the crash kernel. This will
>> clear any outstanding transactions and stop all queues and interrupts.
>> Restore the config space after the FLR, otherwise it was found in testing
>> that the driver wouldn't load successfully.
> 
> 	How does this differ from ading "reset_devices" to the crash
> kernel command line, per Documentation/admin-guide/kdump/kdump.rst?
> 
> 	-J
> 

Hi Jay, thanks for the question.

That parameter is new to me, and upon looking into the parameter, it
doesn't seem well documented. It also seems to only be used by storage
controllers, and would basically result in the same code I already have.
I suspect since it's a driver opt-in to the parameter, the difference
would be 1) requiring the user to give the reset_devices parameter on
the kdump kernel line (which is a big "if") and 2) less readable code
than the current which does:

if (is_kdump_kernel())
...

and the reset_devices way would be:

if (reset_devices)
...

There are several other examples in the networking tree using the method
I ended up with in this change. I'd argue the preferred way in the
networking tree is to use is_kdump_kernel(), which I like better because
it doesn't require user input and shouldn't have any bad side effects
from doing an extra reset in kdump.

Also, this issue has already been tested to be fixed by this patch.

I'd prefer to keep the patch as is, if that's ok with you.

Thanks,
Jesse




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH iwl-net v2] ice: reset first in crash dump kernels
  2023-10-03  5:50   ` Jesse Brandeburg
@ 2023-10-03 17:43     ` Jay Vosburgh
  0 siblings, 0 replies; 6+ messages in thread
From: Jay Vosburgh @ 2023-10-03 17:43 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: intel-wired-lan, linux-pci, pmenzel, netdev, jkc, Vishal Agrawal,
	Przemek Kitszel

Jesse Brandeburg <jesse.brandeburg@intel.com> wrote:

>On 10/2/2023 4:49 PM, Jay Vosburgh wrote:
>> Jesse Brandeburg <jesse.brandeburg@intel.com> wrote:
>> 
>>> When the system boots into the crash dump kernel after a panic, the ice
>>> networking device may still have pending transactions that can cause errors
>>> or machine checks when the device is re-enabled. This can prevent the crash
>>> dump kernel from loading the driver or collecting the crash data.
>>>
>>> To avoid this issue, perform a function level reset (FLR) on the ice device
>>> via PCIe config space before enabling it on the crash kernel. This will
>>> clear any outstanding transactions and stop all queues and interrupts.
>>> Restore the config space after the FLR, otherwise it was found in testing
>>> that the driver wouldn't load successfully.
>> 
>> 	How does this differ from ading "reset_devices" to the crash
>> kernel command line, per Documentation/admin-guide/kdump/kdump.rst?
>> 
>> 	-J
>> 
>
>Hi Jay, thanks for the question.
>
>That parameter is new to me, and upon looking into the parameter, it
>doesn't seem well documented. It also seems to only be used by storage
>controllers, and would basically result in the same code I already have.
>I suspect since it's a driver opt-in to the parameter, the difference
>would be 1) requiring the user to give the reset_devices parameter on
>the kdump kernel line (which is a big "if") and 2) less readable code
>than the current which does:
>
>if (is_kdump_kernel())
>...
>
>and the reset_devices way would be:
>
>if (reset_devices)
>...
>
>There are several other examples in the networking tree using the method
>I ended up with in this change. I'd argue the preferred way in the
>networking tree is to use is_kdump_kernel(), which I like better because
>it doesn't require user input and shouldn't have any bad side effects
>from doing an extra reset in kdump.
>
>Also, this issue has already been tested to be fixed by this patch.
>
>I'd prefer to keep the patch as is, if that's ok with you.

	Thanks for the explanation; I was wondering if this methodology
would conflict or compete with reset_devices in some way, or if there's
a risk that the FLR would in some cases make things worse.

	Since many device drivers have this sort of logic in them, would
it make sense to put this in the PCI core somewhere to FLR at probe time
if is_kdump_kernel()?  The manifestation of the issue that I'm familiar
with is that DMA requests from the device arrive after the IOMMU DMA
remapping tables have been reset during kexec, leading to failures.

	Regardless, the patch looks fine to me given the current state
of kdump / kexec / reset_devices.

Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com>

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH iwl-net v2] ice: reset first in crash dump kernels
  2023-10-02 20:02 [PATCH iwl-net v2] ice: reset first in crash dump kernels Jesse Brandeburg
  2023-10-02 23:49 ` Jay Vosburgh
@ 2023-10-03 22:41 ` Tony Nguyen
  2023-10-04 18:59   ` Jesse Brandeburg
  1 sibling, 1 reply; 6+ messages in thread
From: Tony Nguyen @ 2023-10-03 22:41 UTC (permalink / raw)
  To: Jesse Brandeburg, intel-wired-lan
  Cc: linux-pci, pmenzel, netdev, jkc, Vishal Agrawal, Przemek Kitszel



On 10/2/2023 1:02 PM, Jesse Brandeburg wrote:
> When the system boots into the crash dump kernel after a panic, the ice
> networking device may still have pending transactions that can cause errors
> or machine checks when the device is re-enabled. This can prevent the crash
> dump kernel from loading the driver or collecting the crash data.
> 
> To avoid this issue, perform a function level reset (FLR) on the ice device
> via PCIe config space before enabling it on the crash kernel. This will
> clear any outstanding transactions and stop all queues and interrupts.
> Restore the config space after the FLR, otherwise it was found in testing
> that the driver wouldn't load successfully.
> 
> The following sequence causes the original issue:
> - Load the ice driver with modprobe ice
> - Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
> - Trigger a crash with echo c > /proc/sysrq-trigger
> - Load the ice driver again (or let it load automatically) with modprobe ice
> - The system crashes again during pcim_enable_device()
> 

This is missing a Fixes:

> Reported-by: Vishal Agrawal <vagrawal@redhat.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> ---
> v2: respond to list comments and update commit message
> v1: initial version
> ---
>   drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index c8286adae946..6550c46e4e36 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -6,6 +6,7 @@
>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>   
>   #include <generated/utsrelease.h>
> +#include <linux/crash_dump.h>
>   #include "ice.h"
>   #include "ice_base.h"
>   #include "ice_lib.h"
> @@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>   		return -EINVAL;
>   	}
>   
> +	/* when under a kdump kernel initiate a reset before enabling the
> +	 * device in order to clear out any pending DMA transactions. These
> +	 * transactions can cause some systems to machine check when doing
> +	 * the pcim_enable_device() below.
> +	 */
> +	if (is_kdump_kernel()) {
> +		pci_save_state(pdev);
> +		pci_clear_master(pdev);
> +		err = pcie_flr(pdev);
> +		if (err)
> +			return err;
> +		pci_restore_state(pdev);
> +	}
> +
>   	/* this driver uses devres, see
>   	 * Documentation/driver-api/driver-model/devres.rst
>   	 */
> 
> base-commit: 6a70e5cbedaf8ad10528ac9ac114f3ec20f422df

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH iwl-net v2] ice: reset first in crash dump kernels
  2023-10-03 22:41 ` Tony Nguyen
@ 2023-10-04 18:59   ` Jesse Brandeburg
  0 siblings, 0 replies; 6+ messages in thread
From: Jesse Brandeburg @ 2023-10-04 18:59 UTC (permalink / raw)
  To: Tony Nguyen, intel-wired-lan
  Cc: linux-pci, pmenzel, netdev, jkc, Vishal Agrawal, Przemek Kitszel

On 10/3/2023 3:41 PM, Tony Nguyen wrote:

> This is missing a Fixes:

I'm not sure it is, as I hadn't put it in thinking that there was 
nothing to "fix". But I guess I can put *something* in here in order to 
have the auto-backports work nicely for stable. v3 sent!

> 
>> Reported-by: Vishal Agrawal <vagrawal@redhat.com>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-04 19:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-02 20:02 [PATCH iwl-net v2] ice: reset first in crash dump kernels Jesse Brandeburg
2023-10-02 23:49 ` Jay Vosburgh
2023-10-03  5:50   ` Jesse Brandeburg
2023-10-03 17:43     ` Jay Vosburgh
2023-10-03 22:41 ` Tony Nguyen
2023-10-04 18:59   ` Jesse Brandeburg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).