From: Jesse Brandeburg <jesse.brandeburg@intel.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>,
jkc@redhat.com, intel-wired-lan@lists.osuosl.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v1] ice: reset first in crash dump kernels
Date: Mon, 2 Oct 2023 11:16:08 -0700 [thread overview]
Message-ID: <de02a6f4-b0e7-3ce0-9928-7eedbbc810fb@intel.com> (raw)
In-Reply-To: <1cc59274-4555-409a-9f9b-16707f832b52@molgen.mpg.de>
On 9/19/2023 10:18 PM, Paul Menzel wrote:
> Dear Jesse,
>
>
> Thank you for your patch.
>
> Am 19.09.23 um 23:29 schrieb Jesse Brandeburg:
>> When booting into the crash dump kernels there are cases where upon
>> enabling the device, the system under test will panic or machine check.
>>
>> One such test is to
>> - load ice driver
>> $ modprobe ice
>> - enable SR-IOV (2 VFs)
>> $ echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
>> - crash
>> echo c > /proc/sysrq-trigger
>
> Above you prepended a $.
Fixed in v2.
>
>> - load ice driver (or happens automatically)
>> modprobe ice
>> - crash during pcim_enable_device()
>>
>> Avoid this problem by issuing a FLR to the device via PCIe config space
>> on the crash kernel, to clear out any outstanding transactions and stop
>> all queues and interrupts. Restore config space afterword because the
>
> afterw*a*rd
Fixed in v2.
>
>> driver won't load successfully otherwise.
>
> Excuse my ignorance, could you please add, what the crashdump kernel
> does differently from the “normal” kernel, so this special handling is
> needed?
I added more description in the v2 commit message, I hope that helps.
In summary: the crashdump kernel is starting up on "dirty" state of
hardware, due to the surprise crash of the previously running kernel
that had running devices when it "panicked"
>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> ---
>> drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
>> 1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
>> b/drivers/net/ethernet/intel/ice/ice_main.c
>> index c8286adae946..6550c46e4e36 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> @@ -6,6 +6,7 @@
>> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>> #include <generated/utsrelease.h>
>> +#include <linux/crash_dump.h>
>> #include "ice.h"
>> #include "ice_base.h"
>> #include "ice_lib.h"
>> @@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct
>> pci_device_id __always_unused *ent)
>> return -EINVAL;
>> }
>> + /* when under a kdump kernel initiate a reset before enabling the
>> + * device in order to clear out any pending DMA transactions. These
>> + * transactions can cause some systems to machine check when doing
>> + * the pcim_enable_device() below.
>> + */
>> + if (is_kdump_kernel()) {
>> + pci_save_state(pdev);
>> + pci_clear_master(pdev);
>> + err = pcie_flr(pdev);
>> + if (err)
>> + return err;
>> + pci_restore_state(pdev);
>> + }
>> +
>
> Should this be added to the common PCI code? Maybe loop the PCI
> subsystem folks in?
Ok, I'll cc: linux-pci when I send v2.
>
>> /* this driver uses devres, see
>> * Documentation/driver-api/driver-model/devres.rst
>> */
>
>
> Kind regards,
>
> Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
prev parent reply other threads:[~2023-10-02 19:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-19 21:29 [Intel-wired-lan] [PATCH iwl-net v1] ice: reset first in crash dump kernels Jesse Brandeburg
2023-09-20 5:18 ` Paul Menzel
2023-10-02 18:16 ` Jesse Brandeburg [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de02a6f4-b0e7-3ce0-9928-7eedbbc810fb@intel.com \
--to=jesse.brandeburg@intel.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jkc@redhat.com \
--cc=pmenzel@molgen.mpg.de \
--cc=przemyslaw.kitszel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox