Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Jesse Brandeburg <jesse.brandeburg@intel.com>
To: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	jkc@redhat.com, intel-wired-lan@lists.osuosl.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v1] ice: reset first in crash dump kernels
Date: Mon, 2 Oct 2023 11:16:08 -0700	[thread overview]
Message-ID: <de02a6f4-b0e7-3ce0-9928-7eedbbc810fb@intel.com> (raw)
In-Reply-To: <1cc59274-4555-409a-9f9b-16707f832b52@molgen.mpg.de>

On 9/19/2023 10:18 PM, Paul Menzel wrote:
> Dear Jesse,
> 
> 
> Thank you for your patch.
> 
> Am 19.09.23 um 23:29 schrieb Jesse Brandeburg:
>> When booting into the crash dump kernels there are cases where upon
>> enabling the device, the system under test will panic or machine check.
>>
>> One such test is to
>> - load ice driver
>> $ modprobe ice
>> - enable SR-IOV (2 VFs)
>> $ echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
>> - crash
>> echo c > /proc/sysrq-trigger
> 
> Above you prepended a $.

Fixed in v2.

> 
>> - load ice driver (or happens automatically)
>> modprobe ice
>> - crash during pcim_enable_device()
>>
>> Avoid this problem by issuing a FLR to the device via PCIe config space
>> on the crash kernel, to clear out any outstanding transactions and stop
>> all queues and interrupts. Restore config space afterword because the
> 
> afterw*a*rd

Fixed in v2.


> 
>> driver won't load successfully otherwise.
> 
> Excuse my ignorance, could you please add, what the crashdump kernel
> does differently from the “normal” kernel, so this special handling is
> needed?

I added more description in the v2 commit message, I hope that helps.

In summary: the crashdump kernel is starting up on "dirty" state of
hardware, due to the surprise crash of the previously running kernel
that had running devices when it "panicked"

> 
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> ---
>>   drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
>>   1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
>> b/drivers/net/ethernet/intel/ice/ice_main.c
>> index c8286adae946..6550c46e4e36 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> @@ -6,6 +6,7 @@
>>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>     #include <generated/utsrelease.h>
>> +#include <linux/crash_dump.h>
>>   #include "ice.h"
>>   #include "ice_base.h"
>>   #include "ice_lib.h"
>> @@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct
>> pci_device_id __always_unused *ent)
>>           return -EINVAL;
>>       }
>>   +    /* when under a kdump kernel initiate a reset before enabling the
>> +     * device in order to clear out any pending DMA transactions. These
>> +     * transactions can cause some systems to machine check when doing
>> +     * the pcim_enable_device() below.
>> +     */
>> +    if (is_kdump_kernel()) {
>> +        pci_save_state(pdev);
>> +        pci_clear_master(pdev);
>> +        err = pcie_flr(pdev);
>> +        if (err)
>> +            return err;
>> +        pci_restore_state(pdev);
>> +    }
>> +
> 
> Should this be added to the common PCI code? Maybe loop the PCI
> subsystem folks in?

Ok, I'll cc: linux-pci when I send v2.


> 
>>       /* this driver uses devres, see
>>        * Documentation/driver-api/driver-model/devres.rst
>>        */
> 
> 
> Kind regards,
> 
> Paul

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

      reply	other threads:[~2023-10-02 19:06 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-19 21:29 [Intel-wired-lan] [PATCH iwl-net v1] ice: reset first in crash dump kernels Jesse Brandeburg
2023-09-20  5:18 ` Paul Menzel
2023-10-02 18:16   ` Jesse Brandeburg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=de02a6f4-b0e7-3ce0-9928-7eedbbc810fb@intel.com \
    --to=jesse.brandeburg@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jkc@redhat.com \
    --cc=pmenzel@molgen.mpg.de \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox