Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Lifshits <vitaly.lifshits@intel.com>
To: <intel-wired-lan@osuosl.org>
Subject: Re: [Intel-wired-lan] ACPI IRQ storm with 6.10
Date: Wed, 21 Aug 2024 16:50:47 +0300	[thread overview]
Message-ID: <2d9a325c-e7bd-7df8-4575-3f32c85eabd2@intel.com> (raw)
In-Reply-To: <f0552842-7824-424e-af21-ac9eb3c5c14d@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 7362 bytes --]


On 8/21/2024 8:09 AM, Jiri Slaby wrote:
> On 20. 08. 24, 23:30, Bjorn Helgaas wrote:
>> On Tue, Aug 20, 2024 at 11:13:54PM +0200, Petr Valenta wrote:
>>> Dne 20. 08. 24 v 20:09 Bjorn Helgaas napsal(a):
>>>> On Mon, Aug 19, 2024 at 07:23:42AM +0200, Jiri Slaby wrote:
>>>>> On 19. 08. 24, 6:50, Jiri Slaby wrote:
>>>>>> CC e1000e guys + Jesse (due to 75a3f93b5383) + Bjorn (due to 
>>>>>> b2c289415b2b)
>>>>>
>>>>> Bjorn,
>>>>>
>>>>> I am confused by these changes:
>>>>> ==========================================
>>>>> @@ -291,16 +288,13 @@ static int e1000_set_link_ksettings(struct 
>>>>> net_device
>>>>> *net
>>>>> dev,
>>>>>            * duplex is forced.
>>>>>            */
>>>>>           if (cmd->base.eth_tp_mdix_ctrl) {
>>>>> -               if (hw->phy.media_type != e1000_media_type_copper) {
>>>>> -                       ret_val = -EOPNOTSUPP;
>>>>> -                       goto out;
>>>>> -               }
>>>>> +               if (hw->phy.media_type != e1000_media_type_copper)
>>>>> +                       return -EOPNOTSUPP;
>>>>>
>>>>>                   if ((cmd->base.eth_tp_mdix_ctrl != 
>>>>> ETH_TP_MDI_AUTO) &&
>>>>>                       (cmd->base.autoneg != AUTONEG_ENABLE)) {
>>>>>                           e_err("forcing MDI/MDI-X state is not 
>>>>> supported when
>>>>> lin
>>>>> k speed and/or duplex are forced\n");
>>>>> -                       ret_val = -EINVAL;
>>>>> -                       goto out;
>>>>> +                       return -EINVAL;
>>>>>                   }
>>>>>           }
>>>>>
>>>>> @@ -347,7 +341,6 @@ static int e1000_set_link_ksettings(struct 
>>>>> net_device
>>>>> *netde
>>>>> v,
>>>>>           }
>>>>>
>>>>>    out:
>>>>> -       pm_runtime_put_sync(netdev->dev.parent);
>>>>>           clear_bit(__E1000_RESETTING, &adapter->state);
>>>>>           return ret_val;
>>>>>    }
>>>>> ==========================================
>>>>>
>>>>> So no more clear_bit(__E1000_RESETTING in the above fail paths. Is 
>>>>> that
>>>>> intentional?
>>>>
>>>> Not intentional.  Petr, do you have the ability to test the patch
>>>> below?  I'm not sure it's the correct fix, but it reverts the pieces
>>>> of b2c289415b2b that Jiri pointed out.
>>>
>>> I tested the patch below but it didn't help. After the first boot 
>>> with new
>>> kernel it looked promising as the irq storm only appeared for a few 
>>> seconds,
>>> but with subsequent reboots it was the same as without the patch.
>>
>> Thank you very much for testing that!
>
>
>>> To be sure, I also send the md5sum of ethtool.c after applying the 
>>> patch:
>>>
>>> a25c003257538f16994b4d7c4260e894 ethtool.c
>>
>> Thanks, that matches what I get when applying the patch on v6.10.
>>
>> I'm at a loss.  You could try reverting the entire b2c289415b2b commit
>> (patch for that is below).
>
> FWIW he already tested with b2c289415b2b reverted (I provided him with 
> a built kernel). It behaves the same. So you are not the breaker.
>
>> If that doesn't help, I guess you could try reverting the other
>> commits Jiri mentioned:
>>
>>    76a0a3f9cc2f e1000e: fix force smbus during suspend flow
>>    c93a6f62cb1b e1000e: Fix S0ix residency on corporate systems
>>    bfd546a552e1 e1000e: move force SMBUS near the end of enable_ulp 
>> function
>>    6918107e2540 net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD 
>> duplicates
>>    1eb2cded45b3 net: annotate writes on dev->mtu from ndo_change_mtu()
>>    b2c289415b2b e1000e: Remove redundant runtime resume for ethtool_ops
>>    75a3f93b5383 net: intel: implement modern PM ops declarations
>>
>> If you do this, I would revert 76a0a3f9cc2f, test, then revert
>> c93a6f62cb1b in addition, test, then revert bfd546a552e1 in addition,
>> etc.
>
> Or perhaps easier to do:
>   git bisect v6.10 v6.9 -- drivers/net/ethernet/intel/e1000e/
> directly. But that assumes one of the above commits broke it. If they 
> did not, as a last resort, you can still do full bisect (without the 
> "-- drivers" part).
>
> I would take v6.10 suses config.
> Would boot 6.10.
> do lsmod > /tmp/lsmod
> make LSMOD=/tmp/lsmod localyesconfig
> make bzImage
> and use that bzImage.
>
> Note that graphics, wireless and other stuff will be defunct unless 
> you build in firmwares for them (EXTRA_FIRMWARE config). Alternatively 
> use localmodconfig and build and install also modules (now limited to 
> your machine).
>
> thanks,

 From all the data in the Suse Bugzilla I understood that the issue 
happens with a cable disconnected.

Does it reproduce with a connected cable?

Also, I suspect that it might be related to wake up interrupts that the 
I219 device gets which might cause the IRQ storm.

I would like to ask you to enable the debug prints in the e1000e driver 
and share the dmesg log. I would like to see if we can get the values of 
WUFC and WUS on on resume.

For getting these values please comment out the following in 
__e1000_resume function in netdev.c file:

         /* report the system wakeup cause from S3/S4 */
//      if (adapter->flags2 & FLAG2_HAS_PHY_WAKEUP) {
                 u16 phy_data;

                 e1e_rphy(&adapter->hw, BM_WUS, &phy_data);
                 if (phy_data) {
                         e_info("PHY Wakeup cause - %s\n",
                                phy_data & E1000_WUS_EX ? "Unicast Packet" :
                                phy_data & E1000_WUS_MC ? "Multicast 
Packet" :
                                phy_data & E1000_WUS_BC ? "Broadcast 
Packet" :
                                phy_data & E1000_WUS_MAG ? "Magic Packet" :
                                phy_data & E1000_WUS_LNKC ?
                                "Link Status Change" : "other");
                 }
                 e1e_wphy(&adapter->hw, BM_WUS, ~0);
//      } else {
                 u32 wus = er32(WUS);

                 if (wus) {
                         e_info("MAC Wakeup cause - %s\n",
                                wus & E1000_WUS_EX ? "Unicast Packet" :
                                wus & E1000_WUS_MC ? "Multicast Packet" :
                                wus & E1000_WUS_BC ? "Broadcast Packet" :
                                wus & E1000_WUS_MAG ? "Magic Packet" :
                                wus & E1000_WUS_LNKC ? "Link Status 
Change" :
                                "other");
//              }
                 ew32(WUS, ~0);

To enable the debug prints you can run:

echo "module e1000e +p" | sudo tee /sys/kernel/debug/dynamic_debug/control

[-- Attachment #2: Type: text/html, Size: 15171 bytes --]

  reply	other threads:[~2024-08-21 13:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <60ac8988-ace4-4cf0-8c44-028ca741c0a1@kernel.org>
     [not found] ` <bd149809-f670-4b5d-a18a-f1f63624265f@kernel.org>
     [not found]   ` <CAJZ5v0jrgC+cvjDfw+vQ1jyVz1oj28uTf9k+-Kk2EAPJj3VmdQ@mail.gmail.com>
     [not found]     ` <6eac5c37-a5a8-4ccf-aef6-62a4a0bfcea0@jevklidu.cz>
2024-08-19  4:50       ` [Intel-wired-lan] ACPI IRQ storm with 6.10 Jiri Slaby
2024-08-19  5:23         ` Jiri Slaby
2024-08-19 16:47           ` Bjorn Helgaas
2024-08-20 18:09           ` Bjorn Helgaas
2024-08-20 21:13             ` Petr Valenta
2024-08-20 21:30               ` Bjorn Helgaas
2024-08-21  5:09                 ` Jiri Slaby
2024-08-21 13:50                   ` Vitaly Lifshits [this message]
2024-08-21 11:39                 ` Petr Valenta
2024-08-21 14:59                   ` Bjorn Helgaas
2024-08-21 15:17                     ` Vitaly Lifshits
2024-08-22  7:44                       ` Petr Valenta
2024-08-22  8:33                         ` Petr Valenta
2024-08-22  9:18                           ` Vitaly Lifshits
2024-08-22 10:24                             ` Petr Valenta
2024-08-20 18:44           ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d9a325c-e7bd-7df8-4575-3f32c85eabd2@intel.com \
    --to=vitaly.lifshits@intel.com \
    --cc=intel-wired-lan@osuosl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox