From: Vitaly Lifshits <vitaly.lifshits@intel.com>
To: <intel-wired-lan@osuosl.org>
Subject: Re: [Intel-wired-lan] ACPI IRQ storm with 6.10
Date: Wed, 21 Aug 2024 16:50:47 +0300 [thread overview]
Message-ID: <2d9a325c-e7bd-7df8-4575-3f32c85eabd2@intel.com> (raw)
In-Reply-To: <f0552842-7824-424e-af21-ac9eb3c5c14d@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 7362 bytes --]
On 8/21/2024 8:09 AM, Jiri Slaby wrote:
> On 20. 08. 24, 23:30, Bjorn Helgaas wrote:
>> On Tue, Aug 20, 2024 at 11:13:54PM +0200, Petr Valenta wrote:
>>> Dne 20. 08. 24 v 20:09 Bjorn Helgaas napsal(a):
>>>> On Mon, Aug 19, 2024 at 07:23:42AM +0200, Jiri Slaby wrote:
>>>>> On 19. 08. 24, 6:50, Jiri Slaby wrote:
>>>>>> CC e1000e guys + Jesse (due to 75a3f93b5383) + Bjorn (due to
>>>>>> b2c289415b2b)
>>>>>
>>>>> Bjorn,
>>>>>
>>>>> I am confused by these changes:
>>>>> ==========================================
>>>>> @@ -291,16 +288,13 @@ static int e1000_set_link_ksettings(struct
>>>>> net_device
>>>>> *net
>>>>> dev,
>>>>> * duplex is forced.
>>>>> */
>>>>> if (cmd->base.eth_tp_mdix_ctrl) {
>>>>> - if (hw->phy.media_type != e1000_media_type_copper) {
>>>>> - ret_val = -EOPNOTSUPP;
>>>>> - goto out;
>>>>> - }
>>>>> + if (hw->phy.media_type != e1000_media_type_copper)
>>>>> + return -EOPNOTSUPP;
>>>>>
>>>>> if ((cmd->base.eth_tp_mdix_ctrl !=
>>>>> ETH_TP_MDI_AUTO) &&
>>>>> (cmd->base.autoneg != AUTONEG_ENABLE)) {
>>>>> e_err("forcing MDI/MDI-X state is not
>>>>> supported when
>>>>> lin
>>>>> k speed and/or duplex are forced\n");
>>>>> - ret_val = -EINVAL;
>>>>> - goto out;
>>>>> + return -EINVAL;
>>>>> }
>>>>> }
>>>>>
>>>>> @@ -347,7 +341,6 @@ static int e1000_set_link_ksettings(struct
>>>>> net_device
>>>>> *netde
>>>>> v,
>>>>> }
>>>>>
>>>>> out:
>>>>> - pm_runtime_put_sync(netdev->dev.parent);
>>>>> clear_bit(__E1000_RESETTING, &adapter->state);
>>>>> return ret_val;
>>>>> }
>>>>> ==========================================
>>>>>
>>>>> So no more clear_bit(__E1000_RESETTING in the above fail paths. Is
>>>>> that
>>>>> intentional?
>>>>
>>>> Not intentional. Petr, do you have the ability to test the patch
>>>> below? I'm not sure it's the correct fix, but it reverts the pieces
>>>> of b2c289415b2b that Jiri pointed out.
>>>
>>> I tested the patch below but it didn't help. After the first boot
>>> with new
>>> kernel it looked promising as the irq storm only appeared for a few
>>> seconds,
>>> but with subsequent reboots it was the same as without the patch.
>>
>> Thank you very much for testing that!
>
>
>>> To be sure, I also send the md5sum of ethtool.c after applying the
>>> patch:
>>>
>>> a25c003257538f16994b4d7c4260e894 ethtool.c
>>
>> Thanks, that matches what I get when applying the patch on v6.10.
>>
>> I'm at a loss. You could try reverting the entire b2c289415b2b commit
>> (patch for that is below).
>
> FWIW he already tested with b2c289415b2b reverted (I provided him with
> a built kernel). It behaves the same. So you are not the breaker.
>
>> If that doesn't help, I guess you could try reverting the other
>> commits Jiri mentioned:
>>
>> 76a0a3f9cc2f e1000e: fix force smbus during suspend flow
>> c93a6f62cb1b e1000e: Fix S0ix residency on corporate systems
>> bfd546a552e1 e1000e: move force SMBUS near the end of enable_ulp
>> function
>> 6918107e2540 net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD
>> duplicates
>> 1eb2cded45b3 net: annotate writes on dev->mtu from ndo_change_mtu()
>> b2c289415b2b e1000e: Remove redundant runtime resume for ethtool_ops
>> 75a3f93b5383 net: intel: implement modern PM ops declarations
>>
>> If you do this, I would revert 76a0a3f9cc2f, test, then revert
>> c93a6f62cb1b in addition, test, then revert bfd546a552e1 in addition,
>> etc.
>
> Or perhaps easier to do:
> git bisect v6.10 v6.9 -- drivers/net/ethernet/intel/e1000e/
> directly. But that assumes one of the above commits broke it. If they
> did not, as a last resort, you can still do full bisect (without the
> "-- drivers" part).
>
> I would take v6.10 suses config.
> Would boot 6.10.
> do lsmod > /tmp/lsmod
> make LSMOD=/tmp/lsmod localyesconfig
> make bzImage
> and use that bzImage.
>
> Note that graphics, wireless and other stuff will be defunct unless
> you build in firmwares for them (EXTRA_FIRMWARE config). Alternatively
> use localmodconfig and build and install also modules (now limited to
> your machine).
>
> thanks,
From all the data in the Suse Bugzilla I understood that the issue
happens with a cable disconnected.
Does it reproduce with a connected cable?
Also, I suspect that it might be related to wake up interrupts that the
I219 device gets which might cause the IRQ storm.
I would like to ask you to enable the debug prints in the e1000e driver
and share the dmesg log. I would like to see if we can get the values of
WUFC and WUS on on resume.
For getting these values please comment out the following in
__e1000_resume function in netdev.c file:
/* report the system wakeup cause from S3/S4 */
// if (adapter->flags2 & FLAG2_HAS_PHY_WAKEUP) {
u16 phy_data;
e1e_rphy(&adapter->hw, BM_WUS, &phy_data);
if (phy_data) {
e_info("PHY Wakeup cause - %s\n",
phy_data & E1000_WUS_EX ? "Unicast Packet" :
phy_data & E1000_WUS_MC ? "Multicast
Packet" :
phy_data & E1000_WUS_BC ? "Broadcast
Packet" :
phy_data & E1000_WUS_MAG ? "Magic Packet" :
phy_data & E1000_WUS_LNKC ?
"Link Status Change" : "other");
}
e1e_wphy(&adapter->hw, BM_WUS, ~0);
// } else {
u32 wus = er32(WUS);
if (wus) {
e_info("MAC Wakeup cause - %s\n",
wus & E1000_WUS_EX ? "Unicast Packet" :
wus & E1000_WUS_MC ? "Multicast Packet" :
wus & E1000_WUS_BC ? "Broadcast Packet" :
wus & E1000_WUS_MAG ? "Magic Packet" :
wus & E1000_WUS_LNKC ? "Link Status
Change" :
"other");
// }
ew32(WUS, ~0);
To enable the debug prints you can run:
echo "module e1000e +p" | sudo tee /sys/kernel/debug/dynamic_debug/control
[-- Attachment #2: Type: text/html, Size: 15171 bytes --]
next prev parent reply other threads:[~2024-08-21 13:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <60ac8988-ace4-4cf0-8c44-028ca741c0a1@kernel.org>
[not found] ` <bd149809-f670-4b5d-a18a-f1f63624265f@kernel.org>
[not found] ` <CAJZ5v0jrgC+cvjDfw+vQ1jyVz1oj28uTf9k+-Kk2EAPJj3VmdQ@mail.gmail.com>
[not found] ` <6eac5c37-a5a8-4ccf-aef6-62a4a0bfcea0@jevklidu.cz>
2024-08-19 4:50 ` [Intel-wired-lan] ACPI IRQ storm with 6.10 Jiri Slaby
2024-08-19 5:23 ` Jiri Slaby
2024-08-19 16:47 ` Bjorn Helgaas
2024-08-20 18:09 ` Bjorn Helgaas
2024-08-20 21:13 ` Petr Valenta
2024-08-20 21:30 ` Bjorn Helgaas
2024-08-21 5:09 ` Jiri Slaby
2024-08-21 13:50 ` Vitaly Lifshits [this message]
2024-08-21 11:39 ` Petr Valenta
2024-08-21 14:59 ` Bjorn Helgaas
2024-08-21 15:17 ` Vitaly Lifshits
2024-08-22 7:44 ` Petr Valenta
2024-08-22 8:33 ` Petr Valenta
2024-08-22 9:18 ` Vitaly Lifshits
2024-08-22 10:24 ` Petr Valenta
2024-08-20 18:44 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2d9a325c-e7bd-7df8-4575-3f32c85eabd2@intel.com \
--to=vitaly.lifshits@intel.com \
--cc=intel-wired-lan@osuosl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox