* [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 @ 2025-01-30 17:11 Stephen Hemminger 2025-01-30 19:17 ` Lifshits, Vitaly 0 siblings, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2025-01-30 17:11 UTC (permalink / raw) To: anthony.l.nguyen, jesse.brandeburg; +Cc: intel-wired-lan I am using: 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) Subsystem: Intel Corporation Device 0000 Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=5 Masked- Capabilities: [a0] Express Endpoint, IntMsgNum 0 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d Capabilities: [1c0] Latency Tolerance Reporting Capabilities: [1f0] Precision Time Measurement Capabilities: [1e0] L1 PM Substates Kernel driver in use: igc Kernel modules: igc Using both Debian testing and my own kernel built from 6.12, the igc driver appears broken after resume. After resuming the device is down and no address present. Attempts to set link up manually fail. If I do rmmod/modprobe of igc it comes back. Doing a bit of bisectting but it is slow going. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-01-30 17:11 [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 Stephen Hemminger @ 2025-01-30 19:17 ` Lifshits, Vitaly 2025-01-30 21:08 ` Stephen Hemminger 2025-01-31 1:21 ` Stephen Hemminger 0 siblings, 2 replies; 11+ messages in thread From: Lifshits, Vitaly @ 2025-01-30 19:17 UTC (permalink / raw) To: Stephen Hemminger, anthony.l.nguyen, jesse.brandeburg; +Cc: intel-wired-lan On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > I am using: > > 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > Subsystem: Intel Corporation Device 0000 > Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > Capabilities: [40] Power Management version 3 > Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > Capabilities: [a0] Express Endpoint, IntMsgNum 0 > Capabilities: [100] Advanced Error Reporting > Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > Capabilities: [1c0] Latency Tolerance Reporting > Capabilities: [1f0] Precision Time Measurement > Capabilities: [1e0] L1 PM Substates > Kernel driver in use: igc > Kernel modules: igc > > > Using both Debian testing and my own kernel built from 6.12, the igc > driver appears broken after resume. From which system state are you resuming? > > After resuming the device is down and no address present. > Attempts to set link up manually fail. Did you get any errors in the dmesg log? What is the firmware version on your device (you can get it by running ethtool -i)? > If I do rmmod/modprobe of igc it comes back. > > Doing a bit of bisectting but it is slow going. Meanwhile, we'll also try to reproduce this issue in our lab. Could you share more details about your system so we can create a similar setup? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-01-30 19:17 ` Lifshits, Vitaly @ 2025-01-30 21:08 ` Stephen Hemminger 2025-01-31 1:21 ` Stephen Hemminger 1 sibling, 0 replies; 11+ messages in thread From: Stephen Hemminger @ 2025-01-30 21:08 UTC (permalink / raw) To: Lifshits, Vitaly; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On Thu, 30 Jan 2025 21:17:30 +0200 "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > > I am using: > > > > 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > > Subsystem: Intel Corporation Device 0000 > > Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > > Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > > Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > > Capabilities: [40] Power Management version 3 > > Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > > Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > > Capabilities: [a0] Express Endpoint, IntMsgNum 0 > > Capabilities: [100] Advanced Error Reporting > > Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > > Capabilities: [1c0] Latency Tolerance Reporting > > Capabilities: [1f0] Precision Time Measurement > > Capabilities: [1e0] L1 PM Substates > > Kernel driver in use: igc > > Kernel modules: igc > > > > > > Using both Debian testing and my own kernel built from 6.12, the igc > > driver appears broken after resume. Before suspend $ sudo ethtool enp90s0 Settings for enp90s0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full 2500baseT/Full Supported pause frame use: Symmetric Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full 2500baseT/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Auto-negotiation: on Port: Twisted Pair PHYAD: 0 Transceiver: internal MDI-X: Unknown Supports Wake-on: pumbg Wake-on: d Current message level: 0x00000007 (7) drv probe link Link detected: yes > From which system state are you resuming? Suspend to ram > > > > > After resuming the device is down and no address present. > > Attempts to set link up manually fail. > > Did you get any errors in the dmesg log? See below. > What is the firmware version on your device (you can get it by running > ethtool -i)? $ sudo ethtool -i enp90s0 driver: igc version: 6.12.9-amd64 firmware-version: 2017:888d expansion-rom-version: bus-info: 0000:5a:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes The error after resume is: $ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 enp87s0 DOWN enp90s0 DOWN enp2s0f0np0 UP enp2s0f1np1 UP wlp91s0 DOWN $ sudo ip li set enp90s0 up RTNETLINK answers: No such device > > If I do rmmod/modprobe of igc it comes back. > > > > Doing a bit of bisectting but it is slow going. > > Meanwhile, we'll also try to reproduce this issue in our lab. Could you > share more details about your system so we can create a similar setup? Dmesg starting with suspending. [14229.851637] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 [14230.089271] PM: suspend entry (deep) [14230.093900] Filesystems sync: 0.004 seconds [14230.095179] Freezing user space processes [14230.096559] Freezing user space processes completed (elapsed 0.001 seconds) [14230.096561] OOM killer disabled. [14230.096562] Freezing remaining freezable tasks [14230.097744] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [14230.097773] printk: Suspending console(s) (use no_console_suspend to debug) [14230.134978] serial 00:01: disabled [14230.607766] ACPI: PM: Preparing to enter system sleep state S3 [14230.618973] ACPI: PM: Saving platform NVS memory [14230.619096] Disabling non-boot CPUs ... [14230.621589] smpboot: CPU 19 is now offline [14230.627525] smpboot: CPU 18 is now offline [14230.630805] smpboot: CPU 17 is now offline [14230.635371] smpboot: CPU 16 is now offline [14230.641840] smpboot: CPU 15 is now offline [14230.649528] smpboot: CPU 14 is now offline [14230.658873] smpboot: CPU 13 is now offline [14230.666231] smpboot: CPU 12 is now offline [14230.672531] smpboot: CPU 11 is now offline [14230.684986] smpboot: CPU 10 is now offline [14230.689311] smpboot: CPU 9 is now offline [14230.695249] smpboot: CPU 8 is now offline [14230.698769] smpboot: CPU 7 is now offline [14230.704500] smpboot: CPU 6 is now offline [14230.707715] smpboot: CPU 5 is now offline [14230.714217] smpboot: CPU 4 is now offline [14230.717362] smpboot: CPU 3 is now offline [14230.723696] smpboot: CPU 2 is now offline [14230.730325] smpboot: CPU 1 is now offline [14230.743949] ACPI: PM: Low-level resume complete [14230.744013] ACPI: PM: Restoring platform NVS memory [14230.745033] Enabling non-boot CPUs ... [14230.745051] smpboot: Booting Node 0 Processor 1 APIC 0x1 [14230.747051] CPU1 is up [14230.747063] smpboot: Booting Node 0 Processor 2 APIC 0x8 [14230.754733] CPU2 is up [14230.754744] smpboot: Booting Node 0 Processor 3 APIC 0x9 [14230.758406] CPU3 is up [14230.758417] smpboot: Booting Node 0 Processor 4 APIC 0x10 [14230.765655] CPU4 is up [14230.765665] smpboot: Booting Node 0 Processor 5 APIC 0x11 [14230.768770] CPU5 is up [14230.768811] smpboot: Booting Node 0 Processor 6 APIC 0x18 [14230.776704] CPU6 is up [14230.776715] smpboot: Booting Node 0 Processor 7 APIC 0x19 [14230.780617] CPU7 is up [14230.780630] smpboot: Booting Node 0 Processor 8 APIC 0x20 [14230.795282] CPU8 is up [14230.795321] smpboot: Booting Node 0 Processor 9 APIC 0x21 [14230.801205] CPU9 is up [14230.801222] smpboot: Booting Node 0 Processor 10 APIC 0x28 [14230.823488] CPU10 is up [14230.823518] smpboot: Booting Node 0 Processor 11 APIC 0x29 [14230.829138] CPU11 is up [14230.829151] smpboot: Booting Node 0 Processor 12 APIC 0x30 [14230.838271] core: cpu_atom PMU driver: PEBS-via-PT [14230.838276] ... version: 5 [14230.838278] ... bit width: 48 [14230.838279] ... generic registers: 6 [14230.838279] ... value mask: 0000ffffffffffff [14230.838280] ... max period: 00007fffffffffff [14230.838281] ... fixed-purpose events: 3 [14230.838281] ... event mask: 000000070000003f [14230.839284] CPU12 is up [14230.839327] smpboot: Booting Node 0 Processor 13 APIC 0x32 [14230.849421] CPU13 is up [14230.849433] smpboot: Booting Node 0 Processor 14 APIC 0x34 [14230.859509] CPU14 is up [14230.859526] smpboot: Booting Node 0 Processor 15 APIC 0x36 [14230.867307] CPU15 is up [14230.867320] smpboot: Booting Node 0 Processor 16 APIC 0x38 [14230.879578] CPU16 is up [14230.879604] smpboot: Booting Node 0 Processor 17 APIC 0x3a [14230.888018] CPU17 is up [14230.888068] smpboot: Booting Node 0 Processor 18 APIC 0x3c [14230.898765] CPU18 is up [14230.898778] smpboot: Booting Node 0 Processor 19 APIC 0x3e [14230.907338] CPU19 is up [14230.915217] ACPI: PM: Waking up from system sleep state S3 [14231.077999] spd5118 0-0050: Failed to write b = 0: -6 [14231.078021] spd5118 0-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6 [14231.078162] spd5118 0-0050: PM: failed to resume async: error -6 [14231.096445] nvme nvme0: D3 entry latency set to 10 seconds [14231.100118] nvme nvme0: 20/0/0 default/read/poll queues [14231.107107] i40e 0000:02:00.0: FW LLDP is disabled, attempting SW DCB [14231.109039] serial 00:01: activated [14231.109521] nvme nvme1: 8/0/0 default/read/poll queues [14231.114757] i40e 0000:02:00.0: SW DCB initialization succeeded. [14231.182024] i40e 0000:02:00.1: FW LLDP is disabled, attempting SW DCB [14231.189703] i40e 0000:02:00.1: SW DCB initialization succeeded. [14231.260752] usb 3-2.2: reset high-speed USB device number 6 using xhci_hcd [14231.596571] OOM killer enabled. [14231.596573] Restarting tasks ... [14231.597134] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915]) [14231.597539] done. [14231.597547] random: crng reseeded on system resumption [14231.599560] PM: suspend exit [14234.740539] usb 3-2.2: reset high-speed USB device number 6 using xhci_hcd [14238.192310] usb 3-2.2: reset high-speed USB device number 6 using xhci_hcd Note: I blacklisted i40e but that seems to act only at boot time, not on resume... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-01-30 19:17 ` Lifshits, Vitaly 2025-01-30 21:08 ` Stephen Hemminger @ 2025-01-31 1:21 ` Stephen Hemminger 2025-02-05 10:36 ` Lifshits, Vitaly 1 sibling, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2025-01-31 1:21 UTC (permalink / raw) To: Lifshits, Vitaly; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On Thu, 30 Jan 2025 21:17:30 +0200 "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > > I am using: > > > > 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > > Subsystem: Intel Corporation Device 0000 > > Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > > Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > > Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > > Capabilities: [40] Power Management version 3 > > Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > > Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > > Capabilities: [a0] Express Endpoint, IntMsgNum 0 > > Capabilities: [100] Advanced Error Reporting > > Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > > Capabilities: [1c0] Latency Tolerance Reporting > > Capabilities: [1f0] Precision Time Measurement > > Capabilities: [1e0] L1 PM Substates > > Kernel driver in use: igc > > Kernel modules: igc > > > > > > Using both Debian testing and my own kernel built from 6.12, the igc > > driver appears broken after resume. > > From which system state are you resuming? > > > > > After resuming the device is down and no address present. > > Attempts to set link up manually fail. > > Did you get any errors in the dmesg log? > What is the firmware version on your device (you can get it by running > ethtool -i)? > > > If I do rmmod/modprobe of igc it comes back. > > > > Doing a bit of bisectting but it is slow going. > > Meanwhile, we'll also try to reproduce this issue in our lab. Could you > share more details about your system so we can create a similar setup? Given that error reported is -ENODEV, might be a generic netdev problem not just for igc device. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-01-31 1:21 ` Stephen Hemminger @ 2025-02-05 10:36 ` Lifshits, Vitaly 2025-02-06 4:13 ` Stephen Hemminger 0 siblings, 1 reply; 11+ messages in thread From: Lifshits, Vitaly @ 2025-02-05 10:36 UTC (permalink / raw) To: Stephen Hemminger; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On 1/31/2025 3:21 AM, Stephen Hemminger wrote: > On Thu, 30 Jan 2025 21:17:30 +0200 > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: >>> I am using: >>> >>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) >>> Subsystem: Intel Corporation Device 0000 >>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 >>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] >>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] >>> Capabilities: [40] Power Management version 3 >>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ >>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- >>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 >>> Capabilities: [100] Advanced Error Reporting >>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d >>> Capabilities: [1c0] Latency Tolerance Reporting >>> Capabilities: [1f0] Precision Time Measurement >>> Capabilities: [1e0] L1 PM Substates >>> Kernel driver in use: igc >>> Kernel modules: igc >>> >>> >>> Using both Debian testing and my own kernel built from 6.12, the igc >>> driver appears broken after resume. >> >> From which system state are you resuming? >> >>> >>> After resuming the device is down and no address present. >>> Attempts to set link up manually fail. >> >> Did you get any errors in the dmesg log? >> What is the firmware version on your device (you can get it by running >> ethtool -i)? >> >>> If I do rmmod/modprobe of igc it comes back. >>> >>> Doing a bit of bisectting but it is slow going. >> >> Meanwhile, we'll also try to reproduce this issue in our lab. Could you >> share more details about your system so we can create a similar setup? > > Given that error reported is -ENODEV, might be a generic netdev problem not > just for igc device. > We weren't able to reproduce this issue on our systems, even though we tried several suspend-resume cycles on different kernels and different systems. However, a few days ago we received a comment in a BZ about an issue similar to yours. In there adding a short delay in igc_resume function https://bugzilla.kernel.org/show_bug.cgi?id=219143 https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 Can you try to see if it fixes your issue as well? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-02-05 10:36 ` Lifshits, Vitaly @ 2025-02-06 4:13 ` Stephen Hemminger 2025-02-06 13:17 ` Lifshits, Vitaly 0 siblings, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2025-02-06 4:13 UTC (permalink / raw) To: Lifshits, Vitaly; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On Wed, 5 Feb 2025 12:36:31 +0200 "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > On 1/31/2025 3:21 AM, Stephen Hemminger wrote: > > On Thu, 30 Jan 2025 21:17:30 +0200 > > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > > > >> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > >>> I am using: > >>> > >>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > >>> Subsystem: Intel Corporation Device 0000 > >>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > >>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > >>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > >>> Capabilities: [40] Power Management version 3 > >>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > >>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > >>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 > >>> Capabilities: [100] Advanced Error Reporting > >>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > >>> Capabilities: [1c0] Latency Tolerance Reporting > >>> Capabilities: [1f0] Precision Time Measurement > >>> Capabilities: [1e0] L1 PM Substates > >>> Kernel driver in use: igc > >>> Kernel modules: igc > >>> > >>> > >>> Using both Debian testing and my own kernel built from 6.12, the igc > >>> driver appears broken after resume. > >> > >> From which system state are you resuming? > >> > >>> > >>> After resuming the device is down and no address present. > >>> Attempts to set link up manually fail. > >> > >> Did you get any errors in the dmesg log? > >> What is the firmware version on your device (you can get it by running > >> ethtool -i)? > >> > >>> If I do rmmod/modprobe of igc it comes back. > >>> > >>> Doing a bit of bisectting but it is slow going. > >> > >> Meanwhile, we'll also try to reproduce this issue in our lab. Could you > >> share more details about your system so we can create a similar setup? > > > > Given that error reported is -ENODEV, might be a generic netdev problem not > > just for igc device. > > > > We weren't able to reproduce this issue on our systems, even though we > tried several suspend-resume cycles on different kernels and different > systems. > > However, a few days ago we received a comment in a BZ about an issue > similar to yours. In there adding a short delay in igc_resume function > https://bugzilla.kernel.org/show_bug.cgi?id=219143 > https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 > > > > Can you try to see if it fixes your issue as well? I tried the proposed delay and it had no impact. Any idea of other things to instrument? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-02-06 4:13 ` Stephen Hemminger @ 2025-02-06 13:17 ` Lifshits, Vitaly 2025-02-06 20:09 ` Stephen Hemminger 0 siblings, 1 reply; 11+ messages in thread From: Lifshits, Vitaly @ 2025-02-06 13:17 UTC (permalink / raw) To: Stephen Hemminger; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On 2/6/2025 6:13 AM, Stephen Hemminger wrote: > On Wed, 5 Feb 2025 12:36:31 +0200 > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >> On 1/31/2025 3:21 AM, Stephen Hemminger wrote: >>> On Thu, 30 Jan 2025 21:17:30 +0200 >>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: >>> >>>> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: >>>>> I am using: >>>>> >>>>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) >>>>> Subsystem: Intel Corporation Device 0000 >>>>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 >>>>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] >>>>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] >>>>> Capabilities: [40] Power Management version 3 >>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ >>>>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- >>>>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 >>>>> Capabilities: [100] Advanced Error Reporting >>>>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d >>>>> Capabilities: [1c0] Latency Tolerance Reporting >>>>> Capabilities: [1f0] Precision Time Measurement >>>>> Capabilities: [1e0] L1 PM Substates >>>>> Kernel driver in use: igc >>>>> Kernel modules: igc >>>>> >>>>> >>>>> Using both Debian testing and my own kernel built from 6.12, the igc >>>>> driver appears broken after resume. >>>> >>>> From which system state are you resuming? >>>> >>>>> >>>>> After resuming the device is down and no address present. >>>>> Attempts to set link up manually fail. >>>> >>>> Did you get any errors in the dmesg log? >>>> What is the firmware version on your device (you can get it by running >>>> ethtool -i)? >>>> >>>>> If I do rmmod/modprobe of igc it comes back. >>>>> >>>>> Doing a bit of bisectting but it is slow going. >>>> >>>> Meanwhile, we'll also try to reproduce this issue in our lab. Could you >>>> share more details about your system so we can create a similar setup? >>> >>> Given that error reported is -ENODEV, might be a generic netdev problem not >>> just for igc device. >>> >> >> We weren't able to reproduce this issue on our systems, even though we >> tried several suspend-resume cycles on different kernels and different >> systems. >> >> However, a few days ago we received a comment in a BZ about an issue >> similar to yours. In there adding a short delay in igc_resume function >> https://bugzilla.kernel.org/show_bug.cgi?id=219143 >> https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 >> >> >> >> Can you try to see if it fixes your issue as well? > > I tried the proposed delay and it had no impact. > Any idea of other things to instrument? > Has the adapter worked with a different kernel? Can you try to reproduce the issue over kernel 6.9? Is the LAN cable connected to the igc adapter? Does it maintain link during suspend? Also, I saw that on your board you have three more adapters, I assume that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also happen to enp87s0? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-02-06 13:17 ` Lifshits, Vitaly @ 2025-02-06 20:09 ` Stephen Hemminger 2025-02-11 18:20 ` Lifshits, Vitaly 0 siblings, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2025-02-06 20:09 UTC (permalink / raw) To: Lifshits, Vitaly; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On Thu, 6 Feb 2025 15:17:00 +0200 "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > On 2/6/2025 6:13 AM, Stephen Hemminger wrote: > > On Wed, 5 Feb 2025 12:36:31 +0200 > > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > > > >> On 1/31/2025 3:21 AM, Stephen Hemminger wrote: > >>> On Thu, 30 Jan 2025 21:17:30 +0200 > >>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >>> > >>>> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > >>>>> I am using: > >>>>> > >>>>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > >>>>> Subsystem: Intel Corporation Device 0000 > >>>>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > >>>>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > >>>>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > >>>>> Capabilities: [40] Power Management version 3 > >>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > >>>>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > >>>>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 > >>>>> Capabilities: [100] Advanced Error Reporting > >>>>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > >>>>> Capabilities: [1c0] Latency Tolerance Reporting > >>>>> Capabilities: [1f0] Precision Time Measurement > >>>>> Capabilities: [1e0] L1 PM Substates > >>>>> Kernel driver in use: igc > >>>>> Kernel modules: igc > >>>>> > >>>>> > >>>>> Using both Debian testing and my own kernel built from 6.12, the igc > >>>>> driver appears broken after resume. > >>>> > >>>> From which system state are you resuming? > >>>> > >>>>> > >>>>> After resuming the device is down and no address present. > >>>>> Attempts to set link up manually fail. > >>>> > >>>> Did you get any errors in the dmesg log? > >>>> What is the firmware version on your device (you can get it by running > >>>> ethtool -i)? > >>>> > >>>>> If I do rmmod/modprobe of igc it comes back. > >>>>> > >>>>> Doing a bit of bisectting but it is slow going. > >>>> > >>>> Meanwhile, we'll also try to reproduce this issue in our lab. Could you > >>>> share more details about your system so we can create a similar setup? > >>> > >>> Given that error reported is -ENODEV, might be a generic netdev problem not > >>> just for igc device. > >>> > >> > >> We weren't able to reproduce this issue on our systems, even though we > >> tried several suspend-resume cycles on different kernels and different > >> systems. > >> > >> However, a few days ago we received a comment in a BZ about an issue > >> similar to yours. In there adding a short delay in igc_resume function > >> https://bugzilla.kernel.org/show_bug.cgi?id=219143 > >> https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 > >> > >> > >> > >> Can you try to see if it fixes your issue as well? > > > > I tried the proposed delay and it had no impact. > > Any idea of other things to instrument? > > > > > Has the adapter worked with a different kernel? Can you try to reproduce > the issue over kernel 6.9? > > Is the LAN cable connected to the igc adapter? Does it maintain link > during suspend? > > Also, I saw that on your board you have three more adapters, I assume > that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also > happen to enp87s0? This is a new machine, and not sure if it ever worked. I can boot some older distro via USB if that helps. The LAN cable is always connected (it is a desktop box), and the 10G NIC's are not used; they are connected by a loopback cable and used for DPDK testing occasionally. It does work in Windows... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-02-06 20:09 ` Stephen Hemminger @ 2025-02-11 18:20 ` Lifshits, Vitaly 2025-02-11 19:05 ` Stephen Hemminger 2025-02-11 19:09 ` Stephen Hemminger 0 siblings, 2 replies; 11+ messages in thread From: Lifshits, Vitaly @ 2025-02-11 18:20 UTC (permalink / raw) To: Stephen Hemminger; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On 2/6/2025 10:09 PM, Stephen Hemminger wrote: > On Thu, 6 Feb 2025 15:17:00 +0200 > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >> On 2/6/2025 6:13 AM, Stephen Hemminger wrote: >>> On Wed, 5 Feb 2025 12:36:31 +0200 >>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: >>> >>>> On 1/31/2025 3:21 AM, Stephen Hemminger wrote: >>>>> On Thu, 30 Jan 2025 21:17:30 +0200 >>>>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: >>>>> >>>>>> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: >>>>>>> I am using: >>>>>>> >>>>>>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) >>>>>>> Subsystem: Intel Corporation Device 0000 >>>>>>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 >>>>>>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] >>>>>>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] >>>>>>> Capabilities: [40] Power Management version 3 >>>>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ >>>>>>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- >>>>>>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 >>>>>>> Capabilities: [100] Advanced Error Reporting >>>>>>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d >>>>>>> Capabilities: [1c0] Latency Tolerance Reporting >>>>>>> Capabilities: [1f0] Precision Time Measurement >>>>>>> Capabilities: [1e0] L1 PM Substates >>>>>>> Kernel driver in use: igc >>>>>>> Kernel modules: igc >>>>>>> >>>>>>> >>>>>>> Using both Debian testing and my own kernel built from 6.12, the igc >>>>>>> driver appears broken after resume. >>>>>> >>>>>> From which system state are you resuming? >>>>>> >>>>>>> >>>>>>> After resuming the device is down and no address present. >>>>>>> Attempts to set link up manually fail. >>>>>> >>>>>> Did you get any errors in the dmesg log? >>>>>> What is the firmware version on your device (you can get it by running >>>>>> ethtool -i)? >>>>>> >>>>>>> If I do rmmod/modprobe of igc it comes back. >>>>>>> >>>>>>> Doing a bit of bisectting but it is slow going. >>>>>> >>>>>> Meanwhile, we'll also try to reproduce this issue in our lab. Could you >>>>>> share more details about your system so we can create a similar setup? >>>>> >>>>> Given that error reported is -ENODEV, might be a generic netdev problem not >>>>> just for igc device. >>>>> >>>> >>>> We weren't able to reproduce this issue on our systems, even though we >>>> tried several suspend-resume cycles on different kernels and different >>>> systems. >>>> >>>> However, a few days ago we received a comment in a BZ about an issue >>>> similar to yours. In there adding a short delay in igc_resume function >>>> https://bugzilla.kernel.org/show_bug.cgi?id=219143 >>>> https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 >>>> >>>> >>>> >>>> Can you try to see if it fixes your issue as well? >>> >>> I tried the proposed delay and it had no impact. >>> Any idea of other things to instrument? >>> >> >> >> Has the adapter worked with a different kernel? Can you try to reproduce >> the issue over kernel 6.9? >> >> Is the LAN cable connected to the igc adapter? Does it maintain link >> during suspend? >> >> Also, I saw that on your board you have three more adapters, I assume >> that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also >> happen to enp87s0? > > This is a new machine, and not sure if it ever worked. > I can boot some older distro via USB if that helps. Yes, please. It might help us in narrowing down the issue. > > The LAN cable is always connected (it is a desktop box), and the > 10G NIC's are not used; they are connected by a loopback cable and > used for DPDK testing occasionally. > > It does work in Windows... Do you work with Network Manager? If so, is it possible to see if the issue can be reproduced with it disabled? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-02-11 18:20 ` Lifshits, Vitaly @ 2025-02-11 19:05 ` Stephen Hemminger 2025-02-11 19:09 ` Stephen Hemminger 1 sibling, 0 replies; 11+ messages in thread From: Stephen Hemminger @ 2025-02-11 19:05 UTC (permalink / raw) To: Lifshits, Vitaly; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On Tue, 11 Feb 2025 20:20:03 +0200 "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > On 2/6/2025 10:09 PM, Stephen Hemminger wrote: > > On Thu, 6 Feb 2025 15:17:00 +0200 > > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > > > >> On 2/6/2025 6:13 AM, Stephen Hemminger wrote: > >>> On Wed, 5 Feb 2025 12:36:31 +0200 > >>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >>> > >>>> On 1/31/2025 3:21 AM, Stephen Hemminger wrote: > >>>>> On Thu, 30 Jan 2025 21:17:30 +0200 > >>>>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >>>>> > >>>>>> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > >>>>>>> I am using: > >>>>>>> > >>>>>>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > >>>>>>> Subsystem: Intel Corporation Device 0000 > >>>>>>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > >>>>>>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > >>>>>>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > >>>>>>> Capabilities: [40] Power Management version 3 > >>>>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > >>>>>>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > >>>>>>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 > >>>>>>> Capabilities: [100] Advanced Error Reporting > >>>>>>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > >>>>>>> Capabilities: [1c0] Latency Tolerance Reporting > >>>>>>> Capabilities: [1f0] Precision Time Measurement > >>>>>>> Capabilities: [1e0] L1 PM Substates > >>>>>>> Kernel driver in use: igc > >>>>>>> Kernel modules: igc > >>>>>>> > >>>>>>> > >>>>>>> Using both Debian testing and my own kernel built from 6.12, the igc > >>>>>>> driver appears broken after resume. > >>>>>> > >>>>>> From which system state are you resuming? > >>>>>> > >>>>>>> > >>>>>>> After resuming the device is down and no address present. > >>>>>>> Attempts to set link up manually fail. > >>>>>> > >>>>>> Did you get any errors in the dmesg log? > >>>>>> What is the firmware version on your device (you can get it by running > >>>>>> ethtool -i)? > >>>>>> > >>>>>>> If I do rmmod/modprobe of igc it comes back. > >>>>>>> > >>>>>>> Doing a bit of bisectting but it is slow going. > >>>>>> > >>>>>> Meanwhile, we'll also try to reproduce this issue in our lab. Could you > >>>>>> share more details about your system so we can create a similar setup? > >>>>> > >>>>> Given that error reported is -ENODEV, might be a generic netdev problem not > >>>>> just for igc device. > >>>>> > >>>> > >>>> We weren't able to reproduce this issue on our systems, even though we > >>>> tried several suspend-resume cycles on different kernels and different > >>>> systems. > >>>> > >>>> However, a few days ago we received a comment in a BZ about an issue > >>>> similar to yours. In there adding a short delay in igc_resume function > >>>> https://bugzilla.kernel.org/show_bug.cgi?id=219143 > >>>> https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 > >>>> > >>>> > >>>> > >>>> Can you try to see if it fixes your issue as well? > >>> > >>> I tried the proposed delay and it had no impact. > >>> Any idea of other things to instrument? > >>> > >> > >> > >> Has the adapter worked with a different kernel? Can you try to reproduce > >> the issue over kernel 6.9? > >> > >> Is the LAN cable connected to the igc adapter? Does it maintain link > >> during suspend? > >> > >> Also, I saw that on your board you have three more adapters, I assume > >> that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also > >> happen to enp87s0? > > > > This is a new machine, and not sure if it ever worked. > > I can boot some older distro via USB if that helps. > > Yes, please. > It might help us in narrowing down the issue. > > > > > The LAN cable is always connected (it is a desktop box), and the > > 10G NIC's are not used; they are connected by a loopback cable and > > used for DPDK testing occasionally. > > > > It does work in Windows... > > Do you work with Network Manager? If so, is it possible to see if the > issue can be reproduced with it disabled? > Yes Debian uses Network Manager, but disabling it might not be possible. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 2025-02-11 18:20 ` Lifshits, Vitaly 2025-02-11 19:05 ` Stephen Hemminger @ 2025-02-11 19:09 ` Stephen Hemminger 1 sibling, 0 replies; 11+ messages in thread From: Stephen Hemminger @ 2025-02-11 19:09 UTC (permalink / raw) To: Lifshits, Vitaly; +Cc: anthony.l.nguyen, jesse.brandeburg, intel-wired-lan On Tue, 11 Feb 2025 20:20:03 +0200 "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > > > On 2/6/2025 10:09 PM, Stephen Hemminger wrote: > > On Thu, 6 Feb 2025 15:17:00 +0200 > > "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > > > >> On 2/6/2025 6:13 AM, Stephen Hemminger wrote: > >>> On Wed, 5 Feb 2025 12:36:31 +0200 > >>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >>> > >>>> On 1/31/2025 3:21 AM, Stephen Hemminger wrote: > >>>>> On Thu, 30 Jan 2025 21:17:30 +0200 > >>>>> "Lifshits, Vitaly" <vitaly.lifshits@intel.com> wrote: > >>>>> > >>>>>> On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > >>>>>>> I am using: > >>>>>>> > >>>>>>> 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) > >>>>>>> Subsystem: Intel Corporation Device 0000 > >>>>>>> Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 > >>>>>>> Memory at 6c500000 (32-bit, non-prefetchable) [size=1M] > >>>>>>> Memory at 6c600000 (32-bit, non-prefetchable) [size=16K] > >>>>>>> Capabilities: [40] Power Management version 3 > >>>>>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > >>>>>>> Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > >>>>>>> Capabilities: [a0] Express Endpoint, IntMsgNum 0 > >>>>>>> Capabilities: [100] Advanced Error Reporting > >>>>>>> Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > >>>>>>> Capabilities: [1c0] Latency Tolerance Reporting > >>>>>>> Capabilities: [1f0] Precision Time Measurement > >>>>>>> Capabilities: [1e0] L1 PM Substates > >>>>>>> Kernel driver in use: igc > >>>>>>> Kernel modules: igc > >>>>>>> > >>>>>>> > >>>>>>> Using both Debian testing and my own kernel built from 6.12, the igc > >>>>>>> driver appears broken after resume. > >>>>>> > >>>>>> From which system state are you resuming? > >>>>>> > >>>>>>> > >>>>>>> After resuming the device is down and no address present. > >>>>>>> Attempts to set link up manually fail. > >>>>>> > >>>>>> Did you get any errors in the dmesg log? > >>>>>> What is the firmware version on your device (you can get it by running > >>>>>> ethtool -i)? > >>>>>> > >>>>>>> If I do rmmod/modprobe of igc it comes back. > >>>>>>> > >>>>>>> Doing a bit of bisectting but it is slow going. > >>>>>> > >>>>>> Meanwhile, we'll also try to reproduce this issue in our lab. Could you > >>>>>> share more details about your system so we can create a similar setup? > >>>>> > >>>>> Given that error reported is -ENODEV, might be a generic netdev problem not > >>>>> just for igc device. > >>>>> > >>>> > >>>> We weren't able to reproduce this issue on our systems, even though we > >>>> tried several suspend-resume cycles on different kernels and different > >>>> systems. > >>>> > >>>> However, a few days ago we received a comment in a BZ about an issue > >>>> similar to yours. In there adding a short delay in igc_resume function > >>>> https://bugzilla.kernel.org/show_bug.cgi?id=219143 > >>>> https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 > >>>> > >>>> > >>>> > >>>> Can you try to see if it fixes your issue as well? > >>> > >>> I tried the proposed delay and it had no impact. > >>> Any idea of other things to instrument? > >>> > >> > >> > >> Has the adapter worked with a different kernel? Can you try to reproduce > >> the issue over kernel 6.9? > >> > >> Is the LAN cable connected to the igc adapter? Does it maintain link > >> during suspend? > >> > >> Also, I saw that on your board you have three more adapters, I assume > >> that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also > >> happen to enp87s0? > > > > This is a new machine, and not sure if it ever worked. > > I can boot some older distro via USB if that helps. > > Yes, please. > It might help us in narrowing down the issue. > > > > > The LAN cable is always connected (it is a desktop box), and the > > 10G NIC's are not used; they are connected by a loopback cable and > > used for DPDK testing occasionally. > > > > It does work in Windows... > > Do you work with Network Manager? If so, is it possible to see if the > issue can be reproduced with it disabled? > If network manager is disabled with: # systemctl stop NetworkManager.service # systemctl disable NetworkManager.service Then device persists across suspend/resume. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-02-11 19:09 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-30 17:11 [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12 Stephen Hemminger 2025-01-30 19:17 ` Lifshits, Vitaly 2025-01-30 21:08 ` Stephen Hemminger 2025-01-31 1:21 ` Stephen Hemminger 2025-02-05 10:36 ` Lifshits, Vitaly 2025-02-06 4:13 ` Stephen Hemminger 2025-02-06 13:17 ` Lifshits, Vitaly 2025-02-06 20:09 ` Stephen Hemminger 2025-02-11 18:20 ` Lifshits, Vitaly 2025-02-11 19:05 ` Stephen Hemminger 2025-02-11 19:09 ` Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox