* [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
@ 2025-09-06 5:51 Kohei Enju
2025-09-08 6:26 ` Kurt Kanzenbach
2025-09-08 6:32 ` Loktionov, Aleksandr
0 siblings, 2 replies; 9+ messages in thread
From: Kohei Enju @ 2025-09-06 5:51 UTC (permalink / raw)
To: intel-wired-lan, netdev
Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Kurt Kanzenbach,
kohei.enju, Kohei Enju
Currently igc_probe() doesn't unregister netdev when igc_led_setup()
fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
This behavior can be tested using fault-injection framework. I used the
failslab feature to test the issue. [2]
Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
panic.
[1]
kernel BUG at net/core/dev.c:12047!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:free_netdev+0x278/0x2b0
[...]
Call Trace:
<TASK>
igc_probe+0x370/0x910
local_pci_probe+0x3a/0x80
pci_device_probe+0xd1/0x200
[...]
[2]
#!/bin/bash -ex
FAILSLAB_PATH=/sys/kernel/debug/failslab/
DEVICE=0000:00:05.0
START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
| awk '{printf("0x%s", $1)}')
END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
echo $START_ADDR > $FAILSLAB_PATH/require-start
echo $END_ADDR > $FAILSLAB_PATH/require-end
echo 1 > $FAILSLAB_PATH/times
echo 100 > $FAILSLAB_PATH/probability
echo N > $FAILSLAB_PATH/ignore-gfp-wait
echo $DEVICE > /sys/bus/pci/drivers/igc/bind
Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
---
drivers/net/ethernet/intel/igc/igc_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index e79b14d50b24..95c415d0917d 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -7336,11 +7336,13 @@ static int igc_probe(struct pci_dev *pdev,
if (IS_ENABLED(CONFIG_IGC_LEDS)) {
err = igc_led_setup(adapter);
if (err)
- goto err_register;
+ goto err_led_setup;
}
return 0;
+err_led_setup:
+ unregister_netdev(netdev);
err_register:
igc_release_hw_control(adapter);
igc_ptp_stop(adapter);
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-06 5:51 [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe() Kohei Enju
@ 2025-09-08 6:26 ` Kurt Kanzenbach
2025-09-10 7:28 ` [Intel-wired-lan] " Lifshits, Vitaly
2025-09-08 6:32 ` Loktionov, Aleksandr
1 sibling, 1 reply; 9+ messages in thread
From: Kurt Kanzenbach @ 2025-09-08 6:26 UTC (permalink / raw)
To: Kohei Enju, intel-wired-lan, netdev
Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, kohei.enju, Kohei Enju
[-- Attachment #1: Type: text/plain, Size: 1542 bytes --]
On Sat Sep 06 2025, Kohei Enju wrote:
> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>
> This behavior can be tested using fault-injection framework. I used the
> failslab feature to test the issue. [2]
>
> Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
> panic.
>
> [1]
> kernel BUG at net/core/dev.c:12047!
> Oops: invalid opcode: 0000 [#1] SMP NOPTI
> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> RIP: 0010:free_netdev+0x278/0x2b0
> [...]
> Call Trace:
> <TASK>
> igc_probe+0x370/0x910
> local_pci_probe+0x3a/0x80
> pci_device_probe+0xd1/0x200
> [...]
>
> [2]
> #!/bin/bash -ex
>
> FAILSLAB_PATH=/sys/kernel/debug/failslab/
> DEVICE=0000:00:05.0
> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
> | awk '{printf("0x%s", $1)}')
> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>
> echo $START_ADDR > $FAILSLAB_PATH/require-start
> echo $END_ADDR > $FAILSLAB_PATH/require-end
> echo 1 > $FAILSLAB_PATH/times
> echo 100 > $FAILSLAB_PATH/probability
> echo N > $FAILSLAB_PATH/ignore-gfp-wait
>
> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>
> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
> Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-06 5:51 [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe() Kohei Enju
2025-09-08 6:26 ` Kurt Kanzenbach
@ 2025-09-08 6:32 ` Loktionov, Aleksandr
1 sibling, 0 replies; 9+ messages in thread
From: Loktionov, Aleksandr @ 2025-09-08 6:32 UTC (permalink / raw)
To: Kohei Enju, intel-wired-lan@lists.osuosl.org,
netdev@vger.kernel.org
Cc: Nguyen, Anthony L, Kitszel, Przemyslaw, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Kurt Kanzenbach, kohei.enju@gmail.com
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Kohei Enju
> Sent: Saturday, September 6, 2025 7:52 AM
> To: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
> Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> Przemyslaw <przemyslaw.kitszel@intel.com>; Andrew Lunn
> <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; Kurt Kanzenbach <kurt@linutronix.de>;
> kohei.enju@gmail.com; Kohei Enju <enjuk@amazon.com>
> Subject: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev
> when igc_led_setup() fails in igc_probe()
>
> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>
> This behavior can be tested using fault-injection framework. I used
> the failslab feature to test the issue. [2]
>
> Call unregister_netdev() when igc_led_setup() fails to avoid the
> kernel panic.
>
> [1]
> kernel BUG at net/core/dev.c:12047!
> Oops: invalid opcode: 0000 [#1] SMP NOPTI
> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-
> enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary) Hardware name:
> QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2
> 04/01/2014
> RIP: 0010:free_netdev+0x278/0x2b0
> [...]
> Call Trace:
> <TASK>
> igc_probe+0x370/0x910
> local_pci_probe+0x3a/0x80
> pci_device_probe+0xd1/0x200
> [...]
>
> [2]
> #!/bin/bash -ex
>
> FAILSLAB_PATH=/sys/kernel/debug/failslab/
> DEVICE=0000:00:05.0
> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
> | awk '{printf("0x%s", $1)}')
> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>
> echo $START_ADDR > $FAILSLAB_PATH/require-start echo $END_ADDR >
> $FAILSLAB_PATH/require-end echo 1 > $FAILSLAB_PATH/times echo 100 >
> $FAILSLAB_PATH/probability echo N > $FAILSLAB_PATH/ignore-gfp-wait
>
> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>
> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
> Signed-off-by: Kohei Enju <enjuk@amazon.com>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c
> b/drivers/net/ethernet/intel/igc/igc_main.c
> index e79b14d50b24..95c415d0917d 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -7336,11 +7336,13 @@ static int igc_probe(struct pci_dev *pdev,
> if (IS_ENABLED(CONFIG_IGC_LEDS)) {
> err = igc_led_setup(adapter);
> if (err)
> - goto err_register;
> + goto err_led_setup;
> }
>
> return 0;
>
> +err_led_setup:
> + unregister_netdev(netdev);
> err_register:
> igc_release_hw_control(adapter);
> igc_ptp_stop(adapter);
> --
> 2.48.1
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-08 6:26 ` Kurt Kanzenbach
@ 2025-09-10 7:28 ` Lifshits, Vitaly
2025-09-10 7:52 ` Kohei Enju
0 siblings, 1 reply; 9+ messages in thread
From: Lifshits, Vitaly @ 2025-09-10 7:28 UTC (permalink / raw)
To: Kurt Kanzenbach, Kohei Enju, intel-wired-lan, netdev
Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, kohei.enju
On 9/8/2025 9:26 AM, Kurt Kanzenbach wrote:
> On Sat Sep 06 2025, Kohei Enju wrote:
>> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
>> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>>
>> This behavior can be tested using fault-injection framework. I used the
>> failslab feature to test the issue. [2]
>>
>> Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
>> panic.
>>
>> [1]
>> kernel BUG at net/core/dev.c:12047!
>> Oops: invalid opcode: 0000 [#1] SMP NOPTI
>> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> RIP: 0010:free_netdev+0x278/0x2b0
>> [...]
>> Call Trace:
>> <TASK>
>> igc_probe+0x370/0x910
>> local_pci_probe+0x3a/0x80
>> pci_device_probe+0xd1/0x200
>> [...]
>>
>> [2]
>> #!/bin/bash -ex
>>
>> FAILSLAB_PATH=/sys/kernel/debug/failslab/
>> DEVICE=0000:00:05.0
>> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
>> | awk '{printf("0x%s", $1)}')
>> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>>
>> echo $START_ADDR > $FAILSLAB_PATH/require-start
>> echo $END_ADDR > $FAILSLAB_PATH/require-end
>> echo 1 > $FAILSLAB_PATH/times
>> echo 100 > $FAILSLAB_PATH/probability
>> echo N > $FAILSLAB_PATH/ignore-gfp-wait
>>
>> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>>
>> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
>> Signed-off-by: Kohei Enju <enjuk@amazon.com>
>
> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Thank you for the patch and for identifying this issue!
I was wondering whether we could avoid failing the probe in cases where
igc_led_setup fails. It seems to me that a failure in the LED class
functionality shouldn't prevent the device's core functionality from
working properly.
From what I understand, errors in this function are not due to hardware
malfunctions. Therefore, I suggest we remove the error propagation.
Alternatively, if feasible, we could consider reordering the function
calls so that the LED class setup occurs before the netdev registration.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-10 7:28 ` [Intel-wired-lan] " Lifshits, Vitaly
@ 2025-09-10 7:52 ` Kohei Enju
2025-09-10 8:57 ` Kurt Kanzenbach
2025-09-10 9:02 ` Loktionov, Aleksandr
0 siblings, 2 replies; 9+ messages in thread
From: Kohei Enju @ 2025-09-10 7:52 UTC (permalink / raw)
To: vitaly.lifshits
Cc: andrew+netdev, anthony.l.nguyen, davem, edumazet, enjuk,
intel-wired-lan, kohei.enju, kuba, kurt, netdev, pabeni,
przemyslaw.kitszel, aleksandr.loktionov
+ Aleksandr
On Wed, 10 Sep 2025 10:28:17 +0300, Lifshits, Vitaly wrote:
>On 9/8/2025 9:26 AM, Kurt Kanzenbach wrote:
>> On Sat Sep 06 2025, Kohei Enju wrote:
>>> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
>>> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>>>
>>> This behavior can be tested using fault-injection framework. I used the
>>> failslab feature to test the issue. [2]
>>>
>>> Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
>>> panic.
>>>
>>> [1]
>>> kernel BUG at net/core/dev.c:12047!
>>> Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>> RIP: 0010:free_netdev+0x278/0x2b0
>>> [...]
>>> Call Trace:
>>> <TASK>
>>> igc_probe+0x370/0x910
>>> local_pci_probe+0x3a/0x80
>>> pci_device_probe+0xd1/0x200
>>> [...]
>>>
>>> [2]
>>> #!/bin/bash -ex
>>>
>>> FAILSLAB_PATH=/sys/kernel/debug/failslab/
>>> DEVICE=0000:00:05.0
>>> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
>>> | awk '{printf("0x%s", $1)}')
>>> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>>>
>>> echo $START_ADDR > $FAILSLAB_PATH/require-start
>>> echo $END_ADDR > $FAILSLAB_PATH/require-end
>>> echo 1 > $FAILSLAB_PATH/times
>>> echo 100 > $FAILSLAB_PATH/probability
>>> echo N > $FAILSLAB_PATH/ignore-gfp-wait
>>>
>>> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>>>
>>> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
>>> Signed-off-by: Kohei Enju <enjuk@amazon.com>
>>
>> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
>
>Thank you for the patch and for identifying this issue!
>
>I was wondering whether we could avoid failing the probe in cases where
>igc_led_setup fails. It seems to me that a failure in the LED class
>functionality shouldn't prevent the device's core functionality from
>working properly.
Indeed, that also makes sense.
The behavior that igc_probe() succeeds even if igc_led_setup() fails
also seems good to me, as long as notifying users that igc's led
functionality is not available.
>
> From what I understand, errors in this function are not due to hardware
>malfunctions. Therefore, I suggest we remove the error propagation.
>
>Alternatively, if feasible, we could consider reordering the function
>calls so that the LED class setup occurs before the netdev registration.
>
I don't disagree with you, but I would like to hear Kurt and Aleksandr's
opinion. Do you have any preference or suggestions?
I'll revise and work on v2 if needed.
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-10 7:52 ` Kohei Enju
@ 2025-09-10 8:57 ` Kurt Kanzenbach
2025-09-10 9:15 ` Kohei Enju
2025-09-10 9:02 ` Loktionov, Aleksandr
1 sibling, 1 reply; 9+ messages in thread
From: Kurt Kanzenbach @ 2025-09-10 8:57 UTC (permalink / raw)
To: Kohei Enju, vitaly.lifshits
Cc: andrew+netdev, anthony.l.nguyen, davem, edumazet, enjuk,
intel-wired-lan, kohei.enju, kuba, netdev, pabeni,
przemyslaw.kitszel, aleksandr.loktionov
[-- Attachment #1: Type: text/plain, Size: 3023 bytes --]
On Wed Sep 10 2025, Kohei Enju wrote:
> + Aleksandr
>
> On Wed, 10 Sep 2025 10:28:17 +0300, Lifshits, Vitaly wrote:
>
>>On 9/8/2025 9:26 AM, Kurt Kanzenbach wrote:
>>> On Sat Sep 06 2025, Kohei Enju wrote:
>>>> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
>>>> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>>>>
>>>> This behavior can be tested using fault-injection framework. I used the
>>>> failslab feature to test the issue. [2]
>>>>
>>>> Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
>>>> panic.
>>>>
>>>> [1]
>>>> kernel BUG at net/core/dev.c:12047!
>>>> Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>>> RIP: 0010:free_netdev+0x278/0x2b0
>>>> [...]
>>>> Call Trace:
>>>> <TASK>
>>>> igc_probe+0x370/0x910
>>>> local_pci_probe+0x3a/0x80
>>>> pci_device_probe+0xd1/0x200
>>>> [...]
>>>>
>>>> [2]
>>>> #!/bin/bash -ex
>>>>
>>>> FAILSLAB_PATH=/sys/kernel/debug/failslab/
>>>> DEVICE=0000:00:05.0
>>>> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
>>>> | awk '{printf("0x%s", $1)}')
>>>> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>>>>
>>>> echo $START_ADDR > $FAILSLAB_PATH/require-start
>>>> echo $END_ADDR > $FAILSLAB_PATH/require-end
>>>> echo 1 > $FAILSLAB_PATH/times
>>>> echo 100 > $FAILSLAB_PATH/probability
>>>> echo N > $FAILSLAB_PATH/ignore-gfp-wait
>>>>
>>>> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>>>>
>>>> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
>>>> Signed-off-by: Kohei Enju <enjuk@amazon.com>
>>>
>>> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
>>
>>Thank you for the patch and for identifying this issue!
>>
>>I was wondering whether we could avoid failing the probe in cases where
>>igc_led_setup fails. It seems to me that a failure in the LED class
>>functionality shouldn't prevent the device's core functionality from
>>working properly.
>
> Indeed, that also makes sense.
>
> The behavior that igc_probe() succeeds even if igc_led_setup() fails
> also seems good to me, as long as notifying users that igc's led
> functionality is not available.
SGTM. The LED code is nice to have, but not mandatory at all. The device
has sane LED defaults.
>
>>
>> From what I understand, errors in this function are not due to hardware
>>malfunctions. Therefore, I suggest we remove the error propagation.
>>
>>Alternatively, if feasible, we could consider reordering the function
>>calls so that the LED class setup occurs before the netdev registration.
>>
>
> I don't disagree with you, but I would like to hear Kurt and Aleksandr's
> opinion. Do you have any preference or suggestions?
See above.
Thanks,
Kurt
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-10 7:52 ` Kohei Enju
2025-09-10 8:57 ` Kurt Kanzenbach
@ 2025-09-10 9:02 ` Loktionov, Aleksandr
2025-09-10 9:25 ` Kohei Enju
1 sibling, 1 reply; 9+ messages in thread
From: Loktionov, Aleksandr @ 2025-09-10 9:02 UTC (permalink / raw)
To: Kohei Enju, Lifshits, Vitaly
Cc: andrew+netdev@lunn.ch, Nguyen, Anthony L, davem@davemloft.net,
edumazet@google.com, intel-wired-lan@lists.osuosl.org,
kohei.enju@gmail.com, kuba@kernel.org, kurt@linutronix.de,
netdev@vger.kernel.org, pabeni@redhat.com, Kitszel, Przemyslaw
> -----Original Message-----
> From: Kohei Enju <enjuk@amazon.com>
> Sent: Wednesday, September 10, 2025 9:52 AM
> To: Lifshits, Vitaly <vitaly.lifshits@intel.com>
> Cc: andrew+netdev@lunn.ch; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> edumazet@google.com; enjuk@amazon.com; intel-wired-
> lan@lists.osuosl.org; kohei.enju@gmail.com; kuba@kernel.org;
> kurt@linutronix.de; netdev@vger.kernel.org; pabeni@redhat.com;
> Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Loktionov,
> Aleksandr <aleksandr.loktionov@intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister
> netdev when igc_led_setup() fails in igc_probe()
>
> + Aleksandr
>
> On Wed, 10 Sep 2025 10:28:17 +0300, Lifshits, Vitaly wrote:
>
> >On 9/8/2025 9:26 AM, Kurt Kanzenbach wrote:
> >> On Sat Sep 06 2025, Kohei Enju wrote:
> >>> Currently igc_probe() doesn't unregister netdev when
> igc_led_setup()
> >>> fails, causing BUG_ON() in free_netdev() and then kernel panics.
> [1]
> >>>
> >>> This behavior can be tested using fault-injection framework. I
> used the
> >>> failslab feature to test the issue. [2]
> >>>
> >>> Call unregister_netdev() when igc_led_setup() fails to avoid the
> kernel
> >>> panic.
> >>>
> >>> [1]
> >>> kernel BUG at net/core/dev.c:12047!
> >>> Oops: invalid opcode: 0000 [#1] SMP NOPTI
> >>> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-
> rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
> >>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-
> debian-1.16.3-2 04/01/2014
> >>> RIP: 0010:free_netdev+0x278/0x2b0
> >>> [...]
> >>> Call Trace:
> >>> <TASK>
> >>> igc_probe+0x370/0x910
> >>> local_pci_probe+0x3a/0x80
> >>> pci_device_probe+0xd1/0x200
> >>> [...]
> >>>
> >>> [2]
> >>> #!/bin/bash -ex
> >>>
> >>> FAILSLAB_PATH=/sys/kernel/debug/failslab/
> >>> DEVICE=0000:00:05.0
> >>> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
> >>> | awk '{printf("0x%s", $1)}')
> >>> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
> >>>
> >>> echo $START_ADDR > $FAILSLAB_PATH/require-start
> >>> echo $END_ADDR > $FAILSLAB_PATH/require-end
> >>> echo 1 > $FAILSLAB_PATH/times
> >>> echo 100 > $FAILSLAB_PATH/probability
> >>> echo N > $FAILSLAB_PATH/ignore-gfp-wait
> >>>
> >>> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
> >>>
> >>> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
> >>> Signed-off-by: Kohei Enju <enjuk@amazon.com>
> >>
> >> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
> >
> >Thank you for the patch and for identifying this issue!
> >
> >I was wondering whether we could avoid failing the probe in cases
> where
> >igc_led_setup fails. It seems to me that a failure in the LED class
> >functionality shouldn't prevent the device's core functionality from
> >working properly.
>
> Indeed, that also makes sense.
>
> The behavior that igc_probe() succeeds even if igc_led_setup() fails
> also seems good to me, as long as notifying users that igc's led
> functionality is not available.
>
> >
> > From what I understand, errors in this function are not due to
> hardware
> >malfunctions. Therefore, I suggest we remove the error propagation.
> >
> >Alternatively, if feasible, we could consider reordering the function
> >calls so that the LED class setup occurs before the netdev
> registration.
> >
>
> I don't disagree with you, but I would like to hear Kurt and
> Aleksandr's
> opinion. Do you have any preference or suggestions?
>
> I'll revise and work on v2 if needed.
> Thanks!
Just in case /*I'm Alex*/ here are my 2cents:
I’d treat LED setup as best‑effort and not fail probe if it errors.
Warn once, mark LEDs unavailable, and continue. That keeps datapath
up and avoids tricky probe unwind. If we still want to fail on LED errors,
then either (a) fix the unwind (unregister_netdev et al.) or (b) move LED setup before register_netdev().
If LED labels depend on the netdev name, it’s fine to run LED setup after register_netdev().
Since errors are non‑fatal, there’s no unwind complexity.
Keep igc_led_setup() returning an error for internal visibility, but don’t propagate it as probe failure:
err = igc_led_setup(adapter);
if (err) {
netdev_warn_once(netdev,
"LED init failed (%d); continuing without LED support\n",
err);
adapter->leds_available = false;
} else {
adapter->leds_available = true;
}
In remove()/error paths, guard teardown:
if (adapter->leds_available)
igc_led_teardown(adapter);
Keep current order but fully unwind on error:
err = igc_led_setup(adapter);
if (err) {
unregister_netdev(netdev);
/* del NAPI, free queues, etc. in reverse order */
err = -E...;
goto err_free;
}
With the best regards
Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-10 8:57 ` Kurt Kanzenbach
@ 2025-09-10 9:15 ` Kohei Enju
0 siblings, 0 replies; 9+ messages in thread
From: Kohei Enju @ 2025-09-10 9:15 UTC (permalink / raw)
To: kurt
Cc: aleksandr.loktionov, andrew+netdev, anthony.l.nguyen, davem,
edumazet, enjuk, intel-wired-lan, kohei.enju, kuba, netdev,
pabeni, przemyslaw.kitszel, vitaly.lifshits
On Wed, 10 Sep 2025 10:57:17 +0200, Kurt Kanzenbach wrote:
>On Wed Sep 10 2025, Kohei Enju wrote:
>> + Aleksandr
>>
>> On Wed, 10 Sep 2025 10:28:17 +0300, Lifshits, Vitaly wrote:
>>
>>>On 9/8/2025 9:26 AM, Kurt Kanzenbach wrote:
>>>> On Sat Sep 06 2025, Kohei Enju wrote:
>>>>> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
>>>>> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>>>>>
>>>>> This behavior can be tested using fault-injection framework. I used the
>>>>> failslab feature to test the issue. [2]
>>>>>
>>>>> Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
>>>>> panic.
>>>>>
>>>>> [1]
>>>>> kernel BUG at net/core/dev.c:12047!
>>>>> Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>>>> RIP: 0010:free_netdev+0x278/0x2b0
>>>>> [...]
>>>>> Call Trace:
>>>>> <TASK>
>>>>> igc_probe+0x370/0x910
>>>>> local_pci_probe+0x3a/0x80
>>>>> pci_device_probe+0xd1/0x200
>>>>> [...]
>>>>>
>>>>> [2]
>>>>> #!/bin/bash -ex
>>>>>
>>>>> FAILSLAB_PATH=/sys/kernel/debug/failslab/
>>>>> DEVICE=0000:00:05.0
>>>>> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
>>>>> | awk '{printf("0x%s", $1)}')
>>>>> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>>>>>
>>>>> echo $START_ADDR > $FAILSLAB_PATH/require-start
>>>>> echo $END_ADDR > $FAILSLAB_PATH/require-end
>>>>> echo 1 > $FAILSLAB_PATH/times
>>>>> echo 100 > $FAILSLAB_PATH/probability
>>>>> echo N > $FAILSLAB_PATH/ignore-gfp-wait
>>>>>
>>>>> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>>>>>
>>>>> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
>>>>> Signed-off-by: Kohei Enju <enjuk@amazon.com>
>>>>
>>>> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
>>>
>>>Thank you for the patch and for identifying this issue!
>>>
>>>I was wondering whether we could avoid failing the probe in cases where
>>>igc_led_setup fails. It seems to me that a failure in the LED class
>>>functionality shouldn't prevent the device's core functionality from
>>>working properly.
>>
>> Indeed, that also makes sense.
>>
>> The behavior that igc_probe() succeeds even if igc_led_setup() fails
>> also seems good to me, as long as notifying users that igc's led
>> functionality is not available.
>
>SGTM. The LED code is nice to have, but not mandatory at all. The device
>has sane LED defaults.
Thank you for clarification.
I'll do like that in v2.
>
>>
>>>
>>> From what I understand, errors in this function are not due to hardware
>>>malfunctions. Therefore, I suggest we remove the error propagation.
>>>
>>>Alternatively, if feasible, we could consider reordering the function
>>>calls so that the LED class setup occurs before the netdev registration.
>>>
>>
>> I don't disagree with you, but I would like to hear Kurt and Aleksandr's
>> opinion. Do you have any preference or suggestions?
>
>See above.
Got it :)
>
>Thanks,
>Kurt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: RE: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
2025-09-10 9:02 ` Loktionov, Aleksandr
@ 2025-09-10 9:25 ` Kohei Enju
0 siblings, 0 replies; 9+ messages in thread
From: Kohei Enju @ 2025-09-10 9:25 UTC (permalink / raw)
To: aleksandr.loktionov
Cc: andrew+netdev, anthony.l.nguyen, davem, edumazet, enjuk,
intel-wired-lan, kohei.enju, kuba, kurt, netdev, pabeni,
przemyslaw.kitszel, vitaly.lifshits
On Wed, 10 Sep 2025 09:02:51 +0000, Loktionov, Aleksandr wrote:
[...]
>> >>
>> >> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
>> >
>> >Thank you for the patch and for identifying this issue!
>> >
>> >I was wondering whether we could avoid failing the probe in cases
>> where
>> >igc_led_setup fails. It seems to me that a failure in the LED class
>> >functionality shouldn't prevent the device's core functionality from
>> >working properly.
>>
>> Indeed, that also makes sense.
>>
>> The behavior that igc_probe() succeeds even if igc_led_setup() fails
>> also seems good to me, as long as notifying users that igc's led
>> functionality is not available.
>>
>> >
>> > From what I understand, errors in this function are not due to
>> hardware
>> >malfunctions. Therefore, I suggest we remove the error propagation.
>> >
>> >Alternatively, if feasible, we could consider reordering the function
>> >calls so that the LED class setup occurs before the netdev
>> registration.
>> >
>>
>> I don't disagree with you, but I would like to hear Kurt and
>> Aleksandr's
>> opinion. Do you have any preference or suggestions?
>>
>> I'll revise and work on v2 if needed.
>> Thanks!
>
>Just in case /*I'm Alex*/ here are my 2cents:
> I’d treat LED setup as best‑effort and not fail probe if it errors.
>Warn once, mark LEDs unavailable, and continue. That keeps datapath
>up and avoids tricky probe unwind. If we still want to fail on LED errors,
>then either (a) fix the unwind (unregister_netdev et al.) or (b) move LED setup before register_netdev().
Got it, thank you for your opinion :)
>
> If LED labels depend on the netdev name, it’s fine to run LED setup after register_netdev().
>Since errors are non‑fatal, there’s no unwind complexity.
>
>Keep igc_led_setup() returning an error for internal visibility, but don’t propagate it as probe failure:
>err = igc_led_setup(adapter);
>if (err) {
> netdev_warn_once(netdev,
> "LED init failed (%d); continuing without LED support\n",
> err);
> adapter->leds_available = false;
>} else {
> adapter->leds_available = true;
>}
>
>In remove()/error paths, guard teardown:
>if (adapter->leds_available)
> igc_led_teardown(adapter);
I would like to adopt this approach, where we don't propagate led's
failure and manage its status using flag like leds_available.
I'll send v2 shortly.
>
>Keep current order but fully unwind on error:
>err = igc_led_setup(adapter);
>if (err) {
> unregister_netdev(netdev);
> /* del NAPI, free queues, etc. in reverse order */
> err = -E...;
> goto err_free;
>}
>
>With the best regards
>Alex
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-09-10 9:25 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-06 5:51 [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe() Kohei Enju
2025-09-08 6:26 ` Kurt Kanzenbach
2025-09-10 7:28 ` [Intel-wired-lan] " Lifshits, Vitaly
2025-09-10 7:52 ` Kohei Enju
2025-09-10 8:57 ` Kurt Kanzenbach
2025-09-10 9:15 ` Kohei Enju
2025-09-10 9:02 ` Loktionov, Aleksandr
2025-09-10 9:25 ` Kohei Enju
2025-09-08 6:32 ` Loktionov, Aleksandr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).