linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] wifi: carl9170: do not ping device which has failed to load firmware
       [not found] <y4ufvifcearf75qds5hlro3rfiadwfwlixz5xg3w6jjozk5sdg@7yyfsdvyehon>
@ 2025-06-14 16:33 ` Christian Lamparter
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Lamparter @ 2025-06-14 16:33 UTC (permalink / raw)
  To: Fedor Pchelkin, Dmitry Antipov
  Cc: Christian Lamparter, linux-wireless, open list

Hi,

On 6/13/25 10:19 PM, Fedor Pchelkin wrote:
> Dmitry Antipov wrote:
>> Syzkaller reports [1, 2] crashes caused by an attempts to ping
>> the device which has failed to load firmware. Since such a device
>> doesn't pass 'ieee80211_register_hw()', an internal workqueue
>> managed by 'ieee80211_queue_work()' is not yet created and an
>> attempt to queue work on it causes null-ptr-deref.
>>
>> [1] https://syzkaller.appspot.com/bug?extid=9a4aec827829942045ff
>> [2] https://syzkaller.appspot.com/bug?extid=0d8afba53e8fb2633217
>> Fixes: e4a668c59080 ("carl9170: fix spurious restart due to high latency")
>> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
>> ---
>>   drivers/net/wireless/ath/carl9170/usb.c | 15 ++++++++++-----
>>   1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/carl9170/usb.c b/drivers/net/wireless/ath/carl9170/usb.c
>> index a3e03580cd9f..a0bfa0c477ee 100644
>> --- a/drivers/net/wireless/ath/carl9170/usb.c
>> +++ b/drivers/net/wireless/ath/carl9170/usb.c
>> @@ -438,13 +438,18 @@ static void carl9170_usb_rx_complete(struct urb *urb)
>>   
>>   		if (atomic_read(&ar->rx_anch_urbs) == 0) {
>>   			/*
>> -			 * The system is too slow to cope with
>> -			 * the enormous workload. We have simply
>> -			 * run out of active rx urbs and this
>> -			 * unfortunately leads to an unpredictable
>> -			 * device.
>> +			 * At this point, either the system is too slow to
>> +			 * cope with the enormous workload (so we have simply
>> +			 * run out of active rx urbs and this unfortunately
>> +			 * leads to an unpredictable device), or the device
>> +			 * is not fully functional after an unsuccessful
>> +			 * firmware loading attempts (so it doesn't pass
>> +			 * ieee80211_register_hw() and there is no internal
>> +			 * workqueue at all).
>>   			 */
>>   
>> +			if (WARN_ON_ONCE(!ar->registered))
>> +				return;
> 
> Is WARN justifiable here if it concerns handling a predefined error
> condition?

The driver has many more WARN_ON_(ONCE). Most of them are from "back in the day". I think
carl9170 predates Syzkaller by something like 5 years or less.

In this case, it would be good to know if this only happens with syzkaller, or with some
dogy device (be it the hci, or maybe the ar9170 device itself - they are getting old by now).
I mean Garbage In => Garbage Out. But yes, it shouldn't crash.

> I mean, yeah, it avoids a crash in the completion handler but kernels
> with panic_on_warn - the ones which Syzkaller runs - will still stumble
> here for no reason.

I bet there is already a precedence for this, if someone knows it or has previous decisions:
Please join in!

Regards,
Christian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] wifi: carl9170: do not ping device which has failed to load firmware
       [not found] <12719f87-ce87-4614-a34e-5f05efd55121@gmail.com>
@ 2025-06-15 19:54 ` Fedor Pchelkin
  2025-06-16 10:01   ` Dmitry Antipov
  0 siblings, 1 reply; 7+ messages in thread
From: Fedor Pchelkin @ 2025-06-15 19:54 UTC (permalink / raw)
  To: Christian Lamparter
  Cc: Dmitry Antipov, linux-wireless, Christian Lamparter, linux-kernel

It's better to be back to the original thread.  Sorry, I misclicked and
forgot to put In-Reply-To firstly.

That part of the broken thread:
https://lore.kernel.org/linux-wireless/y4ufvifcearf75qds5hlro3rfiadwfwlixz5xg3w6jjozk5sdg@7yyfsdvyehon/T/#u

Quoting your last reply there..

On Sat, 14 Jun 2025 18:33:28 +0200, Christian Lamparter wrote:
> Hi,
> 
> On 6/13/25 10:19 PM, Fedor Pchelkin wrote:
> > Dmitry Antipov wrote:
> >> Syzkaller reports [1, 2] crashes caused by an attempts to ping
> >> the device which has failed to load firmware. Since such a device
> >> doesn't pass 'ieee80211_register_hw()', an internal workqueue
> >> managed by 'ieee80211_queue_work()' is not yet created and an
> >> attempt to queue work on it causes null-ptr-deref.
> >>
> >> [1] https://syzkaller.appspot.com/bug?extid=9a4aec827829942045ff
> >> [2] https://syzkaller.appspot.com/bug?extid=0d8afba53e8fb2633217
> >> Fixes: e4a668c59080 ("carl9170: fix spurious restart due to high latency")
> >> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
> >> ---
> >>   drivers/net/wireless/ath/carl9170/usb.c | 15 ++++++++++-----
> >>   1 file changed, 10 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/net/wireless/ath/carl9170/usb.c b/drivers/net/wireless/ath/carl9170/usb.c
> >> index a3e03580cd9f..a0bfa0c477ee 100644
> >> --- a/drivers/net/wireless/ath/carl9170/usb.c
> >> +++ b/drivers/net/wireless/ath/carl9170/usb.c
> >> @@ -438,13 +438,18 @@ static void carl9170_usb_rx_complete(struct urb *urb)
> >>   
> >>   		if (atomic_read(&ar->rx_anch_urbs) == 0) {
> >>   			/*
> >> -			 * The system is too slow to cope with
> >> -			 * the enormous workload. We have simply
> >> -			 * run out of active rx urbs and this
> >> -			 * unfortunately leads to an unpredictable
> >> -			 * device.
> >> +			 * At this point, either the system is too slow to
> >> +			 * cope with the enormous workload (so we have simply
> >> +			 * run out of active rx urbs and this unfortunately
> >> +			 * leads to an unpredictable device), or the device
> >> +			 * is not fully functional after an unsuccessful
> >> +			 * firmware loading attempts (so it doesn't pass
> >> +			 * ieee80211_register_hw() and there is no internal
> >> +			 * workqueue at all).
> >>   			 */
> >>   
> >> +			if (WARN_ON_ONCE(!ar->registered))
> >> +				return;
> > 
> > Is WARN justifiable here if it concerns handling a predefined error
> > condition?
> 
> The driver has many more WARN_ON_(ONCE). Most of them are from "back in the day". I think
> carl9170 predates Syzkaller by something like 5 years or less.

Just the fact of presence of other WARNs in the driver is not usually a
pretext to add the new ones.  WARNs should be used to point out that
something is going wrong on the kernel level, i.e. be indicators of
kernel bugs (BUG and BUG_ONs are discouraged for the new code, if I
remember correctly).

> 
> In this case, it would be good to know if this only happens with syzkaller, or with some
> dogy device (be it the hci, or maybe the ar9170 device itself - they are getting old by now).
> I mean Garbage In => Garbage Out. But yes, it shouldn't crash.

The patch description lacks details on why Syzkaller does ever hit such a
situation, even taking into account that fuzzers love to exaggerate the
likelihood of hitting some weird and impossible-to-hit-in-practice stuff.

Dmitry, if you'd like to, please give some comments on the following..

The path in question is executed when carl9170_usb_submit_rx_urb() fails.
I didn't check the repro, only judging by driver code and crash report,
but presumably it fails due to usb_submit_urb() returning some -ENODEV
because the device is disconnected concurrently.  usb_submit_urb() errors
lead to &ar->rx_anch_urbs still remaining to have a zero value.

I think the key reason of the crash is that the device disconnection and
the URB completion callback - carl9170_usb_rx_complete() where the crash
occurs - have a race.

The logs say it's disconnected just before the crash:
[   87.100458][ T2990] usb 4-1: USB disconnect, device number 11
[   87.101369][    C1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000038: 0000 [#1] SMP KASAN PTI
[   87.101407][    C1] KASAN: null-ptr-deref in range [0x00000000000001c0-0x00000000000001c7]

So it looks like ar->registered being false here is a "correct" failure
condition, i.e. it can be expected when the certain phase of the driver
initialization fails and should be handled without any WARNs.

> 
> > I mean, yeah, it avoids a crash in the completion handler but kernels
> > with panic_on_warn - the ones which Syzkaller runs - will still stumble
> > here for no reason.
> 
> I bet there is already a precedence for this, if someone knows it or has previous decisions:
> Please join in!
> 
> Regards,
> Christian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] wifi: carl9170: do not ping device which has failed to load firmware
  2025-06-15 19:54 ` [PATCH] wifi: carl9170: do not ping device which has failed to load firmware Fedor Pchelkin
@ 2025-06-16 10:01   ` Dmitry Antipov
  2025-06-16 16:55     ` Christian Lamparter
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Antipov @ 2025-06-16 10:01 UTC (permalink / raw)
  To: Fedor Pchelkin, Christian Lamparter
  Cc: linux-wireless, Christian Lamparter, linux-kernel

On 6/15/25 10:54 PM, Fedor Pchelkin wrote:

> So it looks like ar->registered being false here is a "correct" failure
> condition, i.e. it can be expected when the certain phase of the driver
> initialization fails and should be handled without any WARNs.

Looking through Documentation/process/coding-style.rst, it may be
better to use pr_warn_once() instead; anyway I would prefer leave
the final decision to the maintainer.

Dmitry


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] wifi: carl9170: do not ping device which has failed to load firmware
  2025-06-16 10:01   ` Dmitry Antipov
@ 2025-06-16 16:55     ` Christian Lamparter
  2025-06-16 18:12       ` [PATCH v2] " Dmitry Antipov
  0 siblings, 1 reply; 7+ messages in thread
From: Christian Lamparter @ 2025-06-16 16:55 UTC (permalink / raw)
  To: Dmitry Antipov, Fedor Pchelkin; +Cc: linux-wireless, linux-kernel

Hi,

On 6/16/25 12:01 PM, Dmitry Antipov wrote:
> On 6/15/25 10:54 PM, Fedor Pchelkin wrote:
> 
>> So it looks like ar->registered being false here is a "correct" failure
>> condition, i.e. it can be expected when the certain phase of the driver
>> initialization fails and should be handled without any WARNs.
> 
> Looking through Documentation/process/coding-style.rst, it may be
> better to use pr_warn_once() instead; anyway I would prefer leave
> the final decision to the maintainer.

Sure. I think you made a fine point. Grepping through drivers/net/wireless
in the (wireless-testing wt-2025-06-08-24) it seems that this could be the
first  pr_warn_once in there.

@Dmitry, would you please respin the patch? Thank you!

Regards,
Christian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] wifi: carl9170: do not ping device which has failed to load firmware
  2025-06-16 16:55     ` Christian Lamparter
@ 2025-06-16 18:12       ` Dmitry Antipov
  2025-06-17  6:18         ` Christian Lamparter
  2025-06-17 23:32         ` Jeff Johnson
  0 siblings, 2 replies; 7+ messages in thread
From: Dmitry Antipov @ 2025-06-16 18:12 UTC (permalink / raw)
  To: Christian Lamparter
  Cc: Fedor Pchelkin, linux-wireless, linux-kernel, Dmitry Antipov

Syzkaller reports [1, 2] crashes caused by an attempts to ping
the device which has failed to load firmware. Since such a device
doesn't pass 'ieee80211_register_hw()', an internal workqueue
managed by 'ieee80211_queue_work()' is not yet created and an
attempt to queue work on it causes null-ptr-deref.

[1] https://syzkaller.appspot.com/bug?extid=9a4aec827829942045ff
[2] https://syzkaller.appspot.com/bug?extid=0d8afba53e8fb2633217
Fixes: e4a668c59080 ("carl9170: fix spurious restart due to high latency")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
---
v2: prefer pr_warn_once() over WARN_ON_ONCE()
---
 drivers/net/wireless/ath/carl9170/usb.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/carl9170/usb.c b/drivers/net/wireless/ath/carl9170/usb.c
index a3e03580cd9f..564ca6a61985 100644
--- a/drivers/net/wireless/ath/carl9170/usb.c
+++ b/drivers/net/wireless/ath/carl9170/usb.c
@@ -438,14 +438,21 @@ static void carl9170_usb_rx_complete(struct urb *urb)
 
 		if (atomic_read(&ar->rx_anch_urbs) == 0) {
 			/*
-			 * The system is too slow to cope with
-			 * the enormous workload. We have simply
-			 * run out of active rx urbs and this
-			 * unfortunately leads to an unpredictable
-			 * device.
+			 * At this point, either the system is too slow to
+			 * cope with the enormous workload (so we have simply
+			 * run out of active rx urbs and this unfortunately
+			 * leads to an unpredictable device), or the device
+			 * is not fully functional after an unsuccessful
+			 * firmware loading attempts (so it doesn't pass
+			 * ieee80211_register_hw() and there is no internal
+			 * workqueue at all).
 			 */
 
-			ieee80211_queue_work(ar->hw, &ar->ping_work);
+			if (ar->registered)
+				ieee80211_queue_work(ar->hw, &ar->ping_work);
+			else
+				pr_warn_once("device %s is not registered\n",
+					     dev_name(&ar->udev->dev));
 		}
 	} else {
 		/*
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] wifi: carl9170: do not ping device which has failed to load firmware
  2025-06-16 18:12       ` [PATCH v2] " Dmitry Antipov
@ 2025-06-17  6:18         ` Christian Lamparter
  2025-06-17 23:32         ` Jeff Johnson
  1 sibling, 0 replies; 7+ messages in thread
From: Christian Lamparter @ 2025-06-17  6:18 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: Fedor Pchelkin, linux-wireless, linux-kernel, Jeff Johnson

On 6/16/25 8:12 PM, Dmitry Antipov wrote:
> Syzkaller reports [1, 2] crashes caused by an attempts to ping
> the device which has failed to load firmware. Since such a device
> doesn't pass 'ieee80211_register_hw()', an internal workqueue
> managed by 'ieee80211_queue_work()' is not yet created and an
> attempt to queue work on it causes null-ptr-deref.
> 
> [1] https://syzkaller.appspot.com/bug?extid=9a4aec827829942045ff
> [2] https://syzkaller.appspot.com/bug?extid=0d8afba53e8fb2633217
> Fixes: e4a668c59080 ("carl9170: fix spurious restart due to high latency")
> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Christian Lamparter <chunkeey@gmail.com>
Cc: <stable@vger.kernel.org>

> ---
> v2: prefer pr_warn_once() over WARN_ON_ONCE()
> ---
>   drivers/net/wireless/ath/carl9170/usb.c | 19 +++++++++++++------
>   1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/carl9170/usb.c b/drivers/net/wireless/ath/carl9170/usb.c
> index a3e03580cd9f..564ca6a61985 100644
> --- a/drivers/net/wireless/ath/carl9170/usb.c
> +++ b/drivers/net/wireless/ath/carl9170/usb.c
> @@ -438,14 +438,21 @@ static void carl9170_usb_rx_complete(struct urb *urb)
>   
>   		if (atomic_read(&ar->rx_anch_urbs) == 0) {
>   			/*
> -			 * The system is too slow to cope with
> -			 * the enormous workload. We have simply
> -			 * run out of active rx urbs and this
> -			 * unfortunately leads to an unpredictable
> -			 * device.
> +			 * At this point, either the system is too slow to
> +			 * cope with the enormous workload (so we have simply
> +			 * run out of active rx urbs and this unfortunately
> +			 * leads to an unpredictable device), or the device
> +			 * is not fully functional after an unsuccessful
> +			 * firmware loading attempts (so it doesn't pass
> +			 * ieee80211_register_hw() and there is no internal
> +			 * workqueue at all).
>   			 */
>   
> -			ieee80211_queue_work(ar->hw, &ar->ping_work);
> +			if (ar->registered)
> +				ieee80211_queue_work(ar->hw, &ar->ping_work);
> +			else
> +				pr_warn_once("device %s is not registered\n",
> +					     dev_name(&ar->udev->dev));
>   		}
>   	} else {
>   		/*


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] wifi: carl9170: do not ping device which has failed to load firmware
  2025-06-16 18:12       ` [PATCH v2] " Dmitry Antipov
  2025-06-17  6:18         ` Christian Lamparter
@ 2025-06-17 23:32         ` Jeff Johnson
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Johnson @ 2025-06-17 23:32 UTC (permalink / raw)
  To: Christian Lamparter, Dmitry Antipov
  Cc: Fedor Pchelkin, linux-wireless, linux-kernel


On Mon, 16 Jun 2025 21:12:05 +0300, Dmitry Antipov wrote:
> Syzkaller reports [1, 2] crashes caused by an attempts to ping
> the device which has failed to load firmware. Since such a device
> doesn't pass 'ieee80211_register_hw()', an internal workqueue
> managed by 'ieee80211_queue_work()' is not yet created and an
> attempt to queue work on it causes null-ptr-deref.
> 
> [1] https://syzkaller.appspot.com/bug?extid=9a4aec827829942045ff
> [2] https://syzkaller.appspot.com/bug?extid=0d8afba53e8fb2633217
> 
> [...]

Applied, thanks!

[1/1] wifi: carl9170: do not ping device which has failed to load firmware
      commit: 15d25307692312cec4b57052da73387f91a2e870

Best regards,
-- 
Jeff Johnson <jeff.johnson@oss.qualcomm.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-06-17 23:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <12719f87-ce87-4614-a34e-5f05efd55121@gmail.com>
2025-06-15 19:54 ` [PATCH] wifi: carl9170: do not ping device which has failed to load firmware Fedor Pchelkin
2025-06-16 10:01   ` Dmitry Antipov
2025-06-16 16:55     ` Christian Lamparter
2025-06-16 18:12       ` [PATCH v2] " Dmitry Antipov
2025-06-17  6:18         ` Christian Lamparter
2025-06-17 23:32         ` Jeff Johnson
     [not found] <y4ufvifcearf75qds5hlro3rfiadwfwlixz5xg3w6jjozk5sdg@7yyfsdvyehon>
2025-06-14 16:33 ` [PATCH] " Christian Lamparter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).