linux-bluetooth.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
@ 2024-06-14  1:30 Alan Stern
  2024-06-14  2:20 ` bluez.test.bot
  2024-06-28 18:23 ` [PATCH] " patchwork-bot+bluetooth
  0 siblings, 2 replies; 3+ messages in thread
From: Alan Stern @ 2024-06-14  1:30 UTC (permalink / raw)
  To: Greg KH
  Cc: syzbot, syzbot, johan.hedberg, linux-bluetooth, linux-kernel,
	linux-usb, luiz.dentz, marcel, syzkaller-bugs

The syzbot fuzzer found that the interrupt-URB completion callback in
the cdc-wdm driver was taking too long, and the driver's immediate
resubmission of interrupt URBs with -EPROTO status combined with the
dummy-hcd emulation to cause a CPU lockup:


cdc_wdm 1-1:1.0: nonzero urb status received: -71
cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625]
CPU#0 Utilization every 4s during lockup:
	#1:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	#2:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	#3:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	#4:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	#5:  98% system,	  1% softirq,	  3% hardirq,	  0% idle
Modules linked in:
irq event stamp: 73096
hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]
hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994
hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
softirqs last  enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]
softirqs last  enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582
softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588
CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G        W          6.10.0-rc2-syzkaller-g8867bbd4a056 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024


Testing showed that the problem did not occur if the two error
messages -- the first two lines above -- were removed; apparently adding
material to the kernel log takes a surprisingly large amount of time.

In any case, the best approach for preventing these lockups and to
avoid spamming the log with thousands of error messages per second is
to ratelimit the two dev_err() calls.  Therefore we replace them with
dev_err_ratelimited().

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Reported-and-tested-by: syzbot+5f996b83575ef4058638@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/00000000000073d54b061a6a1c65@google.com/
Reported-and-tested-by: syzbot+1b2abad17596ad03dcff@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/000000000000f45085061aa9b37e@google.com/
Fixes: 9908a32e94de ("USB: remove err() macro from usb class drivers")
Link: https://lore.kernel.org/linux-usb/40dfa45b-5f21-4eef-a8c1-51a2f320e267@rowland.harvard.edu/
Cc: stable@vger.kernel.org

---

 drivers/usb/class/cdc-wdm.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: usb-devel/drivers/usb/class/cdc-wdm.c
===================================================================
--- usb-devel.orig/drivers/usb/class/cdc-wdm.c
+++ usb-devel/drivers/usb/class/cdc-wdm.c
@@ -266,14 +266,14 @@ static void wdm_int_callback(struct urb
 			dev_err(&desc->intf->dev, "Stall on int endpoint\n");
 			goto sw; /* halt is cleared in work */
 		default:
-			dev_err(&desc->intf->dev,
+			dev_err_ratelimited(&desc->intf->dev,
 				"nonzero urb status received: %d\n", status);
 			break;
 		}
 	}
 
 	if (urb->actual_length < sizeof(struct usb_cdc_notification)) {
-		dev_err(&desc->intf->dev, "wdm_int_callback - %d bytes\n",
+		dev_err_ratelimited(&desc->intf->dev, "wdm_int_callback - %d bytes\n",
 			urb->actual_length);
 		goto exit;
 	}

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
  2024-06-14  1:30 [PATCH] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages Alan Stern
@ 2024-06-14  2:20 ` bluez.test.bot
  2024-06-28 18:23 ` [PATCH] " patchwork-bot+bluetooth
  1 sibling, 0 replies; 3+ messages in thread
From: bluez.test.bot @ 2024-06-14  2:20 UTC (permalink / raw)
  To: linux-bluetooth, stern

[-- Attachment #1: Type: text/plain, Size: 5818 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=861868

---Test result---

Test Summary:
CheckPatch                    FAIL      0.86 seconds
GitLint                       FAIL      0.47 seconds
SubjectPrefix                 FAIL      0.33 seconds
BuildKernel                   PASS      30.26 seconds
CheckAllWarning               PASS      32.87 seconds
CheckSparse                   PASS      38.68 seconds
CheckSmatch                   PASS      104.83 seconds
BuildKernel32                 PASS      29.43 seconds
TestRunnerSetup               PASS      532.86 seconds
TestRunner_l2cap-tester       PASS      20.45 seconds
TestRunner_iso-tester         PASS      32.76 seconds
TestRunner_bnep-tester        PASS      4.87 seconds
TestRunner_mgmt-tester        FAIL      113.51 seconds
TestRunner_rfcomm-tester      PASS      7.43 seconds
TestRunner_sco-tester         PASS      13.04 seconds
TestRunner_ioctl-tester       PASS      7.91 seconds
TestRunner_mesh-tester        PASS      6.00 seconds
TestRunner_smp-tester         PASS      6.99 seconds
TestRunner_userchan-tester    PASS      5.07 seconds
IncrementalBuild              PASS      28.37 seconds

Details
##############################
Test: CheckPatch - FAIL
Desc: Run checkpatch.pl script
Output:
USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#66: 
hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]

WARNING: Possible repeated word: 'Google'
#74: 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024

WARNING: Prefer using '"%s...", __func__' to using 'wdm_int_callback', this function's name, in a string
#117: FILE: drivers/usb/class/cdc-wdm.c:276:
+		dev_err_ratelimited(&desc->intf->dev, "wdm_int_callback - %d bytes\n",

total: 0 errors, 3 warnings, 16 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/src/13697702.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.


##############################
Test: GitLint - FAIL
Desc: Run gitlint
Output:
USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages

WARNING: I3 - ignore-body-lines: gitlint will be switching from using Python regex 'match' (match beginning) to 'search' (match anywhere) semantics. Please review your ignore-body-lines.regex option accordingly. To remove this warning, set general.regex-style-search=True. More details: https://jorisroovers.github.io/gitlint/configuration/#regex-style-search
13: B3 Line contains hard tab characters (\t): "	#1:  98% system,	  0% softirq,	  3% hardirq,	  0% idle"
14: B3 Line contains hard tab characters (\t): "	#2:  98% system,	  0% softirq,	  3% hardirq,	  0% idle"
15: B3 Line contains hard tab characters (\t): "	#3:  98% system,	  0% softirq,	  3% hardirq,	  0% idle"
16: B3 Line contains hard tab characters (\t): "	#4:  98% system,	  0% softirq,	  3% hardirq,	  0% idle"
17: B3 Line contains hard tab characters (\t): "	#5:  98% system,	  1% softirq,	  3% hardirq,	  0% idle"
20: B1 Line exceeds max length (117>80): "hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]"
21: B1 Line exceeds max length (113>80): "hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994"
22: B1 Line exceeds max length (111>80): "hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]"
23: B1 Line exceeds max length (116>80): "hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551"
24: B1 Line exceeds max length (104>80): "softirqs last  enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]"
25: B1 Line exceeds max length (104>80): "softirqs last  enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582"
26: B1 Line exceeds max length (99>80): "softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588"
27: B1 Line exceeds max length (105>80): "CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G        W          6.10.0-rc2-syzkaller-g8867bbd4a056 #0"
28: B1 Line exceeds max length (89>80): "Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024"
43: B1 Line exceeds max length (82>80): "Closes: https://lore.kernel.org/linux-usb/00000000000073d54b061a6a1c65@google.com/"
45: B1 Line exceeds max length (82>80): "Closes: https://lore.kernel.org/linux-usb/000000000000f45085061aa9b37e@google.com/"
46: B1 Line exceeds max length (97>80): "Link: https://lore.kernel.org/linux-usb/40dfa45b-5f21-4eef-a8c1-51a2f320e267@rowland.harvard.edu/"
##############################
Test: SubjectPrefix - FAIL
Desc: Check subject contains "Bluetooth" prefix
Output:
"Bluetooth: " prefix is not specified in the subject
##############################
Test: TestRunner_mgmt-tester - FAIL
Desc: Run mgmt-tester with test-runner
Output:
Total: 492, Passed: 489 (99.4%), Failed: 1, Not Run: 2

Failed Test Cases
LL Privacy - Add Device 6 (RL is full)               Failed       0.194 seconds


---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
  2024-06-14  1:30 [PATCH] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages Alan Stern
  2024-06-14  2:20 ` bluez.test.bot
@ 2024-06-28 18:23 ` patchwork-bot+bluetooth
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+bluetooth @ 2024-06-28 18:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: gregkh, syzbot+1b2abad17596ad03dcff, syzbot+5f996b83575ef4058638,
	johan.hedberg, linux-bluetooth, linux-kernel, linux-usb,
	luiz.dentz, marcel, syzkaller-bugs

Hello:

This patch was applied to bluetooth/bluetooth-next.git (master)
by Greg Kroah-Hartman <gregkh@linuxfoundation.org>:

On Thu, 13 Jun 2024 21:30:43 -0400 you wrote:
> The syzbot fuzzer found that the interrupt-URB completion callback in
> the cdc-wdm driver was taking too long, and the driver's immediate
> resubmission of interrupt URBs with -EPROTO status combined with the
> dummy-hcd emulation to cause a CPU lockup:
> 
> 
> cdc_wdm 1-1:1.0: nonzero urb status received: -71
> cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes
> watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625]
> CPU#0 Utilization every 4s during lockup:
> 	#1:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
> 	#2:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
> 	#3:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
> 	#4:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
> 	#5:  98% system,	  1% softirq,	  3% hardirq,	  0% idle
> Modules linked in:
> irq event stamp: 73096
> hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]
> hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994
> hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
> hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
> softirqs last  enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]
> softirqs last  enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582
> softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588
> CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G        W          6.10.0-rc2-syzkaller-g8867bbd4a056 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
> 
> [...]

Here is the summary with links:
  - USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
    https://git.kernel.org/bluetooth/bluetooth-next/c/22f008128625

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-06-28 18:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14  1:30 [PATCH] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages Alan Stern
2024-06-14  2:20 ` bluez.test.bot
2024-06-28 18:23 ` [PATCH] " patchwork-bot+bluetooth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).