[bvanassche:block-for-next] [sbitmap] e992c326a3: BUG:workqueue

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [bvanassche:block-for-next] [sbitmap]  e992c326a3: BUG:workqueue_lockup-pool
@ 2024-07-24  8:15 kernel test robot
  2024-07-24  9:29 ` YangYang
  0 siblings, 1 reply; 3+ messages in thread
From: kernel test robot @ 2024-07-24  8:15 UTC (permalink / raw)
  To: Yang Yang
  Cc: oe-lkp, lkp, Bart Van Assche, linux-kernel, linux-block,
	oliver.sang



Hello,

kernel test robot noticed "BUG:workqueue_lockup-pool" on:

commit: e992c326a36a35afe13a4c16094e2a76a90ed5eb ("sbitmap: fix io hung due to race on sbitmap_word::cleared")
https://github.com/bvanassche/linux block-for-next

in testcase: boot

compiler: clang-18
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+---------------------------------------------+------------+------------+
|                                             | b0c61a9e6a | e992c326a3 |
+---------------------------------------------+------------+------------+
| BUG:workqueue_lockup-pool                   | 0          | 10         |
+---------------------------------------------+------------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202407241556.b0171c94-lkp@intel.com


[   64.765231][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 43s!
[   64.766333][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 43s!
[   64.767306][    C0] Showing busy workqueues and worker pools:
[   64.767861][    C0] workqueue events: flags=0x0
[   64.768319][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=2 refcnt=3
[   64.768335][    C0]     pending: e1000_watchdog, kfree_rcu_monitor
[   64.768392][    C0] workqueue events_power_efficient: flags=0x80
[   64.770225][    C0]   pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
[   64.770228][    C0]     pending: do_cache_clean
[   64.770249][    C0] workqueue events_freezable_pwr_efficient: flags=0x84
[   64.771967][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
[   64.771976][    C0]     in-flight: 26:disk_events_workfn
[   64.772005][    C0] workqueue mm_percpu_wq: flags=0x8
[   64.773657][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
[   64.773660][    C0]     pending: vmstat_update
[   64.773697][    C0] workqueue kblockd: flags=0x18
[   64.775275][    C0]   pwq 7: cpus=1 node=0 flags=0x0 nice=-20 active=2 refcnt=3
[   64.775278][    C0]     in-flight: 27:blk_mq_timeout_work
[   64.775293][    C0]     pending: blk_mq_timeout_work
[   64.775376][    C0] pool 6: cpus=1 node=0 flags=0x0 nice=0 hung=43s workers=3 idle: 40 1001
[   64.775391][    C0] pool 7: cpus=1 node=0 flags=0x0 nice=-20 hung=43s workers=2 idle: 859
[   64.775400][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
[   64.779459][    C0] pool 7:
[   64.779465][    C0] task:kworker/1:0H    state:R  running task     stack:0     pid:27    tgid:27    ppid:2      flags:0x00004000
[   64.779480][    C0] Workqueue: kblockd blk_mq_timeout_work
[   64.779493][    C0] Call Trace:
[   64.779504][    C0]  <TASK>
[ 64.779541][ C0] __schedule (kernel/sched/core.c:5411) 
[ 64.779563][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543) 
[ 64.779571][ C0] schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6823 kernel/sched/core.c:6837) 
[ 64.779573][ C0] schedule_timeout (kernel/time/timer.c:?) 
[ 64.779580][ C0] ? get_page_from_freelist (mm/page_alloc.c:3431) 
[ 64.779588][ C0] __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) 
[ 64.779591][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543) 
[ 64.779593][ C0] wait_for_completion_state (kernel/sched/completion.c:266) 
[ 64.779595][ C0] __wait_rcu_gp (kernel/rcu/update.c:435) 
[ 64.779607][ C0] synchronize_rcu_normal (kernel/rcu/tree.c:3935) 
[ 64.779614][ C0] ? __pfx_call_rcu_hurry (include/linux/rcupdate.h:113) 
[ 64.779617][ C0] ? rcu_blocking_is_gp (include/linux/kernel.h:? kernel/rcu/tree.c:3894) 
[ 64.779618][ C0] ? synchronize_rcu (kernel/rcu/tree.c:3985) 
[ 64.779620][ C0] blk_mq_timeout_work (block/blk-mq.c:?) 
[ 64.779629][ C0] process_scheduled_works (kernel/workqueue.c:3253) 
[ 64.779647][ C0] worker_thread (include/linux/list.h:373 kernel/workqueue.c:947 kernel/workqueue.c:3410) 
[ 64.779652][ C0] ? __pfx_worker_thread (kernel/workqueue.c:3356) 
[ 64.779655][ C0] kthread (kernel/kthread.c:391) 
[ 64.779668][ C0] ? __pfx_kthread (kernel/kthread.c:342) 
[ 64.779671][ C0] ret_from_fork (arch/x86/kernel/process.c:153) 
[ 64.779688][ C0] ? __pfx_kthread (kernel/kthread.c:342) 
[ 64.779691][ C0] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
[   64.779704][    C0]  </TASK>
[   95.485253][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 74s!
[   95.486737][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 73s!
[   95.487606][    C0] Showing busy workqueues and worker pools:
[   95.488179][    C0] workqueue events: flags=0x0
[   95.488650][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=2 refcnt=3
[   95.488679][    C0]     pending: e1000_watchdog, kfree_rcu_monitor
[   95.488820][    C0] workqueue events_power_efficient: flags=0x80
[   95.490632][    C0]   pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
[   95.490635][    C0]     pending: do_cache_clean
[   95.490669][    C0] workqueue events_freezable_pwr_efficient: flags=0x84
[   95.492426][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
[   95.492429][    C0]     in-flight: 26:disk_events_workfn
[   95.492527][    C0] workqueue mm_percpu_wq: flags=0x8
[   95.494193][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
[   95.494196][    C0]     pending: vmstat_update
[   95.494265][    C0] workqueue kblockd: flags=0x18
[   95.495840][    C0]   pwq 7: cpus=1 node=0 flags=0x0 nice=-20 active=2 refcnt=3
[   95.495843][    C0]     in-flight: 27:blk_mq_timeout_work
[   95.495858][    C0]     pending: blk_mq_timeout_work
[   95.495950][    C0] pool 6: cpus=1 node=0 flags=0x0 nice=0 hung=74s workers=3 idle: 40 1001
[   95.495977][    C0] pool 7: cpus=1 node=0 flags=0x0 nice=-20 hung=73s workers=2 idle: 859
[   95.495983][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
[   95.500089][    C0] pool 7:
[   95.500106][    C0] task:kworker/1:0H    state:R  running task     stack:0     pid:27    tgid:27    ppid:2      flags:0x00004000
[   95.500132][    C0] Workqueue: kblockd blk_mq_timeout_work
[   95.500169][    C0] Call Trace:
[   95.500195][    C0]  <TASK>
[ 95.500259][ C0] __schedule (kernel/sched/core.c:5411) 
[ 95.500304][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543) 
[ 95.500320][ C0] schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6823 kernel/sched/core.c:6837) 
[ 95.500322][ C0] schedule_timeout (kernel/time/timer.c:?) 
[ 95.500341][ C0] ? get_page_from_freelist (mm/page_alloc.c:3431) 
[ 95.500363][ C0] __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) 
[ 95.500365][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543) 
[ 95.500367][ C0] wait_for_completion_state (kernel/sched/completion.c:266) 
[ 95.500369][ C0] __wait_rcu_gp (kernel/rcu/update.c:435) 
[ 95.500399][ C0] synchronize_rcu_normal (kernel/rcu/tree.c:3935) 
[ 95.500420][ C0] ? __pfx_call_rcu_hurry (include/linux/rcupdate.h:113) 
[ 95.500432][ C0] ? rcu_blocking_is_gp (include/linux/kernel.h:? kernel/rcu/tree.c:3894) 
[ 95.500434][ C0] ? synchronize_rcu (kernel/rcu/tree.c:3985) 
[ 95.500435][ C0] blk_mq_timeout_work (block/blk-mq.c:?) 
[ 95.500464][ C0] process_scheduled_works (kernel/workqueue.c:3253) 
[ 95.500516][ C0] worker_thread (include/linux/list.h:373 kernel/workqueue.c:947 kernel/workqueue.c:3410) 
[ 95.500527][ C0] ? __pfx_worker_thread (kernel/workqueue.c:3356) 
[ 95.500530][ C0] kthread (kernel/kthread.c:391) 
[ 95.500585][ C0] ? __pfx_kthread (kernel/kthread.c:342) 
[ 95.500589][ C0] ret_from_fork (arch/x86/kernel/process.c:153) 
[ 95.500636][ C0] ? __pfx_kthread (kernel/kthread.c:342) 
[ 95.500640][ C0] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
[   95.500679][    C0]  </TASK>
[  120.705227][    C1] rcu: INFO: rcu_preempt self-detected stall on CPU
[  120.706866][    C1] rcu: 	1-....: (25000 ticks this GP) idle=71dc/1/0x4000000000000000 softirq=2935/2935 fqs=12477
[  120.712272][    C1] rcu: 	(t=25002 jiffies g=2261 q=805 ncpus=2)
[  120.713520][    C1] CPU: 1 PID: 1601 Comm: (udev-worker) Not tainted 6.10.0-rc6-00303-ge992c326a36a #1
[  120.715344][    C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 120.717321][ C1] RIP: 0010:_raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:152) 
[ 120.718629][ C1] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 c6 07 00 0f ba e6 09 73 01 fb 65 ff 0d ce bc 10 7e <74> 06 c3 cc cc cc cc cc 0f 1f 44 00 00 c3 cc cc cc cc cc 0f 1f 00
All code
========
   0:	90                   	nop
   1:	90                   	nop
   2:	90                   	nop
   3:	90                   	nop
   4:	90                   	nop
   5:	90                   	nop
   6:	90                   	nop
   7:	90                   	nop
   8:	90                   	nop
   9:	90                   	nop
   a:	90                   	nop
   b:	90                   	nop
   c:	90                   	nop
   d:	90                   	nop
   e:	90                   	nop
   f:	90                   	nop
  10:	f3 0f 1e fa          	endbr64
  14:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  19:	c6 07 00             	movb   $0x0,(%rdi)
  1c:	0f ba e6 09          	bt     $0x9,%esi
  20:	73 01                	jae    0x23
  22:	fb                   	sti
  23:	65 ff 0d ce bc 10 7e 	decl   %gs:0x7e10bcce(%rip)        # 0x7e10bcf8
  2a:*	74 06                	je     0x32		<-- trapping instruction
  2c:	c3                   	ret
  2d:	cc                   	int3
  2e:	cc                   	int3
  2f:	cc                   	int3
  30:	cc                   	int3
  31:	cc                   	int3
  32:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  37:	c3                   	ret
  38:	cc                   	int3
  39:	cc                   	int3
  3a:	cc                   	int3
  3b:	cc                   	int3
  3c:	cc                   	int3
  3d:	0f 1f 00             	nopl   (%rax)

Code starting with the faulting instruction
===========================================
   0:	74 06                	je     0x8
   2:	c3                   	ret
   3:	cc                   	int3
   4:	cc                   	int3
   5:	cc                   	int3
   6:	cc                   	int3
   7:	cc                   	int3
   8:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   d:	c3                   	ret
   e:	cc                   	int3
   f:	cc                   	int3
  10:	cc                   	int3
  11:	cc                   	int3
  12:	cc                   	int3
  13:	0f 1f 00             	nopl   (%rax)
[  120.722091][    C1] RSP: 0018:ffffc9000027fa60 EFLAGS: 00000247
[  120.723180][    C1] RAX: 0000000000000286 RBX: ffff8881335dc4c8 RCX: 0000000000000000
[  120.724696][    C1] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffff8881335dc4c8
[  120.726162][    C1] RBP: ffff8881335dc480 R08: 0000000000000001 R09: ffffffffffffffff
[  120.727782][    C1] R10: 0000000000000000 R11: ffffffff817cf120 R12: 0000000000000000
[  120.729356][    C1] R13: 0000000000000001 R14: 0000000000000000 R15: fffffffffffffffe
[  120.730936][    C1] FS:  00007f213215b8c0(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000
[  120.732680][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  120.733960][    C1] CR2: 000055c76ff83708 CR3: 00000001482fe000 CR4: 00000000000406f0
[  120.735559][    C1] Call Trace:
[  120.736328][    C1]  <IRQ>
[ 120.737025][ C1] ? rcu_dump_cpu_stacks (include/linux/cpumask.h:231 kernel/rcu/tree_stall.h:374) 
[ 120.738036][ C1] ? print_cpu_stall (kernel/rcu/tree_stall.h:702) 
[ 120.739012][ C1] ? rcu_sched_clock_irq (kernel/rcu/tree_stall.h:?) 
[ 120.740040][ C1] ? update_process_times (arch/x86/include/asm/preempt.h:26 kernel/time/timer.c:2487) 
[ 120.741048][ C1] ? tick_nohz_handler (kernel/time/tick-sched.c:187 kernel/time/tick-sched.c:306) 
[ 120.742044][ C1] ? __pfx_tick_nohz_handler (kernel/time/tick-sched.c:285) 
[ 120.743092][ C1] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1689) 
[ 120.744101][ C1] ? hrtimer_interrupt (kernel/time/hrtimer.c:1818) 
[  120.745084][    C1



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240724/202407241556.b0171c94-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bvanassche:block-for-next] [sbitmap] e992c326a3: BUG:workqueue_lockup-pool
  2024-07-24  8:15 [bvanassche:block-for-next] [sbitmap] e992c326a3: BUG:workqueue_lockup-pool kernel test robot
@ 2024-07-24  9:29 ` YangYang
  2024-07-24 14:50   ` Bart Van Assche
  0 siblings, 1 reply; 3+ messages in thread
From: YangYang @ 2024-07-24  9:29 UTC (permalink / raw)
  To: kernel test robot; +Cc: oe-lkp, lkp, Bart Van Assche, linux-kernel, linux-block

On 2024/7/24 16:15, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "BUG:workqueue_lockup-pool" on:
> 
> commit: e992c326a36a35afe13a4c16094e2a76a90ed5eb ("sbitmap: fix io hung due to race on sbitmap_word::cleared")
> https://github.com/bvanassche/linux block-for-next

The patch in above branch is different from:
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-next&id=72d04bdcf3f7d7e07d82f9757946f68802a7270a

return (READ_ONCE(map->word) & word_mask) == word_mask;
should be
return (READ_ONCE(map->word) & word_mask) != word_mask;

Thanks.

> 
> in testcase: boot
> 
> compiler: clang-18
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> +---------------------------------------------+------------+------------+
> |                                             | b0c61a9e6a | e992c326a3 |
> +---------------------------------------------+------------+------------+
> | BUG:workqueue_lockup-pool                   | 0          | 10         |
> +---------------------------------------------+------------+------------+
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202407241556.b0171c94-lkp@intel.com
> 
> 
> [   64.765231][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 43s!
> [   64.766333][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 43s!
> [   64.767306][    C0] Showing busy workqueues and worker pools:
> [   64.767861][    C0] workqueue events: flags=0x0
> [   64.768319][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=2 refcnt=3
> [   64.768335][    C0]     pending: e1000_watchdog, kfree_rcu_monitor
> [   64.768392][    C0] workqueue events_power_efficient: flags=0x80
> [   64.770225][    C0]   pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
> [   64.770228][    C0]     pending: do_cache_clean
> [   64.770249][    C0] workqueue events_freezable_pwr_efficient: flags=0x84
> [   64.771967][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
> [   64.771976][    C0]     in-flight: 26:disk_events_workfn
> [   64.772005][    C0] workqueue mm_percpu_wq: flags=0x8
> [   64.773657][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
> [   64.773660][    C0]     pending: vmstat_update
> [   64.773697][    C0] workqueue kblockd: flags=0x18
> [   64.775275][    C0]   pwq 7: cpus=1 node=0 flags=0x0 nice=-20 active=2 refcnt=3
> [   64.775278][    C0]     in-flight: 27:blk_mq_timeout_work
> [   64.775293][    C0]     pending: blk_mq_timeout_work
> [   64.775376][    C0] pool 6: cpus=1 node=0 flags=0x0 nice=0 hung=43s workers=3 idle: 40 1001
> [   64.775391][    C0] pool 7: cpus=1 node=0 flags=0x0 nice=-20 hung=43s workers=2 idle: 859
> [   64.775400][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
> [   64.779459][    C0] pool 7:
> [   64.779465][    C0] task:kworker/1:0H    state:R  running task     stack:0     pid:27    tgid:27    ppid:2      flags:0x00004000
> [   64.779480][    C0] Workqueue: kblockd blk_mq_timeout_work
> [   64.779493][    C0] Call Trace:
> [   64.779504][    C0]  <TASK>
> [ 64.779541][ C0] __schedule (kernel/sched/core.c:5411)
> [ 64.779563][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543)
> [ 64.779571][ C0] schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6823 kernel/sched/core.c:6837)
> [ 64.779573][ C0] schedule_timeout (kernel/time/timer.c:?)
> [ 64.779580][ C0] ? get_page_from_freelist (mm/page_alloc.c:3431)
> [ 64.779588][ C0] __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116)
> [ 64.779591][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543)
> [ 64.779593][ C0] wait_for_completion_state (kernel/sched/completion.c:266)
> [ 64.779595][ C0] __wait_rcu_gp (kernel/rcu/update.c:435)
> [ 64.779607][ C0] synchronize_rcu_normal (kernel/rcu/tree.c:3935)
> [ 64.779614][ C0] ? __pfx_call_rcu_hurry (include/linux/rcupdate.h:113)
> [ 64.779617][ C0] ? rcu_blocking_is_gp (include/linux/kernel.h:? kernel/rcu/tree.c:3894)
> [ 64.779618][ C0] ? synchronize_rcu (kernel/rcu/tree.c:3985)
> [ 64.779620][ C0] blk_mq_timeout_work (block/blk-mq.c:?)
> [ 64.779629][ C0] process_scheduled_works (kernel/workqueue.c:3253)
> [ 64.779647][ C0] worker_thread (include/linux/list.h:373 kernel/workqueue.c:947 kernel/workqueue.c:3410)
> [ 64.779652][ C0] ? __pfx_worker_thread (kernel/workqueue.c:3356)
> [ 64.779655][ C0] kthread (kernel/kthread.c:391)
> [ 64.779668][ C0] ? __pfx_kthread (kernel/kthread.c:342)
> [ 64.779671][ C0] ret_from_fork (arch/x86/kernel/process.c:153)
> [ 64.779688][ C0] ? __pfx_kthread (kernel/kthread.c:342)
> [ 64.779691][ C0] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
> [   64.779704][    C0]  </TASK>
> [   95.485253][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 74s!
> [   95.486737][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 73s!
> [   95.487606][    C0] Showing busy workqueues and worker pools:
> [   95.488179][    C0] workqueue events: flags=0x0
> [   95.488650][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=2 refcnt=3
> [   95.488679][    C0]     pending: e1000_watchdog, kfree_rcu_monitor
> [   95.488820][    C0] workqueue events_power_efficient: flags=0x80
> [   95.490632][    C0]   pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
> [   95.490635][    C0]     pending: do_cache_clean
> [   95.490669][    C0] workqueue events_freezable_pwr_efficient: flags=0x84
> [   95.492426][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
> [   95.492429][    C0]     in-flight: 26:disk_events_workfn
> [   95.492527][    C0] workqueue mm_percpu_wq: flags=0x8
> [   95.494193][    C0]   pwq 6: cpus=1 node=0 flags=0x0 nice=0 active=1 refcnt=2
> [   95.494196][    C0]     pending: vmstat_update
> [   95.494265][    C0] workqueue kblockd: flags=0x18
> [   95.495840][    C0]   pwq 7: cpus=1 node=0 flags=0x0 nice=-20 active=2 refcnt=3
> [   95.495843][    C0]     in-flight: 27:blk_mq_timeout_work
> [   95.495858][    C0]     pending: blk_mq_timeout_work
> [   95.495950][    C0] pool 6: cpus=1 node=0 flags=0x0 nice=0 hung=74s workers=3 idle: 40 1001
> [   95.495977][    C0] pool 7: cpus=1 node=0 flags=0x0 nice=-20 hung=73s workers=2 idle: 859
> [   95.495983][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
> [   95.500089][    C0] pool 7:
> [   95.500106][    C0] task:kworker/1:0H    state:R  running task     stack:0     pid:27    tgid:27    ppid:2      flags:0x00004000
> [   95.500132][    C0] Workqueue: kblockd blk_mq_timeout_work
> [   95.500169][    C0] Call Trace:
> [   95.500195][    C0]  <TASK>
> [ 95.500259][ C0] __schedule (kernel/sched/core.c:5411)
> [ 95.500304][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543)
> [ 95.500320][ C0] schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6823 kernel/sched/core.c:6837)
> [ 95.500322][ C0] schedule_timeout (kernel/time/timer.c:?)
> [ 95.500341][ C0] ? get_page_from_freelist (mm/page_alloc.c:3431)
> [ 95.500363][ C0] __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116)
> [ 95.500365][ C0] ? __pfx_schedule_timeout (kernel/time/timer.c:2543)
> [ 95.500367][ C0] wait_for_completion_state (kernel/sched/completion.c:266)
> [ 95.500369][ C0] __wait_rcu_gp (kernel/rcu/update.c:435)
> [ 95.500399][ C0] synchronize_rcu_normal (kernel/rcu/tree.c:3935)
> [ 95.500420][ C0] ? __pfx_call_rcu_hurry (include/linux/rcupdate.h:113)
> [ 95.500432][ C0] ? rcu_blocking_is_gp (include/linux/kernel.h:? kernel/rcu/tree.c:3894)
> [ 95.500434][ C0] ? synchronize_rcu (kernel/rcu/tree.c:3985)
> [ 95.500435][ C0] blk_mq_timeout_work (block/blk-mq.c:?)
> [ 95.500464][ C0] process_scheduled_works (kernel/workqueue.c:3253)
> [ 95.500516][ C0] worker_thread (include/linux/list.h:373 kernel/workqueue.c:947 kernel/workqueue.c:3410)
> [ 95.500527][ C0] ? __pfx_worker_thread (kernel/workqueue.c:3356)
> [ 95.500530][ C0] kthread (kernel/kthread.c:391)
> [ 95.500585][ C0] ? __pfx_kthread (kernel/kthread.c:342)
> [ 95.500589][ C0] ret_from_fork (arch/x86/kernel/process.c:153)
> [ 95.500636][ C0] ? __pfx_kthread (kernel/kthread.c:342)
> [ 95.500640][ C0] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
> [   95.500679][    C0]  </TASK>
> [  120.705227][    C1] rcu: INFO: rcu_preempt self-detected stall on CPU
> [  120.706866][    C1] rcu: 	1-....: (25000 ticks this GP) idle=71dc/1/0x4000000000000000 softirq=2935/2935 fqs=12477
> [  120.712272][    C1] rcu: 	(t=25002 jiffies g=2261 q=805 ncpus=2)
> [  120.713520][    C1] CPU: 1 PID: 1601 Comm: (udev-worker) Not tainted 6.10.0-rc6-00303-ge992c326a36a #1
> [  120.715344][    C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 120.717321][ C1] RIP: 0010:_raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:152)
> [ 120.718629][ C1] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 c6 07 00 0f ba e6 09 73 01 fb 65 ff 0d ce bc 10 7e <74> 06 c3 cc cc cc cc cc 0f 1f 44 00 00 c3 cc cc cc cc cc 0f 1f 00
> All code
> ========
>     0:	90                   	nop
>     1:	90                   	nop
>     2:	90                   	nop
>     3:	90                   	nop
>     4:	90                   	nop
>     5:	90                   	nop
>     6:	90                   	nop
>     7:	90                   	nop
>     8:	90                   	nop
>     9:	90                   	nop
>     a:	90                   	nop
>     b:	90                   	nop
>     c:	90                   	nop
>     d:	90                   	nop
>     e:	90                   	nop
>     f:	90                   	nop
>    10:	f3 0f 1e fa          	endbr64
>    14:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
>    19:	c6 07 00             	movb   $0x0,(%rdi)
>    1c:	0f ba e6 09          	bt     $0x9,%esi
>    20:	73 01                	jae    0x23
>    22:	fb                   	sti
>    23:	65 ff 0d ce bc 10 7e 	decl   %gs:0x7e10bcce(%rip)        # 0x7e10bcf8
>    2a:*	74 06                	je     0x32		<-- trapping instruction
>    2c:	c3                   	ret
>    2d:	cc                   	int3
>    2e:	cc                   	int3
>    2f:	cc                   	int3
>    30:	cc                   	int3
>    31:	cc                   	int3
>    32:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
>    37:	c3                   	ret
>    38:	cc                   	int3
>    39:	cc                   	int3
>    3a:	cc                   	int3
>    3b:	cc                   	int3
>    3c:	cc                   	int3
>    3d:	0f 1f 00             	nopl   (%rax)
> 
> Code starting with the faulting instruction
> ===========================================
>     0:	74 06                	je     0x8
>     2:	c3                   	ret
>     3:	cc                   	int3
>     4:	cc                   	int3
>     5:	cc                   	int3
>     6:	cc                   	int3
>     7:	cc                   	int3
>     8:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
>     d:	c3                   	ret
>     e:	cc                   	int3
>     f:	cc                   	int3
>    10:	cc                   	int3
>    11:	cc                   	int3
>    12:	cc                   	int3
>    13:	0f 1f 00             	nopl   (%rax)
> [  120.722091][    C1] RSP: 0018:ffffc9000027fa60 EFLAGS: 00000247
> [  120.723180][    C1] RAX: 0000000000000286 RBX: ffff8881335dc4c8 RCX: 0000000000000000
> [  120.724696][    C1] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffff8881335dc4c8
> [  120.726162][    C1] RBP: ffff8881335dc480 R08: 0000000000000001 R09: ffffffffffffffff
> [  120.727782][    C1] R10: 0000000000000000 R11: ffffffff817cf120 R12: 0000000000000000
> [  120.729356][    C1] R13: 0000000000000001 R14: 0000000000000000 R15: fffffffffffffffe
> [  120.730936][    C1] FS:  00007f213215b8c0(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000
> [  120.732680][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  120.733960][    C1] CR2: 000055c76ff83708 CR3: 00000001482fe000 CR4: 00000000000406f0
> [  120.735559][    C1] Call Trace:
> [  120.736328][    C1]  <IRQ>
> [ 120.737025][ C1] ? rcu_dump_cpu_stacks (include/linux/cpumask.h:231 kernel/rcu/tree_stall.h:374)
> [ 120.738036][ C1] ? print_cpu_stall (kernel/rcu/tree_stall.h:702)
> [ 120.739012][ C1] ? rcu_sched_clock_irq (kernel/rcu/tree_stall.h:?)
> [ 120.740040][ C1] ? update_process_times (arch/x86/include/asm/preempt.h:26 kernel/time/timer.c:2487)
> [ 120.741048][ C1] ? tick_nohz_handler (kernel/time/tick-sched.c:187 kernel/time/tick-sched.c:306)
> [ 120.742044][ C1] ? __pfx_tick_nohz_handler (kernel/time/tick-sched.c:285)
> [ 120.743092][ C1] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1689)
> [ 120.744101][ C1] ? hrtimer_interrupt (kernel/time/hrtimer.c:1818)
> [  120.745084][    C1
> 
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240724/202407241556.b0171c94-lkp@intel.com
> 
> 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bvanassche:block-for-next] [sbitmap] e992c326a3: BUG:workqueue_lockup-pool
  2024-07-24  9:29 ` YangYang
@ 2024-07-24 14:50   ` Bart Van Assche
  0 siblings, 0 replies; 3+ messages in thread
From: Bart Van Assche @ 2024-07-24 14:50 UTC (permalink / raw)
  To: YangYang, kernel test robot; +Cc: oe-lkp, lkp, linux-kernel, linux-block

On 7/24/24 2:29 AM, YangYang wrote:
> The patch in above branch is different from:
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-next&id=72d04bdcf3f7d7e07d82f9757946f68802a7270a
> 
> return (READ_ONCE(map->word) & word_mask) == word_mask;
> should be
> return (READ_ONCE(map->word) & word_mask) != word_mask;

Hi Yang,

Thanks for having taken a look. That branch should not have been
published so I deleted it.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-24 14:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-24  8:15 [bvanassche:block-for-next] [sbitmap] e992c326a3: BUG:workqueue_lockup-pool kernel test robot
2024-07-24  9:29 ` YangYang
2024-07-24 14:50   ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox