* [BUG] Sleeping inside a RCU reader
@ 2012-01-03 1:51 Vinicius Costa Gomes
2012-01-03 9:19 ` Daniel Wagner
2012-01-03 12:03 ` Andrei Emeltchenko
0 siblings, 2 replies; 4+ messages in thread
From: Vinicius Costa Gomes @ 2012-01-03 1:51 UTC (permalink / raw)
To: linux-bluetooth
[-- Attachment #1: Type: text/plain, Size: 640 bytes --]
Hi,
I found this bug (trace attached) and I don't think I will be able to
start working on it in the next 11 hours, so this email is to make
the life of anyone who wants to work on it a little easier. (If anyone
wishes to work on this, please respond this email).
I was able to reproduce it consistenly doing a "hciconfig <dev> down"
with an established connection. I was using LE devices, but it seems that
the types of the devices don't matter.
The problem is that cancel_delayed_work_sync() called inside
hci_conn_del() is blocking, because of and called in the context of an RCU
(inside hci_conn_hash_flush()).
Cheers,
--
Vinicius
[-- Attachment #2: locking-bug-rcu.txt --]
[-- Type: text/plain, Size: 7118 bytes --]
[ 220.604374] general protection fault: 0000 [#1] SMP
[ 220.605032] CPU 0
[ 220.605032] Modules linked in: bnep hci_uart rfcomm btusb bluetooth ftdi_sio usbserial [last unloaded: bluetooth]
[ 220.605032]
[ 220.605032] Pid: 2046, comm: hciconfig Not tainted 3.2.0-rc2+ #1 Bochs Bochs
[ 220.605032] RIP: 0010:[<ffffffff81056e0b>] [<ffffffff81056e0b>] __lock_acquire+0x9e/0xd60
[ 220.605032] RSP: 0018:ffff88003c6dfae8 EFLAGS: 00010046
[ 220.605032] RAX: 0000000000000046 RBX: ffff88003e1b0f40 RCX: 0000000000000000
[ 220.605032] RDX: 0000000000000000 RSI: 0000000000000000 RDI: dead4ead00000920
[ 220.605032] RBP: ffff88003c6dfb98 R08: 0000000000000002 R09: 0000000000000001
[ 220.605032] R10: ffff8800bc6dfb9d R11: 0000000000000096 R12: 0000000000000046
[ 220.605032] R13: 0000000000000002 R14: 0000000000000000 R15: dead4ead00000920
[ 220.605032] FS: 00007f7842c6f700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[ 220.605032] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 220.605032] CR2: 00007f7842c95c11 CR3: 000000003d00f000 CR4: 00000000000006b0
[ 220.605032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 220.605032] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 220.605032] Process hciconfig (pid: 2046, threadinfo ffff88003c6de000, task ffff88003e1b0f40)
[ 220.605032] Stack:
[ 220.605032] ffff880000000000 ffffffff00000001 0000000000000000 ffffffff81427a97
[ 220.605032] 0000000000037c23 0000000000037c23 ffff88003c6dfb68 ffffffff8103061b
[ 220.605032] 0000000000000049 ffffffff00037c23 00000000ffffffff 0000000000000001
[ 220.605032] Call Trace:
[ 220.605032] [<ffffffff81427a97>] ? _raw_spin_unlock_irqrestore+0x39/0x5c
[ 220.605032] [<ffffffff8103061b>] ? console_unlock+0x1f7/0x206
[ 220.605032] [<ffffffff81057ee3>] lock_acquire+0x8a/0xa7
[ 220.605032] [<ffffffff81039c57>] ? lock_timer_base.isra.17+0x26/0x4b
[ 220.605032] [<ffffffff814274b9>] _raw_spin_lock_irqsave+0x49/0x5c
[ 220.605032] [<ffffffff81039c57>] ? lock_timer_base.isra.17+0x26/0x4b
[ 220.605032] [<ffffffff81039c57>] lock_timer_base.isra.17+0x26/0x4b
[ 220.605032] [<ffffffff81039ddf>] del_timer+0x26/0x71
[ 220.605032] [<ffffffff811cdd8d>] ? kobject_put+0x45/0x49
[ 220.605032] [<ffffffff81042606>] __cancel_work_timer+0x30/0xfa
[ 220.605032] [<ffffffff810426dd>] cancel_delayed_work_sync+0xd/0xf
[ 220.605032] [<ffffffffa0124d76>] hci_conn_del+0x5a/0x19d [bluetooth]
[ 220.605032] [<ffffffffa0124f6a>] hci_conn_hash_flush+0xb1/0xc9 [bluetooth]
[ 220.605032] [<ffffffffa0120e91>] hci_dev_do_close+0x163/0x28b [bluetooth]
[ 220.605032] [<ffffffffa0121b61>] hci_dev_close+0x26/0x47 [bluetooth]
[ 220.605032] [<ffffffffa012e419>] hci_sock_ioctl+0x123/0x348 [bluetooth]
[ 220.605032] [<ffffffff812fc285>] sock_do_ioctl+0x22/0x40
[ 220.605032] [<ffffffff812fc6a0>] sock_ioctl+0x20e/0x21d
[ 220.605032] [<ffffffff810c5916>] do_vfs_ioctl+0x469/0x4aa
[ 220.605032] [<ffffffff81427add>] ? _raw_spin_unlock+0x23/0x36
[ 220.605032] [<ffffffff8105464b>] ? trace_hardirqs_off_caller+0xa3/0x10b
[ 220.605032] [<ffffffff81427e59>] ? retint_swapgs+0x13/0x1b
[ 220.605032] [<ffffffff810c59a8>] sys_ioctl+0x51/0x75
[ 220.605032] [<ffffffff81428652>] system_call_fastpath+0x16/0x1b
[ 220.605032] Code: 8d 58 ff ff ff 74 24 e8 28 15 18 00 85 c0 0f 84 76 0c 00 00 83 3d c5 0b 13 01 00 be cf 0b 00 00 0f 85 64 0c 00 00 e9 e9 0b 00 00
[ 220.605032] 81 3f 60 bf 96 81 b8 01 00 00 00 44 0f 44 e8 83 fe 01 77 0c
[ 220.605032] RIP [<ffffffff81056e0b>] __lock_acquire+0x9e/0xd60
[ 220.605032] RSP <ffff88003c6dfae8>
[ 220.605032] ---[ end trace b1b0a7fd049894c1 ]---
[ 220.605032] note: hciconfig[2046] exited with preempt_count 1
[ 220.605032] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
[ 220.605032] in_atomic(): 1, irqs_disabled(): 1, pid: 2046, name: hciconfig
[ 220.605032] INFO: lockdep is turned off.
[ 220.605032] irq event stamp: 3118
[ 220.605032] hardirqs last enabled at (3117): [<ffffffff81427a9e>] _raw_spin_unlock_irqrestore+0x40/0x5c
[ 220.605032] hardirqs last disabled at (3118): [<ffffffff8142748a>] _raw_spin_lock_irqsave+0x1a/0x5c
[ 220.605032] softirqs last enabled at (2924): [<ffffffff81301746>] release_sock+0x148/0x151
[ 220.605032] softirqs last disabled at (2922): [<ffffffff814275f0>] _raw_spin_lock_bh+0x11/0x4f
[ 220.605032] Pid: 2046, comm: hciconfig Tainted: G D 3.2.0-rc2+ #1
[ 220.605032] Call Trace:
[ 220.605032] [<ffffffff810559e1>] ? print_irqtrace_events+0x9d/0xa1
[ 220.605032] [<ffffffff810277fd>] __might_sleep+0xfe/0x103
[ 220.605032] [<ffffffff81426bfa>] down_read+0x1f/0x56
[ 220.605032] [<ffffffff81049ddd>] ? hrtimer_try_to_cancel+0x7f/0x8d
[ 220.605032] [<ffffffff81062bd6>] acct_collect+0x49/0x185
[ 220.605032] [<ffffffff81032e57>] do_exit+0x1f0/0x6ee
[ 220.605032] [<ffffffff810313c7>] ? kmsg_dump+0x84/0x12d
[ 220.605032] [<ffffffff81030611>] ? console_unlock+0x1ed/0x206
[ 220.605032] [<ffffffff810045ac>] oops_end+0x97/0x9c
[ 220.605032] [<ffffffff810046d8>] die+0x55/0x61
[ 220.605032] [<ffffffff8100255c>] do_general_protection+0x12c/0x134
[ 220.605032] [<ffffffff81428055>] general_protection+0x25/0x30
[ 220.605032] [<ffffffff81056e0b>] ? __lock_acquire+0x9e/0xd60
[ 220.605032] [<ffffffff81056dd5>] ? __lock_acquire+0x68/0xd60
[ 220.605032] [<ffffffff81427a97>] ? _raw_spin_unlock_irqrestore+0x39/0x5c
[ 220.605032] [<ffffffff8103061b>] ? console_unlock+0x1f7/0x206
[ 220.605032] [<ffffffff81057ee3>] lock_acquire+0x8a/0xa7
[ 220.605032] [<ffffffff81039c57>] ? lock_timer_base.isra.17+0x26/0x4b
[ 220.605032] [<ffffffff814274b9>] _raw_spin_lock_irqsave+0x49/0x5c
[ 220.605032] [<ffffffff81039c57>] ? lock_timer_base.isra.17+0x26/0x4b
[ 220.605032] [<ffffffff81039c57>] lock_timer_base.isra.17+0x26/0x4b
[ 220.605032] [<ffffffff81039ddf>] del_timer+0x26/0x71
[ 220.605032] [<ffffffff811cdd8d>] ? kobject_put+0x45/0x49
[ 220.605032] [<ffffffff81042606>] __cancel_work_timer+0x30/0xfa
[ 220.605032] [<ffffffff810426dd>] cancel_delayed_work_sync+0xd/0xf
[ 220.605032] [<ffffffffa0124d76>] hci_conn_del+0x5a/0x19d [bluetooth]
[ 220.605032] [<ffffffffa0124f6a>] hci_conn_hash_flush+0xb1/0xc9 [bluetooth]
[ 220.605032] [<ffffffffa0120e91>] hci_dev_do_close+0x163/0x28b [bluetooth]
[ 220.605032] [<ffffffffa0121b61>] hci_dev_close+0x26/0x47 [bluetooth]
[ 220.605032] [<ffffffffa012e419>] hci_sock_ioctl+0x123/0x348 [bluetooth]
[ 220.605032] [<ffffffff812fc285>] sock_do_ioctl+0x22/0x40
[ 220.605032] [<ffffffff812fc6a0>] sock_ioctl+0x20e/0x21d
[ 220.605032] [<ffffffff810c5916>] do_vfs_ioctl+0x469/0x4aa
[ 220.605032] [<ffffffff81427add>] ? _raw_spin_unlock+0x23/0x36
[ 220.605032] [<ffffffff8105464b>] ? trace_hardirqs_off_caller+0xa3/0x10b
[ 220.605032] [<ffffffff81427e59>] ? retint_swapgs+0x13/0x1b
[ 220.605032] [<ffffffff810c59a8>] sys_ioctl+0x51/0x75
[ 220.605032] [<ffffffff81428652>] system_call_fastpath+0x16/0x1b
[ 220.695701] BUG: scheduling while atomic: hciconfig/2046/0x10000002
[ 220.696929] INFO: lockdep is turned off.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] Sleeping inside a RCU reader
2012-01-03 1:51 [BUG] Sleeping inside a RCU reader Vinicius Costa Gomes
@ 2012-01-03 9:19 ` Daniel Wagner
2012-01-03 12:03 ` Andrei Emeltchenko
1 sibling, 0 replies; 4+ messages in thread
From: Daniel Wagner @ 2012-01-03 9:19 UTC (permalink / raw)
To: Vinicius Costa Gomes; +Cc: linux-bluetooth
Hi Vinicius,
not sure if the bt is about the same issue. I was able to trigger this
after a pairing failed (the reason for this is still unknown to me, I am
debugging this. See the urb failures) and then stopping bluetoothd.
BTW, this is bluetooth-next from yesterday.
cheers,
daniel
Jan 3 10:08:24 candlejack kernel: [66006.248288] usb 5-2: BOGUS urb flags, 2 --> 0
Jan 3 10:08:24 candlejack kernel: [66006.248293] Bluetooth: hci0 urb ffff8801fcfe83c0 submission failed (22)
Jan 3 10:08:28 candlejack kernel: [66010.256231] usb 5-2: BOGUS urb flags, 2 --> 0
Jan 3 10:08:28 candlejack kernel: [66010.256237] Bluetooth: hci0 urb ffff8800375693c0 submission failed (22)
Jan 3 10:09:00 candlejack abrt: detected unhandled Python exception in './test-device'
Jan 3 10:09:02 candlejack abrt: detected unhandled Python exception in './test-device'
Jan 3 10:09:04 candlejack bluetoothd[4913]: Terminating
Jan 3 10:09:05 candlejack kernel: [66046.284261] Bluetooth: unknown link type 97
Jan 3 10:09:05 candlejack kernel: [66046.284281] BUG: unable to handle kernel paging request at 0000000000050002
Jan 3 10:09:05 candlejack kernel: [66046.284330] IP: [<ffffffff817ea271>] _raw_spin_lock_irqsave+0x11/0x30
Jan 3 10:09:05 candlejack kernel: [66046.284368] PGD 0
Jan 3 10:09:05 candlejack kernel: [66046.284386] Oops: 0002 [#1] SMP
Jan 3 10:09:05 candlejack kernel: [66046.284414] CPU 1
Jan 3 10:09:05 candlejack kernel: [66046.284427] Modules linked in: rfcomm bnep btusb bluetooth [last unloaded: scsi_wait_scan]
Jan 3 10:09:05 candlejack kernel: [66046.284498]
Jan 3 10:09:05 candlejack kernel: [66046.284510] Pid: 4913, comm: bluetoothd Tainted: G I 3.2.0-rc2+ #5 Hewlett-Packard HP xw4600 Workstation/0AA0h
Jan 3 10:09:05 candlejack kernel: [66046.284567] RIP: 0010:[<ffffffff817ea271>] [<ffffffff817ea271>] _raw_spin_lock_irqsave+0x11/0x30
Jan 3 10:09:05 candlejack kernel: [66046.284607] RSP: 0018:ffff8801a6141c98 EFLAGS: 00010096
Jan 3 10:09:05 candlejack kernel: [66046.284629] RAX: 0000000000000296 RBX: 0000000000050002 RCX: 0000000000000047
Jan 3 10:09:05 candlejack kernel: [66046.284656] RDX: 0000000000000100 RSI: ffff8801a6141ce0 RDI: 0000000000050002
Jan 3 10:09:05 candlejack kernel: [66046.284689] RBP: ffff8801a6141c98 R08: 0000000000000000 R09: 0000000000000000
Jan 3 10:09:05 candlejack kernel: [66046.284716] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880165a9b4f0
Jan 3 10:09:05 candlejack kernel: [66046.284745] R13: ffff8801a6141ce0 R14: 0000000000050003 R15: 000000000000000f
Jan 3 10:09:05 candlejack kernel: [66046.284772] FS: 00007f8ff6ff1740(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
Jan 3 10:09:05 candlejack kernel: [66046.284802] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 3 10:09:05 candlejack kernel: [66046.284825] CR2: 0000000000050002 CR3: 0000000194066000 CR4: 00000000000406e0
Jan 3 10:09:05 candlejack kernel: [66046.284852] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 3 10:09:05 candlejack kernel: [66046.284880] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 3 10:09:05 candlejack kernel: [66046.284907] Process bluetoothd (pid: 4913, threadinfo ffff8801a6140000, task ffff880199d6cb00)
Jan 3 10:09:05 candlejack kernel: [66046.284938] Stack:
Jan 3 10:09:05 candlejack kernel: [66046.284951] ffff8801a6141cc8 ffffffff81052c89 ffff880165a9b4f0 0000000000000000
Jan 3 10:09:05 candlejack kernel: [66046.285006] ffff8802361e15d0 ffff8800512a6330 ffff8801a6141d18 ffffffff81053f4e
Jan 3 10:09:05 candlejack kernel: [66046.285062] ffffffffa002a7c2 ffff8801a6141cc0 0000000000000000 ffff880165a9b400
Jan 3 10:09:05 candlejack kernel: [66046.285113] Call Trace:
Jan 3 10:09:05 candlejack kernel: [66046.285131] [<ffffffff81052c89>] lock_timer_base+0x39/0x70
Jan 3 10:09:05 candlejack kernel: [66046.285157] [<ffffffff81053f4e>] del_timer+0x3e/0x110
Jan 3 10:09:05 candlejack kernel: [66046.285187] [<ffffffffa00080d4>] hci_conn_del+0x44/0x290 [bluetooth]
Jan 3 10:09:05 candlejack kernel: [66046.285217] [<ffffffffa00083a0>] hci_conn_hash_flush+0x80/0xf0 [bluetooth]
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffffa0002c1b>] hci_dev_do_close+0xbb/0x320 [bluetooth]
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffffa00047dd>] hci_dev_close+0x2d/0x70 [bluetooth]
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffffa001552a>] hci_sock_ioctl+0x1aa/0x3f0 [bluetooth]
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff817f1a96>] ? ftrace_call+0x5/0x2b
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff8162b3c0>] sock_do_ioctl+0x30/0x70
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff8162c449>] sock_ioctl+0x79/0x2f0
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff81133a88>] do_vfs_ioctl+0x98/0x550
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff810b0586>] ? ftrace_test_stop_func+0x16/0x20
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff817f1a96>] ? ftrace_call+0x5/0x2b
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff81133fd1>] sys_ioctl+0x91/0xa0
Jan 3 10:09:05 candlejack kernel: [66046.285249] [<ffffffff817f1ceb>] system_call_fastpath+0x16/0x1b
Jan 3 10:09:05 candlejack kernel: [66046.285249] Code: 40 00 8d 8a 00 01 00 00 89 d0 f0 66 0f b1 0f 66 39 d0 0f 94 c0 0f b6 c0 5d c3 55 48 89 e5 e8 e7 77 00 00 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 0f b6 ce 38 d1 74 0c 0f 1f 00 f3 90 0f b6 17 38
Jan 3 10:09:05 candlejack kernel: [66046.285249] RIP [<ffffffff817ea271>] _raw_spin_lock_irqsave+0x11/0x30
Jan 3 10:09:05 candlejack kernel: [66046.285249] RSP <ffff8801a6141c98>
Jan 3 10:09:05 candlejack kernel: [66046.285249] CR2: 0000000000050002
Jan 3 10:09:05 candlejack kernel: [66046.287210] ---[ end trace 4eaa2a86a8e2da24 ]---
Jan 3 10:09:05 candlejack kernel: [66047.254031] Bluetooth: hci0 command tx timeout
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] Sleeping inside a RCU reader
2012-01-03 1:51 [BUG] Sleeping inside a RCU reader Vinicius Costa Gomes
2012-01-03 9:19 ` Daniel Wagner
@ 2012-01-03 12:03 ` Andrei Emeltchenko
2012-01-11 19:54 ` Ulisses Furquim
1 sibling, 1 reply; 4+ messages in thread
From: Andrei Emeltchenko @ 2012-01-03 12:03 UTC (permalink / raw)
To: Vinicius Costa Gomes; +Cc: linux-bluetooth
Hi Vinicius,
On Mon, Jan 02, 2012 at 10:51:31PM -0300, Vinicius Costa Gomes wrote:
> Hi,
>
> I found this bug (trace attached) and I don't think I will be able to
> start working on it in the next 11 hours, so this email is to make
> the life of anyone who wants to work on it a little easier. (If anyone
> wishes to work on this, please respond this email).
>
> I was able to reproduce it consistenly doing a "hciconfig <dev> down"
> with an established connection. I was using LE devices, but it seems that
> the types of the devices don't matter.
>
> The problem is that cancel_delayed_work_sync() called inside
> hci_conn_del() is blocking, because of and called in the context of an RCU
> (inside hci_conn_hash_flush()).
I think that if the RCU code is not protected by rcu_read_{,un}lock then
it is OK to sleep (and this is updater anyway).
In that function "hci_conn_hash_flush" maybe it is better to use
list_for_each_entry instead of list_for_each_entry_rcu for better
readability?
Best regards
Andrei Emeltchenko
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] Sleeping inside a RCU reader
2012-01-03 12:03 ` Andrei Emeltchenko
@ 2012-01-11 19:54 ` Ulisses Furquim
0 siblings, 0 replies; 4+ messages in thread
From: Ulisses Furquim @ 2012-01-11 19:54 UTC (permalink / raw)
To: Andrei Emeltchenko, Vinicius Costa Gomes, linux-bluetooth
Hi,
Is this bug still present? I haven't seen any more updates on this thread.
Thanks,
Regards,
-- Ulisses
On Tue, Jan 3, 2012 at 10:03 AM, Andrei Emeltchenko
<andrei.emeltchenko.news@gmail.com> wrote:
> Hi Vinicius,
>
> On Mon, Jan 02, 2012 at 10:51:31PM -0300, Vinicius Costa Gomes wrote:
>> Hi,
>>
>> I found this bug (trace attached) and I don't think I will be able to
>> start working on it in the next 11 hours, so this email is to make
>> the life of anyone who wants to work on it a little easier. (If anyone
>> wishes to work on this, please respond this email).
>>
>> I was able to reproduce it consistenly doing a "hciconfig <dev> down"
>> with an established connection. I was using LE devices, but it seems that
>> the types of the devices don't matter.
>>
>> The problem is that cancel_delayed_work_sync() called inside
>> hci_conn_del() is blocking, because of and called in the context of an RCU
>> (inside hci_conn_hash_flush()).
>
> I think that if the RCU code is not protected by rcu_read_{,un}lock then
> it is OK to sleep (and this is updater anyway).
>
> In that function "hci_conn_hash_flush" maybe it is better to use
> list_for_each_entry instead of list_for_each_entry_rcu for better
> readability?
>
> Best regards
> Andrei Emeltchenko
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ulisses Furquim
ProFUSION embedded systems
http://profusion.mobi
Mobile: +55 19 9250 0942
Skype: ulissesffs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-01-11 19:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-03 1:51 [BUG] Sleeping inside a RCU reader Vinicius Costa Gomes
2012-01-03 9:19 ` Daniel Wagner
2012-01-03 12:03 ` Andrei Emeltchenko
2012-01-11 19:54 ` Ulisses Furquim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).