netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] bluetooth: fix NULL-pointer dereferences
@ 2012-03-07 16:01 Johan Hovold
  2012-03-07 16:01 ` [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close Johan Hovold
  2012-03-07 16:02 ` [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister Johan Hovold
  0 siblings, 2 replies; 21+ messages in thread
From: Johan Hovold @ 2012-03-07 16:01 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo F. Padovan
  Cc: David S. Miller, linux-bluetooth, linux-kernel, netdev,
	Johan Hovold

Hi, 

These patches fixes two races in hci_ldisc and hci_core which can lead to
NULL-pointer dereferences.

The first one is 100% reproducible on 3.2 as well as 3.3-rc6 and needs to be
backported to all stable kernels as the offending code has been around for
quite some time.

The second one is 100% reproducible on 3.3-rc6 but I haven't seen it on 3.2 or
earlier, but as far as I can see it could be possibly to trigger it at least on
3.0 and later.


Thanks,
Johan

Johan Hovold (2):
  bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  bluetooth: hci_core: fix NULL-pointer dereference at unregister

 drivers/bluetooth/hci_ldisc.c |    2 +-
 include/net/bluetooth/hci.h   |    1 +
 net/bluetooth/hci_core.c      |    7 +++++++
 3 files changed, 9 insertions(+), 1 deletions(-)

-- 
1.7.8.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-07 16:01 [PATCH 0/2] bluetooth: fix NULL-pointer dereferences Johan Hovold
@ 2012-03-07 16:01 ` Johan Hovold
  2012-03-07 19:33   ` Marcel Holtmann
  2012-03-09 13:44   ` David Herrmann
  2012-03-07 16:02 ` [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister Johan Hovold
  1 sibling, 2 replies; 21+ messages in thread
From: Johan Hovold @ 2012-03-07 16:01 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo F. Padovan
  Cc: David S. Miller, linux-bluetooth, linux-kernel, netdev,
	Johan Hovold, stable

Do not close protocol driver until device has been unregistered.

This fixes a race between tty_close and hci_dev_open which can result in
a NULL-pointer dereference.

The line discipline closes the protocol driver while we may still have
hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
dereference when lock is acquired and hci_init_req called.

Bug is 100% reproducible using hciattach and a disconnected serial port:

0. # hciattach -n ttyO1 any noflow

1. hci_dev_open called from hci_power_on grabs req lock
2. hci_init_req executes but device fails to initialise (times out
   eventually)
3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
4. hci_uart_tty_close detaches protocol driver and cancels init req
5. hci_dev_open (1) releases req lock
6. hci_dev_open (3) grabs req lock, calls hci_init_req, which triggers oops
   when request is prepared in hci_uart_send_frame

[  137.201263] Unable to handle kernel NULL pointer dereference at virtual address 00000028
[  137.209838] pgd = c0004000
[  137.212677] [00000028] *pgd=00000000
[  137.216430] Internal error: Oops: 17 [#1]
[  137.220642] Modules linked in:
[  137.223846] CPU: 0    Tainted: G        W     (3.3.0-rc6-dirty #406)
[  137.230529] PC is at __lock_acquire+0x5c/0x1ab0
[  137.235290] LR is at lock_acquire+0x9c/0x128
[  137.239776] pc : [<c0071490>]    lr : [<c00733f8>]    psr: 20000093
[  137.239776] sp : cf869dd8  ip : c0529554  fp : c051c730
[  137.251800] r10: 00000000  r9 : cf8673c0  r8 : 00000080
[  137.257293] r7 : 00000028  r6 : 00000002  r5 : 00000000  r4 : c053fd70
[  137.264129] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : 00000001
[  137.270965] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[  137.278717] Control: 10c5387d  Table: 8f0f4019  DAC: 00000015
[  137.284729] Process kworker/u:1 (pid: 7, stack limit = 0xcf8682e8)
[  137.291229] Stack: (0xcf869dd8 to 0xcf86a000)
[  137.295776] 9dc0:                                                       c0529554 00000000
[  137.304351] 9de0: cf8673c0 cf868000 d03ea1ef cf868000 000001ef 00000470 00000000 00000002
[  137.312927] 9e00: cf8673c0 00000001 c051c730 c00716ec 0000000c 00000440 c0529554 00000001
[  137.321533] 9e20: c051c730 cf868000 d03ea1f3 00000000 c053b978 00000000 00000028 cf868000
[  137.330078] 9e40: 00000000 00000000 00000002 00000000 00000000 c00733f8 00000002 00000080
[  137.338684] 9e60: 00000000 c02a1d50 00000000 00000001 60000013 c0969a1c 60000093 c053b96c
[  137.347259] 9e80: 00000002 00000018 20000013 c02a1d50 cf0ac000 00000000 00000002 cf868000
[  137.355834] 9ea0: 00000089 c0374130 00000002 00000000 c02a1d50 cf0ac000 0000000c cf0fc540
[  137.364410] 9ec0: 00000018 c02a1d50 cf0fc540 00000000 cf0fc540 c0282238 c028220c cf178d80
[  137.372985] 9ee0: 127525d8 c02821cc 9a1fa451 c032727c 9a1fa451 127525d8 cf0fc540 cf0ac4ec
[  137.381561] 9f00: cf0ac000 cf0fc540 cf0ac584 c03285f4 c0328580 cf0ac4ec cf85c740 c05510cc
[  137.390136] 9f20: ce825400 c004c914 00000002 00000000 c004c884 ce8254f5 cf869f48 00000000
[  137.398712] 9f40: c0328580 ce825415 c0a7f914 c061af64 00000000 c048cf3c cf8673c0 cf85c740
[  137.407287] 9f60: c05510cc c051a66c c05510ec c05510c4 cf85c750 cf868000 00000089 c004d6ac
[  137.415863] 9f80: 00000000 c0073d14 00000001 cf853ed8 cf85c740 c004d558 00000013 00000000
[  137.424438] 9fa0: 00000000 00000000 00000000 c00516b0 00000000 00000000 cf85c740 00000000
[  137.433013] 9fc0: 00000001 dead4ead ffffffff ffffffff c0551674 00000000 00000000 c0450aa4
[  137.441589] 9fe0: cf869fe0 cf869fe0 cf853ed8 c005162c c0013b30 c0013b30 00ffff00 00ffff00
[  137.450164] [<c0071490>] (__lock_acquire+0x5c/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
[  137.459503] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0374130>] (_raw_spin_lock_irqsave+0x44/0x58)
[  137.469360] [<c0374130>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c02a1d50>] (skb_queue_tail+0x18/0x48)
[  137.479339] [<c02a1d50>] (skb_queue_tail+0x18/0x48) from [<c0282238>] (h4_enqueue+0x2c/0x34)
[  137.488189] [<c0282238>] (h4_enqueue+0x2c/0x34) from [<c02821cc>] (hci_uart_send_frame+0x34/0x68)
[  137.497497] [<c02821cc>] (hci_uart_send_frame+0x34/0x68) from [<c032727c>] (hci_send_frame+0x50/0x88)
[  137.507171] [<c032727c>] (hci_send_frame+0x50/0x88) from [<c03285f4>] (hci_cmd_work+0x74/0xd4)
[  137.516204] [<c03285f4>] (hci_cmd_work+0x74/0xd4) from [<c004c914>] (process_one_work+0x1a0/0x4ec)
[  137.525604] [<c004c914>] (process_one_work+0x1a0/0x4ec) from [<c004d6ac>] (worker_thread+0x154/0x344)
[  137.535278] [<c004d6ac>] (worker_thread+0x154/0x344) from [<c00516b0>] (kthread+0x84/0x90)
[  137.543975] [<c00516b0>] (kthread+0x84/0x90) from [<c0013b30>] (kernel_thread_exit+0x0/0x8)
[  137.552734] Code: e59f4e5c e5941000 e3510000 0a000031 (e5971000)
[  137.559234] ---[ end trace 1b75b31a2719ed1e ]---

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
---
 drivers/bluetooth/hci_ldisc.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
index 0711448..6946081 100644
--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -310,11 +310,11 @@ static void hci_uart_tty_close(struct tty_struct *tty)
 			hci_uart_close(hdev);
 
 		if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
-			hu->proto->close(hu);
 			if (hdev) {
 				hci_unregister_dev(hdev);
 				hci_free_dev(hdev);
 			}
+			hu->proto->close(hu);
 		}
 	}
 }
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
  2012-03-07 16:01 [PATCH 0/2] bluetooth: fix NULL-pointer dereferences Johan Hovold
  2012-03-07 16:01 ` [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close Johan Hovold
@ 2012-03-07 16:02 ` Johan Hovold
       [not found]   ` <1331136120-27075-3-git-send-email-jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-07 16:02 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo F. Padovan
  Cc: David S. Miller, linux-bluetooth, linux-kernel, netdev,
	Johan Hovold, stable

Make sure hci_dev_open returns immediately if hci_dev_unregister has
been called.

This fixes a race between hci_dev_open and hci_dev_unregister which can
lead to a NULL-pointer dereference.

Bug is 100% reproducible using hciattach and a disconnected serial port:

0. # hciattach -n /dev/ttyO1 any noflow

1. hci_dev_open called from hci_power_on grabs req lock
2. hci_init_req executes but device fails to initialise (times out
   eventually)
3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
4. hci_uart_tty_close calls hci_dev_unregister and sleeps on req lock in
   hci_dev_do_close
5. hci_dev_open (1) releases req lock
6. hci_dev_do_close grabs req lock and returns as device is not up
7. hci_dev_unregister sleeps in destroy_workqueue
8. hci_dev_open (3) grabs req lock, calls hci_init_req and eventually sleeps
9. hci_dev_unregister finishes, while hci_dev_open is still running...

[   79.627136] INFO: trying to register non-static key.
[   79.632354] the code is fine but needs lockdep annotation.
[   79.638122] turning off the locking correctness validator.
[   79.643920] [<c00188bc>] (unwind_backtrace+0x0/0xf8) from [<c00729c4>] (__lock_acquire+0x1590/0x1ab0)
[   79.653594] [<c00729c4>] (__lock_acquire+0x1590/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
[   79.663085] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0040a88>] (run_timer_softirq+0x150/0x3ac)
[   79.672668] [<c0040a88>] (run_timer_softirq+0x150/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
[   79.682281] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
[   79.690856] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
[   79.699157] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
[   79.708648] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
[   79.718048] Exception stack(0xcf281fb0 to 0xcf281ff8)
[   79.723358] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
[   79.731933] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
[   79.740509] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
[   79.747497] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   79.756011] pgd = cf3b4000
[   79.758850] [00000000] *pgd=8f0c7831, *pte=00000000, *ppte=00000000
[   79.765502] Internal error: Oops: 80000007 [#1]
[   79.770294] Modules linked in:
[   79.773529] CPU: 0    Tainted: G        W     (3.3.0-rc6-00002-gb5d5c87 #421)
[   79.781066] PC is at 0x0
[   79.783721] LR is at run_timer_softirq+0x16c/0x3ac
[   79.788787] pc : [<00000000>]    lr : [<c0040aa4>]    psr: 60000113
[   79.788787] sp : cf281ee0  ip : 00000000  fp : cf280000
[   79.800903] r10: 00000004  r9 : 00000100  r8 : b6f234d0
[   79.806427] r7 : c0519c28  r6 : cf093488  r5 : c0561a00  r4 : 00000000
[   79.813323] r3 : 00000000  r2 : c054eee0  r1 : 00000001  r0 : 00000000
[   79.820190] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   79.827728] Control: 10c5387d  Table: 8f3b4019  DAC: 00000015
[   79.833801] Process gpsd (pid: 1265, stack limit = 0xcf2802e8)
[   79.839965] Stack: (0xcf281ee0 to 0xcf282000)
[   79.844573] 1ee0: 00000002 00000000 c0040a24 00000000 00000002 cf281f08 00200200 00000000
[   79.853210] 1f00: 00000000 cf281f18 cf281f08 00000000 00000000 00000000 cf281f18 cf281f18
[   79.861816] 1f20: 00000000 00000001 c056184c 00000000 00000001 b6f234d0 c0561848 00000004
[   79.870452] 1f40: cf280000 c003a3b8 c051e79c 00000001 00000000 00000100 3fa9e7b8 0000000a
[   79.879089] 1f60: 00000025 cf280000 00000025 00000000 00000000 b6f234d0 00000000 00000004
[   79.887756] 1f80: 00000000 c003a924 c053ad38 c0013a50 fa200000 cf281fb0 ffffffff c0008530
[   79.896362] 1fa0: 0001e6a0 0000aab8 80000010 c037499c 0001e6a0 be8dab00 0001e698 00036698
[   79.904998] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
[   79.913665] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff 00fbf700 04ffff00
[   79.922302] [<c0040aa4>] (run_timer_softirq+0x16c/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
[   79.931945] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
[   79.940582] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
[   79.948913] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
[   79.958404] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
[   79.967773] Exception stack(0xcf281fb0 to 0xcf281ff8)
[   79.973083] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
[   79.981658] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
[   79.990234] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
[   79.997161] Code: bad PC value
[   80.000396] ---[ end trace 6f6739840475f9ee ]---
[   80.005279] Kernel panic - not syncing: Fatal exception in interrupt

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
---
 include/net/bluetooth/hci.h |    1 +
 net/bluetooth/hci_core.c    |    7 +++++++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 00596e8..f626f44 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -86,6 +86,7 @@ enum {
 	HCI_DEBUG_KEYS,
 
 	HCI_RESET,
+	HCI_UNREGISTER,
 };
 
 /*
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 5aeb624..3937ce3 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
 
 	hci_req_lock(hdev);
 
+	if (test_bit(HCI_UNREGISTER, &hdev->flags)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
 	if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) {
 		ret = -ERFKILL;
 		goto done;
@@ -1577,6 +1582,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
 
 	BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);
 
+	set_bit(HCI_UNREGISTER, &hdev->flags);
+
 	write_lock(&hci_dev_list_lock);
 	list_del(&hdev->list);
 	write_unlock(&hci_dev_list_lock);
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
       [not found]   ` <1331136120-27075-3-git-send-email-jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2012-03-07 19:29     ` Marcel Holtmann
  2012-03-08 11:56       ` Johan Hovold
  0 siblings, 1 reply; 21+ messages in thread
From: Marcel Holtmann @ 2012-03-07 19:29 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

Hi Johan,

> Make sure hci_dev_open returns immediately if hci_dev_unregister has
> been called.
> 
> This fixes a race between hci_dev_open and hci_dev_unregister which can
> lead to a NULL-pointer dereference.
> 
> Bug is 100% reproducible using hciattach and a disconnected serial port:
> 
> 0. # hciattach -n /dev/ttyO1 any noflow
> 
> 1. hci_dev_open called from hci_power_on grabs req lock
> 2. hci_init_req executes but device fails to initialise (times out
>    eventually)
> 3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
> 4. hci_uart_tty_close calls hci_dev_unregister and sleeps on req lock in
>    hci_dev_do_close
> 5. hci_dev_open (1) releases req lock
> 6. hci_dev_do_close grabs req lock and returns as device is not up
> 7. hci_dev_unregister sleeps in destroy_workqueue
> 8. hci_dev_open (3) grabs req lock, calls hci_init_req and eventually sleeps
> 9. hci_dev_unregister finishes, while hci_dev_open is still running...
> 
> [   79.627136] INFO: trying to register non-static key.
> [   79.632354] the code is fine but needs lockdep annotation.
> [   79.638122] turning off the locking correctness validator.
> [   79.643920] [<c00188bc>] (unwind_backtrace+0x0/0xf8) from [<c00729c4>] (__lock_acquire+0x1590/0x1ab0)
> [   79.653594] [<c00729c4>] (__lock_acquire+0x1590/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
> [   79.663085] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0040a88>] (run_timer_softirq+0x150/0x3ac)
> [   79.672668] [<c0040a88>] (run_timer_softirq+0x150/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
> [   79.682281] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
> [   79.690856] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
> [   79.699157] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
> [   79.708648] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
> [   79.718048] Exception stack(0xcf281fb0 to 0xcf281ff8)
> [   79.723358] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
> [   79.731933] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
> [   79.740509] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
> [   79.747497] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [   79.756011] pgd = cf3b4000
> [   79.758850] [00000000] *pgd=8f0c7831, *pte=00000000, *ppte=00000000
> [   79.765502] Internal error: Oops: 80000007 [#1]
> [   79.770294] Modules linked in:
> [   79.773529] CPU: 0    Tainted: G        W     (3.3.0-rc6-00002-gb5d5c87 #421)
> [   79.781066] PC is at 0x0
> [   79.783721] LR is at run_timer_softirq+0x16c/0x3ac
> [   79.788787] pc : [<00000000>]    lr : [<c0040aa4>]    psr: 60000113
> [   79.788787] sp : cf281ee0  ip : 00000000  fp : cf280000
> [   79.800903] r10: 00000004  r9 : 00000100  r8 : b6f234d0
> [   79.806427] r7 : c0519c28  r6 : cf093488  r5 : c0561a00  r4 : 00000000
> [   79.813323] r3 : 00000000  r2 : c054eee0  r1 : 00000001  r0 : 00000000
> [   79.820190] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [   79.827728] Control: 10c5387d  Table: 8f3b4019  DAC: 00000015
> [   79.833801] Process gpsd (pid: 1265, stack limit = 0xcf2802e8)
> [   79.839965] Stack: (0xcf281ee0 to 0xcf282000)
> [   79.844573] 1ee0: 00000002 00000000 c0040a24 00000000 00000002 cf281f08 00200200 00000000
> [   79.853210] 1f00: 00000000 cf281f18 cf281f08 00000000 00000000 00000000 cf281f18 cf281f18
> [   79.861816] 1f20: 00000000 00000001 c056184c 00000000 00000001 b6f234d0 c0561848 00000004
> [   79.870452] 1f40: cf280000 c003a3b8 c051e79c 00000001 00000000 00000100 3fa9e7b8 0000000a
> [   79.879089] 1f60: 00000025 cf280000 00000025 00000000 00000000 b6f234d0 00000000 00000004
> [   79.887756] 1f80: 00000000 c003a924 c053ad38 c0013a50 fa200000 cf281fb0 ffffffff c0008530
> [   79.896362] 1fa0: 0001e6a0 0000aab8 80000010 c037499c 0001e6a0 be8dab00 0001e698 00036698
> [   79.904998] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
> [   79.913665] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff 00fbf700 04ffff00
> [   79.922302] [<c0040aa4>] (run_timer_softirq+0x16c/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
> [   79.931945] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
> [   79.940582] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
> [   79.948913] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
> [   79.958404] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
> [   79.967773] Exception stack(0xcf281fb0 to 0xcf281ff8)
> [   79.973083] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
> [   79.981658] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
> [   79.990234] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
> [   79.997161] Code: bad PC value
> [   80.000396] ---[ end trace 6f6739840475f9ee ]---
> [   80.005279] Kernel panic - not syncing: Fatal exception in interrupt
> 
> Cc: stable <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> Signed-off-by: Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  include/net/bluetooth/hci.h |    1 +
>  net/bluetooth/hci_core.c    |    7 +++++++
>  2 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> index 00596e8..f626f44 100644
> --- a/include/net/bluetooth/hci.h
> +++ b/include/net/bluetooth/hci.h
> @@ -86,6 +86,7 @@ enum {
>  	HCI_DEBUG_KEYS,
>  
>  	HCI_RESET,
> +	HCI_UNREGISTER,
>  };

what version of the kernel is this patch against? Since we cleaned up
the flags in bluetooth-next tree. Also in addition you can not just add
flags here.

>  
>  /*
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 5aeb624..3937ce3 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
>  
>  	hci_req_lock(hdev);
>  
> +	if (test_bit(HCI_UNREGISTER, &hdev->flags)) {
> +		ret = -ENODEV;
> +		goto done;
> +	}
> +
>  	if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) {
>  		ret = -ERFKILL;
>  		goto done;
> @@ -1577,6 +1582,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
>  
>  	BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);
>  
> +	set_bit(HCI_UNREGISTER, &hdev->flags);
> +
>  	write_lock(&hci_dev_list_lock);
>  	list_del(&hdev->list);
>  	write_unlock(&hci_dev_list_lock);

Is this really enough to protect this race condition?

Regards

Marcel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-07 16:01 ` [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close Johan Hovold
@ 2012-03-07 19:33   ` Marcel Holtmann
  2012-03-08 11:57     ` Johan Hovold
  2012-03-09 13:44   ` David Herrmann
  1 sibling, 1 reply; 21+ messages in thread
From: Marcel Holtmann @ 2012-03-07 19:33 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Gustavo F. Padovan, David S. Miller, linux-bluetooth,
	linux-kernel, netdev, stable

Hi Johan,

> Do not close protocol driver until device has been unregistered.
> 
> This fixes a race between tty_close and hci_dev_open which can result in
> a NULL-pointer dereference.
> 
> The line discipline closes the protocol driver while we may still have
> hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> dereference when lock is acquired and hci_init_req called.
> 
> Bug is 100% reproducible using hciattach and a disconnected serial port:
> 
> 0. # hciattach -n ttyO1 any noflow
> 
> 1. hci_dev_open called from hci_power_on grabs req lock
> 2. hci_init_req executes but device fails to initialise (times out
>    eventually)
> 3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
> 4. hci_uart_tty_close detaches protocol driver and cancels init req
> 5. hci_dev_open (1) releases req lock
> 6. hci_dev_open (3) grabs req lock, calls hci_init_req, which triggers oops
>    when request is prepared in hci_uart_send_frame
> 
> [  137.201263] Unable to handle kernel NULL pointer dereference at virtual address 00000028
> [  137.209838] pgd = c0004000
> [  137.212677] [00000028] *pgd=00000000
> [  137.216430] Internal error: Oops: 17 [#1]
> [  137.220642] Modules linked in:
> [  137.223846] CPU: 0    Tainted: G        W     (3.3.0-rc6-dirty #406)
> [  137.230529] PC is at __lock_acquire+0x5c/0x1ab0
> [  137.235290] LR is at lock_acquire+0x9c/0x128
> [  137.239776] pc : [<c0071490>]    lr : [<c00733f8>]    psr: 20000093
> [  137.239776] sp : cf869dd8  ip : c0529554  fp : c051c730
> [  137.251800] r10: 00000000  r9 : cf8673c0  r8 : 00000080
> [  137.257293] r7 : 00000028  r6 : 00000002  r5 : 00000000  r4 : c053fd70
> [  137.264129] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : 00000001
> [  137.270965] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> [  137.278717] Control: 10c5387d  Table: 8f0f4019  DAC: 00000015
> [  137.284729] Process kworker/u:1 (pid: 7, stack limit = 0xcf8682e8)
> [  137.291229] Stack: (0xcf869dd8 to 0xcf86a000)
> [  137.295776] 9dc0:                                                       c0529554 00000000
> [  137.304351] 9de0: cf8673c0 cf868000 d03ea1ef cf868000 000001ef 00000470 00000000 00000002
> [  137.312927] 9e00: cf8673c0 00000001 c051c730 c00716ec 0000000c 00000440 c0529554 00000001
> [  137.321533] 9e20: c051c730 cf868000 d03ea1f3 00000000 c053b978 00000000 00000028 cf868000
> [  137.330078] 9e40: 00000000 00000000 00000002 00000000 00000000 c00733f8 00000002 00000080
> [  137.338684] 9e60: 00000000 c02a1d50 00000000 00000001 60000013 c0969a1c 60000093 c053b96c
> [  137.347259] 9e80: 00000002 00000018 20000013 c02a1d50 cf0ac000 00000000 00000002 cf868000
> [  137.355834] 9ea0: 00000089 c0374130 00000002 00000000 c02a1d50 cf0ac000 0000000c cf0fc540
> [  137.364410] 9ec0: 00000018 c02a1d50 cf0fc540 00000000 cf0fc540 c0282238 c028220c cf178d80
> [  137.372985] 9ee0: 127525d8 c02821cc 9a1fa451 c032727c 9a1fa451 127525d8 cf0fc540 cf0ac4ec
> [  137.381561] 9f00: cf0ac000 cf0fc540 cf0ac584 c03285f4 c0328580 cf0ac4ec cf85c740 c05510cc
> [  137.390136] 9f20: ce825400 c004c914 00000002 00000000 c004c884 ce8254f5 cf869f48 00000000
> [  137.398712] 9f40: c0328580 ce825415 c0a7f914 c061af64 00000000 c048cf3c cf8673c0 cf85c740
> [  137.407287] 9f60: c05510cc c051a66c c05510ec c05510c4 cf85c750 cf868000 00000089 c004d6ac
> [  137.415863] 9f80: 00000000 c0073d14 00000001 cf853ed8 cf85c740 c004d558 00000013 00000000
> [  137.424438] 9fa0: 00000000 00000000 00000000 c00516b0 00000000 00000000 cf85c740 00000000
> [  137.433013] 9fc0: 00000001 dead4ead ffffffff ffffffff c0551674 00000000 00000000 c0450aa4
> [  137.441589] 9fe0: cf869fe0 cf869fe0 cf853ed8 c005162c c0013b30 c0013b30 00ffff00 00ffff00
> [  137.450164] [<c0071490>] (__lock_acquire+0x5c/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
> [  137.459503] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0374130>] (_raw_spin_lock_irqsave+0x44/0x58)
> [  137.469360] [<c0374130>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c02a1d50>] (skb_queue_tail+0x18/0x48)
> [  137.479339] [<c02a1d50>] (skb_queue_tail+0x18/0x48) from [<c0282238>] (h4_enqueue+0x2c/0x34)
> [  137.488189] [<c0282238>] (h4_enqueue+0x2c/0x34) from [<c02821cc>] (hci_uart_send_frame+0x34/0x68)
> [  137.497497] [<c02821cc>] (hci_uart_send_frame+0x34/0x68) from [<c032727c>] (hci_send_frame+0x50/0x88)
> [  137.507171] [<c032727c>] (hci_send_frame+0x50/0x88) from [<c03285f4>] (hci_cmd_work+0x74/0xd4)
> [  137.516204] [<c03285f4>] (hci_cmd_work+0x74/0xd4) from [<c004c914>] (process_one_work+0x1a0/0x4ec)
> [  137.525604] [<c004c914>] (process_one_work+0x1a0/0x4ec) from [<c004d6ac>] (worker_thread+0x154/0x344)
> [  137.535278] [<c004d6ac>] (worker_thread+0x154/0x344) from [<c00516b0>] (kthread+0x84/0x90)
> [  137.543975] [<c00516b0>] (kthread+0x84/0x90) from [<c0013b30>] (kernel_thread_exit+0x0/0x8)
> [  137.552734] Code: e59f4e5c e5941000 e3510000 0a000031 (e5971000)
> [  137.559234] ---[ end trace 1b75b31a2719ed1e ]---
> 
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Johan Hovold <jhovold@gmail.com>
> ---
>  drivers/bluetooth/hci_ldisc.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> index 0711448..6946081 100644
> --- a/drivers/bluetooth/hci_ldisc.c
> +++ b/drivers/bluetooth/hci_ldisc.c
> @@ -310,11 +310,11 @@ static void hci_uart_tty_close(struct tty_struct *tty)
>  			hci_uart_close(hdev);
>  
>  		if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
> -			hu->proto->close(hu);
>  			if (hdev) {
>  				hci_unregister_dev(hdev);
>  				hci_free_dev(hdev);
>  			}
> +			hu->proto->close(hu);
>  		}
>  	}
>  }

what kernel version is this against? Our changes in bluetooth-next fixed
some of the destruct handling.

Also hci_unregister_dev should be calling the destruct handler and thus
your change is now accessing hu but it got freed already.

Regards

Marcel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
  2012-03-07 19:29     ` Marcel Holtmann
@ 2012-03-08 11:56       ` Johan Hovold
  2012-03-08 17:43         ` Marcel Holtmann
  0 siblings, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-08 11:56 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Gustavo F. Padovan, David S. Miller, linux-bluetooth,
	linux-kernel, netdev, stable

Hi Marcel,

On Wed, Mar 07, 2012 at 11:29:20AM -0800, Marcel Holtmann wrote:
> Hi Johan,
> 
> > Make sure hci_dev_open returns immediately if hci_dev_unregister has
> > been called.
> > 
> > This fixes a race between hci_dev_open and hci_dev_unregister which can
> > lead to a NULL-pointer dereference.

[...]

> what version of the kernel is this patch against? Since we cleaned up
> the flags in bluetooth-next tree. Also in addition you can not just add
> flags here.

As this to be fixed in 3.3 it is against 3.3-rc6.

> >  /*
> > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > index 5aeb624..3937ce3 100644
> > --- a/net/bluetooth/hci_core.c
> > +++ b/net/bluetooth/hci_core.c
> > @@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
> >  
> >  	hci_req_lock(hdev);
> >  
> > +	if (test_bit(HCI_UNREGISTER, &hdev->flags)) {
> > +		ret = -ENODEV;
> > +		goto done;
> > +	}
> > +
> >  	if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) {
> >  		ret = -ERFKILL;
> >  		goto done;
> > @@ -1577,6 +1582,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
> >  
> >  	BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);
> >  
> > +	set_bit(HCI_UNREGISTER, &hdev->flags);
> > +
> >  	write_lock(&hci_dev_list_lock);
> >  	list_del(&hdev->list);
> >  	write_unlock(&hci_dev_list_lock);
> 
> Is this really enough to protect this race condition?

1. first hci_dev_open grabs req lock
2. second hci_dev_open sleeps on req lock
3. hci_dev_unregister sleep on req lock (in do_close)
4. first hci_dev_open times out and releases req lock

Now either a) the second open grabs the lock or b) close does. 

a) The second open will time out eventually as well and setting a flag
   at unregister will only speed things up (at least when the first
   patch in my series is applied -- otherwise this leads to a
   NULL-pointer exception as well).

b) If close grabs the lock while we have open sleeping on it things go
   really bad and this is the case this patch intends to fix.

As far as I can see, a flag set at unregister (before acquiring the lock)
will suffice to fix this race, but perhaps I'm missing something?

Where should such an internal flag be added as hdev->flags can not be
used? hdev->dev_flags?

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-07 19:33   ` Marcel Holtmann
@ 2012-03-08 11:57     ` Johan Hovold
  2012-03-08 17:45       ` Marcel Holtmann
  0 siblings, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-08 11:57 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Gustavo F. Padovan, David S. Miller, linux-bluetooth,
	linux-kernel, netdev, stable

Hi Marcel,

On Wed, Mar 07, 2012 at 11:33:17AM -0800, Marcel Holtmann wrote:
> Hi Johan,
>
> > Do not close protocol driver until device has been unregistered.
> > 
> > This fixes a race between tty_close and hci_dev_open which can result in
> > a NULL-pointer dereference.
> > 
> > The line discipline closes the protocol driver while we may still have
> > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> > dereference when lock is acquired and hci_init_req called.

[...]

> what kernel version is this against? Our changes in bluetooth-next fixed
> some of the destruct handling.

This is against the latest rc as it needs to be fixed in 3.3, but I
missed a dependency to bluetooth-next as you point out below.

> Also hci_unregister_dev should be calling the destruct handler and thus
> your change is now accessing hu but it got freed already.

You're right, my patch depends on 010666a126fc ("Bluetooth: Make
hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
Fix memory leak and remove destruct cb") from bluetooth-next. 

But since the latter one fixes a memory leak it should have been marked
for stable as well as pushed to Linus for 3.3, right?

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
  2012-03-08 11:56       ` Johan Hovold
@ 2012-03-08 17:43         ` Marcel Holtmann
  2012-03-09 12:53           ` [PATCH 2/2 v2] " Johan Hovold
  0 siblings, 1 reply; 21+ messages in thread
From: Marcel Holtmann @ 2012-03-08 17:43 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

Hi Johan,

> > > Make sure hci_dev_open returns immediately if hci_dev_unregister has
> > > been called.
> > > 
> > > This fixes a race between hci_dev_open and hci_dev_unregister which can
> > > lead to a NULL-pointer dereference.
> 
> [...]
> 
> > what version of the kernel is this patch against? Since we cleaned up
> > the flags in bluetooth-next tree. Also in addition you can not just add
> > flags here.
> 
> As this to be fixed in 3.3 it is against 3.3-rc6.
> 
> > >  /*
> > > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > > index 5aeb624..3937ce3 100644
> > > --- a/net/bluetooth/hci_core.c
> > > +++ b/net/bluetooth/hci_core.c
> > > @@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
> > >  
> > >  	hci_req_lock(hdev);
> > >  
> > > +	if (test_bit(HCI_UNREGISTER, &hdev->flags)) {
> > > +		ret = -ENODEV;
> > > +		goto done;
> > > +	}
> > > +
> > >  	if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) {
> > >  		ret = -ERFKILL;
> > >  		goto done;
> > > @@ -1577,6 +1582,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
> > >  
> > >  	BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);
> > >  
> > > +	set_bit(HCI_UNREGISTER, &hdev->flags);
> > > +
> > >  	write_lock(&hci_dev_list_lock);
> > >  	list_del(&hdev->list);
> > >  	write_unlock(&hci_dev_list_lock);
> > 
> > Is this really enough to protect this race condition?
> 
> 1. first hci_dev_open grabs req lock
> 2. second hci_dev_open sleeps on req lock
> 3. hci_dev_unregister sleep on req lock (in do_close)
> 4. first hci_dev_open times out and releases req lock
> 
> Now either a) the second open grabs the lock or b) close does. 
> 
> a) The second open will time out eventually as well and setting a flag
>    at unregister will only speed things up (at least when the first
>    patch in my series is applied -- otherwise this leads to a
>    NULL-pointer exception as well).
> 
> b) If close grabs the lock while we have open sleeping on it things go
>    really bad and this is the case this patch intends to fix.
> 
> As far as I can see, a flag set at unregister (before acquiring the lock)
> will suffice to fix this race, but perhaps I'm missing something?
> 
> Where should such an internal flag be added as hdev->flags can not be
> used? hdev->dev_flags?

please add them to hdev->dev_flags as internal flag.

Regards

Marcel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-08 11:57     ` Johan Hovold
@ 2012-03-08 17:45       ` Marcel Holtmann
  2012-03-09 13:04         ` Johan Hovold
  0 siblings, 1 reply; 21+ messages in thread
From: Marcel Holtmann @ 2012-03-08 17:45 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

Hi Johan,

> > > Do not close protocol driver until device has been unregistered.
> > > 
> > > This fixes a race between tty_close and hci_dev_open which can result in
> > > a NULL-pointer dereference.
> > > 
> > > The line discipline closes the protocol driver while we may still have
> > > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> > > dereference when lock is acquired and hci_init_req called.
> 
> [...]
> 
> > what kernel version is this against? Our changes in bluetooth-next fixed
> > some of the destruct handling.
> 
> This is against the latest rc as it needs to be fixed in 3.3, but I
> missed a dependency to bluetooth-next as you point out below.
> 
> > Also hci_unregister_dev should be calling the destruct handler and thus
> > your change is now accessing hu but it got freed already.
> 
> You're right, my patch depends on 010666a126fc ("Bluetooth: Make
> hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
> Fix memory leak and remove destruct cb") from bluetooth-next. 
> 
> But since the latter one fixes a memory leak it should have been marked
> for stable as well as pushed to Linus for 3.3, right?

we need to look into this and propose patches for -stable. Is your
problem still present with bluetooth-next or not?

Regards

Marcel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/2 v2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
  2012-03-08 17:43         ` Marcel Holtmann
@ 2012-03-09 12:53           ` Johan Hovold
  2012-03-09 14:04             ` David Herrmann
  0 siblings, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 12:53 UTC (permalink / raw)
  To: Marcel Holtmann, Gustavo F. Padovan
  Cc: David S. Miller, linux-bluetooth, linux-kernel, netdev,
	Johan Hovold, stable

Make sure hci_dev_open returns immediately if hci_dev_unregister has
been called.

This fixes a race between hci_dev_open and hci_dev_unregister which can
lead to a NULL-pointer dereference.

Bug is 100% reproducible using hciattach and a disconnected serial port:

0. # hciattach -n /dev/ttyO1 any noflow

1. hci_dev_open called from hci_power_on grabs req lock
2. hci_init_req executes but device fails to initialise (times out
   eventually)
3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
4. hci_uart_tty_close calls hci_dev_unregister and sleeps on req lock in
   hci_dev_do_close
5. hci_dev_open (1) releases req lock
6. hci_dev_do_close grabs req lock and returns as device is not up
7. hci_dev_unregister sleeps in destroy_workqueue
8. hci_dev_open (3) grabs req lock, calls hci_init_req and eventually sleeps
9. hci_dev_unregister finishes, while hci_dev_open is still running...

[   79.627136] INFO: trying to register non-static key.
[   79.632354] the code is fine but needs lockdep annotation.
[   79.638122] turning off the locking correctness validator.
[   79.643920] [<c00188bc>] (unwind_backtrace+0x0/0xf8) from [<c00729c4>] (__lock_acquire+0x1590/0x1ab0)
[   79.653594] [<c00729c4>] (__lock_acquire+0x1590/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
[   79.663085] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0040a88>] (run_timer_softirq+0x150/0x3ac)
[   79.672668] [<c0040a88>] (run_timer_softirq+0x150/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
[   79.682281] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
[   79.690856] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
[   79.699157] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
[   79.708648] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
[   79.718048] Exception stack(0xcf281fb0 to 0xcf281ff8)
[   79.723358] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
[   79.731933] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
[   79.740509] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
[   79.747497] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   79.756011] pgd = cf3b4000
[   79.758850] [00000000] *pgd=8f0c7831, *pte=00000000, *ppte=00000000
[   79.765502] Internal error: Oops: 80000007 [#1]
[   79.770294] Modules linked in:
[   79.773529] CPU: 0    Tainted: G        W     (3.3.0-rc6-00002-gb5d5c87 #421)
[   79.781066] PC is at 0x0
[   79.783721] LR is at run_timer_softirq+0x16c/0x3ac
[   79.788787] pc : [<00000000>]    lr : [<c0040aa4>]    psr: 60000113
[   79.788787] sp : cf281ee0  ip : 00000000  fp : cf280000
[   79.800903] r10: 00000004  r9 : 00000100  r8 : b6f234d0
[   79.806427] r7 : c0519c28  r6 : cf093488  r5 : c0561a00  r4 : 00000000
[   79.813323] r3 : 00000000  r2 : c054eee0  r1 : 00000001  r0 : 00000000
[   79.820190] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   79.827728] Control: 10c5387d  Table: 8f3b4019  DAC: 00000015
[   79.833801] Process gpsd (pid: 1265, stack limit = 0xcf2802e8)
[   79.839965] Stack: (0xcf281ee0 to 0xcf282000)
[   79.844573] 1ee0: 00000002 00000000 c0040a24 00000000 00000002 cf281f08 00200200 00000000
[   79.853210] 1f00: 00000000 cf281f18 cf281f08 00000000 00000000 00000000 cf281f18 cf281f18
[   79.861816] 1f20: 00000000 00000001 c056184c 00000000 00000001 b6f234d0 c0561848 00000004
[   79.870452] 1f40: cf280000 c003a3b8 c051e79c 00000001 00000000 00000100 3fa9e7b8 0000000a
[   79.879089] 1f60: 00000025 cf280000 00000025 00000000 00000000 b6f234d0 00000000 00000004
[   79.887756] 1f80: 00000000 c003a924 c053ad38 c0013a50 fa200000 cf281fb0 ffffffff c0008530
[   79.896362] 1fa0: 0001e6a0 0000aab8 80000010 c037499c 0001e6a0 be8dab00 0001e698 00036698
[   79.904998] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
[   79.913665] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff 00fbf700 04ffff00
[   79.922302] [<c0040aa4>] (run_timer_softirq+0x16c/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
[   79.931945] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
[   79.940582] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
[   79.948913] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
[   79.958404] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
[   79.967773] Exception stack(0xcf281fb0 to 0xcf281ff8)
[   79.973083] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
[   79.981658] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
[   79.990234] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
[   79.997161] Code: bad PC value
[   80.000396] ---[ end trace 6f6739840475f9ee ]---
[   80.005279] Kernel panic - not syncing: Fatal exception in interrupt

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
---

v2: use hdev->dev_flags for internal unregister flag 


 include/net/bluetooth/hci.h |    2 ++
 net/bluetooth/hci_core.c    |    7 +++++++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 00596e8..e8879b9 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -93,6 +93,8 @@ enum {
  * states from the controller.
  */
 enum {
+	HCI_UNREGISTER,
+
 	HCI_LE_SCAN,
 };
 
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index d6448f0..22b6781 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
 
 	hci_req_lock(hdev);
 
+	if (test_bit(HCI_UNREGISTER, &hdev->dev_flags)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
 	if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) {
 		ret = -ERFKILL;
 		goto done;
@@ -1577,6 +1582,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
 
 	BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);
 
+	set_bit(HCI_UNREGISTER, &hdev->dev_flags);
+
 	write_lock(&hci_dev_list_lock);
 	list_del(&hdev->list);
 	write_unlock(&hci_dev_list_lock);
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-08 17:45       ` Marcel Holtmann
@ 2012-03-09 13:04         ` Johan Hovold
  2012-03-09 13:52           ` David Herrmann
  0 siblings, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 13:04 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

On Thu, Mar 08, 2012 at 09:45:22AM -0800, Marcel Holtmann wrote:
> Hi Johan,
> 
> > > > Do not close protocol driver until device has been unregistered.
> > > > 
> > > > This fixes a race between tty_close and hci_dev_open which can result in
> > > > a NULL-pointer dereference.
> > > > 
> > > > The line discipline closes the protocol driver while we may still have
> > > > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> > > > dereference when lock is acquired and hci_init_req called.
> > 
> > [...]
> > 
> > > what kernel version is this against? Our changes in bluetooth-next fixed
> > > some of the destruct handling.
> > 
> > This is against the latest rc as it needs to be fixed in 3.3, but I
> > missed a dependency to bluetooth-next as you point out below.
> > 
> > > Also hci_unregister_dev should be calling the destruct handler and thus
> > > your change is now accessing hu but it got freed already.
> > 
> > You're right, my patch depends on 010666a126fc ("Bluetooth: Make
> > hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
> > Fix memory leak and remove destruct cb") from bluetooth-next. 
> > 
> > But since the latter one fixes a memory leak it should have been marked
> > for stable as well as pushed to Linus for 3.3, right?
> 
> we need to look into this and propose patches for -stable. Is your
> problem still present with bluetooth-next or not?

Yes, both races are present in bluetooth-next of today (b8622cbd58f34)
and only takes an additional manual step to trigger (as the core no
longer tries to open the device twice automatically).

My two patches on top of either the two patches by David Herrmann
mentioned above or the following minimal fix of the same memory leak
would be sufficient to fix both races in 3.3:

diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
index 0711448..97c5faa 100644
--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -237,7 +237,6 @@ static void hci_uart_destruct(struct hci_dev *hdev)
                return;
 
        BT_DBG("%s", hdev->name);
-       kfree(hdev->driver_data);
 }
 
 /* ------ LDISC part ------ */
@@ -316,6 +315,7 @@ static void hci_uart_tty_close(struct tty_struct *tty)
                                hci_free_dev(hdev);
                        }
                }
+               kfree(hu);
        }
 }


Thanks,
Johan

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-07 16:01 ` [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close Johan Hovold
  2012-03-07 19:33   ` Marcel Holtmann
@ 2012-03-09 13:44   ` David Herrmann
  2012-03-09 14:29     ` Johan Hovold
  1 sibling, 1 reply; 21+ messages in thread
From: David Herrmann @ 2012-03-09 13:44 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth, linux-kernel, netdev, stable

Hi Johan

On Wed, Mar 7, 2012 at 5:01 PM, Johan Hovold <jhovold@gmail.com> wrote:
> Do not close protocol driver until device has been unregistered.
>
> This fixes a race between tty_close and hci_dev_open which can result in
> a NULL-pointer dereference.
>
> The line discipline closes the protocol driver while we may still have
> hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> dereference when lock is acquired and hci_init_req called.
>
> Bug is 100% reproducible using hciattach and a disconnected serial port:
>
> 0. # hciattach -n ttyO1 any noflow
>
> 1. hci_dev_open called from hci_power_on grabs req lock
> 2. hci_init_req executes but device fails to initialise (times out
>   eventually)
> 3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
> 4. hci_uart_tty_close detaches protocol driver and cancels init req
> 5. hci_dev_open (1) releases req lock
> 6. hci_dev_open (3) grabs req lock, calls hci_init_req, which triggers oops
>   when request is prepared in hci_uart_send_frame
>
> [  137.201263] Unable to handle kernel NULL pointer dereference at virtual address 00000028
> [  137.209838] pgd = c0004000
> [  137.212677] [00000028] *pgd=00000000
> [  137.216430] Internal error: Oops: 17 [#1]
> [  137.220642] Modules linked in:
> [  137.223846] CPU: 0    Tainted: G        W     (3.3.0-rc6-dirty #406)
> [  137.230529] PC is at __lock_acquire+0x5c/0x1ab0
> [  137.235290] LR is at lock_acquire+0x9c/0x128
> [  137.239776] pc : [<c0071490>]    lr : [<c00733f8>]    psr: 20000093
> [  137.239776] sp : cf869dd8  ip : c0529554  fp : c051c730
> [  137.251800] r10: 00000000  r9 : cf8673c0  r8 : 00000080
> [  137.257293] r7 : 00000028  r6 : 00000002  r5 : 00000000  r4 : c053fd70
> [  137.264129] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : 00000001
> [  137.270965] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> [  137.278717] Control: 10c5387d  Table: 8f0f4019  DAC: 00000015
> [  137.284729] Process kworker/u:1 (pid: 7, stack limit = 0xcf8682e8)
> [  137.291229] Stack: (0xcf869dd8 to 0xcf86a000)
> [  137.295776] 9dc0:                                                       c0529554 00000000
> [  137.304351] 9de0: cf8673c0 cf868000 d03ea1ef cf868000 000001ef 00000470 00000000 00000002
> [  137.312927] 9e00: cf8673c0 00000001 c051c730 c00716ec 0000000c 00000440 c0529554 00000001
> [  137.321533] 9e20: c051c730 cf868000 d03ea1f3 00000000 c053b978 00000000 00000028 cf868000
> [  137.330078] 9e40: 00000000 00000000 00000002 00000000 00000000 c00733f8 00000002 00000080
> [  137.338684] 9e60: 00000000 c02a1d50 00000000 00000001 60000013 c0969a1c 60000093 c053b96c
> [  137.347259] 9e80: 00000002 00000018 20000013 c02a1d50 cf0ac000 00000000 00000002 cf868000
> [  137.355834] 9ea0: 00000089 c0374130 00000002 00000000 c02a1d50 cf0ac000 0000000c cf0fc540
> [  137.364410] 9ec0: 00000018 c02a1d50 cf0fc540 00000000 cf0fc540 c0282238 c028220c cf178d80
> [  137.372985] 9ee0: 127525d8 c02821cc 9a1fa451 c032727c 9a1fa451 127525d8 cf0fc540 cf0ac4ec
> [  137.381561] 9f00: cf0ac000 cf0fc540 cf0ac584 c03285f4 c0328580 cf0ac4ec cf85c740 c05510cc
> [  137.390136] 9f20: ce825400 c004c914 00000002 00000000 c004c884 ce8254f5 cf869f48 00000000
> [  137.398712] 9f40: c0328580 ce825415 c0a7f914 c061af64 00000000 c048cf3c cf8673c0 cf85c740
> [  137.407287] 9f60: c05510cc c051a66c c05510ec c05510c4 cf85c750 cf868000 00000089 c004d6ac
> [  137.415863] 9f80: 00000000 c0073d14 00000001 cf853ed8 cf85c740 c004d558 00000013 00000000
> [  137.424438] 9fa0: 00000000 00000000 00000000 c00516b0 00000000 00000000 cf85c740 00000000
> [  137.433013] 9fc0: 00000001 dead4ead ffffffff ffffffff c0551674 00000000 00000000 c0450aa4
> [  137.441589] 9fe0: cf869fe0 cf869fe0 cf853ed8 c005162c c0013b30 c0013b30 00ffff00 00ffff00
> [  137.450164] [<c0071490>] (__lock_acquire+0x5c/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
> [  137.459503] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0374130>] (_raw_spin_lock_irqsave+0x44/0x58)
> [  137.469360] [<c0374130>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c02a1d50>] (skb_queue_tail+0x18/0x48)
> [  137.479339] [<c02a1d50>] (skb_queue_tail+0x18/0x48) from [<c0282238>] (h4_enqueue+0x2c/0x34)
> [  137.488189] [<c0282238>] (h4_enqueue+0x2c/0x34) from [<c02821cc>] (hci_uart_send_frame+0x34/0x68)
> [  137.497497] [<c02821cc>] (hci_uart_send_frame+0x34/0x68) from [<c032727c>] (hci_send_frame+0x50/0x88)
> [  137.507171] [<c032727c>] (hci_send_frame+0x50/0x88) from [<c03285f4>] (hci_cmd_work+0x74/0xd4)
> [  137.516204] [<c03285f4>] (hci_cmd_work+0x74/0xd4) from [<c004c914>] (process_one_work+0x1a0/0x4ec)
> [  137.525604] [<c004c914>] (process_one_work+0x1a0/0x4ec) from [<c004d6ac>] (worker_thread+0x154/0x344)
> [  137.535278] [<c004d6ac>] (worker_thread+0x154/0x344) from [<c00516b0>] (kthread+0x84/0x90)
> [  137.543975] [<c00516b0>] (kthread+0x84/0x90) from [<c0013b30>] (kernel_thread_exit+0x0/0x8)
> [  137.552734] Code: e59f4e5c e5941000 e3510000 0a000031 (e5971000)
> [  137.559234] ---[ end trace 1b75b31a2719ed1e ]---
>
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Johan Hovold <jhovold@gmail.com>
> ---
>  drivers/bluetooth/hci_ldisc.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> index 0711448..6946081 100644
> --- a/drivers/bluetooth/hci_ldisc.c
> +++ b/drivers/bluetooth/hci_ldisc.c
> @@ -310,11 +310,11 @@ static void hci_uart_tty_close(struct tty_struct *tty)
>                        hci_uart_close(hdev);
>
>                if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
> -                       hu->proto->close(hu);
>                        if (hdev) {
>                                hci_unregister_dev(hdev);
>                                hci_free_dev(hdev);
>                        }
> +                       hu->proto->close(hu);
>                }
>        }
>  }

I can confirm this. hci_uart_set_proto() opens the proto before it
registers the hci device. Hence, we should also unregister the hci
device before closing the proto. I also looked whether this introduces
other race conditions but no proto-callback can be called here as they
are all protected by the tty-layer which synchronizes all
tty-callbacks. Therefore, I think this is the correct fix.

We can apply this to stable even without the "destruct"-fixes from me
as hu->proto->$cb$() doesn't care whether hdev is valid or not. I
don't think the destruct-fixes are important enough to send them to
stable.

Reviewed-by: David Herrmann <dh.herrmann@googlemail.com>

Regards
David

> --
> 1.7.8.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-09 13:04         ` Johan Hovold
@ 2012-03-09 13:52           ` David Herrmann
  2012-03-09 14:40             ` Johan Hovold
  0 siblings, 1 reply; 21+ messages in thread
From: David Herrmann @ 2012-03-09 13:52 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

On Fri, Mar 9, 2012 at 2:04 PM, Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, Mar 08, 2012 at 09:45:22AM -0800, Marcel Holtmann wrote:
>> Hi Johan,
>>
>> > > > Do not close protocol driver until device has been unregistered.
>> > > >
>> > > > This fixes a race between tty_close and hci_dev_open which can result in
>> > > > a NULL-pointer dereference.
>> > > >
>> > > > The line discipline closes the protocol driver while we may still have
>> > > > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
>> > > > dereference when lock is acquired and hci_init_req called.
>> >
>> > [...]
>> >
>> > > what kernel version is this against? Our changes in bluetooth-next fixed
>> > > some of the destruct handling.
>> >
>> > This is against the latest rc as it needs to be fixed in 3.3, but I
>> > missed a dependency to bluetooth-next as you point out below.
>> >
>> > > Also hci_unregister_dev should be calling the destruct handler and thus
>> > > your change is now accessing hu but it got freed already.
>> >
>> > You're right, my patch depends on 010666a126fc ("Bluetooth: Make
>> > hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
>> > Fix memory leak and remove destruct cb") from bluetooth-next.
>> >
>> > But since the latter one fixes a memory leak it should have been marked
>> > for stable as well as pushed to Linus for 3.3, right?
>>
>> we need to look into this and propose patches for -stable. Is your
>> problem still present with bluetooth-next or not?
>
> Yes, both races are present in bluetooth-next of today (b8622cbd58f34)
> and only takes an additional manual step to trigger (as the core no
> longer tries to open the device twice automatically).
>
> My two patches on top of either the two patches by David Herrmann
> mentioned above or the following minimal fix of the same memory leak
> would be sufficient to fix both races in 3.3:
>
> diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> index 0711448..97c5faa 100644
> --- a/drivers/bluetooth/hci_ldisc.c
> +++ b/drivers/bluetooth/hci_ldisc.c
> @@ -237,7 +237,6 @@ static void hci_uart_destruct(struct hci_dev *hdev)
>                return;
>
>        BT_DBG("%s", hdev->name);
> -       kfree(hdev->driver_data);
>  }
>
>  /* ------ LDISC part ------ */
> @@ -316,6 +315,7 @@ static void hci_uart_tty_close(struct tty_struct *tty)
>                                hci_free_dev(hdev);
>                        }
>                }
> +               kfree(hu);
>        }
>  }

The "destruct"-callback was broken in many ways but working around it
without removing it seems wrong. This memory-leak occurs only if a
tty-device uses the uart-ldisc without a protocol bound to it.
Therefore, I didn't consider it important enough for stable. However,
if you want to fix this, leave the kfree() inside the destruct
callback but add another kfree() into the hci_uart_close() in an
"else"-clause like this:

if (test_and_clear_bit(...)) {
} else {
+   kfree(...);
}

This will still keep the bogus ref-counts inside hci_dev with the
destruct() callback but will also free the ldisc if no protocol is
set.
Regards
David

>
> Thanks,
> Johan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2 v2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
  2012-03-09 12:53           ` [PATCH 2/2 v2] " Johan Hovold
@ 2012-03-09 14:04             ` David Herrmann
       [not found]               ` <CANq1E4Rt0ctZ5cpXipJE--YmkR4OjKBXLBQkeTKWP3+Q-q37Yw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: David Herrmann @ 2012-03-09 14:04 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth, linux-kernel, netdev, stable

Hi Johan

On Fri, Mar 9, 2012 at 1:53 PM, Johan Hovold <jhovold@gmail.com> wrote:
> Make sure hci_dev_open returns immediately if hci_dev_unregister has
> been called.
>
> This fixes a race between hci_dev_open and hci_dev_unregister which can
> lead to a NULL-pointer dereference.
>
> Bug is 100% reproducible using hciattach and a disconnected serial port:
>
> 0. # hciattach -n /dev/ttyO1 any noflow
>
> 1. hci_dev_open called from hci_power_on grabs req lock
> 2. hci_init_req executes but device fails to initialise (times out
>   eventually)
> 3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
> 4. hci_uart_tty_close calls hci_dev_unregister and sleeps on req lock in
>   hci_dev_do_close
> 5. hci_dev_open (1) releases req lock
> 6. hci_dev_do_close grabs req lock and returns as device is not up
> 7. hci_dev_unregister sleeps in destroy_workqueue
> 8. hci_dev_open (3) grabs req lock, calls hci_init_req and eventually sleeps
> 9. hci_dev_unregister finishes, while hci_dev_open is still running...
>
> [   79.627136] INFO: trying to register non-static key.
> [   79.632354] the code is fine but needs lockdep annotation.
> [   79.638122] turning off the locking correctness validator.
> [   79.643920] [<c00188bc>] (unwind_backtrace+0x0/0xf8) from [<c00729c4>] (__lock_acquire+0x1590/0x1ab0)
> [   79.653594] [<c00729c4>] (__lock_acquire+0x1590/0x1ab0) from [<c00733f8>] (lock_acquire+0x9c/0x128)
> [   79.663085] [<c00733f8>] (lock_acquire+0x9c/0x128) from [<c0040a88>] (run_timer_softirq+0x150/0x3ac)
> [   79.672668] [<c0040a88>] (run_timer_softirq+0x150/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
> [   79.682281] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
> [   79.690856] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
> [   79.699157] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
> [   79.708648] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
> [   79.718048] Exception stack(0xcf281fb0 to 0xcf281ff8)
> [   79.723358] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
> [   79.731933] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
> [   79.740509] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
> [   79.747497] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [   79.756011] pgd = cf3b4000
> [   79.758850] [00000000] *pgd=8f0c7831, *pte=00000000, *ppte=00000000
> [   79.765502] Internal error: Oops: 80000007 [#1]
> [   79.770294] Modules linked in:
> [   79.773529] CPU: 0    Tainted: G        W     (3.3.0-rc6-00002-gb5d5c87 #421)
> [   79.781066] PC is at 0x0
> [   79.783721] LR is at run_timer_softirq+0x16c/0x3ac
> [   79.788787] pc : [<00000000>]    lr : [<c0040aa4>]    psr: 60000113
> [   79.788787] sp : cf281ee0  ip : 00000000  fp : cf280000
> [   79.800903] r10: 00000004  r9 : 00000100  r8 : b6f234d0
> [   79.806427] r7 : c0519c28  r6 : cf093488  r5 : c0561a00  r4 : 00000000
> [   79.813323] r3 : 00000000  r2 : c054eee0  r1 : 00000001  r0 : 00000000
> [   79.820190] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [   79.827728] Control: 10c5387d  Table: 8f3b4019  DAC: 00000015
> [   79.833801] Process gpsd (pid: 1265, stack limit = 0xcf2802e8)
> [   79.839965] Stack: (0xcf281ee0 to 0xcf282000)
> [   79.844573] 1ee0: 00000002 00000000 c0040a24 00000000 00000002 cf281f08 00200200 00000000
> [   79.853210] 1f00: 00000000 cf281f18 cf281f08 00000000 00000000 00000000 cf281f18 cf281f18
> [   79.861816] 1f20: 00000000 00000001 c056184c 00000000 00000001 b6f234d0 c0561848 00000004
> [   79.870452] 1f40: cf280000 c003a3b8 c051e79c 00000001 00000000 00000100 3fa9e7b8 0000000a
> [   79.879089] 1f60: 00000025 cf280000 00000025 00000000 00000000 b6f234d0 00000000 00000004
> [   79.887756] 1f80: 00000000 c003a924 c053ad38 c0013a50 fa200000 cf281fb0 ffffffff c0008530
> [   79.896362] 1fa0: 0001e6a0 0000aab8 80000010 c037499c 0001e6a0 be8dab00 0001e698 00036698
> [   79.904998] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
> [   79.913665] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff 00fbf700 04ffff00
> [   79.922302] [<c0040aa4>] (run_timer_softirq+0x16c/0x3ac) from [<c003a3b8>] (__do_softirq+0xd4/0x22c)
> [   79.931945] [<c003a3b8>] (__do_softirq+0xd4/0x22c) from [<c003a924>] (irq_exit+0x8c/0x94)
> [   79.940582] [<c003a924>] (irq_exit+0x8c/0x94) from [<c0013a50>] (handle_IRQ+0x34/0x84)
> [   79.948913] [<c0013a50>] (handle_IRQ+0x34/0x84) from [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c)
> [   79.958404] [<c0008530>] (omap3_intc_handle_irq+0x48/0x4c) from [<c037499c>] (__irq_usr+0x3c/0x60)
> [   79.967773] Exception stack(0xcf281fb0 to 0xcf281ff8)
> [   79.973083] 1fa0:                                     0001e6a0 be8dab00 0001e698 00036698
> [   79.981658] 1fc0: 0002df98 0002df38 0000001f 00000000 b6f234d0 00000000 00000004 00000000
> [   79.990234] 1fe0: 0001e6f8 be8d6aa0 be8dac50 0000aab8 80000010 ffffffff
> [   79.997161] Code: bad PC value
> [   80.000396] ---[ end trace 6f6739840475f9ee ]---
> [   80.005279] Kernel panic - not syncing: Fatal exception in interrupt
>
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Johan Hovold <jhovold@gmail.com>
> ---
>
> v2: use hdev->dev_flags for internal unregister flag
>
>
>  include/net/bluetooth/hci.h |    2 ++
>  net/bluetooth/hci_core.c    |    7 +++++++
>  2 files changed, 9 insertions(+), 0 deletions(-)
>
> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> index 00596e8..e8879b9 100644
> --- a/include/net/bluetooth/hci.h
> +++ b/include/net/bluetooth/hci.h
> @@ -93,6 +93,8 @@ enum {
>  * states from the controller.
>  */
>  enum {
> +       HCI_UNREGISTER,
> +
>        HCI_LE_SCAN,
>  };
>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index d6448f0..22b6781 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
>
>        hci_req_lock(hdev);
>
> +       if (test_bit(HCI_UNREGISTER, &hdev->dev_flags)) {
> +               ret = -ENODEV;
> +               goto done;
> +       }
> +

Isn't it enough to check for HCI_RUNNING here? We obviously have a
race here as we take the device with hci_dev_get(), then sleep and
then we do not check whether the device is still alive. However,
drivers are required to reset HCI_RUNNING before calling
hci_unregister_dev() (which is bogus anyway, but its the way we
handled it in the past) therefore it should be enough for us to check
for HCI_RUNNING.

Regards
David

>        if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) {
>                ret = -ERFKILL;
>                goto done;
> @@ -1577,6 +1582,8 @@ void hci_unregister_dev(struct hci_dev *hdev)
>
>        BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);
>
> +       set_bit(HCI_UNREGISTER, &hdev->dev_flags);
> +
>        write_lock(&hci_dev_list_lock);
>        list_del(&hdev->list);
>        write_unlock(&hci_dev_list_lock);
> --
> 1.7.8.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-09 13:44   ` David Herrmann
@ 2012-03-09 14:29     ` Johan Hovold
  2012-03-09 14:35       ` David Herrmann
  0 siblings, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 14:29 UTC (permalink / raw)
  To: David Herrmann
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth, linux-kernel, netdev, stable

Hi David,

On Fri, Mar 09, 2012 at 02:44:30PM +0100, David Herrmann wrote:
> On Wed, Mar 7, 2012 at 5:01 PM, Johan Hovold <jhovold@gmail.com> wrote:
> > Do not close protocol driver until device has been unregistered.
> >
> > This fixes a race between tty_close and hci_dev_open which can result in
> > a NULL-pointer dereference.
> >
> > The line discipline closes the protocol driver while we may still have
> > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> > dereference when lock is acquired and hci_init_req called.

[...]

> > diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> > index 0711448..6946081 100644
> > --- a/drivers/bluetooth/hci_ldisc.c
> > +++ b/drivers/bluetooth/hci_ldisc.c
> > @@ -310,11 +310,11 @@ static void hci_uart_tty_close(struct tty_struct *tty)
> >                        hci_uart_close(hdev);
> >
> >                if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
> > -                       hu->proto->close(hu);
> >                        if (hdev) {
> >                                hci_unregister_dev(hdev);
> >                                hci_free_dev(hdev);
> >                        }
> > +                       hu->proto->close(hu);
> >                }
> >        }
> >  }
> 
> I can confirm this. hci_uart_set_proto() opens the proto before it
> registers the hci device. Hence, we should also unregister the hci
> device before closing the proto. I also looked whether this introduces
> other race conditions but no proto-callback can be called here as they
> are all protected by the tty-layer which synchronizes all
> tty-callbacks. Therefore, I think this is the correct fix.
> 
> We can apply this to stable even without the "destruct"-fixes from me
> as hu->proto->$cb$() doesn't care whether hdev is valid or not. I
> don't think the destruct-fixes are important enough to send them to
> stable.

Unfortunately hu is is not valid once hci_unregister returns as it will
call the destruct callback. So my patch depends on changing this
behaviour first. (I could also store a pointer to the protocol before
calling unregister in my patch.)

Secondly, I must disagree with you regarding whether the memory leak you
found is critical enough to be added to the stable trees. We're leaking
kernel memory in a deterministic and easily triggered way which could be
exploited by a malicious user.

> Reviewed-by: David Herrmann <dh.herrmann@googlemail.com>

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-09 14:29     ` Johan Hovold
@ 2012-03-09 14:35       ` David Herrmann
  2012-03-09 15:15         ` Johan Hovold
  0 siblings, 1 reply; 21+ messages in thread
From: David Herrmann @ 2012-03-09 14:35 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

On Fri, Mar 9, 2012 at 3:29 PM, Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi David,
>
> On Fri, Mar 09, 2012 at 02:44:30PM +0100, David Herrmann wrote:
>> On Wed, Mar 7, 2012 at 5:01 PM, Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > Do not close protocol driver until device has been unregistered.
>> >
>> > This fixes a race between tty_close and hci_dev_open which can result in
>> > a NULL-pointer dereference.
>> >
>> > The line discipline closes the protocol driver while we may still have
>> > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
>> > dereference when lock is acquired and hci_init_req called.
>
> [...]
>
>> > diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
>> > index 0711448..6946081 100644
>> > --- a/drivers/bluetooth/hci_ldisc.c
>> > +++ b/drivers/bluetooth/hci_ldisc.c
>> > @@ -310,11 +310,11 @@ static void hci_uart_tty_close(struct tty_struct *tty)
>> >                        hci_uart_close(hdev);
>> >
>> >                if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
>> > -                       hu->proto->close(hu);
>> >                        if (hdev) {
>> >                                hci_unregister_dev(hdev);
>> >                                hci_free_dev(hdev);
>> >                        }
>> > +                       hu->proto->close(hu);
>> >                }
>> >        }
>> >  }
>>
>> I can confirm this. hci_uart_set_proto() opens the proto before it
>> registers the hci device. Hence, we should also unregister the hci
>> device before closing the proto. I also looked whether this introduces
>> other race conditions but no proto-callback can be called here as they
>> are all protected by the tty-layer which synchronizes all
>> tty-callbacks. Therefore, I think this is the correct fix.
>>
>> We can apply this to stable even without the "destruct"-fixes from me
>> as hu->proto->$cb$() doesn't care whether hdev is valid or not. I
>> don't think the destruct-fixes are important enough to send them to
>> stable.
>
> Unfortunately hu is is not valid once hci_unregister returns as it will
> call the destruct callback. So my patch depends on changing this
> behaviour first. (I could also store a pointer to the protocol before
> calling unregister in my patch.)

Right, I missed that, sorry.

> Secondly, I must disagree with you regarding whether the memory leak you
> found is critical enough to be added to the stable trees. We're leaking
> kernel memory in a deterministic and easily triggered way which could be
> exploited by a malicious user.

Are you planning on sending a patch to stable-ML or should I do so? How about
my proposal in the other mail? Could you include this fix when resending this?

>> Reviewed-by: David Herrmann <dh.herrmann-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
>
> Thanks,
> Johan

Regards
David

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-09 13:52           ` David Herrmann
@ 2012-03-09 14:40             ` Johan Hovold
  2012-03-09 15:02               ` David Herrmann
  0 siblings, 1 reply; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 14:40 UTC (permalink / raw)
  To: David Herrmann
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth, linux-kernel, netdev, stable

On Fri, Mar 09, 2012 at 02:52:00PM +0100, David Herrmann wrote:
> On Fri, Mar 9, 2012 at 2:04 PM, Johan Hovold <jhovold@gmail.com> wrote:
> > On Thu, Mar 08, 2012 at 09:45:22AM -0800, Marcel Holtmann wrote:
> >> > > > Do not close protocol driver until device has been unregistered.
> >> > > >
> >> > > > This fixes a race between tty_close and hci_dev_open which can result in
> >> > > > a NULL-pointer dereference.
> >> > > >
> >> > > > The line discipline closes the protocol driver while we may still have
> >> > > > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> >> > > > dereference when lock is acquired and hci_init_req called.
> >> >
> >> > [...]
> >> >
> >> > > what kernel version is this against? Our changes in bluetooth-next fixed
> >> > > some of the destruct handling.
> >> >
> >> > This is against the latest rc as it needs to be fixed in 3.3, but I
> >> > missed a dependency to bluetooth-next as you point out below.
> >> >
> >> > > Also hci_unregister_dev should be calling the destruct handler and thus
> >> > > your change is now accessing hu but it got freed already.
> >> >
> >> > You're right, my patch depends on 010666a126fc ("Bluetooth: Make
> >> > hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
> >> > Fix memory leak and remove destruct cb") from bluetooth-next.
> >> >
> >> > But since the latter one fixes a memory leak it should have been marked
> >> > for stable as well as pushed to Linus for 3.3, right?
> >>
> >> we need to look into this and propose patches for -stable. Is your
> >> problem still present with bluetooth-next or not?
> >
> > Yes, both races are present in bluetooth-next of today (b8622cbd58f34)
> > and only takes an additional manual step to trigger (as the core no
> > longer tries to open the device twice automatically).
> >
> > My two patches on top of either the two patches by David Herrmann
> > mentioned above or the following minimal fix of the same memory leak
> > would be sufficient to fix both races in 3.3:
> >
> > diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> > index 0711448..97c5faa 100644
> > --- a/drivers/bluetooth/hci_ldisc.c
> > +++ b/drivers/bluetooth/hci_ldisc.c
> > @@ -237,7 +237,6 @@ static void hci_uart_destruct(struct hci_dev *hdev)
> >                return;
> >
> >        BT_DBG("%s", hdev->name);
> > -       kfree(hdev->driver_data);
> >  }
> >
> >  /* ------ LDISC part ------ */
> > @@ -316,6 +315,7 @@ static void hci_uart_tty_close(struct tty_struct *tty)
> >                                hci_free_dev(hdev);
> >                        }
> >                }
> > +               kfree(hu);
> >        }
> >  }
> 
> The "destruct"-callback was broken in many ways but working around it
> without removing it seems wrong.

The reason for not doing so would be to keep the fixes minimal and thus
more appropriate for the stable trees.

Furthermore, according to you patch own description "Several drivers
already provide an empty callback" so I didn't consider it to be
a problem.

> This memory-leak occurs only if a
> tty-device uses the uart-ldisc without a protocol bound to it.
> Therefore, I didn't consider it important enough for stable.

See my answer to you previous mail regarding this.

> However,
> if you want to fix this, leave the kfree() inside the destruct
> callback but add another kfree() into the hci_uart_close() in an
> "else"-clause like this:
> 
> if (test_and_clear_bit(...)) {
> } else {
> +   kfree(...);
> }

You really don't want to free the hci_uart in it's own close method...

The hci_uart is allocated in tty_open and should be freed in tty_close.

> This will still keep the bogus ref-counts inside hci_dev with the
> destruct() callback but will also free the ldisc if no protocol is
> set.

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2 v2] bluetooth: hci_core: fix NULL-pointer dereference at unregister
       [not found]               ` <CANq1E4Rt0ctZ5cpXipJE--YmkR4OjKBXLBQkeTKWP3+Q-q37Yw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-03-09 14:48                 ` Johan Hovold
  0 siblings, 0 replies; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 14:48 UTC (permalink / raw)
  To: David Herrmann
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

Hi David,

On Fri, Mar 09, 2012 at 03:04:11PM +0100, David Herrmann wrote:
> On Fri, Mar 9, 2012 at 1:53 PM, Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Make sure hci_dev_open returns immediately if hci_dev_unregister has
> > been called.
> >
> > This fixes a race between hci_dev_open and hci_dev_unregister which can
> > lead to a NULL-pointer dereference.
> >
> > Bug is 100% reproducible using hciattach and a disconnected serial port:
> >
> > 0. # hciattach -n /dev/ttyO1 any noflow
> >
> > 1. hci_dev_open called from hci_power_on grabs req lock
> > 2. hci_init_req executes but device fails to initialise (times out
> >   eventually)
> > 3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock
> > 4. hci_uart_tty_close calls hci_dev_unregister and sleeps on req lock in
> >   hci_dev_do_close
> > 5. hci_dev_open (1) releases req lock
> > 6. hci_dev_do_close grabs req lock and returns as device is not up
> > 7. hci_dev_unregister sleeps in destroy_workqueue
> > 8. hci_dev_open (3) grabs req lock, calls hci_init_req and eventually sleeps
> > 9. hci_dev_unregister finishes, while hci_dev_open is still running...

[...]

> > diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> > index 00596e8..e8879b9 100644
> > --- a/include/net/bluetooth/hci.h
> > +++ b/include/net/bluetooth/hci.h
> > @@ -93,6 +93,8 @@ enum {
> >  * states from the controller.
> >  */
> >  enum {
> > +       HCI_UNREGISTER,
> > +
> >        HCI_LE_SCAN,
> >  };
> >
> > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > index d6448f0..22b6781 100644
> > --- a/net/bluetooth/hci_core.c
> > +++ b/net/bluetooth/hci_core.c
> > @@ -525,6 +525,11 @@ int hci_dev_open(__u16 dev)
> >
> >        hci_req_lock(hdev);
> >
> > +       if (test_bit(HCI_UNREGISTER, &hdev->dev_flags)) {
> > +               ret = -ENODEV;
> > +               goto done;
> > +       }
> > +
> 
> Isn't it enough to check for HCI_RUNNING here? We obviously have a
> race here as we take the device with hci_dev_get(), then sleep and
> then we do not check whether the device is still alive. However,
> drivers are required to reset HCI_RUNNING before calling
> hci_unregister_dev() (which is bogus anyway, but its the way we
> handled it in the past) therefore it should be enough for us to check
> for HCI_RUNNING.

I'm afraid this won't work as hci_dev_open is responsible for setting
HCI_RUNNING in the first place (set in hdev->open(hdev) called from
hci_dev_open).

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-09 14:40             ` Johan Hovold
@ 2012-03-09 15:02               ` David Herrmann
       [not found]                 ` <CANq1E4TcUKKXetitjWJZgP9550gnB43rncnAcwwdz_6HpZf_Ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: David Herrmann @ 2012-03-09 15:02 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth, linux-kernel, netdev, stable

On Fri, Mar 9, 2012 at 3:40 PM, Johan Hovold <jhovold@gmail.com> wrote:
> On Fri, Mar 09, 2012 at 02:52:00PM +0100, David Herrmann wrote:
>> On Fri, Mar 9, 2012 at 2:04 PM, Johan Hovold <jhovold@gmail.com> wrote:
>> > On Thu, Mar 08, 2012 at 09:45:22AM -0800, Marcel Holtmann wrote:
>> >> > > > Do not close protocol driver until device has been unregistered.
>> >> > > >
>> >> > > > This fixes a race between tty_close and hci_dev_open which can result in
>> >> > > > a NULL-pointer dereference.
>> >> > > >
>> >> > > > The line discipline closes the protocol driver while we may still have
>> >> > > > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
>> >> > > > dereference when lock is acquired and hci_init_req called.
>> >> >
>> >> > [...]
>> >> >
>> >> > > what kernel version is this against? Our changes in bluetooth-next fixed
>> >> > > some of the destruct handling.
>> >> >
>> >> > This is against the latest rc as it needs to be fixed in 3.3, but I
>> >> > missed a dependency to bluetooth-next as you point out below.
>> >> >
>> >> > > Also hci_unregister_dev should be calling the destruct handler and thus
>> >> > > your change is now accessing hu but it got freed already.
>> >> >
>> >> > You're right, my patch depends on 010666a126fc ("Bluetooth: Make
>> >> > hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
>> >> > Fix memory leak and remove destruct cb") from bluetooth-next.
>> >> >
>> >> > But since the latter one fixes a memory leak it should have been marked
>> >> > for stable as well as pushed to Linus for 3.3, right?
>> >>
>> >> we need to look into this and propose patches for -stable. Is your
>> >> problem still present with bluetooth-next or not?
>> >
>> > Yes, both races are present in bluetooth-next of today (b8622cbd58f34)
>> > and only takes an additional manual step to trigger (as the core no
>> > longer tries to open the device twice automatically).
>> >
>> > My two patches on top of either the two patches by David Herrmann
>> > mentioned above or the following minimal fix of the same memory leak
>> > would be sufficient to fix both races in 3.3:
>> >
>> > diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
>> > index 0711448..97c5faa 100644
>> > --- a/drivers/bluetooth/hci_ldisc.c
>> > +++ b/drivers/bluetooth/hci_ldisc.c
>> > @@ -237,7 +237,6 @@ static void hci_uart_destruct(struct hci_dev *hdev)
>> >                return;
>> >
>> >        BT_DBG("%s", hdev->name);
>> > -       kfree(hdev->driver_data);
>> >  }
>> >
>> >  /* ------ LDISC part ------ */
>> > @@ -316,6 +315,7 @@ static void hci_uart_tty_close(struct tty_struct *tty)
>> >                                hci_free_dev(hdev);
>> >                        }
>> >                }
>> > +               kfree(hu);
>> >        }
>> >  }
>>
>> The "destruct"-callback was broken in many ways but working around it
>> without removing it seems wrong.
>
> The reason for not doing so would be to keep the fixes minimal and thus
> more appropriate for the stable trees.
>
> Furthermore, according to you patch own description "Several drivers
> already provide an empty callback" so I didn't consider it to be
> a problem.

It's just a proposal, feel free to keep your patch. But please include
a comment in your commit-message that you explicitly avoid using the
destruct-callback as it is, and always was, broken. Otherwise, it looks
wrong seeing such a commit.
Or simply link to the patches that remove the destruct callback in the
-next tree.

>> This memory-leak occurs only if a
>> tty-device uses the uart-ldisc without a protocol bound to it.
>> Therefore, I didn't consider it important enough for stable.
>
> See my answer to you previous mail regarding this.
>
>> However,
>> if you want to fix this, leave the kfree() inside the destruct
>> callback but add another kfree() into the hci_uart_close() in an
>> "else"-clause like this:
>>
>> if (test_and_clear_bit(...)) {
>> } else {
>> +   kfree(...);
>> }
>
> You really don't want to free the hci_uart in it's own close method...
>
> The hci_uart is allocated in tty_open and should be freed in tty_close.

Oops, I obviously meant hci_uart_tty_close(), sorry.

>> This will still keep the bogus ref-counts inside hci_dev with the
>> destruct() callback but will also free the ldisc if no protocol is
>> set.
>
> Thanks,
> Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
       [not found]                 ` <CANq1E4TcUKKXetitjWJZgP9550gnB43rncnAcwwdz_6HpZf_Ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-03-09 15:08                   ` Johan Hovold
  0 siblings, 0 replies; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 15:08 UTC (permalink / raw)
  To: David Herrmann
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, stable

On Fri, Mar 09, 2012 at 04:02:21PM +0100, David Herrmann wrote:
> On Fri, Mar 9, 2012 at 3:40 PM, Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > On Fri, Mar 09, 2012 at 02:52:00PM +0100, David Herrmann wrote:
> >> On Fri, Mar 9, 2012 at 2:04 PM, Johan Hovold <jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >> > On Thu, Mar 08, 2012 at 09:45:22AM -0800, Marcel Holtmann wrote:
> >> >> > > > Do not close protocol driver until device has been unregistered.
> >> >> > > >
> >> >> > > > This fixes a race between tty_close and hci_dev_open which can result in
> >> >> > > > a NULL-pointer dereference.
> >> >> > > >
> >> >> > > > The line discipline closes the protocol driver while we may still have
> >> >> > > > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> >> >> > > > dereference when lock is acquired and hci_init_req called.
> >> >> >
> >> >> > [...]
> >> >> >
> >> >> > > what kernel version is this against? Our changes in bluetooth-next fixed
> >> >> > > some of the destruct handling.
> >> >> >
> >> >> > This is against the latest rc as it needs to be fixed in 3.3, but I
> >> >> > missed a dependency to bluetooth-next as you point out below.
> >> >> >
> >> >> > > Also hci_unregister_dev should be calling the destruct handler and thus
> >> >> > > your change is now accessing hu but it got freed already.
> >> >> >
> >> >> > You're right, my patch depends on 010666a126fc ("Bluetooth: Make
> >> >> > hci-destruct callback optional") and 797fe796c4 ("Bluetooth: uart-ldisc:
> >> >> > Fix memory leak and remove destruct cb") from bluetooth-next.
> >> >> >
> >> >> > But since the latter one fixes a memory leak it should have been marked
> >> >> > for stable as well as pushed to Linus for 3.3, right?
> >> >>
> >> >> we need to look into this and propose patches for -stable. Is your
> >> >> problem still present with bluetooth-next or not?
> >> >
> >> > Yes, both races are present in bluetooth-next of today (b8622cbd58f34)
> >> > and only takes an additional manual step to trigger (as the core no
> >> > longer tries to open the device twice automatically).
> >> >
> >> > My two patches on top of either the two patches by David Herrmann
> >> > mentioned above or the following minimal fix of the same memory leak
> >> > would be sufficient to fix both races in 3.3:
> >> >
> >> > diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> >> > index 0711448..97c5faa 100644
> >> > --- a/drivers/bluetooth/hci_ldisc.c
> >> > +++ b/drivers/bluetooth/hci_ldisc.c
> >> > @@ -237,7 +237,6 @@ static void hci_uart_destruct(struct hci_dev *hdev)
> >> >                return;
> >> >
> >> >        BT_DBG("%s", hdev->name);
> >> > -       kfree(hdev->driver_data);
> >> >  }
> >> >
> >> >  /* ------ LDISC part ------ */
> >> > @@ -316,6 +315,7 @@ static void hci_uart_tty_close(struct tty_struct *tty)
> >> >                                hci_free_dev(hdev);
> >> >                        }
> >> >                }
> >> > +               kfree(hu);
> >> >        }
> >> >  }
> >>
> >> The "destruct"-callback was broken in many ways but working around it
> >> without removing it seems wrong.
> >
> > The reason for not doing so would be to keep the fixes minimal and thus
> > more appropriate for the stable trees.
> >
> > Furthermore, according to you patch own description "Several drivers
> > already provide an empty callback" so I didn't consider it to be
> > a problem.
> 
> It's just a proposal, feel free to keep your patch. But please include
> a comment in your commit-message that you explicitly avoid using the
> destruct-callback as it is, and always was, broken. Otherwise, it looks
> wrong seeing such a commit.

Agreed.

> Or simply link to the patches that remove the destruct callback in the
> -next tree.

Yes, I would definitely mention those patches.

> >> This memory-leak occurs only if a
> >> tty-device uses the uart-ldisc without a protocol bound to it.
> >> Therefore, I didn't consider it important enough for stable.
> >
> > See my answer to you previous mail regarding this.
> >
> >> However,
> >> if you want to fix this, leave the kfree() inside the destruct
> >> callback but add another kfree() into the hci_uart_close() in an
> >> "else"-clause like this:
> >>
> >> if (test_and_clear_bit(...)) {
> >> } else {
> >> +   kfree(...);
> >> }
> >
> > You really don't want to free the hci_uart in it's own close method...
> >
> > The hci_uart is allocated in tty_open and should be freed in tty_close.
> 
> Oops, I obviously meant hci_uart_tty_close(), sorry.

Ouch. I should have realised it was a typo, sorry.

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close
  2012-03-09 14:35       ` David Herrmann
@ 2012-03-09 15:15         ` Johan Hovold
  0 siblings, 0 replies; 21+ messages in thread
From: Johan Hovold @ 2012-03-09 15:15 UTC (permalink / raw)
  To: David Herrmann
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	linux-bluetooth, linux-kernel, netdev, stable

On Fri, Mar 09, 2012 at 03:35:46PM +0100, David Herrmann wrote:
> On Fri, Mar 9, 2012 at 3:29 PM, Johan Hovold <jhovold@gmail.com> wrote:
> > On Fri, Mar 09, 2012 at 02:44:30PM +0100, David Herrmann wrote:
> >> On Wed, Mar 7, 2012 at 5:01 PM, Johan Hovold <jhovold@gmail.com> wrote:
> >> > Do not close protocol driver until device has been unregistered.
> >> >
> >> > This fixes a race between tty_close and hci_dev_open which can result in
> >> > a NULL-pointer dereference.
> >> >
> >> > The line discipline closes the protocol driver while we may still have
> >> > hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer
> >> > dereference when lock is acquired and hci_init_req called.
> >
> > [...]
> >
> >> > diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
> >> > index 0711448..6946081 100644
> >> > --- a/drivers/bluetooth/hci_ldisc.c
> >> > +++ b/drivers/bluetooth/hci_ldisc.c
> >> > @@ -310,11 +310,11 @@ static void hci_uart_tty_close(struct tty_struct *tty)
> >> >                        hci_uart_close(hdev);
> >> >
> >> >                if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
> >> > -                       hu->proto->close(hu);
> >> >                        if (hdev) {
> >> >                                hci_unregister_dev(hdev);
> >> >                                hci_free_dev(hdev);
> >> >                        }
> >> > +                       hu->proto->close(hu);
> >> >                }
> >> >        }
> >> >  }
> >>
> >> I can confirm this. hci_uart_set_proto() opens the proto before it
> >> registers the hci device. Hence, we should also unregister the hci
> >> device before closing the proto. I also looked whether this introduces
> >> other race conditions but no proto-callback can be called here as they
> >> are all protected by the tty-layer which synchronizes all
> >> tty-callbacks. Therefore, I think this is the correct fix.
> >>
> >> We can apply this to stable even without the "destruct"-fixes from me
> >> as hu->proto->$cb$() doesn't care whether hdev is valid or not. I
> >> don't think the destruct-fixes are important enough to send them to
> >> stable.
> >
> > Unfortunately hu is is not valid once hci_unregister returns as it will
> > call the destruct callback. So my patch depends on changing this
> > behaviour first. (I could also store a pointer to the protocol before
> > calling unregister in my patch.)
> 
> Right, I missed that, sorry.
> 
> > Secondly, I must disagree with you regarding whether the memory leak you
> > found is critical enough to be added to the stable trees. We're leaking
> > kernel memory in a deterministic and easily triggered way which could be
> > exploited by a malicious user.
> 
> Are you planning on sending a patch to stable-ML or should I do so? How about
> my proposal in the other mail? Could you include this fix when resending this?

This needs to go in through the bluetooth/networking trees (or their
maintainers at least) so that it gets in to 3.3, otherwise stable will
not pick it up for earlier trees.

I'll post a revised series which includes the minimal fix to the memory
leak so that all three patches can go to Linus and hopefully make it in
before 3.3 is out. 

Sounds good?

Thanks,
Johan

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-03-09 15:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-07 16:01 [PATCH 0/2] bluetooth: fix NULL-pointer dereferences Johan Hovold
2012-03-07 16:01 ` [PATCH 1/2] bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close Johan Hovold
2012-03-07 19:33   ` Marcel Holtmann
2012-03-08 11:57     ` Johan Hovold
2012-03-08 17:45       ` Marcel Holtmann
2012-03-09 13:04         ` Johan Hovold
2012-03-09 13:52           ` David Herrmann
2012-03-09 14:40             ` Johan Hovold
2012-03-09 15:02               ` David Herrmann
     [not found]                 ` <CANq1E4TcUKKXetitjWJZgP9550gnB43rncnAcwwdz_6HpZf_Ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-03-09 15:08                   ` Johan Hovold
2012-03-09 13:44   ` David Herrmann
2012-03-09 14:29     ` Johan Hovold
2012-03-09 14:35       ` David Herrmann
2012-03-09 15:15         ` Johan Hovold
2012-03-07 16:02 ` [PATCH 2/2] bluetooth: hci_core: fix NULL-pointer dereference at unregister Johan Hovold
     [not found]   ` <1331136120-27075-3-git-send-email-jhovold-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2012-03-07 19:29     ` Marcel Holtmann
2012-03-08 11:56       ` Johan Hovold
2012-03-08 17:43         ` Marcel Holtmann
2012-03-09 12:53           ` [PATCH 2/2 v2] " Johan Hovold
2012-03-09 14:04             ` David Herrmann
     [not found]               ` <CANq1E4Rt0ctZ5cpXipJE--YmkR4OjKBXLBQkeTKWP3+Q-q37Yw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-03-09 14:48                 ` Johan Hovold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).