[RFC] Bluetooth ERTM L2CAP deadlocks (hung tasks) due to l2cap_sock

linux-bluetooth.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] Bluetooth ERTM L2CAP deadlocks (hung tasks) due to l2cap_sock_shutdown()
@ 2015-05-07 15:26 Dean Jenkins
  2015-05-29 10:10 ` Dean Jenkins
  0 siblings, 1 reply; 3+ messages in thread
From: Dean Jenkins @ 2015-05-07 15:26 UTC (permalink / raw)
  To: BlueZ development
  Cc: Marcel Holtmann, Gustavo Padovan, Johan Hedberg, Frkuska, Joshua

Hi Marcel, Gustavo and Johan,

Please see the end of the E-mail for a workaround, do you agree with the 
workaround, any comments ?

Description of L2CAP ERTM deadlock
======================

Using kernel 3.8 on an ARM i.MX6 based platform, we have identified some 
deadlock situations involving L2CAP ERTM. We believe kernel v4.0.1 can 
deadlock as well (not tested it) as the affected code is similar to 
kernel 3.8.

With a L2CAP ERTM channel connected, and l2cap_sock_shutdown() is 
called, then __l2cap_wait_ack() is called which is supposed to wait for 
all sent I frames to be acknowledged.

l2cap_sock_shutdown() takes all 3 locks namely:

mutex_lock(&conn->chan_lock);
l2cap_chan_lock(chan);
lock_sock(sk);

__l2cap_wait_ack() is implemented as follows:

int __l2cap_wait_ack(struct sock *sk)
{
     struct l2cap_chan *chan = l2cap_pi(sk)->chan;
     DECLARE_WAITQUEUE(wait, current);
     int err = 0;
     int timeo = HZ/5;

     add_wait_queue(sk_sleep(sk), &wait);
     set_current_state(TASK_INTERRUPTIBLE);
     while (chan->unacked_frames > 0 && chan->conn) {
         if (!timeo)
             timeo = HZ/5;

         if (signal_pending(current)) {
             err = sock_intr_errno(timeo);
             break;
         }

         release_sock(sk);
         timeo = schedule_timeout(timeo);
         lock_sock(sk);
         set_current_state(TASK_INTERRUPTIBLE);

         err = sock_error(sk);
         if (err)
             break;
     }
     set_current_state(TASK_RUNNING);
     remove_wait_queue(sk_sleep(sk), &wait);
     return err;
}

Analysis shows that the while loop checks chan->unacked_frames > 0 and 
chan->conn is !NULL. So when both are true the loop waits (sleeps) for 
200ms using schedule_timeout() and then rechecks the while loop 
conditions before waiting again.

Unfortunately, during this waiting (sleeping) period the following locks 
are still held:
mutex_lock(&conn->chan_lock);
l2cap_chan_lock(chan);

These held locks prevent l2cap_recv_frame() from processing the received 
ACKs for the sent I frames in ERTM. This means chan->unacked_frames > 0 
remains true.

chan->conn cannot be set to NULL by the channel being closed by other 
means as the held locks prevent the state being updated.

Therefore a deadlock occurs because __l2cap_wait_ack() gets stuck in an 
infinite loop which periodically waits (sleeps). The kernel's hung task 
detector is not able to detect this "hung" thread because the thread 
periodically runs. This means l2cap_sock_shutdown() does not return. 
With debug, we have proven that the infinite loop occurs.

Here is an example hung task log and LOCKDEP output of a L2CAP ERTM 
deadlock scenario: (we do have some other different traces as well so 
more than 1 failure scenario exists)

[  360.497912] INFO: task kworker/u:0H:7 blocked for more than 180 seconds.
[  360.497931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  360.497944] kworker/u:0H    D 8046bb28     0     7      2 0x00000000
[  360.498051] Backtrace:
[  360.498145] [<8046b624>] (__schedule+0x0/0x790) from [<8046be3c>] 
(schedule+0x88/0x8c)
[  360.498231] [<8046bdb4>] (schedule+0x0/0x8c) from [<8046c1c8>] 
(schedule_preempt_disabled+0x18/0x24)
[  360.498318] [<8046c1b0>] (schedule_preempt_disabled+0x0/0x24) from 
[<80469b70>] (mutex_lock_nested+0x250/0x3e8)
[  360.498495] [<80469920>] (mutex_lock_nested+0x0/0x3e8) from 
[<7f44d2a8>] (l2cap_get_chan_by_scid+0x24/0x54 [bluetooth])
[  360.498696] [<7f44d284>] (l2cap_get_chan_by_scid+0x0/0x54 
[bluetooth]) from [<7f453f68>] (l2cap_recv_frame+0x1918/0x1cc8 [bluetooth])
[  360.498779]  r6:0000006b r5:9773ef00 r4:96bfdc00 r3:0000003f
[  360.498919] [<7f452650>] (l2cap_recv_frame+0x0/0x1cc8 [bluetooth]) 
from [<7f454e3c>] (l2cap_recv_acldata+0xc0/0x1f8 [bluetooth])
[  360.499088] [<7f454d7c>] (l2cap_recv_acldata+0x0/0x1f8 [bluetooth]) 
from [<7f437c74>] (hci_rx_work+0x1b4/0x2ec [bluetooth])
[  360.499176]  r6:9759a000 r5:9759a77c r4:9773ef00 r3:00000000
[  360.499259] [<7f437ac0>] (hci_rx_work+0x0/0x2ec [bluetooth]) from 
[<8003ee1c>] (process_one_work+0x344/0x540)
[  360.499352] [<8003ead8>] (process_one_work+0x0/0x540) from 
[<8003f2e4>] (worker_thread+0x298/0x3ec)
[  360.499449] [<8003f04c>] (worker_thread+0x0/0x3ec) from [<80044a40>] 
(kthread+0xb4/0xc0)
[  360.499543] [<8004498c>] (kthread+0x0/0xc0) from [<8000da30>] 
(ret_from_fork+0x14/0x24)
[  360.499622]  r7:00000000 r6:00000000 r5:8004498c r4:a40a3e34
[  360.499880]
Showing all locks held in the system:
[  360.499909] 3 locks held by kworker/u:0H/7:
[  360.499921]  #0:  (hdev->name){++++..}, at: [<8003ed28>] 
process_one_work+0x250/0x540
[  360.499968]  #1:  ((&hdev->rx_work)){+.+...}, at: [<8003ed28>] 
process_one_work+0x250/0x540
[  360.500009]  #2:  (&conn->chan_lock){+.+...}, at: [<7f44d2a8>] 
l2cap_get_chan_by_scid+0x24/0x54 [bluetooth]
<snip>
[  360.500256] 3 locks held by kworker/1:2/64:
[  360.500266]  #0:  (events){.+.+..}, at: [<8003ed28>] 
process_one_work+0x250/0x540
[  360.500307]  #1: ((&(&chan->retrans_timer)->work)){+.+...}, at: 
[<8003ed28>] process_one_work+0x250/0x540
[  360.500346]  #2:  (&chan->lock){+.+...}, at: [<7f450414>] 
l2cap_retrans_timeout+0x24/0x58 [bluetooth]
<snip>
[  360.502066] 2 locks held by <userland BT daemon>/1433:
[  360.502076]  #0:  (&conn->chan_lock){+.+...}, at: [<7f455378>] 
l2cap_sock_shutdown+0x34/0x130 [bluetooth]
[  360.502179]  #1:  (&chan->lock){+.+...}, at: [<7f455388>] 
l2cap_sock_shutdown+0x44/0x130 [bluetooth]
<snip>

It can be seen that l2cap_get_chan_by_scid() is blocked so that the ACK 
cannot be processed and l2cap_retrans_timeout() has timed out in an 
attempt to retransmit the I frame but is also blocked. The Bluetooth 
stack is now paralysed.

This scenario will fail every time. The locking seems to be flawed in 
l2cap_sock_shutdown() for L2CAP ERTM. Did it ever work ?


Workaround
========

Here is our workaround patch (based on kernel 3.8) which releases all 
the locks before waiting (sleeping):

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index a53f47a..bf49066 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -1766,7 +1766,13 @@ int __l2cap_wait_ack(struct sock *sk)
                 }

                 release_sock(sk);
+               l2cap_chan_unlock(chan);
+               if (chan->conn)
+ mutex_unlock(&chan->conn->chan_lock);
                 timeo = schedule_timeout(timeo);
+               if (chan->conn)
+ mutex_lock(&chan->conn->chan_lock);
+               l2cap_chan_lock(chan);
                 lock_sock(sk);
                 set_current_state(TASK_INTERRUPTIBLE);

diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 657b3a4..bd0d1ae 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -867,8 +867,10 @@ static int l2cap_sock_shutdown(struct socket *sock, 
int how)
         lock_sock(sk);

         if (!sk->sk_shutdown) {
-               if (chan->mode == L2CAP_MODE_ERTM)
+               if (chan->mode == L2CAP_MODE_ERTM) {
                         err = __l2cap_wait_ack(sk);
+                       conn = chan->conn;
+               }

                 sk->sk_shutdown = SHUTDOWN_MASK;

If you think this is the right approach then we will formally create a 
patch and send it to the mailing list for review.

Our initial tests show that the deadlock no longer occurs with the patch 
applied, but we are worried about side effects.

We have some doubts:

a) What are the locks protecting ? Do all 3 locks need to be taken ?

b) Can a better architectural solution be created that avoids the locks ?

Perhaps arm the channel to disconnect, then after all ACKs have been 
received, call a callback to wake up l2cap_sock_shutdown ?

c) What stops new I frames being sent during the waiting period so 
chan->unacked_frames increases ?

Thanks for your time.

Regards,
Dean

-- 
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] Bluetooth ERTM L2CAP deadlocks (hung tasks) due to l2cap_sock_shutdown()
  2015-05-07 15:26 [RFC] Bluetooth ERTM L2CAP deadlocks (hung tasks) due to l2cap_sock_shutdown() Dean Jenkins
@ 2015-05-29 10:10 ` Dean Jenkins
  2015-06-05  8:35   ` Dean Jenkins
  0 siblings, 1 reply; 3+ messages in thread
From: Dean Jenkins @ 2015-05-29 10:10 UTC (permalink / raw)
  To: BlueZ development
  Cc: Marcel Holtmann, Gustavo Padovan, Johan Hedberg, Frkuska, Joshua

Hi Marcel, Gustavo and Johan,

My original E-mail is below for completeness so that you understand the 
context of this new E-mail.

We have been successful in demonstrating a L2CAP ERTM shutdown deadlock 
by using the Bluez l2test utility. We think the latest 4.x kernel can 
also deadlock (not tested).

Details of the simple test case are as follows:

Apparatus
======

Raspberry Pi (RPi) original B variant (single ARM core) running Linux 
Raspian with kernel 3.18.11+ and a USB BT v4.0 dongle. bluez is v5.15 
(locally built)
Dell laptop running Linux Mageia 4 with kernel 3.14.32 and an on-board 
USB BT dongle. bluez is v5.11

Method
=====

INITIAL SETTINGS:

Ensure bluetoothd is running on the laptop and RPi so the D-Bus services 
are available for the Bluez utilities to use
Run bluetoothctl on the laptop and RPi and ensure that an adaptor is shown

Type in the following commands on both systems:
agent on
default-agent
power on
discoverable on
scan on

Both systems should now detect each other

Make a note of each device's BT address. Call them <lap_BT_ADDR> and 
<RPi_BT_ADDR>

On the RPi pair with the laptop (visa versa will also work):

On the RPi do command
pair <lap_BT_ADDR>
Accept the pin code

On the laptop do
Accept the pin code

Pairing should now be successful. bluetoothd must continue to run so 
that the linkkeys can be serviced by the agent.

bluetoothctl can be quit on both systems

USING L2TEST

l2test can be used in various modes of operation controlled via command 
line options. l2test can be used in a client-server arrangement whereby 
1 BT device listens for a connection (server) and the other BT device 
connects (client) to the listening device.

For this testcase:

RPi is the l2test client (unit under test)
laptop is the l2test server

On the laptop use the command:
l2test -i hci0 -X ertm -d
This command listens for connections with optional L2CAP ERTM supported 
and dumps out the received data.

TEST 1: NON-ERTM CONNECTION

On the RPi use the command:
l2test -s -b 1024 -B /dev/urandom -P 4113 <lap_BT_ADDR>
This command sends 1024 bytes to the laptop via PSM 4113 used by l2test 
over a L2CAP basic mode channel.

TEST 2: ERTM CONNECTION

On the RPi use the command:
l2test -s -b 1024 -B /dev/urandom -X ertm -P 4113 <lap_BT_ADDR>
This command sends 1024 bytes to the laptop via PSM 4113 used by l2test 
over an ERTM channel

Results
=====

TEST 1

root@raspberrypi:~# l2test -s -b 1024 -B /dev/urandom -P 4113 
5C:F3:70:68:BA:DD
l2test[10346]: Connected to 5C:F3:70:68:BA:DD (bredr, psm 4113, scid 64)
l2test[10346]: Local device 5C:F3:70:68:BA:DA (bredr, psm 4113, scid 64)
l2test[10346]: Options [imtu 672, omtu 672, flush_to 65535, mode 0, 
handle 11, class 0x000000, priority 0, rcvbuf 163840]
l2test[10346]: Sending ...
l2test[10346]: Closing channel ...
l2test[10346]: Done
l2test[10346]: Exit

Using basic L2CAP mode, the channel sends and closes OK.
l2test terminates as expected.

TEST 2

root@raspberrypi:~# l2test -s -b 1024 -B /dev/urandom -X ertm -P 4113 
5C:F3:70:68:BA:DD
l2test[10366]: Connected to 5C:F3:70:68:BA:DD (bredr, psm 4113, scid 64)
l2test[10366]: Local device 5C:F3:70:68:BA:DA (bredr, psm 4113, scid 64)
l2test[10366]: Options [imtu 672, omtu 672, flush_to 65535, mode 3, 
handle 11, class 0x000000, priority 0, rcvbuf 163840]
l2test[10366]: Sending ...
l2test[10366]: Closing channel ...

Using L2CAP ERTM, the channel is instructed to close but hangs.
l2test fails to naturally terminate.

 From Bluez 5.15 tools/l2test.c:
static void send_mode(int sk)
{
     do_send(sk);

     syslog(LOG_INFO, "Closing channel ...");
     if (shutdown(sk, SHUT_RDWR) < 0)
         syslog(LOG_INFO, "Close failed: %m");
     else
         syslog(LOG_INFO, "Done");
}

The -s send l2test command line option calls send_mode() to send the 
data via do_send() and then closes the channel with the call to shutdown().
do_send() is probably not checking that the data was successfully sent 
before returning meaning potentially not all the I frames were ACK'ed.

The results shows that shutdown() fails to return as no "Done" textual 
message is seen.

When the RPi is left in this condition for up to 10 minutes (need to 
wait longer than 2 times the hung task detector period plus L2CAP timer 
expiry time), dmesg shows:

[  601.116119] INFO: task kworker/0:1:19 blocked for more than 120 seconds.
[  601.116156]       Not tainted 3.18.11+ #781
[  601.116169] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  601.116183] kworker/0:1     D c054cbcc     0    19      2 0x00000000
[  601.116537] Workqueue: events l2cap_retrans_timeout [bluetooth]
[  601.116685] [<c054cbcc>] (__schedule) from [<c054cfa4>] 
(schedule+0x40/0x8c)
[  601.116727] [<c054cfa4>] (schedule) from [<c054d3c8>] 
(schedule_preempt_disabled+0x30/0x40)
[  601.116812] [<c054d3c8>] (schedule_preempt_disabled) from 
[<c054ee14>] (__mutex_lock_slowpath+0xb8/0x18c)
[  601.116856] [<c054ee14>] (__mutex_lock_slowpath) from [<c054ef20>] 
(mutex_lock+0x38/0x3c)
[  601.117085] [<c054ef20>] (mutex_lock) from [<bf05173c>] 
(l2cap_retrans_timeout+0x24/0x5c [bluetooth])
[  601.117371] [<bf05173c>] (l2cap_retrans_timeout [bluetooth]) from 
[<c003aa68>] (process_one_work+0x13c/0x444)
[  601.117410] [<c003aa68>] (process_one_work) from [<c003b954>] 
(worker_thread+0x190/0x580)
[  601.117445] [<c003b954>] (worker_thread) from [<c003fdb4>] 
(kthread+0xd8/0xf4)
[  601.117488] [<c003fdb4>] (kthread) from [<c000e8c8>] 
(ret_from_fork+0x14/0x20)
[  601.117569] INFO: task kworker/u3:2:327 blocked for more than 120 
seconds.
[  601.117588]       Not tainted 3.18.11+ #781
[  601.117601] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  601.117614] kworker/u3:2    D c054cbcc     0   327      2 0x00000000
[  601.117806] Workqueue: hci0 hci_rx_work [bluetooth]
[  601.117941] [<c054cbcc>] (__schedule) from [<c054cfa4>] 
(schedule+0x40/0x8c)
[  601.117981] [<c054cfa4>] (schedule) from [<c054d3c8>] 
(schedule_preempt_disabled+0x30/0x40)
[  601.118056] [<c054d3c8>] (schedule_preempt_disabled) from 
[<c054ee14>] (__mutex_lock_slowpath+0xb8/0x18c)
[  601.118092] [<c054ee14>] (__mutex_lock_slowpath) from [<c054ef20>] 
(mutex_lock+0x38/0x3c)
[  601.118307] [<c054ef20>] (mutex_lock) from [<bf04d2c8>] 
(l2cap_get_chan_by_scid+0x28/0x98 [bluetooth])
[  601.118696] [<bf04d2c8>] (l2cap_get_chan_by_scid [bluetooth]) from 
[<bf054590>] (l2cap_recv_frame+0xa4/0x2cec [bluetooth])
[  601.119146] [<bf054590>] (l2cap_recv_frame [bluetooth]) from 
[<bf058194>] (l2cap_recv_acldata+0x214/0x22c [bluetooth])
[  601.119559] [<bf058194>] (l2cap_recv_acldata [bluetooth]) from 
[<bf02bf88>] (hci_rx_work+0x180/0x288 [bluetooth])
[  601.119801] [<bf02bf88>] (hci_rx_work [bluetooth]) from [<c003aa68>] 
(process_one_work+0x13c/0x444)
[  601.119839] [<c003aa68>] (process_one_work) from [<c003b954>] 
(worker_thread+0x190/0x580)
[  601.119883] [<c003b954>] (worker_thread) from [<c003fdb4>] 
(kthread+0xd8/0xf4)
[  601.119919] [<c003fdb4>] (kthread) from [<c000e8c8>] 
(ret_from_fork+0x14/0x20)

Conclusions
========

The hung task failure is easy to reproduce with the l2test utility. It 
could be argued that the do_send() function should only return when all 
the data has been successfully acknowledged but on the other hand, it is 
ideal to generate the right conditions for the failure to be triggered.

Probably, the low resources on the RPi help to make the failure more 
likely than using a PC for example.

hcidump shows that the S RR frame from the laptop comes 320ms after the 
last I frame from the RPi which generates a window of opportunity for 
the shutdown() call to execute l2cap_sock_shutdown() before the S RR 
frame has been processed.

It can be seen that the root cause function __l2cap_wait_ack() is not 
reported by the hung task detector because the function is running in an 
infinite loop but periodically waiting (sleeping) for 200ms so is not 
detectable by the hung task detector. The hung task detector is 
triggered when the L2CAP recovery timers expire (l2cap_retrans_timeout) 
and gets blocked, or when a new frame is received 
(l2cap_get_chan_by_scid) and processing gets blocked.

We looked at a range of kernels from 3.8 to 4.0 which we think can 
suffer the deadlock.

We managed to get a deadlock on Fedora using kernel 4.0.4 but it was a 
different test case with an iPhone.

Please can we work together to find a solution to the deadlock ?

Note that our interest is in fixing kernels 3.8 and 3.14 so hopefully 
any fix to the latest 4.x kernel can be back-ported to the kernels that 
we are using.

Many thanks for your time.

Regards,
Dean

On 07/05/15 16:26, Dean Jenkins wrote:
> Hi Marcel, Gustavo and Johan,
>
> Please see the end of the E-mail for a workaround, do you agree with 
> the workaround, any comments ?
>
> Description of L2CAP ERTM deadlock
> ======================
>
> Using kernel 3.8 on an ARM i.MX6 based platform, we have identified 
> some deadlock situations involving L2CAP ERTM. We believe kernel 
> v4.0.1 can deadlock as well (not tested it) as the affected code is 
> similar to kernel 3.8.
>
> With a L2CAP ERTM channel connected, and l2cap_sock_shutdown() is 
> called, then __l2cap_wait_ack() is called which is supposed to wait 
> for all sent I frames to be acknowledged.
>
> l2cap_sock_shutdown() takes all 3 locks namely:
>
> mutex_lock(&conn->chan_lock);
> l2cap_chan_lock(chan);
> lock_sock(sk);
>
> __l2cap_wait_ack() is implemented as follows:
>
> int __l2cap_wait_ack(struct sock *sk)
> {
>     struct l2cap_chan *chan = l2cap_pi(sk)->chan;
>     DECLARE_WAITQUEUE(wait, current);
>     int err = 0;
>     int timeo = HZ/5;
>
>     add_wait_queue(sk_sleep(sk), &wait);
>     set_current_state(TASK_INTERRUPTIBLE);
>     while (chan->unacked_frames > 0 && chan->conn) {
>         if (!timeo)
>             timeo = HZ/5;
>
>         if (signal_pending(current)) {
>             err = sock_intr_errno(timeo);
>             break;
>         }
>
>         release_sock(sk);
>         timeo = schedule_timeout(timeo);
>         lock_sock(sk);
>         set_current_state(TASK_INTERRUPTIBLE);
>
>         err = sock_error(sk);
>         if (err)
>             break;
>     }
>     set_current_state(TASK_RUNNING);
>     remove_wait_queue(sk_sleep(sk), &wait);
>     return err;
> }
>
> Analysis shows that the while loop checks chan->unacked_frames > 0 and 
> chan->conn is !NULL. So when both are true the loop waits (sleeps) for 
> 200ms using schedule_timeout() and then rechecks the while loop 
> conditions before waiting again.
>
> Unfortunately, during this waiting (sleeping) period the following 
> locks are still held:
> mutex_lock(&conn->chan_lock);
> l2cap_chan_lock(chan);
>
> These held locks prevent l2cap_recv_frame() from processing the 
> received ACKs for the sent I frames in ERTM. This means 
> chan->unacked_frames > 0 remains true.
>
> chan->conn cannot be set to NULL by the channel being closed by other 
> means as the held locks prevent the state being updated.
>
> Therefore a deadlock occurs because __l2cap_wait_ack() gets stuck in 
> an infinite loop which periodically waits (sleeps). The kernel's hung 
> task detector is not able to detect this "hung" thread because the 
> thread periodically runs. This means l2cap_sock_shutdown() does not 
> return. With debug, we have proven that the infinite loop occurs.
>
> Here is an example hung task log and LOCKDEP output of a L2CAP ERTM 
> deadlock scenario: (we do have some other different traces as well so 
> more than 1 failure scenario exists)
>
> [  360.497912] INFO: task kworker/u:0H:7 blocked for more than 180 
> seconds.
> [  360.497931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [  360.497944] kworker/u:0H    D 8046bb28     0     7      2 0x00000000
> [  360.498051] Backtrace:
> [  360.498145] [<8046b624>] (__schedule+0x0/0x790) from [<8046be3c>] 
> (schedule+0x88/0x8c)
> [  360.498231] [<8046bdb4>] (schedule+0x0/0x8c) from [<8046c1c8>] 
> (schedule_preempt_disabled+0x18/0x24)
> [  360.498318] [<8046c1b0>] (schedule_preempt_disabled+0x0/0x24) from 
> [<80469b70>] (mutex_lock_nested+0x250/0x3e8)
> [  360.498495] [<80469920>] (mutex_lock_nested+0x0/0x3e8) from 
> [<7f44d2a8>] (l2cap_get_chan_by_scid+0x24/0x54 [bluetooth])
> [  360.498696] [<7f44d284>] (l2cap_get_chan_by_scid+0x0/0x54 
> [bluetooth]) from [<7f453f68>] (l2cap_recv_frame+0x1918/0x1cc8 
> [bluetooth])
> [  360.498779]  r6:0000006b r5:9773ef00 r4:96bfdc00 r3:0000003f
> [  360.498919] [<7f452650>] (l2cap_recv_frame+0x0/0x1cc8 [bluetooth]) 
> from [<7f454e3c>] (l2cap_recv_acldata+0xc0/0x1f8 [bluetooth])
> [  360.499088] [<7f454d7c>] (l2cap_recv_acldata+0x0/0x1f8 [bluetooth]) 
> from [<7f437c74>] (hci_rx_work+0x1b4/0x2ec [bluetooth])
> [  360.499176]  r6:9759a000 r5:9759a77c r4:9773ef00 r3:00000000
> [  360.499259] [<7f437ac0>] (hci_rx_work+0x0/0x2ec [bluetooth]) from 
> [<8003ee1c>] (process_one_work+0x344/0x540)
> [  360.499352] [<8003ead8>] (process_one_work+0x0/0x540) from 
> [<8003f2e4>] (worker_thread+0x298/0x3ec)
> [  360.499449] [<8003f04c>] (worker_thread+0x0/0x3ec) from 
> [<80044a40>] (kthread+0xb4/0xc0)
> [  360.499543] [<8004498c>] (kthread+0x0/0xc0) from [<8000da30>] 
> (ret_from_fork+0x14/0x24)
> [  360.499622]  r7:00000000 r6:00000000 r5:8004498c r4:a40a3e34
> [  360.499880]
> Showing all locks held in the system:
> [  360.499909] 3 locks held by kworker/u:0H/7:
> [  360.499921]  #0:  (hdev->name){++++..}, at: [<8003ed28>] 
> process_one_work+0x250/0x540
> [  360.499968]  #1:  ((&hdev->rx_work)){+.+...}, at: [<8003ed28>] 
> process_one_work+0x250/0x540
> [  360.500009]  #2:  (&conn->chan_lock){+.+...}, at: [<7f44d2a8>] 
> l2cap_get_chan_by_scid+0x24/0x54 [bluetooth]
> <snip>
> [  360.500256] 3 locks held by kworker/1:2/64:
> [  360.500266]  #0:  (events){.+.+..}, at: [<8003ed28>] 
> process_one_work+0x250/0x540
> [  360.500307]  #1: ((&(&chan->retrans_timer)->work)){+.+...}, at: 
> [<8003ed28>] process_one_work+0x250/0x540
> [  360.500346]  #2:  (&chan->lock){+.+...}, at: [<7f450414>] 
> l2cap_retrans_timeout+0x24/0x58 [bluetooth]
> <snip>
> [  360.502066] 2 locks held by <userland BT daemon>/1433:
> [  360.502076]  #0:  (&conn->chan_lock){+.+...}, at: [<7f455378>] 
> l2cap_sock_shutdown+0x34/0x130 [bluetooth]
> [  360.502179]  #1:  (&chan->lock){+.+...}, at: [<7f455388>] 
> l2cap_sock_shutdown+0x44/0x130 [bluetooth]
> <snip>
>
> It can be seen that l2cap_get_chan_by_scid() is blocked so that the 
> ACK cannot be processed and l2cap_retrans_timeout() has timed out in 
> an attempt to retransmit the I frame but is also blocked. The 
> Bluetooth stack is now paralysed.
>
> This scenario will fail every time. The locking seems to be flawed in 
> l2cap_sock_shutdown() for L2CAP ERTM. Did it ever work ?
>
>
> Workaround
> ========
>
> Here is our workaround patch (based on kernel 3.8) which releases all 
> the locks before waiting (sleeping):
>
> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index a53f47a..bf49066 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -1766,7 +1766,13 @@ int __l2cap_wait_ack(struct sock *sk)
>                 }
>
>                 release_sock(sk);
> +               l2cap_chan_unlock(chan);
> +               if (chan->conn)
> + mutex_unlock(&chan->conn->chan_lock);
>                 timeo = schedule_timeout(timeo);
> +               if (chan->conn)
> + mutex_lock(&chan->conn->chan_lock);
> +               l2cap_chan_lock(chan);
>                 lock_sock(sk);
>                 set_current_state(TASK_INTERRUPTIBLE);
>
> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
> index 657b3a4..bd0d1ae 100644
> --- a/net/bluetooth/l2cap_sock.c
> +++ b/net/bluetooth/l2cap_sock.c
> @@ -867,8 +867,10 @@ static int l2cap_sock_shutdown(struct socket 
> *sock, int how)
>         lock_sock(sk);
>
>         if (!sk->sk_shutdown) {
> -               if (chan->mode == L2CAP_MODE_ERTM)
> +               if (chan->mode == L2CAP_MODE_ERTM) {
>                         err = __l2cap_wait_ack(sk);
> +                       conn = chan->conn;
> +               }
>
>                 sk->sk_shutdown = SHUTDOWN_MASK;
>
> If you think this is the right approach then we will formally create a 
> patch and send it to the mailing list for review.
>
> Our initial tests show that the deadlock no longer occurs with the 
> patch applied, but we are worried about side effects.
>
> We have some doubts:
>
> a) What are the locks protecting ? Do all 3 locks need to be taken ?
>
> b) Can a better architectural solution be created that avoids the locks ?
>
> Perhaps arm the channel to disconnect, then after all ACKs have been 
> received, call a callback to wake up l2cap_sock_shutdown ?
>
> c) What stops new I frames being sent during the waiting period so 
> chan->unacked_frames increases ?
>
> Thanks for your time.
>
> Regards,
> Dean
>

-- 
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Bluetooth ERTM L2CAP deadlocks (hung tasks) due to l2cap_sock_shutdown()
  2015-05-29 10:10 ` Dean Jenkins
@ 2015-06-05  8:35   ` Dean Jenkins
  0 siblings, 0 replies; 3+ messages in thread
From: Dean Jenkins @ 2015-06-05  8:35 UTC (permalink / raw)
  To: BlueZ development
  Cc: Marcel Holtmann, Gustavo Padovan, Johan Hedberg, Frkuska, Joshua

Hi Marcel, Gustavo and Johan,

We are working on a fix for the Bluetooth ERTM L2CAP deadlock (hung 
task) issue. In particular to fix kernel 3.8, however, I have 
forward-ported the patches to bluetooth-next and I believe the fix is 
appropriate for bluetooth-next as sjanc on IRC was able to reproduce the 
deadlock on bluetooth-next.

I have uploaded our set of proposed patches in 
l2cap_ertm_shutdown_deadlock_patches.tgz to 
https://bugzilla.kernel.org/show_bug.cgi?id=99301

We are still testing so the patches may not be the final versions but 
the signs are good that the solution works.

Feel free to comment and collaborate.

I use the nick "djenkins" in the Bluetooth IRC chat room, please connect 
me if you wish to discuss the issue.

Thanks for your time.

Regards,
Dean

On 29/05/15 11:10, Dean Jenkins wrote:
> Hi Marcel, Gustavo and Johan,
>
> My original E-mail is below for completeness so that you understand 
> the context of this new E-mail.
>
> We have been successful in demonstrating a L2CAP ERTM shutdown 
> deadlock by using the Bluez l2test utility. We think the latest 4.x 
> kernel can also deadlock (not tested).
>
> Details of the simple test case are as follows:
>
> Apparatus
> ======
>
> Raspberry Pi (RPi) original B variant (single ARM core) running Linux 
> Raspian with kernel 3.18.11+ and a USB BT v4.0 dongle. bluez is v5.15 
> (locally built)
> Dell laptop running Linux Mageia 4 with kernel 3.14.32 and an on-board 
> USB BT dongle. bluez is v5.11
>
> Method
> =====
>
> INITIAL SETTINGS:
>
> Ensure bluetoothd is running on the laptop and RPi so the D-Bus 
> services are available for the Bluez utilities to use
> Run bluetoothctl on the laptop and RPi and ensure that an adaptor is 
> shown
>
> Type in the following commands on both systems:
> agent on
> default-agent
> power on
> discoverable on
> scan on
>
> Both systems should now detect each other
>
> Make a note of each device's BT address. Call them <lap_BT_ADDR> and 
> <RPi_BT_ADDR>
>
> On the RPi pair with the laptop (visa versa will also work):
>
> On the RPi do command
> pair <lap_BT_ADDR>
> Accept the pin code
>
> On the laptop do
> Accept the pin code
>
> Pairing should now be successful. bluetoothd must continue to run so 
> that the linkkeys can be serviced by the agent.
>
> bluetoothctl can be quit on both systems
>
> USING L2TEST
>
> l2test can be used in various modes of operation controlled via 
> command line options. l2test can be used in a client-server 
> arrangement whereby 1 BT device listens for a connection (server) and 
> the other BT device connects (client) to the listening device.
>
> For this testcase:
>
> RPi is the l2test client (unit under test)
> laptop is the l2test server
>
> On the laptop use the command:
> l2test -i hci0 -X ertm -d
> This command listens for connections with optional L2CAP ERTM 
> supported and dumps out the received data.
>
> TEST 1: NON-ERTM CONNECTION
>
> On the RPi use the command:
> l2test -s -b 1024 -B /dev/urandom -P 4113 <lap_BT_ADDR>
> This command sends 1024 bytes to the laptop via PSM 4113 used by 
> l2test over a L2CAP basic mode channel.
>
> TEST 2: ERTM CONNECTION
>
> On the RPi use the command:
> l2test -s -b 1024 -B /dev/urandom -X ertm -P 4113 <lap_BT_ADDR>
> This command sends 1024 bytes to the laptop via PSM 4113 used by 
> l2test over an ERTM channel
>
> Results
> =====
>
> TEST 1
>
> root@raspberrypi:~# l2test -s -b 1024 -B /dev/urandom -P 4113 
> 5C:F3:70:68:BA:DD
> l2test[10346]: Connected to 5C:F3:70:68:BA:DD (bredr, psm 4113, scid 64)
> l2test[10346]: Local device 5C:F3:70:68:BA:DA (bredr, psm 4113, scid 64)
> l2test[10346]: Options [imtu 672, omtu 672, flush_to 65535, mode 0, 
> handle 11, class 0x000000, priority 0, rcvbuf 163840]
> l2test[10346]: Sending ...
> l2test[10346]: Closing channel ...
> l2test[10346]: Done
> l2test[10346]: Exit
>
> Using basic L2CAP mode, the channel sends and closes OK.
> l2test terminates as expected.
>
> TEST 2
>
> root@raspberrypi:~# l2test -s -b 1024 -B /dev/urandom -X ertm -P 4113 
> 5C:F3:70:68:BA:DD
> l2test[10366]: Connected to 5C:F3:70:68:BA:DD (bredr, psm 4113, scid 64)
> l2test[10366]: Local device 5C:F3:70:68:BA:DA (bredr, psm 4113, scid 64)
> l2test[10366]: Options [imtu 672, omtu 672, flush_to 65535, mode 3, 
> handle 11, class 0x000000, priority 0, rcvbuf 163840]
> l2test[10366]: Sending ...
> l2test[10366]: Closing channel ...
>
> Using L2CAP ERTM, the channel is instructed to close but hangs.
> l2test fails to naturally terminate.
>
> From Bluez 5.15 tools/l2test.c:
> static void send_mode(int sk)
> {
>     do_send(sk);
>
>     syslog(LOG_INFO, "Closing channel ...");
>     if (shutdown(sk, SHUT_RDWR) < 0)
>         syslog(LOG_INFO, "Close failed: %m");
>     else
>         syslog(LOG_INFO, "Done");
> }
>
> The -s send l2test command line option calls send_mode() to send the 
> data via do_send() and then closes the channel with the call to 
> shutdown().
> do_send() is probably not checking that the data was successfully sent 
> before returning meaning potentially not all the I frames were ACK'ed.
>
> The results shows that shutdown() fails to return as no "Done" textual 
> message is seen.
>
> When the RPi is left in this condition for up to 10 minutes (need to 
> wait longer than 2 times the hung task detector period plus L2CAP 
> timer expiry time), dmesg shows:
>
> [  601.116119] INFO: task kworker/0:1:19 blocked for more than 120 
> seconds.
> [  601.116156]       Not tainted 3.18.11+ #781
> [  601.116169] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [  601.116183] kworker/0:1     D c054cbcc     0    19      2 0x00000000
> [  601.116537] Workqueue: events l2cap_retrans_timeout [bluetooth]
> [  601.116685] [<c054cbcc>] (__schedule) from [<c054cfa4>] 
> (schedule+0x40/0x8c)
> [  601.116727] [<c054cfa4>] (schedule) from [<c054d3c8>] 
> (schedule_preempt_disabled+0x30/0x40)
> [  601.116812] [<c054d3c8>] (schedule_preempt_disabled) from 
> [<c054ee14>] (__mutex_lock_slowpath+0xb8/0x18c)
> [  601.116856] [<c054ee14>] (__mutex_lock_slowpath) from [<c054ef20>] 
> (mutex_lock+0x38/0x3c)
> [  601.117085] [<c054ef20>] (mutex_lock) from [<bf05173c>] 
> (l2cap_retrans_timeout+0x24/0x5c [bluetooth])
> [  601.117371] [<bf05173c>] (l2cap_retrans_timeout [bluetooth]) from 
> [<c003aa68>] (process_one_work+0x13c/0x444)
> [  601.117410] [<c003aa68>] (process_one_work) from [<c003b954>] 
> (worker_thread+0x190/0x580)
> [  601.117445] [<c003b954>] (worker_thread) from [<c003fdb4>] 
> (kthread+0xd8/0xf4)
> [  601.117488] [<c003fdb4>] (kthread) from [<c000e8c8>] 
> (ret_from_fork+0x14/0x20)
> [  601.117569] INFO: task kworker/u3:2:327 blocked for more than 120 
> seconds.
> [  601.117588]       Not tainted 3.18.11+ #781
> [  601.117601] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [  601.117614] kworker/u3:2    D c054cbcc     0   327      2 0x00000000
> [  601.117806] Workqueue: hci0 hci_rx_work [bluetooth]
> [  601.117941] [<c054cbcc>] (__schedule) from [<c054cfa4>] 
> (schedule+0x40/0x8c)
> [  601.117981] [<c054cfa4>] (schedule) from [<c054d3c8>] 
> (schedule_preempt_disabled+0x30/0x40)
> [  601.118056] [<c054d3c8>] (schedule_preempt_disabled) from 
> [<c054ee14>] (__mutex_lock_slowpath+0xb8/0x18c)
> [  601.118092] [<c054ee14>] (__mutex_lock_slowpath) from [<c054ef20>] 
> (mutex_lock+0x38/0x3c)
> [  601.118307] [<c054ef20>] (mutex_lock) from [<bf04d2c8>] 
> (l2cap_get_chan_by_scid+0x28/0x98 [bluetooth])
> [  601.118696] [<bf04d2c8>] (l2cap_get_chan_by_scid [bluetooth]) from 
> [<bf054590>] (l2cap_recv_frame+0xa4/0x2cec [bluetooth])
> [  601.119146] [<bf054590>] (l2cap_recv_frame [bluetooth]) from 
> [<bf058194>] (l2cap_recv_acldata+0x214/0x22c [bluetooth])
> [  601.119559] [<bf058194>] (l2cap_recv_acldata [bluetooth]) from 
> [<bf02bf88>] (hci_rx_work+0x180/0x288 [bluetooth])
> [  601.119801] [<bf02bf88>] (hci_rx_work [bluetooth]) from 
> [<c003aa68>] (process_one_work+0x13c/0x444)
> [  601.119839] [<c003aa68>] (process_one_work) from [<c003b954>] 
> (worker_thread+0x190/0x580)
> [  601.119883] [<c003b954>] (worker_thread) from [<c003fdb4>] 
> (kthread+0xd8/0xf4)
> [  601.119919] [<c003fdb4>] (kthread) from [<c000e8c8>] 
> (ret_from_fork+0x14/0x20)
>
> Conclusions
> ========
>
> The hung task failure is easy to reproduce with the l2test utility. It 
> could be argued that the do_send() function should only return when 
> all the data has been successfully acknowledged but on the other hand, 
> it is ideal to generate the right conditions for the failure to be 
> triggered.
>
> Probably, the low resources on the RPi help to make the failure more 
> likely than using a PC for example.
>
> hcidump shows that the S RR frame from the laptop comes 320ms after 
> the last I frame from the RPi which generates a window of opportunity 
> for the shutdown() call to execute l2cap_sock_shutdown() before the S 
> RR frame has been processed.
>
> It can be seen that the root cause function __l2cap_wait_ack() is not 
> reported by the hung task detector because the function is running in 
> an infinite loop but periodically waiting (sleeping) for 200ms so is 
> not detectable by the hung task detector. The hung task detector is 
> triggered when the L2CAP recovery timers expire 
> (l2cap_retrans_timeout) and gets blocked, or when a new frame is 
> received (l2cap_get_chan_by_scid) and processing gets blocked.
>
> We looked at a range of kernels from 3.8 to 4.0 which we think can 
> suffer the deadlock.
>
> We managed to get a deadlock on Fedora using kernel 4.0.4 but it was a 
> different test case with an iPhone.
>
> Please can we work together to find a solution to the deadlock ?
>
> Note that our interest is in fixing kernels 3.8 and 3.14 so hopefully 
> any fix to the latest 4.x kernel can be back-ported to the kernels 
> that we are using.
>
> Many thanks for your time.
>
> Regards,
> Dean
>
> On 07/05/15 16:26, Dean Jenkins wrote:
>> Hi Marcel, Gustavo and Johan,
>>
>> Please see the end of the E-mail for a workaround, do you agree with 
>> the workaround, any comments ?
>>
>> Description of L2CAP ERTM deadlock
>> ======================
>>
>> Using kernel 3.8 on an ARM i.MX6 based platform, we have identified 
>> some deadlock situations involving L2CAP ERTM. We believe kernel 
>> v4.0.1 can deadlock as well (not tested it) as the affected code is 
>> similar to kernel 3.8.
>>
>> With a L2CAP ERTM channel connected, and l2cap_sock_shutdown() is 
>> called, then __l2cap_wait_ack() is called which is supposed to wait 
>> for all sent I frames to be acknowledged.
>>
>> l2cap_sock_shutdown() takes all 3 locks namely:
>>
>> mutex_lock(&conn->chan_lock);
>> l2cap_chan_lock(chan);
>> lock_sock(sk);
>>
>> __l2cap_wait_ack() is implemented as follows:
>>
>> int __l2cap_wait_ack(struct sock *sk)
>> {
>>     struct l2cap_chan *chan = l2cap_pi(sk)->chan;
>>     DECLARE_WAITQUEUE(wait, current);
>>     int err = 0;
>>     int timeo = HZ/5;
>>
>>     add_wait_queue(sk_sleep(sk), &wait);
>>     set_current_state(TASK_INTERRUPTIBLE);
>>     while (chan->unacked_frames > 0 && chan->conn) {
>>         if (!timeo)
>>             timeo = HZ/5;
>>
>>         if (signal_pending(current)) {
>>             err = sock_intr_errno(timeo);
>>             break;
>>         }
>>
>>         release_sock(sk);
>>         timeo = schedule_timeout(timeo);
>>         lock_sock(sk);
>>         set_current_state(TASK_INTERRUPTIBLE);
>>
>>         err = sock_error(sk);
>>         if (err)
>>             break;
>>     }
>>     set_current_state(TASK_RUNNING);
>>     remove_wait_queue(sk_sleep(sk), &wait);
>>     return err;
>> }
>>
>> Analysis shows that the while loop checks chan->unacked_frames > 0 
>> and chan->conn is !NULL. So when both are true the loop waits 
>> (sleeps) for 200ms using schedule_timeout() and then rechecks the 
>> while loop conditions before waiting again.
>>
>> Unfortunately, during this waiting (sleeping) period the following 
>> locks are still held:
>> mutex_lock(&conn->chan_lock);
>> l2cap_chan_lock(chan);
>>
>> These held locks prevent l2cap_recv_frame() from processing the 
>> received ACKs for the sent I frames in ERTM. This means 
>> chan->unacked_frames > 0 remains true.
>>
>> chan->conn cannot be set to NULL by the channel being closed by other 
>> means as the held locks prevent the state being updated.
>>
>> Therefore a deadlock occurs because __l2cap_wait_ack() gets stuck in 
>> an infinite loop which periodically waits (sleeps). The kernel's hung 
>> task detector is not able to detect this "hung" thread because the 
>> thread periodically runs. This means l2cap_sock_shutdown() does not 
>> return. With debug, we have proven that the infinite loop occurs.
>>
>> Here is an example hung task log and LOCKDEP output of a L2CAP ERTM 
>> deadlock scenario: (we do have some other different traces as well so 
>> more than 1 failure scenario exists)
>>
>> [  360.497912] INFO: task kworker/u:0H:7 blocked for more than 180 
>> seconds.
>> [  360.497931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
>> disables this message.
>> [  360.497944] kworker/u:0H    D 8046bb28     0     7      2 0x00000000
>> [  360.498051] Backtrace:
>> [  360.498145] [<8046b624>] (__schedule+0x0/0x790) from [<8046be3c>] 
>> (schedule+0x88/0x8c)
>> [  360.498231] [<8046bdb4>] (schedule+0x0/0x8c) from [<8046c1c8>] 
>> (schedule_preempt_disabled+0x18/0x24)
>> [  360.498318] [<8046c1b0>] (schedule_preempt_disabled+0x0/0x24) from 
>> [<80469b70>] (mutex_lock_nested+0x250/0x3e8)
>> [  360.498495] [<80469920>] (mutex_lock_nested+0x0/0x3e8) from 
>> [<7f44d2a8>] (l2cap_get_chan_by_scid+0x24/0x54 [bluetooth])
>> [  360.498696] [<7f44d284>] (l2cap_get_chan_by_scid+0x0/0x54 
>> [bluetooth]) from [<7f453f68>] (l2cap_recv_frame+0x1918/0x1cc8 
>> [bluetooth])
>> [  360.498779]  r6:0000006b r5:9773ef00 r4:96bfdc00 r3:0000003f
>> [  360.498919] [<7f452650>] (l2cap_recv_frame+0x0/0x1cc8 [bluetooth]) 
>> from [<7f454e3c>] (l2cap_recv_acldata+0xc0/0x1f8 [bluetooth])
>> [  360.499088] [<7f454d7c>] (l2cap_recv_acldata+0x0/0x1f8 
>> [bluetooth]) from [<7f437c74>] (hci_rx_work+0x1b4/0x2ec [bluetooth])
>> [  360.499176]  r6:9759a000 r5:9759a77c r4:9773ef00 r3:00000000
>> [  360.499259] [<7f437ac0>] (hci_rx_work+0x0/0x2ec [bluetooth]) from 
>> [<8003ee1c>] (process_one_work+0x344/0x540)
>> [  360.499352] [<8003ead8>] (process_one_work+0x0/0x540) from 
>> [<8003f2e4>] (worker_thread+0x298/0x3ec)
>> [  360.499449] [<8003f04c>] (worker_thread+0x0/0x3ec) from 
>> [<80044a40>] (kthread+0xb4/0xc0)
>> [  360.499543] [<8004498c>] (kthread+0x0/0xc0) from [<8000da30>] 
>> (ret_from_fork+0x14/0x24)
>> [  360.499622]  r7:00000000 r6:00000000 r5:8004498c r4:a40a3e34
>> [  360.499880]
>> Showing all locks held in the system:
>> [  360.499909] 3 locks held by kworker/u:0H/7:
>> [  360.499921]  #0:  (hdev->name){++++..}, at: [<8003ed28>] 
>> process_one_work+0x250/0x540
>> [  360.499968]  #1:  ((&hdev->rx_work)){+.+...}, at: [<8003ed28>] 
>> process_one_work+0x250/0x540
>> [  360.500009]  #2:  (&conn->chan_lock){+.+...}, at: [<7f44d2a8>] 
>> l2cap_get_chan_by_scid+0x24/0x54 [bluetooth]
>> <snip>
>> [  360.500256] 3 locks held by kworker/1:2/64:
>> [  360.500266]  #0:  (events){.+.+..}, at: [<8003ed28>] 
>> process_one_work+0x250/0x540
>> [  360.500307]  #1: ((&(&chan->retrans_timer)->work)){+.+...}, at: 
>> [<8003ed28>] process_one_work+0x250/0x540
>> [  360.500346]  #2:  (&chan->lock){+.+...}, at: [<7f450414>] 
>> l2cap_retrans_timeout+0x24/0x58 [bluetooth]
>> <snip>
>> [  360.502066] 2 locks held by <userland BT daemon>/1433:
>> [  360.502076]  #0:  (&conn->chan_lock){+.+...}, at: [<7f455378>] 
>> l2cap_sock_shutdown+0x34/0x130 [bluetooth]
>> [  360.502179]  #1:  (&chan->lock){+.+...}, at: [<7f455388>] 
>> l2cap_sock_shutdown+0x44/0x130 [bluetooth]
>> <snip>
>>
>> It can be seen that l2cap_get_chan_by_scid() is blocked so that the 
>> ACK cannot be processed and l2cap_retrans_timeout() has timed out in 
>> an attempt to retransmit the I frame but is also blocked. The 
>> Bluetooth stack is now paralysed.
>>
>> This scenario will fail every time. The locking seems to be flawed in 
>> l2cap_sock_shutdown() for L2CAP ERTM. Did it ever work ?
>>
>>
>> Workaround
>> ========
>>
>> Here is our workaround patch (based on kernel 3.8) which releases all 
>> the locks before waiting (sleeping):
>>
>> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
>> index a53f47a..bf49066 100644
>> --- a/net/bluetooth/l2cap_core.c
>> +++ b/net/bluetooth/l2cap_core.c
>> @@ -1766,7 +1766,13 @@ int __l2cap_wait_ack(struct sock *sk)
>>                 }
>>
>>                 release_sock(sk);
>> +               l2cap_chan_unlock(chan);
>> +               if (chan->conn)
>> + mutex_unlock(&chan->conn->chan_lock);
>>                 timeo = schedule_timeout(timeo);
>> +               if (chan->conn)
>> + mutex_lock(&chan->conn->chan_lock);
>> +               l2cap_chan_lock(chan);
>>                 lock_sock(sk);
>>                 set_current_state(TASK_INTERRUPTIBLE);
>>
>> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
>> index 657b3a4..bd0d1ae 100644
>> --- a/net/bluetooth/l2cap_sock.c
>> +++ b/net/bluetooth/l2cap_sock.c
>> @@ -867,8 +867,10 @@ static int l2cap_sock_shutdown(struct socket 
>> *sock, int how)
>>         lock_sock(sk);
>>
>>         if (!sk->sk_shutdown) {
>> -               if (chan->mode == L2CAP_MODE_ERTM)
>> +               if (chan->mode == L2CAP_MODE_ERTM) {
>>                         err = __l2cap_wait_ack(sk);
>> +                       conn = chan->conn;
>> +               }
>>
>>                 sk->sk_shutdown = SHUTDOWN_MASK;
>>
>> If you think this is the right approach then we will formally create 
>> a patch and send it to the mailing list for review.
>>
>> Our initial tests show that the deadlock no longer occurs with the 
>> patch applied, but we are worried about side effects.
>>
>> We have some doubts:
>>
>> a) What are the locks protecting ? Do all 3 locks need to be taken ?
>>
>> b) Can a better architectural solution be created that avoids the 
>> locks ?
>>
>> Perhaps arm the channel to disconnect, then after all ACKs have been 
>> received, call a callback to wake up l2cap_sock_shutdown ?
>>
>> c) What stops new I frames being sent during the waiting period so 
>> chan->unacked_frames increases ?
>>
>> Thanks for your time.
>>
>> Regards,
>> Dean
>>
>


-- 
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-06-05  8:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-07 15:26 [RFC] Bluetooth ERTM L2CAP deadlocks (hung tasks) due to l2cap_sock_shutdown() Dean Jenkins
2015-05-29 10:10 ` Dean Jenkins
2015-06-05  8:35   ` Dean Jenkins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).