* rctest -c "Can't connect: Device or resource busy (16)" @ 2014-03-11 21:44 Scott James Remnant 2014-03-11 23:03 ` Marcel Holtmann 0 siblings, 1 reply; 14+ messages in thread From: Scott James Remnant @ 2014-03-11 21:44 UTC (permalink / raw) To: linux-bluetooth@vger.kernel.org Is this expected behavior for RFCOMM? When using rctest -r on one side, and rctest -c (connect/disconnect/connect...) on the other, the client side fails to connect a significant percentage of the time: rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Connected [handle 34817, class 0xff8801, priority 0] rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) rctest[23705]: Can't connect: Device or resource busy (16) Scott ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-11 21:44 rctest -c "Can't connect: Device or resource busy (16)" Scott James Remnant @ 2014-03-11 23:03 ` Marcel Holtmann 2014-03-11 23:37 ` Scott James Remnant 0 siblings, 1 reply; 14+ messages in thread From: Marcel Holtmann @ 2014-03-11 23:03 UTC (permalink / raw) To: Scott James Remnant; +Cc: linux-bluetooth@vger.kernel.org Hi Scott, > Is this expected behavior for RFCOMM? When using rctest -r on one > side, and rctest -c (connect/disconnect/connect...) on the other, the > client side fails to connect a significant percentage of the time: > > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Connected [handle 34817, class 0xff8801, priority 0] > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) > rctest[23705]: Can't connect: Device or resource busy (16) it is not an expected behavior. Can you create a btmon trace for this and see what error code we get on the HCI level. Without any kind of analysis this sound like a race condition in the disconnect handling where the connection is not fully terminated and you try to re-connect too quickly. What kernel version is this? And as usual the first question, do you still see this behavior with a bluetooth-next kernel? Regards Marcel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-11 23:03 ` Marcel Holtmann @ 2014-03-11 23:37 ` Scott James Remnant 2014-03-12 2:14 ` Marcel Holtmann 0 siblings, 1 reply; 14+ messages in thread From: Scott James Remnant @ 2014-03-11 23:37 UTC (permalink / raw) To: Marcel Holtmann; +Cc: linux-bluetooth@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1639 bytes --] 3.10 with bluetooth-next applied - btmon of both sides attached On Tue, Mar 11, 2014 at 4:03 PM, Marcel Holtmann <marcel@holtmann.org> wrote: > Hi Scott, > >> Is this expected behavior for RFCOMM? When using rctest -r on one >> side, and rctest -c (connect/disconnect/connect...) on the other, the >> client side fails to connect a significant percentage of the time: >> >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Connected [handle 34817, class 0xff8801, priority 0] >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) >> rctest[23705]: Can't connect: Device or resource busy (16) > > it is not an expected behavior. Can you create a btmon trace for this and see what error code we get on the HCI level. > > Without any kind of analysis this sound like a race condition in the disconnect handling where the connection is not fully terminated and you try to re-connect too quickly. > > What kernel version is this? And as usual the first question, do you still see this behavior with a bluetooth-next kernel? > > Regards > > Marcel > [-- Attachment #2: connect.btsnoop --] [-- Type: application/octet-stream, Size: 13642 bytes --] [-- Attachment #3: listen.btsnoop --] [-- Type: application/octet-stream, Size: 14060 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-11 23:37 ` Scott James Remnant @ 2014-03-12 2:14 ` Marcel Holtmann 2014-03-12 3:01 ` Peter Hurley 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley 0 siblings, 2 replies; 14+ messages in thread From: Marcel Holtmann @ 2014-03-12 2:14 UTC (permalink / raw) To: Scott James Remnant; +Cc: linux-bluetooth@vger.kernel.org Hi Scott, > 3.10 with bluetooth-next applied - btmon of both sides attached I am seeing the same with bluetooth-next on a 3.13 kernel. It even has a weird pattern for me. rctest[52844]: Connected [handle 11, class 0x000000, priority 0] rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Device or resource busy (16) rctest[52844]: Can't connect: Connection refused (111) rctest[52844]: Can't connect: Connection reset by peer (104) rctest[52844]: Can't connect: Connection reset by peer (104) rctest[52844]: Connected [handle 11, class 0x000000, priority 0] Unfortunately I currently have no idea on why this happens. When using SO_LINGER of 60 seconds with rctest -L 60 -c <bdaddr> I am getting some really odd tracing behavior. > ACL Data RX: Handle 11 flags 0x02 dlen 14 L2CAP: Command Reject (0x01) ident 7 len 6 Reason: Invalid CID in request (0x0002) Destination CID: 64 Source CID: 0 This should never happen and makes me wonder if something is even screwed up one down in L2CAP. Do you happen to have time to bisect this and figure out which patch introduced this behavior. Regards Marcel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-12 2:14 ` Marcel Holtmann @ 2014-03-12 3:01 ` Peter Hurley 2014-03-12 3:27 ` Peter Hurley ` (2 more replies) 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley 1 sibling, 3 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-12 3:01 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth@vger.kernel.org On 03/11/2014 10:14 PM, Marcel Holtmann wrote: > Hi Scott, > >> 3.10 with bluetooth-next applied - btmon of both sides attached > > I am seeing the same with bluetooth-next on a 3.13 kernel. It even has a weird pattern for me. > > rctest[52844]: Connected [handle 11, class 0x000000, priority 0] > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Device or resource busy (16) > rctest[52844]: Can't connect: Connection refused (111) > rctest[52844]: Can't connect: Connection reset by peer (104) > rctest[52844]: Can't connect: Connection reset by peer (104) > rctest[52844]: Connected [handle 11, class 0x000000, priority 0] > > Unfortunately I currently have no idea on why this happens. When using > SO_LINGER of 60 seconds with rctest -L 60 -c <bdaddr> I am getting some > really odd tracing behavior. > >> ACL Data RX: Handle 11 flags 0x02 dlen 14 > L2CAP: Command Reject (0x01) ident 7 len 6 > Reason: Invalid CID in request (0x0002) > Destination CID: 64 > Source CID: 0 > > This should never happen and makes me wonder if something is > even screwed up one down in L2CAP. Do you happen to have time > to bisect this and figure out which patch introduced this behavior. Marcel, I'm looking into this on the assumption I broke something in the refactor of __rfcomm_dlc_close. I can reproduce this and looking at the kernel trace shows that that UA reply of the dlc DISC is not being processed, because the rfcomm thread is not getting scheduled, but I'm not sure where that was supposed to happen. IOW, rctest -r | rctest -c xx:xx:.... | ... | rfcomm_sock_shutdown | __rfcomm_sock_close | rfcomm_dlc_close dlci 20 | __rfcomm_dlc_close | rfcomm_send_disc dlci 20 rfcomm_recv_disc state 1 dlci 20 | rfcomm_send_ua dlci 20 | __rfcomm_dlc_close state 9 | rfcomm_dlc_unlink | ... | UA never shows up here because | rfcomm_process_sessions isn't run | so rfcomm_recv_ua never happens Regards, Peter Hurley ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-12 3:01 ` Peter Hurley @ 2014-03-12 3:27 ` Peter Hurley 2014-03-12 4:27 ` Peter Hurley 2014-03-12 17:18 ` Peter Hurley 2 siblings, 0 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-12 3:27 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth@vger.kernel.org On 03/11/2014 11:01 PM, Peter Hurley wrote: > On 03/11/2014 10:14 PM, Marcel Holtmann wrote: >> Hi Scott, >> >>> 3.10 with bluetooth-next applied - btmon of both sides attached >> >> I am seeing the same with bluetooth-next on a 3.13 kernel. It even has a weird pattern for me. The pattern occurs because this the dlc is not getting unlinked on this end until the watchdog timer goes off and closes it. [The listener is unlinking the dlc properly. ] >> rctest[52844]: Connected [handle 11, class 0x000000, priority 0] >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Connection refused (111) >> rctest[52844]: Can't connect: Connection reset by peer (104) >> rctest[52844]: Can't connect: Connection reset by peer (104) >> rctest[52844]: Connected [handle 11, class 0x000000, priority 0] >> >> Unfortunately I currently have no idea on why this happens. When using >> SO_LINGER of 60 seconds with rctest -L 60 -c <bdaddr> I am getting some >> really odd tracing behavior. >> >>> ACL Data RX: Handle 11 flags 0x02 dlen 14 >> L2CAP: Command Reject (0x01) ident 7 len 6 >> Reason: Invalid CID in request (0x0002) >> Destination CID: 64 >> Source CID: 0 >> >> This should never happen and makes me wonder if something is >> even screwed up one down in L2CAP. Do you happen to have time >> to bisect this and figure out which patch introduced this behavior. > > Marcel, > > I'm looking into this on the assumption I broke something in > the refactor of __rfcomm_dlc_close. Ok, it's nothing I broke. This is repeatable on 3.12 (sender and receiver). I'll keep looking this anyway since at least I know what's broken. > I can reproduce this and looking at the kernel trace shows > that that UA reply of the dlc DISC is not being processed, because > the rfcomm thread is not getting scheduled, but I'm not sure where > that was supposed to happen. > > IOW, > > rctest -r | rctest -c xx:xx:.... > | ... > | rfcomm_sock_shutdown > | __rfcomm_sock_close > | rfcomm_dlc_close dlci 20 > | __rfcomm_dlc_close > | rfcomm_send_disc dlci 20 > rfcomm_recv_disc state 1 dlci 20 | > rfcomm_send_ua dlci 20 | > __rfcomm_dlc_close state 9 | > rfcomm_dlc_unlink | > ... | UA never shows up here because > | rfcomm_process_sessions isn't run > | so rfcomm_recv_ua never happens Regards, Peter Hurley ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-12 3:01 ` Peter Hurley 2014-03-12 3:27 ` Peter Hurley @ 2014-03-12 4:27 ` Peter Hurley 2014-03-12 17:18 ` Peter Hurley 2 siblings, 0 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-12 4:27 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth@vger.kernel.org On 03/11/2014 11:01 PM, Peter Hurley wrote: > On 03/11/2014 10:14 PM, Marcel Holtmann wrote: >> Hi Scott, >> >>> 3.10 with bluetooth-next applied - btmon of both sides attached >> >> I am seeing the same with bluetooth-next on a 3.13 kernel. It even has a weird pattern for me. >> >> rctest[52844]: Connected [handle 11, class 0x000000, priority 0] >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Connection refused (111) >> rctest[52844]: Can't connect: Connection reset by peer (104) >> rctest[52844]: Can't connect: Connection reset by peer (104) >> rctest[52844]: Connected [handle 11, class 0x000000, priority 0] >> >> Unfortunately I currently have no idea on why this happens. When using >> SO_LINGER of 60 seconds with rctest -L 60 -c <bdaddr> I am getting some >> really odd tracing behavior. >> >>> ACL Data RX: Handle 11 flags 0x02 dlen 14 >> L2CAP: Command Reject (0x01) ident 7 len 6 >> Reason: Invalid CID in request (0x0002) >> Destination CID: 64 >> Source CID: 0 >> >> This should never happen and makes me wonder if something is >> even screwed up one down in L2CAP. Do you happen to have time >> to bisect this and figure out which patch introduced this behavior. Sorry, I was wrong about the UA non-receipt; found it much further down in the kernel trace. There's quite a lag between sending the DISC and receiving the UA reply; about 12ms. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-12 3:01 ` Peter Hurley 2014-03-12 3:27 ` Peter Hurley 2014-03-12 4:27 ` Peter Hurley @ 2014-03-12 17:18 ` Peter Hurley 2014-03-28 17:49 ` Scott James Remnant 2 siblings, 1 reply; 14+ messages in thread From: Peter Hurley @ 2014-03-12 17:18 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth@vger.kernel.org > On 03/11/2014 10:14 PM, Marcel Holtmann wrote: >> Hi Scott, >> >>> 3.10 with bluetooth-next applied - btmon of both sides attached >> >> I am seeing the same with bluetooth-next on a 3.13 kernel. It even has a weird pattern for me. >> >> rctest[52844]: Connected [handle 11, class 0x000000, priority 0] >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Device or resource busy (16) >> rctest[52844]: Can't connect: Connection refused (111) >> rctest[52844]: Can't connect: Connection reset by peer (104) >> rctest[52844]: Can't connect: Connection reset by peer (104) >> rctest[52844]: Connected [handle 11, class 0x000000, priority 0] >> >> Unfortunately I currently have no idea on why this happens. When using >> SO_LINGER of 60 seconds with rctest -L 60 -c <bdaddr> I am getting some >> really odd tracing behavior. >> >>> ACL Data RX: Handle 11 flags 0x02 dlen 14 >> L2CAP: Command Reject (0x01) ident 7 len 6 >> Reason: Invalid CID in request (0x0002) >> Destination CID: 64 >> Source CID: 0 >> >> This should never happen and makes me wonder if something is >> even screwed up one down in L2CAP. Do you happen to have time >> to bisect this and figure out which patch introduced this behavior. There are 2 separate problems going on here. The failure to reconnect is happening because after sending DISC, the sender still replies to stale MCC packets, which the receiver (whose state now reflects the DISC) replies with a DM packet which gets received on the re-opened dlc, causing it to disconnect with error. This is trivially fixable, and AFAICT, is not a regression. The 2nd problem of not being able to reconnect in a reasonable number of times is due to the huge round-trip delays after receiving DISC and replying with UA. This is a result of the excessive number of execution contexts being used on the BT stack. 1) from USB interrupt -> btusb_bulk_complete -> queue_work(rx_work) 2) hci_rx_work -> l2cap_data_channel -> l2cap_sock_recv_cb -> sock_queue_rcv_skb 3) -> rfcomm_l2data_ready -> rfcomm_schedule 4) rfcomm_process_rx -> rfcomm_recv_disc -> rfcomm_send_ua -> l2cap_do_send -> hci_queue_acl 5) hci_tx_work -> btusb_send_frame On a 8-core preempt workstation, this takes ~200us, which is a lot for receiving a 12-byte packet and replying with a 12-byte packet. Plus, it doesn't help that USB tacks on another 918us to actually complete the URB. On a 1-core non-preempt laptop, this takes 51ms! Regards, Peter Hurley ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: rctest -c "Can't connect: Device or resource busy (16)" 2014-03-12 17:18 ` Peter Hurley @ 2014-03-28 17:49 ` Scott James Remnant 0 siblings, 0 replies; 14+ messages in thread From: Scott James Remnant @ 2014-03-28 17:49 UTC (permalink / raw) To: Peter Hurley; +Cc: Marcel Holtmann, linux-bluetooth@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1374 bytes --] On Wed, Mar 12, 2014 at 10:18 AM, Peter Hurley <peter@hurleysoftware.com> wrote: >> On 03/11/2014 10:14 PM, Marcel Holtmann wrote: > There are 2 separate problems going on here. > > The failure to reconnect is happening because after sending DISC, > the sender still replies to stale MCC packets, which the receiver > (whose state now reflects the DISC) replies with a DM packet which > gets received on the re-opened dlc, causing it to disconnect with > error. This is trivially fixable, and AFAICT, is not a regression. > > The 2nd problem of not being able to reconnect in a reasonable > number of times is due to the huge round-trip delays after receiving > DISC and replying with UA. > Not sure whether this is related, or a different problem: Using BlueZ's Profile APIs to make an RFCOMM connection to a paired device, write data, receive reply, then disconnect after a while (within 10 mins usually) we end up in a state where further connections to that device fail. On 3.8 they seem to fail with org.bluez.Error.InProgress while on bluetooth-next they fail with org.bluez.Error.Failed. "connect ADDR" in bluetoothctl returns the same error, and there isn't any btmon activity for the attempt. Restarting bluetoothd, resetting the radio, and even rmmod/modprobe of btusb do not fix the issue - only a full reboot. (btmon of the entire session attached) Scott [-- Attachment #2: 355651.btsnoop.bz2 --] [-- Type: application/x-bzip2, Size: 228250 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/3] Fix RFCOMM connect/disconn races 2014-03-12 2:14 ` Marcel Holtmann 2014-03-12 3:01 ` Peter Hurley @ 2014-03-13 16:43 ` Peter Hurley 2014-03-13 16:43 ` [PATCH 1/3] bluetooth: rfcomm: Reply with DM after dlc disconnect Peter Hurley ` (3 more replies) 1 sibling, 4 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-13 16:43 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth, Peter Hurley Marcel, This patch series fixes the observed errors when running rctest -r on one station and rctest -c on the other. Please note that patch 2/3 is somewhat risky in that it allows multiple RFCOMM sessions between the same endpoints to co-exist, provided that all but one of the sessions is closing or is closed. I did test this, but there may be some subtlety I've overlooked. [I note this because you may prefer the simpler and more bulletproof solution of returning -EBUSY from __rfcomm_dlc_open() in this situation; I can understand if you have lost patience for RFCOMM breakage.] This series does not address the excessive latency from sending the DISC command to receiving the UA reply. I think a partial solution would be to skip the rfcomm thread and run rfcomm_process_sessions() directly from rfcomm_l2data_ready(); this would require some rework where sk->sk_data_ready() is called directly from contexts that aren't suitable for running rfcomm_process_sessions(). And this wouldn't address the USB polling latency at all. Regards, Peter Hurley (3): bluetooth: rfcomm: Reply with DM after dlc disconnect bluetooth: rfcomm: Create new session if closing old session bluetooth: rfcomm: Defer session teardown after last dlc net/bluetooth/rfcomm/core.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) -- 1.8.1.2 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] bluetooth: rfcomm: Reply with DM after dlc disconnect 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley @ 2014-03-13 16:43 ` Peter Hurley 2014-03-13 16:43 ` [PATCH 2/3] bluetooth: rfcomm: Create new session if closing old session Peter Hurley ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-13 16:43 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth, Peter Hurley Stale commands and data may be received after DISC has already been sent for a dlc; specifically the MSC, RLS, RPN and DISC commands must reply with DM for a dlc already closing. [The PN command receive already handles this case and other TS 0710 commands are not dlc-specific.] Fixes when a stale reply to a stale command causes a DM response on a newly reopened dlc. For example, Station A Station B MSC --->| | | |<--- DISC | MSC ----> | | <---- DISC |---> MSC DISC <---| |<--- MSC UA --->| <---- MSC | The dlc is now closed on Station A. | UA ----> | MSC <---| |---> UA The dlc is now closed on Station B. DM --->| | Open new dlc @ same dlci | DM ----> | | |---> DM | | Prematurely closes new dlc. Reported-by: Scott James Remnant <keybuk@chromium.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> --- net/bluetooth/rfcomm/core.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c index 112749c..6aa90c0 100644 --- a/net/bluetooth/rfcomm/core.c +++ b/net/bluetooth/rfcomm/core.c @@ -1253,7 +1253,7 @@ static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s, if (dlci) { struct rfcomm_dlc *d = rfcomm_dlc_get(s, dlci); - if (d) { + if (d && d->state != BT_DISCONN) { rfcomm_send_ua(s, dlci); if (d->state == BT_CONNECT || d->state == BT_CONFIG) @@ -1445,6 +1445,7 @@ static int rfcomm_recv_pn(struct rfcomm_session *s, int cr, struct sk_buff *skb) static int rfcomm_recv_rpn(struct rfcomm_session *s, int cr, int len, struct sk_buff *skb) { + struct rfcomm_dlc *d; struct rfcomm_rpn *rpn = (void *) skb->data; u8 dlci = __get_dlci(rpn->dlci); @@ -1461,6 +1462,12 @@ static int rfcomm_recv_rpn(struct rfcomm_session *s, int cr, int len, struct sk_ dlci, cr, len, rpn->bit_rate, rpn->line_settings, rpn->flow_ctrl, rpn->xon_char, rpn->xoff_char, rpn->param_mask); + d = rfcomm_dlc_get(s, dlci); + if (!d || d->state == BT_DISCONN) { + rfcomm_send_dm(s, dlci); + return 0; + } + if (!cr) return 0; @@ -1551,11 +1558,18 @@ rpn_out: static int rfcomm_recv_rls(struct rfcomm_session *s, int cr, struct sk_buff *skb) { + struct rfcomm_dlc *d; struct rfcomm_rls *rls = (void *) skb->data; u8 dlci = __get_dlci(rls->dlci); BT_DBG("dlci %d cr %d status 0x%x", dlci, cr, rls->status); + d = rfcomm_dlc_get(s, dlci); + if (!d || d->state == BT_DISCONN) { + rfcomm_send_dm(s, dlci); + return 0; + } + if (!cr) return 0; @@ -1577,8 +1591,10 @@ static int rfcomm_recv_msc(struct rfcomm_session *s, int cr, struct sk_buff *skb BT_DBG("dlci %d cr %d v24 0x%x", dlci, cr, msc->v24_sig); d = rfcomm_dlc_get(s, dlci); - if (!d) + if (!d || d->state == BT_DISCONN) { + rfcomm_send_dm(s, dlci); return 0; + } if (cr) { if (msc->v24_sig & RFCOMM_V24_FC && !d->cfc) @@ -1671,7 +1687,7 @@ static int rfcomm_recv_data(struct rfcomm_session *s, u8 dlci, int pf, struct sk BT_DBG("session %p state %ld dlci %d pf %d", s, s->state, dlci, pf); d = rfcomm_dlc_get(s, dlci); - if (!d) { + if (!d || d->state == BT_DISCONN) { rfcomm_send_dm(s, dlci); goto drop; } -- 1.8.1.2 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] bluetooth: rfcomm: Create new session if closing old session 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley 2014-03-13 16:43 ` [PATCH 1/3] bluetooth: rfcomm: Reply with DM after dlc disconnect Peter Hurley @ 2014-03-13 16:43 ` Peter Hurley 2014-03-13 16:43 ` [PATCH 3/3] bluetooth: rfcomm: Defer session teardown after last dlc Peter Hurley 2014-04-02 20:04 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Marcel Holtmann 3 siblings, 0 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-13 16:43 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth, Peter Hurley If the rfcomm session has already sent DISC (thus in BT_DISCONN state), allow dlc opens to create a new session instead of trying to open a dlc in a closing session (and instead of returning an error). Allow multiple rfcomm sessions with the same endpoints as long as the existing sessions are either in BT_DISCONN or BT_CLOSED states. Fixes multi-second delays when running rctest -c on one station and rctest -r on the other. Reported-by: Scott James Remnant <keybuk@chromium.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> --- net/bluetooth/rfcomm/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c index 6aa90c0..292322c 100644 --- a/net/bluetooth/rfcomm/core.c +++ b/net/bluetooth/rfcomm/core.c @@ -695,6 +695,9 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst) s = list_entry(p, struct rfcomm_session, list); chan = l2cap_pi(s->sock->sk)->chan; + if (s->state == BT_DISCONN || s->state == BT_CLOSED) + continue; + if ((!bacmp(src, BDADDR_ANY) || !bacmp(&chan->src, src)) && !bacmp(&chan->dst, dst)) return s; -- 1.8.1.2 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/3] bluetooth: rfcomm: Defer session teardown after last dlc 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley 2014-03-13 16:43 ` [PATCH 1/3] bluetooth: rfcomm: Reply with DM after dlc disconnect Peter Hurley 2014-03-13 16:43 ` [PATCH 2/3] bluetooth: rfcomm: Create new session if closing old session Peter Hurley @ 2014-03-13 16:43 ` Peter Hurley 2014-04-02 20:04 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Marcel Holtmann 3 siblings, 0 replies; 14+ messages in thread From: Peter Hurley @ 2014-03-13 16:43 UTC (permalink / raw) To: Marcel Holtmann, Scott James Remnant; +Cc: linux-bluetooth, Peter Hurley Commit 9cf5b0ea3a7f1432c61029f7aaf4b8b338628884, [Bluetooth] Disconnect L2CAP connection after last RFCOMM DLC, immediately disconnects the session when closing the last dlc. This commit predates the later addition of a session timer [1] which sets up a timer to automatically close the session after the last dlc is closed. This automatic behavior was further noted noted in [2]: When the last RFCOMM data channel is closed, a timer is normally set up to disconnect the control channel at a later time. If the control channel disconnect command is sent with the timer pending, the timer needs to be cancelled. Rely on automatic session disconnect; this fulfills the responsibility of closing the multiplexer while allowing the last dlc to be closed and reopened without needing to handshake a new l2cap connection. [1] commit 9e726b17422bade75fba94e625cd35fd1353e682 Bluetooth: Fix rejected connection not disconnecting ACL link [2] commit 79e654787c67f6b05f73366ff8ccac72ba7249e6 Bluetooth: Clear RFCOMM session timer when disconnecting last channel Signed-off-by: Peter Hurley <peter@hurleysoftware.com> --- net/bluetooth/rfcomm/core.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c index 292322c..062a96c 100644 --- a/net/bluetooth/rfcomm/core.c +++ b/net/bluetooth/rfcomm/core.c @@ -1193,13 +1193,6 @@ static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci) case BT_DISCONN: d->state = BT_CLOSED; __rfcomm_dlc_close(d, 0); - - if (list_empty(&s->dlcs)) { - s->state = BT_DISCONN; - rfcomm_send_disc(s, 0); - rfcomm_session_clear_timer(s); - } - break; } } else { -- 1.8.1.2 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] Fix RFCOMM connect/disconn races 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley ` (2 preceding siblings ...) 2014-03-13 16:43 ` [PATCH 3/3] bluetooth: rfcomm: Defer session teardown after last dlc Peter Hurley @ 2014-04-02 20:04 ` Marcel Holtmann 3 siblings, 0 replies; 14+ messages in thread From: Marcel Holtmann @ 2014-04-02 20:04 UTC (permalink / raw) To: Peter Hurley; +Cc: Scott James Remnant, linux-bluetooth Hi Peter, > This patch series fixes the observed errors when running > rctest -r on one station and rctest -c on the other. > > Please note that patch 2/3 is somewhat risky in that it allows > multiple RFCOMM sessions between the same endpoints to co-exist, > provided that all but one of the sessions is closing or is closed. > I did test this, but there may be some subtlety I've overlooked. > > [I note this because you may prefer the simpler and more bulletproof > solution of returning -EBUSY from __rfcomm_dlc_open() in this > situation; I can understand if you have lost patience for RFCOMM > breakage.] I do not know if we can allow tow sessions with the same endpoint and not causing any other error. Especially also not breaking existing Bluetooth qualification test cases. Would it be safe to take 1/3 and 3/3 and still think about 2/3 a bit more. > This series does not address the excessive latency from sending > the DISC command to receiving the UA reply. I think a partial > solution would be to skip the rfcomm thread and run > rfcomm_process_sessions() directly from rfcomm_l2data_ready(); > this would require some rework where sk->sk_data_ready() is > called directly from contexts that aren't suitable for running > rfcomm_process_sessions(). And this wouldn't address the > USB polling latency at all. I have no idea what is causing this at all. However it could be well that this comes from the time when we moved HCI processing from tasklets into workqueues. And now we have things messing with each other. The long term goal (which nobody seems to be working on) was to replace the L2CAP socket that RFCOMM uses and thus why the rfcommd thread is needed with a direct L2CAP ops handling. Regards Marcel ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-04-02 20:04 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-11 21:44 rctest -c "Can't connect: Device or resource busy (16)" Scott James Remnant 2014-03-11 23:03 ` Marcel Holtmann 2014-03-11 23:37 ` Scott James Remnant 2014-03-12 2:14 ` Marcel Holtmann 2014-03-12 3:01 ` Peter Hurley 2014-03-12 3:27 ` Peter Hurley 2014-03-12 4:27 ` Peter Hurley 2014-03-12 17:18 ` Peter Hurley 2014-03-28 17:49 ` Scott James Remnant 2014-03-13 16:43 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Peter Hurley 2014-03-13 16:43 ` [PATCH 1/3] bluetooth: rfcomm: Reply with DM after dlc disconnect Peter Hurley 2014-03-13 16:43 ` [PATCH 2/3] bluetooth: rfcomm: Create new session if closing old session Peter Hurley 2014-03-13 16:43 ` [PATCH 3/3] bluetooth: rfcomm: Defer session teardown after last dlc Peter Hurley 2014-04-02 20:04 ` [PATCH 0/3] Fix RFCOMM connect/disconn races Marcel Holtmann
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.