public inbox for linux-bluetooth@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
@ 2026-03-02 23:37 Dajid MOREL
  2026-03-03  0:11 ` [v4] " bluez.test.bot
  2026-03-03 17:24 ` [PATCH v4] " Luiz Augusto von Dentz
  0 siblings, 2 replies; 22+ messages in thread
From: Dajid MOREL @ 2026-03-02 23:37 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: luiz.dentz, Dajid MOREL

In an industrial IoT context at Volvo Group, we use TE Connectivity
BLE pressure sensors. These sensors exhibit high latency during
the initial LE connection handshake in noisy RF environments. The
connection systematically fails on Ubuntu Core 22 (BlueZ) because the
connection attempt is aborted too early.

In the v2 thread, it was suggested that userspace (via setsockopt
SO_SNDTIMEO) dictates the connection timeout (defaulting to 40s),
suspecting that userspace was cutting the connection at 2 seconds,
not the kernel.

To verify this, an empirical test was conducted using the following
Python/Bleak script to force the application timeout to 45.0 seconds:

  import asyncio
  from bleak import BleakClient, BleakScanner
  import time

  ADDRESS = "E8:C0:B1:D4:A3:3C"

  async def test_connection():
      device = await BleakScanner.find_device_by_address(ADDRESS, timeout=15.0)
      start_time = time.time()
      try:
          # Forcing 45s timeout in userspace
          async with BleakClient(device, timeout=45.0) as client:
              print(f"Connected in {time.time() - start_time:.2f}s")
      except Exception as e:
          print(f"Failed after {time.time() - start_time:.2f}s: {e}")

  asyncio.run(test_connection())

1. Result on UNMODIFIED Kernel: The userspace script patiently waited
   for the full 45 seconds before raising a TimeoutError. If the kernel
   had actually kept the radio connection attempt alive for those
   45 seconds, the connection would have succeeded around the
   12.5-second mark (as proven by the patched kernel test below).
   The fact that it did not proves that the underlying HCI connection
   attempt was aborted early by the kernel. Userspace was blind to this
   abort and kept waiting in a vacuum.

2. Result on MODIFIED Kernel (with this patch): Using the exact same
   userspace script (45.0s timeout), the connection successfully
   established at the 12.51-second mark.

Conclusion:
This proves that the underlying HCI LE Connection creation is bound by
a strict 2-second timeout derived from `conn_timeout` in `hci_conn.c`,
and that userspace socket options do not override this hardcoded HCI
abort in our stack. The sensor physically takes 12.5 seconds to
handshake, making the 2-second kernel limit a hard blocker.

This patch increases the hardcoded LE connection timeout to 20 seconds
to provide a comfortable margin for handshake retries.

Note: If the upstream preference is to not hardcode 20 seconds globally,
I would be happy to submit a v5 that exposes this as a configurable
module parameter (e.g., `le_conn_timeout`).
---
 net/bluetooth/hci_conn.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
index a47f5daffdbf..7edce4126900 100644
--- a/net/bluetooth/hci_conn.c
+++ b/net/bluetooth/hci_conn.c
@@ -1436,7 +1436,7 @@ struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
 	}
 
 	conn->sec_level = BT_SECURITY_LOW;
-	conn->conn_timeout = conn_timeout;
+	conn->conn_timeout = msecs_to_jiffies(20000);
 	conn->le_adv_phy = phy;
 	conn->le_adv_sec_phy = sec_phy;
 
@@ -1664,7 +1664,7 @@ struct hci_conn *hci_connect_le_scan(struct hci_dev *hdev, bdaddr_t *dst,
 	set_bit(HCI_CONN_SCANNING, &conn->flags);
 	conn->sec_level = BT_SECURITY_LOW;
 	conn->pending_sec_level = sec_level;
-	conn->conn_timeout = conn_timeout;
+	conn->conn_timeout = msecs_to_jiffies(20000);
 	conn->conn_reason = conn_reason;
 
 	hci_update_passive_scan(hdev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* RE: [v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-02 23:37 [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors Dajid MOREL
@ 2026-03-03  0:11 ` bluez.test.bot
  2026-03-03 17:24 ` [PATCH v4] " Luiz Augusto von Dentz
  1 sibling, 0 replies; 22+ messages in thread
From: bluez.test.bot @ 2026-03-03  0:11 UTC (permalink / raw)
  To: linux-bluetooth, dajidp.morel

[-- Attachment #1: Type: text/plain, Size: 2942 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=1060320

---Test result---

Test Summary:
CheckPatch                    PENDING   0.25 seconds
GitLint                       PENDING   0.31 seconds
SubjectPrefix                 PASS      0.12 seconds
BuildKernel                   PASS      26.41 seconds
CheckAllWarning               PASS      28.62 seconds
CheckSparse                   PASS      32.32 seconds
BuildKernel32                 PASS      25.90 seconds
TestRunnerSetup               PASS      560.97 seconds
TestRunner_l2cap-tester       FAIL      32.64 seconds
TestRunner_iso-tester         PASS      104.97 seconds
TestRunner_bnep-tester        PASS      6.24 seconds
TestRunner_mgmt-tester        FAIL      124.87 seconds
TestRunner_rfcomm-tester      PASS      9.68 seconds
TestRunner_sco-tester         FAIL      14.60 seconds
TestRunner_ioctl-tester       PASS      11.50 seconds
TestRunner_mesh-tester        FAIL      12.42 seconds
TestRunner_smp-tester         PASS      8.67 seconds
TestRunner_userchan-tester    PASS      6.74 seconds
IncrementalBuild              PENDING   0.91 seconds

Details
##############################
Test: CheckPatch - PENDING
Desc: Run checkpatch.pl script
Output:

##############################
Test: GitLint - PENDING
Desc: Run gitlint
Output:

##############################
Test: TestRunner_l2cap-tester - FAIL
Desc: Run l2cap-tester with test-runner
Output:
Total: 96, Passed: 94 (97.9%), Failed: 2, Not Run: 0

Failed Test Cases
L2CAP LE Client - Read 32k Success                   Timed out    2.441 seconds
L2CAP LE Client - RX Timestamping 32k                Timed out    1.897 seconds
##############################
Test: TestRunner_mgmt-tester - FAIL
Desc: Run mgmt-tester with test-runner
Output:
Total: 494, Passed: 489 (99.0%), Failed: 1, Not Run: 4

Failed Test Cases
Read Exp Feature - Success                           Failed       0.116 seconds
##############################
Test: TestRunner_sco-tester - FAIL
Desc: Run sco-tester with test-runner
Output:
WARNING: possible circular locking dependency detected
BUG: sleeping function called from invalid context at net/core/sock.c:3782
Total: 30, Passed: 30 (100.0%), Failed: 0, Not Run: 0
##############################
Test: TestRunner_mesh-tester - FAIL
Desc: Run mesh-tester with test-runner
Output:
Total: 10, Passed: 8 (80.0%), Failed: 2, Not Run: 0

Failed Test Cases
Mesh - Send cancel - 1                               Timed out    2.776 seconds
Mesh - Send cancel - 2                               Timed out    1.994 seconds
##############################
Test: IncrementalBuild - PENDING
Desc: Incremental build with the patches in the series
Output:



---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-02 23:37 [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors Dajid MOREL
  2026-03-03  0:11 ` [v4] " bluez.test.bot
@ 2026-03-03 17:24 ` Luiz Augusto von Dentz
  2026-03-03 18:57   ` Dajid Morel
  1 sibling, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-03 17:24 UTC (permalink / raw)
  To: Dajid MOREL; +Cc: linux-bluetooth, Dajid MOREL

Hi Dajid,

On Mon, Mar 2, 2026 at 6:43 PM Dajid MOREL <dajidp.morel@gmail.com> wrote:
>
> In an industrial IoT context at Volvo Group, we use TE Connectivity
> BLE pressure sensors. These sensors exhibit high latency during
> the initial LE connection handshake in noisy RF environments. The
> connection systematically fails on Ubuntu Core 22 (BlueZ) because the
> connection attempt is aborted too early.
>
> In the v2 thread, it was suggested that userspace (via setsockopt
> SO_SNDTIMEO) dictates the connection timeout (defaulting to 40s),
> suspecting that userspace was cutting the connection at 2 seconds,
> not the kernel.
>
> To verify this, an empirical test was conducted using the following
> Python/Bleak script to force the application timeout to 45.0 seconds:
>
>   import asyncio
>   from bleak import BleakClient, BleakScanner
>   import time
>
>   ADDRESS = "E8:C0:B1:D4:A3:3C"
>
>   async def test_connection():
>       device = await BleakScanner.find_device_by_address(ADDRESS, timeout=15.0)
>       start_time = time.time()
>       try:
>           # Forcing 45s timeout in userspace
>           async with BleakClient(device, timeout=45.0) as client:
>               print(f"Connected in {time.time() - start_time:.2f}s")
>       except Exception as e:
>           print(f"Failed after {time.time() - start_time:.2f}s: {e}")
>
>   asyncio.run(test_connection())
>
> 1. Result on UNMODIFIED Kernel: The userspace script patiently waited
>    for the full 45 seconds before raising a TimeoutError. If the kernel
>    had actually kept the radio connection attempt alive for those
>    45 seconds, the connection would have succeeded around the
>    12.5-second mark (as proven by the patched kernel test below).
>    The fact that it did not proves that the underlying HCI connection
>    attempt was aborted early by the kernel. Userspace was blind to this
>    abort and kept waiting in a vacuum.
>
> 2. Result on MODIFIED Kernel (with this patch): Using the exact same
>    userspace script (45.0s timeout), the connection successfully
>    established at the 12.51-second mark.
>
> Conclusion:
> This proves that the underlying HCI LE Connection creation is bound by
> a strict 2-second timeout derived from `conn_timeout` in `hci_conn.c`,
> and that userspace socket options do not override this hardcoded HCI
> abort in our stack. The sensor physically takes 12.5 seconds to
> handshake, making the 2-second kernel limit a hard blocker.

Well except if you can point us where the 2 second timeout is coming
from I don't see how this proves that there is a strict 2-second
timeout, in fact I already point you that in the previous thread, it
seems there is something programming the SO_SNDTIMEO to be 2 seconds
which is why you could only overcome it by hardcoding a 20 sec fixed
timeout, so you are actually introduce a strict timeout yourself with
this change, so application wouldn't be able to set their own timeout
when needed.

> This patch increases the hardcoded LE connection timeout to 20 seconds
> to provide a comfortable margin for handshake retries.
>
> Note: If the upstream preference is to not hardcode 20 seconds globally,
> I would be happy to submit a v5 that exposes this as a configurable
> module parameter (e.g., `le_conn_timeout`).
> ---
>  net/bluetooth/hci_conn.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
> index a47f5daffdbf..7edce4126900 100644
> --- a/net/bluetooth/hci_conn.c
> +++ b/net/bluetooth/hci_conn.c
> @@ -1436,7 +1436,7 @@ struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
>         }
>
>         conn->sec_level = BT_SECURITY_LOW;
> -       conn->conn_timeout = conn_timeout;
> +       conn->conn_timeout = msecs_to_jiffies(20000);
>         conn->le_adv_phy = phy;
>         conn->le_adv_sec_phy = sec_phy;
>
> @@ -1664,7 +1664,7 @@ struct hci_conn *hci_connect_le_scan(struct hci_dev *hdev, bdaddr_t *dst,
>         set_bit(HCI_CONN_SCANNING, &conn->flags);
>         conn->sec_level = BT_SECURITY_LOW;
>         conn->pending_sec_level = sec_level;
> -       conn->conn_timeout = conn_timeout;
> +       conn->conn_timeout = msecs_to_jiffies(20000);
>         conn->conn_reason = conn_reason;
>
>         hci_update_passive_scan(hdev);
> --
> 2.34.1
>


-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-03 17:24 ` [PATCH v4] " Luiz Augusto von Dentz
@ 2026-03-03 18:57   ` Dajid Morel
  2026-03-03 19:26     ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 22+ messages in thread
From: Dajid Morel @ 2026-03-03 18:57 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth, Dajid MOREL

On Tue, Mar 3, 2026 at 6:24 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> Hi Dajid,
> Well except if you can point us where the 2 second timeout is coming
> from I don't see how this proves that there is a strict 2-second
> timeout, in fact I already point you that in the previous thread, it
> seems there is something programming the SO_SNDTIMEO to be 2 seconds
> which is why you could only overcome it by hardcoding a 20 sec fixed
> timeout, so you are actually introduce a strict timeout yourself with
> this change, so application wouldn't be able to set their own timeout
> when needed.

Hi Luiz,

Thank you for your response. Following your suggestion that something
in userspace might be programming SO_SNDTIMEO to 2 seconds, I dug into
the entire stack to verify this.

I completely agree that my v4 patch (hardcoding 20s globally in the
kernel) is architecturally flawed because it breaks SO_SNDTIMEO for
testing tools that legitimately rely on shorter timeouts. I formally
withdraw the v4 patch.

However, regarding the origin of the 2-second timeout in standard use
cases, a deep dive into the stack reveals a gap between the API and
the socket creation:

1. Python/Bleak layer: When an application sets a 45s timeout, it only
sets an internal asyncio timer. The actual command sent via D-Bus
(org.bluez.Device1.Connect) takes no timeout parameter.
2. BlueZ (bluetoothd) layer: A `grep -rn "setsockopt" btio/` in the
BlueZ tree shows that while btio.c configures many socket options
(like L2CAP_LM, BT_SECURITY), it never sets SO_SNDTIMEO for standard
D-Bus clients. A global search confirms SO_SNDTIMEO is only used
within the tools/ directory (e.g., l2cap-tester.c).

This means that when an application requests a connection via D-Bus,
bluetoothd passes the request down and creates the L2CAP socket
without configuring SO_SNDTIMEO.

Because the socket is created "naked" regarding timeouts, it falls
entirely back to the kernel's default behavior, which is governed by
the hardcoded 2-second conn_timeout in hci_conn.c. Userspace is
bypassed and blindly waits for its own 45s timer to expire.

(Note: I have systematically reproduced this exact 2-second abort
issue across different hardware platforms and Bluetooth controllers,
including Raspberry Pi 4, BeagleY-AI, and Rock 4 C+, confirming it is
a core stack limitation, not a vendor-specific firmware quirk).

To fix this properly without touching the kernel, would you accept a
patch to BlueZ (bluetoothd / btio) instead? We could make bluetoothd
explicitly call setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, ...) when
establishing LE connections via D-Bus, drawing the value from a new
configurable parameter in main.conf (e.g., LEConnectionTimeout).

I would be happy to draft this BlueZ userspace patch if you agree this
is the correct architectural approach to unblock industrial D-Bus
clients.

Best regards,

Dajid Morel
Volvo Group


Le mar. 3 mars 2026 à 18:24, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> a écrit :
>
> Hi Dajid,
>
> On Mon, Mar 2, 2026 at 6:43 PM Dajid MOREL <dajidp.morel@gmail.com> wrote:
> >
> > In an industrial IoT context at Volvo Group, we use TE Connectivity
> > BLE pressure sensors. These sensors exhibit high latency during
> > the initial LE connection handshake in noisy RF environments. The
> > connection systematically fails on Ubuntu Core 22 (BlueZ) because the
> > connection attempt is aborted too early.
> >
> > In the v2 thread, it was suggested that userspace (via setsockopt
> > SO_SNDTIMEO) dictates the connection timeout (defaulting to 40s),
> > suspecting that userspace was cutting the connection at 2 seconds,
> > not the kernel.
> >
> > To verify this, an empirical test was conducted using the following
> > Python/Bleak script to force the application timeout to 45.0 seconds:
> >
> >   import asyncio
> >   from bleak import BleakClient, BleakScanner
> >   import time
> >
> >   ADDRESS = "E8:C0:B1:D4:A3:3C"
> >
> >   async def test_connection():
> >       device = await BleakScanner.find_device_by_address(ADDRESS, timeout=15.0)
> >       start_time = time.time()
> >       try:
> >           # Forcing 45s timeout in userspace
> >           async with BleakClient(device, timeout=45.0) as client:
> >               print(f"Connected in {time.time() - start_time:.2f}s")
> >       except Exception as e:
> >           print(f"Failed after {time.time() - start_time:.2f}s: {e}")
> >
> >   asyncio.run(test_connection())
> >
> > 1. Result on UNMODIFIED Kernel: The userspace script patiently waited
> >    for the full 45 seconds before raising a TimeoutError. If the kernel
> >    had actually kept the radio connection attempt alive for those
> >    45 seconds, the connection would have succeeded around the
> >    12.5-second mark (as proven by the patched kernel test below).
> >    The fact that it did not proves that the underlying HCI connection
> >    attempt was aborted early by the kernel. Userspace was blind to this
> >    abort and kept waiting in a vacuum.
> >
> > 2. Result on MODIFIED Kernel (with this patch): Using the exact same
> >    userspace script (45.0s timeout), the connection successfully
> >    established at the 12.51-second mark.
> >
> > Conclusion:
> > This proves that the underlying HCI LE Connection creation is bound by
> > a strict 2-second timeout derived from `conn_timeout` in `hci_conn.c`,
> > and that userspace socket options do not override this hardcoded HCI
> > abort in our stack. The sensor physically takes 12.5 seconds to
> > handshake, making the 2-second kernel limit a hard blocker.
>
> Well except if you can point us where the 2 second timeout is coming
> from I don't see how this proves that there is a strict 2-second
> timeout, in fact I already point you that in the previous thread, it
> seems there is something programming the SO_SNDTIMEO to be 2 seconds
> which is why you could only overcome it by hardcoding a 20 sec fixed
> timeout, so you are actually introduce a strict timeout yourself with
> this change, so application wouldn't be able to set their own timeout
> when needed.
>
> > This patch increases the hardcoded LE connection timeout to 20 seconds
> > to provide a comfortable margin for handshake retries.
> >
> > Note: If the upstream preference is to not hardcode 20 seconds globally,
> > I would be happy to submit a v5 that exposes this as a configurable
> > module parameter (e.g., `le_conn_timeout`).
> > ---
> >  net/bluetooth/hci_conn.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
> > index a47f5daffdbf..7edce4126900 100644
> > --- a/net/bluetooth/hci_conn.c
> > +++ b/net/bluetooth/hci_conn.c
> > @@ -1436,7 +1436,7 @@ struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
> >         }
> >
> >         conn->sec_level = BT_SECURITY_LOW;
> > -       conn->conn_timeout = conn_timeout;
> > +       conn->conn_timeout = msecs_to_jiffies(20000);
> >         conn->le_adv_phy = phy;
> >         conn->le_adv_sec_phy = sec_phy;
> >
> > @@ -1664,7 +1664,7 @@ struct hci_conn *hci_connect_le_scan(struct hci_dev *hdev, bdaddr_t *dst,
> >         set_bit(HCI_CONN_SCANNING, &conn->flags);
> >         conn->sec_level = BT_SECURITY_LOW;
> >         conn->pending_sec_level = sec_level;
> > -       conn->conn_timeout = conn_timeout;
> > +       conn->conn_timeout = msecs_to_jiffies(20000);
> >         conn->conn_reason = conn_reason;
> >
> >         hci_update_passive_scan(hdev);
> > --
> > 2.34.1
> >
>
>
> --
> Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-03 18:57   ` Dajid Morel
@ 2026-03-03 19:26     ` Luiz Augusto von Dentz
  2026-03-03 20:30       ` Dajid Morel
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-03 19:26 UTC (permalink / raw)
  To: Dajid Morel; +Cc: linux-bluetooth, Dajid MOREL

Hi Dajid,

On Tue, Mar 3, 2026 at 1:57 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
>
> On Tue, Mar 3, 2026 at 6:24 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> >
> > Hi Dajid,
> > Well except if you can point us where the 2 second timeout is coming
> > from I don't see how this proves that there is a strict 2-second
> > timeout, in fact I already point you that in the previous thread, it
> > seems there is something programming the SO_SNDTIMEO to be 2 seconds
> > which is why you could only overcome it by hardcoding a 20 sec fixed
> > timeout, so you are actually introduce a strict timeout yourself with
> > this change, so application wouldn't be able to set their own timeout
> > when needed.
>
> Hi Luiz,
>
> Thank you for your response. Following your suggestion that something
> in userspace might be programming SO_SNDTIMEO to 2 seconds, I dug into
> the entire stack to verify this.
>
> I completely agree that my v4 patch (hardcoding 20s globally in the
> kernel) is architecturally flawed because it breaks SO_SNDTIMEO for
> testing tools that legitimately rely on shorter timeouts. I formally
> withdraw the v4 patch.
>
> However, regarding the origin of the 2-second timeout in standard use
> cases, a deep dive into the stack reveals a gap between the API and
> the socket creation:
>
> 1. Python/Bleak layer: When an application sets a 45s timeout, it only
> sets an internal asyncio timer. The actual command sent via D-Bus
> (org.bluez.Device1.Connect) takes no timeout parameter.
> 2. BlueZ (bluetoothd) layer: A `grep -rn "setsockopt" btio/` in the
> BlueZ tree shows that while btio.c configures many socket options
> (like L2CAP_LM, BT_SECURITY), it never sets SO_SNDTIMEO for standard
> D-Bus clients. A global search confirms SO_SNDTIMEO is only used
> within the tools/ directory (e.g., l2cap-tester.c).
>
> This means that when an application requests a connection via D-Bus,
> bluetoothd passes the request down and creates the L2CAP socket
> without configuring SO_SNDTIMEO.
>
> Because the socket is created "naked" regarding timeouts, it falls
> entirely back to the kernel's default behavior, which is governed by
> the hardcoded 2-second conn_timeout in hci_conn.c. Userspace is
> bypassed and blindly waits for its own 45s timer to expire.
>
> (Note: I have systematically reproduced this exact 2-second abort
> issue across different hardware platforms and Bluetooth controllers,
> including Raspberry Pi 4, BeagleY-AI, and Rock 4 C+, confirming it is
> a core stack limitation, not a vendor-specific firmware quirk).
>
> To fix this properly without touching the kernel, would you accept a
> patch to BlueZ (bluetoothd / btio) instead? We could make bluetoothd
> explicitly call setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, ...) when
> establishing LE connections via D-Bus, drawing the value from a new
> configurable parameter in main.conf (e.g., LEConnectionTimeout).
>
> I would be happy to draft this BlueZ userspace patch if you agree this
> is the correct architectural approach to unblock industrial D-Bus
> clients.


memcheck-amd64-[97587]: = src/device.c:device_connect_le() Connection
attempt to: 70:5A:6F:63:B6:41


14:18:45.430382
< HCI Command: LE Set Extended Scan Enable (0x08|0x0042) plen 6


                                               #1 [hci0]
14:18:45.432666
        Extended scan: Disabled (0x00)
        Filter duplicates: Disabled (0x00)
        Duration: 0 msec (0x0000)
        Period: 0.00 sec (0x0000)
> HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #2 [hci0] 14:18:45.533046
      LE Set Extended Scan Enable (0x08|0x0042) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Address Resolution Enable (0x08|0x002d) plen 1


                                               #3 [hci0]
14:18:45.533072
        Address resolution: Disabled (0x00)
> HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #4 [hci0] 14:18:45.534056
      LE Set Address Resolution Enable (0x08|0x002d) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Add Device To Accept List (0x08|0x0011) plen 7


                                               #5 [hci0]
14:18:45.534065
        Address type: Public (0x00)
        Address: 70:5A:6F:63:B6:41 (OUI 70-5A-6F)
> HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #6 [hci0] 14:18:45.535023
      LE Add Device To Accept List (0x08|0x0011) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Address Resolution Enable (0x08|0x002d) plen 1


                                               #7 [hci0]
14:18:45.535030
        Address resolution: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #8 [hci0] 14:18:45.536031
      LE Set Address Resolution Enable (0x08|0x002d) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Extended Scan Parameters (0x08|0x0041) plen 13


                                               #9 [hci0]
14:18:45.536039
        Own address type: Public (0x00)
        Filter policy: Ignore not in accept list (0x01)
        PHYs: 0x05
        Entry 0: LE 1M
          Type: Passive (0x00)
          Interval: 60.000 msec (0x0060)
          Window: 60.000 msec (0x0060)
        Entry 1: LE Coded
          Type: Passive (0x00)
          Interval: 180.000 msec (0x0120)
          Window: 180.000 msec (0x0120)
> HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                        #10 [hci0] 14:18:45.537026
      LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Extended Scan Enable (0x08|0x0042) plen 6


                                              #11 [hci0]
14:18:45.537040
        Extended scan: Enabled (0x01)
        Filter duplicates: Enabled (0x01)
        Duration: 0 msec (0x0000)
        Period: 0.00 sec (0x0000)
> HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                        #12 [hci0] 14:18:45.537969
      LE Set Extended Scan Enable (0x08|0x0042) ncmd 1
        Status: Success (0x00)
@ MGMT Event: Connect Failed (0x000d) plen 8


                                         {0x0001} [hci0]
14:19:25.941624
        LE Address: 70:5A:6F:63:B6:41 (OUI 70-5A-6F)
        Status: Disconnected (0x0e)
memcheck-amd64-[97587]: = src/device.c:att_connect_cb() connect to
70:5A:6F:63:B6:41: Connection refused (111)


14:19:25.943909

That is waiting 40 seconds as expected, so I'm not sure what is
causing it to time out in 2 seconds but that is definitely the
expected behavior.

> Best regards,
>
> Dajid Morel
> Volvo Group
>
>
> Le mar. 3 mars 2026 à 18:24, Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> a écrit :
> >
> > Hi Dajid,
> >
> > On Mon, Mar 2, 2026 at 6:43 PM Dajid MOREL <dajidp.morel@gmail.com> wrote:
> > >
> > > In an industrial IoT context at Volvo Group, we use TE Connectivity
> > > BLE pressure sensors. These sensors exhibit high latency during
> > > the initial LE connection handshake in noisy RF environments. The
> > > connection systematically fails on Ubuntu Core 22 (BlueZ) because the
> > > connection attempt is aborted too early.
> > >
> > > In the v2 thread, it was suggested that userspace (via setsockopt
> > > SO_SNDTIMEO) dictates the connection timeout (defaulting to 40s),
> > > suspecting that userspace was cutting the connection at 2 seconds,
> > > not the kernel.
> > >
> > > To verify this, an empirical test was conducted using the following
> > > Python/Bleak script to force the application timeout to 45.0 seconds:
> > >
> > >   import asyncio
> > >   from bleak import BleakClient, BleakScanner
> > >   import time
> > >
> > >   ADDRESS = "E8:C0:B1:D4:A3:3C"
> > >
> > >   async def test_connection():
> > >       device = await BleakScanner.find_device_by_address(ADDRESS, timeout=15.0)
> > >       start_time = time.time()
> > >       try:
> > >           # Forcing 45s timeout in userspace
> > >           async with BleakClient(device, timeout=45.0) as client:
> > >               print(f"Connected in {time.time() - start_time:.2f}s")
> > >       except Exception as e:
> > >           print(f"Failed after {time.time() - start_time:.2f}s: {e}")
> > >
> > >   asyncio.run(test_connection())
> > >
> > > 1. Result on UNMODIFIED Kernel: The userspace script patiently waited
> > >    for the full 45 seconds before raising a TimeoutError. If the kernel
> > >    had actually kept the radio connection attempt alive for those
> > >    45 seconds, the connection would have succeeded around the
> > >    12.5-second mark (as proven by the patched kernel test below).
> > >    The fact that it did not proves that the underlying HCI connection
> > >    attempt was aborted early by the kernel. Userspace was blind to this
> > >    abort and kept waiting in a vacuum.
> > >
> > > 2. Result on MODIFIED Kernel (with this patch): Using the exact same
> > >    userspace script (45.0s timeout), the connection successfully
> > >    established at the 12.51-second mark.
> > >
> > > Conclusion:
> > > This proves that the underlying HCI LE Connection creation is bound by
> > > a strict 2-second timeout derived from `conn_timeout` in `hci_conn.c`,
> > > and that userspace socket options do not override this hardcoded HCI
> > > abort in our stack. The sensor physically takes 12.5 seconds to
> > > handshake, making the 2-second kernel limit a hard blocker.
> >
> > Well except if you can point us where the 2 second timeout is coming
> > from I don't see how this proves that there is a strict 2-second
> > timeout, in fact I already point you that in the previous thread, it
> > seems there is something programming the SO_SNDTIMEO to be 2 seconds
> > which is why you could only overcome it by hardcoding a 20 sec fixed
> > timeout, so you are actually introduce a strict timeout yourself with
> > this change, so application wouldn't be able to set their own timeout
> > when needed.
> >
> > > This patch increases the hardcoded LE connection timeout to 20 seconds
> > > to provide a comfortable margin for handshake retries.
> > >
> > > Note: If the upstream preference is to not hardcode 20 seconds globally,
> > > I would be happy to submit a v5 that exposes this as a configurable
> > > module parameter (e.g., `le_conn_timeout`).
> > > ---
> > >  net/bluetooth/hci_conn.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
> > > index a47f5daffdbf..7edce4126900 100644
> > > --- a/net/bluetooth/hci_conn.c
> > > +++ b/net/bluetooth/hci_conn.c
> > > @@ -1436,7 +1436,7 @@ struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
> > >         }
> > >
> > >         conn->sec_level = BT_SECURITY_LOW;
> > > -       conn->conn_timeout = conn_timeout;
> > > +       conn->conn_timeout = msecs_to_jiffies(20000);
> > >         conn->le_adv_phy = phy;
> > >         conn->le_adv_sec_phy = sec_phy;
> > >
> > > @@ -1664,7 +1664,7 @@ struct hci_conn *hci_connect_le_scan(struct hci_dev *hdev, bdaddr_t *dst,
> > >         set_bit(HCI_CONN_SCANNING, &conn->flags);
> > >         conn->sec_level = BT_SECURITY_LOW;
> > >         conn->pending_sec_level = sec_level;
> > > -       conn->conn_timeout = conn_timeout;
> > > +       conn->conn_timeout = msecs_to_jiffies(20000);
> > >         conn->conn_reason = conn_reason;
> > >
> > >         hci_update_passive_scan(hdev);
> > > --
> > > 2.34.1
> > >
> >
> >
> > --
> > Luiz Augusto von Dentz



-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-03 19:26     ` Luiz Augusto von Dentz
@ 2026-03-03 20:30       ` Dajid Morel
  2026-03-03 21:12         ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 22+ messages in thread
From: Dajid Morel @ 2026-03-03 20:30 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth, Dajid MOREL

On Tue, Mar 3, 2026 at 8:26 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> That is waiting 40 seconds as expected, so I'm not sure what is
> causing it to time out in 2 seconds but that is definitely the
> expected behavior.

Hi Luiz,

Thank you for providing those logs. Seeing the 40.5-second delta in
your environment is very insightful and confirms that the standard
stack should wait much longer than what I am observing.

I have finally identified the root cause of the 2-second abort in my
setup. My environment uses industrial TE Connectivity M5600 sensors,
which are designed for ultra-low power consumption with a long
advertising interval of 5 seconds.

After auditing the kernel source, I found that HCI_CMD_TIMEOUT is
hardcoded to 2.0 seconds (#define HCI_CMD_TIMEOUT
msecs_to_jiffies(2000)).

When the kernel issues HCI_OP_LE_CREATE_CONN, the local controller
(Broadcom on RPi4 or Rockchip on Rock 4 C+) must wait for the next
advertisement from the sensor to proceed with the connection. Since
the M5600 only wakes up every 5s, the 2-second HCI_CMD_TIMEOUT
systematically triggers before the controller can receive the
advertisement and acknowledge the command completion. This leads to an
immediate abort, even if the sensor is physically next to a high-gain
antenna (9.4dBi).

This explains why my v4 patch (forcing conn_timeout to 20s) worked as
a side-effect: it kept the connection structure alive just long enough
to bypass the immediate impact of the HCI command timeout, but it was
architecturally the wrong target.

I officially withdraw this patch series.

However, this 2-second hardcoded limit for HCI_CMD_TIMEOUT seems
fundamentally incompatible with many industrial low-duty-cycle
sensors. Many developers on various forums resort to kernel hacks to
bypass this.

Would you consider a patch that either:
1. Increases HCI_CMD_TIMEOUT globally to 5 or 10 seconds?
2. Or makes the LE connection command timeout specifically
configurable via the Management API or main.conf?

I would like to work on a cleaner solution that accommodates these
low-power industrial sleep cycles without breaking existing tools.

Best regards,

Dajid Morel
Volvo Group


Le mar. 3 mars 2026 à 20:26, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> a écrit :
>
> Hi Dajid,
>
> On Tue, Mar 3, 2026 at 1:57 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
> >
> > On Tue, Mar 3, 2026 at 6:24 PM Luiz Augusto von Dentz
> > <luiz.dentz@gmail.com> wrote:
> > >
> > > Hi Dajid,
> > > Well except if you can point us where the 2 second timeout is coming
> > > from I don't see how this proves that there is a strict 2-second
> > > timeout, in fact I already point you that in the previous thread, it
> > > seems there is something programming the SO_SNDTIMEO to be 2 seconds
> > > which is why you could only overcome it by hardcoding a 20 sec fixed
> > > timeout, so you are actually introduce a strict timeout yourself with
> > > this change, so application wouldn't be able to set their own timeout
> > > when needed.
> >
> > Hi Luiz,
> >
> > Thank you for your response. Following your suggestion that something
> > in userspace might be programming SO_SNDTIMEO to 2 seconds, I dug into
> > the entire stack to verify this.
> >
> > I completely agree that my v4 patch (hardcoding 20s globally in the
> > kernel) is architecturally flawed because it breaks SO_SNDTIMEO for
> > testing tools that legitimately rely on shorter timeouts. I formally
> > withdraw the v4 patch.
> >
> > However, regarding the origin of the 2-second timeout in standard use
> > cases, a deep dive into the stack reveals a gap between the API and
> > the socket creation:
> >
> > 1. Python/Bleak layer: When an application sets a 45s timeout, it only
> > sets an internal asyncio timer. The actual command sent via D-Bus
> > (org.bluez.Device1.Connect) takes no timeout parameter.
> > 2. BlueZ (bluetoothd) layer: A `grep -rn "setsockopt" btio/` in the
> > BlueZ tree shows that while btio.c configures many socket options
> > (like L2CAP_LM, BT_SECURITY), it never sets SO_SNDTIMEO for standard
> > D-Bus clients. A global search confirms SO_SNDTIMEO is only used
> > within the tools/ directory (e.g., l2cap-tester.c).
> >
> > This means that when an application requests a connection via D-Bus,
> > bluetoothd passes the request down and creates the L2CAP socket
> > without configuring SO_SNDTIMEO.
> >
> > Because the socket is created "naked" regarding timeouts, it falls
> > entirely back to the kernel's default behavior, which is governed by
> > the hardcoded 2-second conn_timeout in hci_conn.c. Userspace is
> > bypassed and blindly waits for its own 45s timer to expire.
> >
> > (Note: I have systematically reproduced this exact 2-second abort
> > issue across different hardware platforms and Bluetooth controllers,
> > including Raspberry Pi 4, BeagleY-AI, and Rock 4 C+, confirming it is
> > a core stack limitation, not a vendor-specific firmware quirk).
> >
> > To fix this properly without touching the kernel, would you accept a
> > patch to BlueZ (bluetoothd / btio) instead? We could make bluetoothd
> > explicitly call setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, ...) when
> > establishing LE connections via D-Bus, drawing the value from a new
> > configurable parameter in main.conf (e.g., LEConnectionTimeout).
> >
> > I would be happy to draft this BlueZ userspace patch if you agree this
> > is the correct architectural approach to unblock industrial D-Bus
> > clients.
>
>
> memcheck-amd64-[97587]: = src/device.c:device_connect_le() Connection
> attempt to: 70:5A:6F:63:B6:41
>
>
> 14:18:45.430382
> < HCI Command: LE Set Extended Scan Enable (0x08|0x0042) plen 6
>
>
>                                                #1 [hci0]
> 14:18:45.432666
>         Extended scan: Disabled (0x00)
>         Filter duplicates: Disabled (0x00)
>         Duration: 0 msec (0x0000)
>         Period: 0.00 sec (0x0000)
> > HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #2 [hci0] 14:18:45.533046
>       LE Set Extended Scan Enable (0x08|0x0042) ncmd 1
>         Status: Success (0x00)
> < HCI Command: LE Set Address Resolution Enable (0x08|0x002d) plen 1
>
>
>                                                #3 [hci0]
> 14:18:45.533072
>         Address resolution: Disabled (0x00)
> > HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #4 [hci0] 14:18:45.534056
>       LE Set Address Resolution Enable (0x08|0x002d) ncmd 1
>         Status: Success (0x00)
> < HCI Command: LE Add Device To Accept List (0x08|0x0011) plen 7
>
>
>                                                #5 [hci0]
> 14:18:45.534065
>         Address type: Public (0x00)
>         Address: 70:5A:6F:63:B6:41 (OUI 70-5A-6F)
> > HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #6 [hci0] 14:18:45.535023
>       LE Add Device To Accept List (0x08|0x0011) ncmd 1
>         Status: Success (0x00)
> < HCI Command: LE Set Address Resolution Enable (0x08|0x002d) plen 1
>
>
>                                                #7 [hci0]
> 14:18:45.535030
>         Address resolution: Enabled (0x01)
> > HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                         #8 [hci0] 14:18:45.536031
>       LE Set Address Resolution Enable (0x08|0x002d) ncmd 1
>         Status: Success (0x00)
> < HCI Command: LE Set Extended Scan Parameters (0x08|0x0041) plen 13
>
>
>                                                #9 [hci0]
> 14:18:45.536039
>         Own address type: Public (0x00)
>         Filter policy: Ignore not in accept list (0x01)
>         PHYs: 0x05
>         Entry 0: LE 1M
>           Type: Passive (0x00)
>           Interval: 60.000 msec (0x0060)
>           Window: 60.000 msec (0x0060)
>         Entry 1: LE Coded
>           Type: Passive (0x00)
>           Interval: 180.000 msec (0x0120)
>           Window: 180.000 msec (0x0120)
> > HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                        #10 [hci0] 14:18:45.537026
>       LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1
>         Status: Success (0x00)
> < HCI Command: LE Set Extended Scan Enable (0x08|0x0042) plen 6
>
>
>                                               #11 [hci0]
> 14:18:45.537040
>         Extended scan: Enabled (0x01)
>         Filter duplicates: Enabled (0x01)
>         Duration: 0 msec (0x0000)
>         Period: 0.00 sec (0x0000)
> > HCI Event: Command Complete (0x0e) plen 4                                                                                                                                                                                                                        #12 [hci0] 14:18:45.537969
>       LE Set Extended Scan Enable (0x08|0x0042) ncmd 1
>         Status: Success (0x00)
> @ MGMT Event: Connect Failed (0x000d) plen 8
>
>
>                                          {0x0001} [hci0]
> 14:19:25.941624
>         LE Address: 70:5A:6F:63:B6:41 (OUI 70-5A-6F)
>         Status: Disconnected (0x0e)
> memcheck-amd64-[97587]: = src/device.c:att_connect_cb() connect to
> 70:5A:6F:63:B6:41: Connection refused (111)
>
>
> 14:19:25.943909
>
> That is waiting 40 seconds as expected, so I'm not sure what is
> causing it to time out in 2 seconds but that is definitely the
> expected behavior.
>
> > Best regards,
> >
> > Dajid Morel
> > Volvo Group
> >
> >
> > Le mar. 3 mars 2026 à 18:24, Luiz Augusto von Dentz
> > <luiz.dentz@gmail.com> a écrit :
> > >
> > > Hi Dajid,
> > >
> > > On Mon, Mar 2, 2026 at 6:43 PM Dajid MOREL <dajidp.morel@gmail.com> wrote:
> > > >
> > > > In an industrial IoT context at Volvo Group, we use TE Connectivity
> > > > BLE pressure sensors. These sensors exhibit high latency during
> > > > the initial LE connection handshake in noisy RF environments. The
> > > > connection systematically fails on Ubuntu Core 22 (BlueZ) because the
> > > > connection attempt is aborted too early.
> > > >
> > > > In the v2 thread, it was suggested that userspace (via setsockopt
> > > > SO_SNDTIMEO) dictates the connection timeout (defaulting to 40s),
> > > > suspecting that userspace was cutting the connection at 2 seconds,
> > > > not the kernel.
> > > >
> > > > To verify this, an empirical test was conducted using the following
> > > > Python/Bleak script to force the application timeout to 45.0 seconds:
> > > >
> > > >   import asyncio
> > > >   from bleak import BleakClient, BleakScanner
> > > >   import time
> > > >
> > > >   ADDRESS = "E8:C0:B1:D4:A3:3C"
> > > >
> > > >   async def test_connection():
> > > >       device = await BleakScanner.find_device_by_address(ADDRESS, timeout=15.0)
> > > >       start_time = time.time()
> > > >       try:
> > > >           # Forcing 45s timeout in userspace
> > > >           async with BleakClient(device, timeout=45.0) as client:
> > > >               print(f"Connected in {time.time() - start_time:.2f}s")
> > > >       except Exception as e:
> > > >           print(f"Failed after {time.time() - start_time:.2f}s: {e}")
> > > >
> > > >   asyncio.run(test_connection())
> > > >
> > > > 1. Result on UNMODIFIED Kernel: The userspace script patiently waited
> > > >    for the full 45 seconds before raising a TimeoutError. If the kernel
> > > >    had actually kept the radio connection attempt alive for those
> > > >    45 seconds, the connection would have succeeded around the
> > > >    12.5-second mark (as proven by the patched kernel test below).
> > > >    The fact that it did not proves that the underlying HCI connection
> > > >    attempt was aborted early by the kernel. Userspace was blind to this
> > > >    abort and kept waiting in a vacuum.
> > > >
> > > > 2. Result on MODIFIED Kernel (with this patch): Using the exact same
> > > >    userspace script (45.0s timeout), the connection successfully
> > > >    established at the 12.51-second mark.
> > > >
> > > > Conclusion:
> > > > This proves that the underlying HCI LE Connection creation is bound by
> > > > a strict 2-second timeout derived from `conn_timeout` in `hci_conn.c`,
> > > > and that userspace socket options do not override this hardcoded HCI
> > > > abort in our stack. The sensor physically takes 12.5 seconds to
> > > > handshake, making the 2-second kernel limit a hard blocker.
> > >
> > > Well except if you can point us where the 2 second timeout is coming
> > > from I don't see how this proves that there is a strict 2-second
> > > timeout, in fact I already point you that in the previous thread, it
> > > seems there is something programming the SO_SNDTIMEO to be 2 seconds
> > > which is why you could only overcome it by hardcoding a 20 sec fixed
> > > timeout, so you are actually introduce a strict timeout yourself with
> > > this change, so application wouldn't be able to set their own timeout
> > > when needed.
> > >
> > > > This patch increases the hardcoded LE connection timeout to 20 seconds
> > > > to provide a comfortable margin for handshake retries.
> > > >
> > > > Note: If the upstream preference is to not hardcode 20 seconds globally,
> > > > I would be happy to submit a v5 that exposes this as a configurable
> > > > module parameter (e.g., `le_conn_timeout`).
> > > > ---
> > > >  net/bluetooth/hci_conn.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
> > > > index a47f5daffdbf..7edce4126900 100644
> > > > --- a/net/bluetooth/hci_conn.c
> > > > +++ b/net/bluetooth/hci_conn.c
> > > > @@ -1436,7 +1436,7 @@ struct hci_conn *hci_connect_le(struct hci_dev *hdev, bdaddr_t *dst,
> > > >         }
> > > >
> > > >         conn->sec_level = BT_SECURITY_LOW;
> > > > -       conn->conn_timeout = conn_timeout;
> > > > +       conn->conn_timeout = msecs_to_jiffies(20000);
> > > >         conn->le_adv_phy = phy;
> > > >         conn->le_adv_sec_phy = sec_phy;
> > > >
> > > > @@ -1664,7 +1664,7 @@ struct hci_conn *hci_connect_le_scan(struct hci_dev *hdev, bdaddr_t *dst,
> > > >         set_bit(HCI_CONN_SCANNING, &conn->flags);
> > > >         conn->sec_level = BT_SECURITY_LOW;
> > > >         conn->pending_sec_level = sec_level;
> > > > -       conn->conn_timeout = conn_timeout;
> > > > +       conn->conn_timeout = msecs_to_jiffies(20000);
> > > >         conn->conn_reason = conn_reason;
> > > >
> > > >         hci_update_passive_scan(hdev);
> > > > --
> > > > 2.34.1
> > > >
> > >
> > >
> > > --
> > > Luiz Augusto von Dentz
>
>
>
> --
> Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-03 20:30       ` Dajid Morel
@ 2026-03-03 21:12         ` Luiz Augusto von Dentz
  2026-03-06  7:15           ` Dajid Morel
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-03 21:12 UTC (permalink / raw)
  To: Dajid Morel; +Cc: linux-bluetooth, Dajid MOREL

Hi Dajid,

On Tue, Mar 3, 2026 at 3:31 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
>
> On Tue, Mar 3, 2026 at 8:26 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> >
> > That is waiting 40 seconds as expected, so I'm not sure what is
> > causing it to time out in 2 seconds but that is definitely the
> > expected behavior.
>
> Hi Luiz,
>
> Thank you for providing those logs. Seeing the 40.5-second delta in
> your environment is very insightful and confirms that the standard
> stack should wait much longer than what I am observing.
>
> I have finally identified the root cause of the 2-second abort in my
> setup. My environment uses industrial TE Connectivity M5600 sensors,
> which are designed for ultra-low power consumption with a long
> advertising interval of 5 seconds.
>
> After auditing the kernel source, I found that HCI_CMD_TIMEOUT is
> hardcoded to 2.0 seconds (#define HCI_CMD_TIMEOUT
> msecs_to_jiffies(2000)).
>
> When the kernel issues HCI_OP_LE_CREATE_CONN, the local controller
> (Broadcom on RPi4 or Rockchip on Rock 4 C+) must wait for the next
> advertisement from the sensor to proceed with the connection. Since
> the M5600 only wakes up every 5s, the 2-second HCI_CMD_TIMEOUT
> systematically triggers before the controller can receive the
> advertisement and acknowledge the command completion. This leads to an
> immediate abort, even if the sensor is physically next to a high-gain
> antenna (9.4dBi).
>
> This explains why my v4 patch (forcing conn_timeout to 20s) worked as
> a side-effect: it kept the connection structure alive just long enough
> to bypass the immediate impact of the HCI command timeout, but it was
> architecturally the wrong target.
>
> I officially withdraw this patch series.
>
> However, this 2-second hardcoded limit for HCI_CMD_TIMEOUT seems
> fundamentally incompatible with many industrial low-duty-cycle
> sensors. Many developers on various forums resort to kernel hacks to
> bypass this.
>
> Would you consider a patch that either:
> 1. Increases HCI_CMD_TIMEOUT globally to 5 or 10 seconds?
> 2. Or makes the LE connection command timeout specifically
> configurable via the Management API or main.conf?
>
> I would like to work on a cleaner solution that accommodates these
> low-power industrial sleep cycles without breaking existing tools.

What kernel version are you seeing this behavior? We no longer use
HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:

https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673

It was changed some 4 years back, so it quite an old change even for
stable kernel:

https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-03 21:12         ` Luiz Augusto von Dentz
@ 2026-03-06  7:15           ` Dajid Morel
  2026-03-06 14:26             ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 22+ messages in thread
From: Dajid Morel @ 2026-03-06  7:15 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth, Dajid MOREL

On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
> What kernel version are you seeing this behavior? We no longer use
> HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:
> https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673
>
> It was changed some 4 years back, so it quite an old change even for
> stable kernel:
> https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a

Hi Luiz,

To answer your question, our industrial environment runs on Ubuntu Core 22,
which uses the LTS kernel 5.15.0-1096-raspi (aarch64).

Thank you for the detailed logs and for pointing out commit
a56a1138cbd85e4d565356199d60e1cb94e5a77a.

I understand that HCI_OP_LE_CREATE_CONN itself has been decoupled from
HCI_CMD_TIMEOUT. However, I have conducted a deep code analysis on the
current bluetooth-next tree combined with an isolated empirical test on
our hardware, which strongly suggests that HCI_CMD_TIMEOUT is still the
root cause of the abort in our industrial use case.

Here are the facts and the methodology used to verify it.

1. Code analysis on bluetooth-next: HCI_CMD_TIMEOUT is still widely used

While the specific create connection command may use a different timeout,
the entire connection setup sequence relies on multiple synchronous HCI
commands. A simple grep on the current bluetooth-next tree shows extensive
usage of this 2-second limit:

$ grep -rn "HCI_CMD_TIMEOUT" net/bluetooth/ | wc -l
150

$ grep -rn "HCI_CMD_TIMEOUT" net/bluetooth/ | cut -d: -f1 | uniq -c | sort -nr
    124 net/bluetooth/hci_sync.c
      7 net/bluetooth/msft.c
      6 net/bluetooth/hci_core.c
      3 net/bluetooth/hci_conn.c

With 124 occurrences in hci_sync.c alone, many preparatory commands
(e.g. HCI_OP_LE_ADD_TO_ACCEPT_LIST, HCI_OP_LE_SET_EXT_SCAN_ENABLE,
which are visible in the memcheck logs you provided) rely on
__hci_cmd_sync_sk(), which falls back to the hardcoded
HCI_CMD_TIMEOUT (2000 ms).

2. Empirical test methodology

To verify that this global timeout is the limiting factor for our
5-second advertising interval sensors, we performed an isolated test
in our environment.

Environment:
Ubuntu Core 22 / Kernel LTS 5.15.0-1096-raspi (aarch64)

Sensor:
TE Connectivity M5600 (5 s advertising interval, ~12.5 s handshake time)

Action:
All previous patches were reverted, including the withdrawn v4 patch on
conn_timeout. We modified only the global definition in
include/net/bluetooth/hci.h:

--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@
-#define HCI_CMD_TIMEOUT msecs_to_jiffies(2000)
+#define HCI_CMD_TIMEOUT msecs_to_jiffies(15000)

Build process:
To avoid any userspace interference, we rebuilt the kernel natively as an
immutable Snap and generated a custom Ubuntu Core OS image using
snapcraft pack and ubuntu-image.

3. Test results

We modified only the global definition in include/net/bluetooth/hci.h and
observed the exact behavioral threshold.

Phase 1 (2000 ms – unmodified):
The connection attempt is aborted almost immediately and silently at the
HCI level. Userspace applications remain unaware and continue waiting,
which explains the ~45 s stall observed in our previous Python test.

Phase 2 (10000 ms):
The kernel allows the connection sequence to progress further, but the
sensor requires ~12.5 s to complete the handshake. The kernel timeout
therefore triggers right before completion. For the first time our
userspace daemon logged explicit "[BLE] Disconnected" events, showing
that the kernel actively aborted the handshake at the 10 s mark.

Phase 3 (15000 ms):
Once the kernel timeout exceeded the sensor response time, the connection
succeeded reliably. The full handshake consistently took ~12.5 seconds.

Conclusion

These observations suggest that even though HCI_OP_LE_CREATE_CONN itself
no longer relies on HCI_CMD_TIMEOUT, the overall connection sequence is
still constrained by synchronous preparatory commands in hci_sync.c that
use this timeout.

Because our sensors advertise only every 5 seconds, the state machine
appears to hit this limit before the full sequence can complete.

Since increasing HCI_CMD_TIMEOUT globally to ~15 seconds in the upstream
kernel may be too aggressive for other environments, what would be the
recommended approach from the BlueZ maintainers to support LE devices
with advertising intervals greater than 2 seconds?

Would it be acceptable to make this synchronization timeout configurable,
for example through sysfs or the mgmt API?

Best regards,

Dajid Morel
Volvo Group

Le mar. 3 mars 2026 à 22:12, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> a écrit :
>
> Hi Dajid,
>
> On Tue, Mar 3, 2026 at 3:31 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
> >
> > On Tue, Mar 3, 2026 at 8:26 PM Luiz Augusto von Dentz
> > <luiz.dentz@gmail.com> wrote:
> > >
> > > That is waiting 40 seconds as expected, so I'm not sure what is
> > > causing it to time out in 2 seconds but that is definitely the
> > > expected behavior.
> >
> > Hi Luiz,
> >
> > Thank you for providing those logs. Seeing the 40.5-second delta in
> > your environment is very insightful and confirms that the standard
> > stack should wait much longer than what I am observing.
> >
> > I have finally identified the root cause of the 2-second abort in my
> > setup. My environment uses industrial TE Connectivity M5600 sensors,
> > which are designed for ultra-low power consumption with a long
> > advertising interval of 5 seconds.
> >
> > After auditing the kernel source, I found that HCI_CMD_TIMEOUT is
> > hardcoded to 2.0 seconds (#define HCI_CMD_TIMEOUT
> > msecs_to_jiffies(2000)).
> >
> > When the kernel issues HCI_OP_LE_CREATE_CONN, the local controller
> > (Broadcom on RPi4 or Rockchip on Rock 4 C+) must wait for the next
> > advertisement from the sensor to proceed with the connection. Since
> > the M5600 only wakes up every 5s, the 2-second HCI_CMD_TIMEOUT
> > systematically triggers before the controller can receive the
> > advertisement and acknowledge the command completion. This leads to an
> > immediate abort, even if the sensor is physically next to a high-gain
> > antenna (9.4dBi).
> >
> > This explains why my v4 patch (forcing conn_timeout to 20s) worked as
> > a side-effect: it kept the connection structure alive just long enough
> > to bypass the immediate impact of the HCI command timeout, but it was
> > architecturally the wrong target.
> >
> > I officially withdraw this patch series.
> >
> > However, this 2-second hardcoded limit for HCI_CMD_TIMEOUT seems
> > fundamentally incompatible with many industrial low-duty-cycle
> > sensors. Many developers on various forums resort to kernel hacks to
> > bypass this.
> >
> > Would you consider a patch that either:
> > 1. Increases HCI_CMD_TIMEOUT globally to 5 or 10 seconds?
> > 2. Or makes the LE connection command timeout specifically
> > configurable via the Management API or main.conf?
> >
> > I would like to work on a cleaner solution that accommodates these
> > low-power industrial sleep cycles without breaking existing tools.
>
> What kernel version are you seeing this behavior? We no longer use
> HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:
>
> https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673
>
> It was changed some 4 years back, so it quite an old change even for
> stable kernel:
>
> https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-06  7:15           ` Dajid Morel
@ 2026-03-06 14:26             ` Luiz Augusto von Dentz
       [not found]               ` <CAM8DPm2z-6xUm3SyFJ9umn4=o9bBov6PhKV0TEDCBc14eMFSew@mail.gmail.com>
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-06 14:26 UTC (permalink / raw)
  To: Dajid Morel; +Cc: linux-bluetooth, Dajid MOREL

Hi Dajid,

On Fri, Mar 6, 2026 at 2:15 AM Dajid Morel <dajidp.morel@gmail.com> wrote:
>
> On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> > What kernel version are you seeing this behavior? We no longer use
> > HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:
> > https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673
> >
> > It was changed some 4 years back, so it quite an old change even for
> > stable kernel:
> > https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
>
> Hi Luiz,
>
> To answer your question, our industrial environment runs on Ubuntu Core 22,
> which uses the LTS kernel 5.15.0-1096-raspi (aarch64).
>
> Thank you for the detailed logs and for pointing out commit
> a56a1138cbd85e4d565356199d60e1cb94e5a77a.
>
> I understand that HCI_OP_LE_CREATE_CONN itself has been decoupled from
> HCI_CMD_TIMEOUT. However, I have conducted a deep code analysis on the
> current bluetooth-next tree combined with an isolated empirical test on
> our hardware, which strongly suggests that HCI_CMD_TIMEOUT is still the
> root cause of the abort in our industrial use case.
>
> Here are the facts and the methodology used to verify it.
>
> 1. Code analysis on bluetooth-next: HCI_CMD_TIMEOUT is still widely used
>
> While the specific create connection command may use a different timeout,
> the entire connection setup sequence relies on multiple synchronous HCI
> commands. A simple grep on the current bluetooth-next tree shows extensive
> usage of this 2-second limit:
>
> $ grep -rn "HCI_CMD_TIMEOUT" net/bluetooth/ | wc -l
> 150
>
> $ grep -rn "HCI_CMD_TIMEOUT" net/bluetooth/ | cut -d: -f1 | uniq -c | sort -nr
>     124 net/bluetooth/hci_sync.c
>       7 net/bluetooth/msft.c
>       6 net/bluetooth/hci_core.c
>       3 net/bluetooth/hci_conn.c
>
> With 124 occurrences in hci_sync.c alone, many preparatory commands
> (e.g. HCI_OP_LE_ADD_TO_ACCEPT_LIST, HCI_OP_LE_SET_EXT_SCAN_ENABLE,
> which are visible in the memcheck logs you provided) rely on
> __hci_cmd_sync_sk(), which falls back to the hardcoded
> HCI_CMD_TIMEOUT (2000 ms).
>
> 2. Empirical test methodology
>
> To verify that this global timeout is the limiting factor for our
> 5-second advertising interval sensors, we performed an isolated test
> in our environment.
>
> Environment:
> Ubuntu Core 22 / Kernel LTS 5.15.0-1096-raspi (aarch64)
>
> Sensor:
> TE Connectivity M5600 (5 s advertising interval, ~12.5 s handshake time)
>
> Action:
> All previous patches were reverted, including the withdrawn v4 patch on
> conn_timeout. We modified only the global definition in
> include/net/bluetooth/hci.h:
>
> --- a/include/net/bluetooth/hci.h
> +++ b/include/net/bluetooth/hci.h
> @@
> -#define HCI_CMD_TIMEOUT msecs_to_jiffies(2000)
> +#define HCI_CMD_TIMEOUT msecs_to_jiffies(15000)
>
> Build process:
> To avoid any userspace interference, we rebuilt the kernel natively as an
> immutable Snap and generated a custom Ubuntu Core OS image using
> snapcraft pack and ubuntu-image.
>
> 3. Test results
>
> We modified only the global definition in include/net/bluetooth/hci.h and
> observed the exact behavioral threshold.
>
> Phase 1 (2000 ms – unmodified):
> The connection attempt is aborted almost immediately and silently at the
> HCI level. Userspace applications remain unaware and continue waiting,
> which explains the ~45 s stall observed in our previous Python test.
>
> Phase 2 (10000 ms):
> The kernel allows the connection sequence to progress further, but the
> sensor requires ~12.5 s to complete the handshake. The kernel timeout
> therefore triggers right before completion. For the first time our
> userspace daemon logged explicit "[BLE] Disconnected" events, showing
> that the kernel actively aborted the handshake at the 10 s mark.
>
> Phase 3 (15000 ms):
> Once the kernel timeout exceeded the sensor response time, the connection
> succeeded reliably. The full handshake consistently took ~12.5 seconds.
>
> Conclusion
>
> These observations suggest that even though HCI_OP_LE_CREATE_CONN itself
> no longer relies on HCI_CMD_TIMEOUT, the overall connection sequence is
> still constrained by synchronous preparatory commands in hci_sync.c that
> use this timeout.
>
> Because our sensors advertise only every 5 seconds, the state machine
> appears to hit this limit before the full sequence can complete.
>
> Since increasing HCI_CMD_TIMEOUT globally to ~15 seconds in the upstream
> kernel may be too aggressive for other environments, what would be the
> recommended approach from the BlueZ maintainers to support LE devices
> with advertising intervals greater than 2 seconds?
>
> Would it be acceptable to make this synchronization timeout configurable,
> for example through sysfs or the mgmt API?

Im talking to an AI model/agent? However, it does look like the above
was generated by an AI model that is only checking the timeout used in
the commands without knowing the command sequence performed when
attempting a connection. Specifically for commands that report status
the timeout is short because the controller only needs to confirm it
received and understood the command. In fact, most commands behave
this way since they really need to generate a command complete or
status as soon as possible; otherwise, the host wouldn't be able to
continue sending the next command. Therefore, the rambling about the
usage of HCI_CMD_TIMEOUT is nonsense.

Regarding the actual problem, try using something newer, 5.15 might
not actually contain the necessary changes to wait an arbitrary amount
of time for the connection to complete.

> Best regards,
>
> Dajid Morel
> Volvo Group
>
> Le mar. 3 mars 2026 à 22:12, Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> a écrit :
> >
> > Hi Dajid,
> >
> > On Tue, Mar 3, 2026 at 3:31 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
> > >
> > > On Tue, Mar 3, 2026 at 8:26 PM Luiz Augusto von Dentz
> > > <luiz.dentz@gmail.com> wrote:
> > > >
> > > > That is waiting 40 seconds as expected, so I'm not sure what is
> > > > causing it to time out in 2 seconds but that is definitely the
> > > > expected behavior.
> > >
> > > Hi Luiz,
> > >
> > > Thank you for providing those logs. Seeing the 40.5-second delta in
> > > your environment is very insightful and confirms that the standard
> > > stack should wait much longer than what I am observing.
> > >
> > > I have finally identified the root cause of the 2-second abort in my
> > > setup. My environment uses industrial TE Connectivity M5600 sensors,
> > > which are designed for ultra-low power consumption with a long
> > > advertising interval of 5 seconds.
> > >
> > > After auditing the kernel source, I found that HCI_CMD_TIMEOUT is
> > > hardcoded to 2.0 seconds (#define HCI_CMD_TIMEOUT
> > > msecs_to_jiffies(2000)).
> > >
> > > When the kernel issues HCI_OP_LE_CREATE_CONN, the local controller
> > > (Broadcom on RPi4 or Rockchip on Rock 4 C+) must wait for the next
> > > advertisement from the sensor to proceed with the connection. Since
> > > the M5600 only wakes up every 5s, the 2-second HCI_CMD_TIMEOUT
> > > systematically triggers before the controller can receive the
> > > advertisement and acknowledge the command completion. This leads to an
> > > immediate abort, even if the sensor is physically next to a high-gain
> > > antenna (9.4dBi).
> > >
> > > This explains why my v4 patch (forcing conn_timeout to 20s) worked as
> > > a side-effect: it kept the connection structure alive just long enough
> > > to bypass the immediate impact of the HCI command timeout, but it was
> > > architecturally the wrong target.
> > >
> > > I officially withdraw this patch series.
> > >
> > > However, this 2-second hardcoded limit for HCI_CMD_TIMEOUT seems
> > > fundamentally incompatible with many industrial low-duty-cycle
> > > sensors. Many developers on various forums resort to kernel hacks to
> > > bypass this.
> > >
> > > Would you consider a patch that either:
> > > 1. Increases HCI_CMD_TIMEOUT globally to 5 or 10 seconds?
> > > 2. Or makes the LE connection command timeout specifically
> > > configurable via the Management API or main.conf?
> > >
> > > I would like to work on a cleaner solution that accommodates these
> > > low-power industrial sleep cycles without breaking existing tools.
> >
> > What kernel version are you seeing this behavior? We no longer use
> > HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:
> >
> > https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673
> >
> > It was changed some 4 years back, so it quite an old change even for
> > stable kernel:
> >
> > https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a



-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
       [not found]               ` <CAM8DPm2z-6xUm3SyFJ9umn4=o9bBov6PhKV0TEDCBc14eMFSew@mail.gmail.com>
@ 2026-03-06 15:57                 ` Luiz Augusto von Dentz
  2026-03-06 17:54                   ` Dajid Morel
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-06 15:57 UTC (permalink / raw)
  To: Dajid Morel; +Cc: linux-bluetooth, Dajid MOREL

Hi Dajid,

On Fri, Mar 6, 2026 at 10:48 AM Dajid Morel <dajidp.morel@gmail.com> wrote:
>
> > Im talking to an AI model/agent? However, it does look like the above
> > was generated by an AI model that is only checking the timeout used in
> > the commands without knowing the command sequence performed when
> > attempting a connection. Specifically for commands that report status
> > the timeout is short because the controller only needs to confirm it
> > received and understood the command. In fact, most commands behave
> > this way since they really need to generate a command complete or
> > status as soon as possible; otherwise, the host wouldn't be able to
> > continue sending the next command. Therefore, the rambling about the
> > usage of HCI_CMD_TIMEOUT is nonsense.
>
> > Regarding the actual problem, try using something newer, 5.15 might
> > not actually contain the necessary changes to wait an arbitrary amount
> > of time for the connection to complete.
>
>
> Hi Luiz,
>
>
> English is not my primary language, that’s why i’m using AI to help rephrase my thoughts into proper English. However, I want to be very clear: the grep results, the logic analysis, and the hardware tests are 100% manual and were conducted by me on real industrial equipment.
>
> Regarding your point on 5.15 being old: you are correct. But as I mentioned, we are tied to Ubuntu Core 22 for this Volvo deployment.
>
> The "rambling" about the 2-second timeout comes from a very concrete observation. Today, I tested my Phase 3 (15s timeout) on the Raspberry Pi 4 with the TE M5600 sensor.
>
> The result is a success:
>
> For the first time, the sensor successfully associated:
>
> [BLE] Capteur associé : E8:C0:B1:D4:A3:3C
>
> The handshake systematically succeeds now, whereas it was a 100% failure rate with the stock 2-second HCI_CMD_TIMEOUT.
>
> I understand your concern about breaking the HCI bus logic with a global 15s timeout. However, if 5.15 is "lacking the necessary changes", could you point me to the specific upstream commits that implement the "arbitrary wait" for LE connections?

You mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
that was posted 3 days ago in this thread? That in theory should have
been backported.

> I would much rather backport the "correct" architectural fix to our 5.15 tree than keep this global hammer, but I need a path forward that supports these 5s-advertising interval sensors on our current LTS platform.

> Dajid Morel,
>
> Volvo Group
>
>
>
> On Fri, Mar 6, 2026 at 15:26 Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
>>
>> Hi Dajid,
>>
>> On Fri, Mar 6, 2026 at 2:15 AM Dajid Morel <dajidp.morel@gmail.com> wrote:
>> >
>> > On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
>> > <luiz.dentz@gmail.com> wrote:
>> > > What kernel version are you seeing this behavior? We no longer use
>> > > HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:
>> > > https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673
>> > >
>> > > It was changed some 4 years back, so it quite an old change even for
>> > > stable kernel:
>> > > https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
>> >
>> > Hi Luiz,
>> >
>> > To answer your question, our industrial environment runs on Ubuntu Core 22,
>> > which uses the LTS kernel 5.15.0-1096-raspi (aarch64).
>> >
>> > Thank you for the detailed logs and for pointing out commit
>> > a56a1138cbd85e4d565356199d60e1cb94e5a77a.
>> >
>> > I understand that HCI_OP_LE_CREATE_CONN itself has been decoupled from
>> > HCI_CMD_TIMEOUT. However, I have conducted a deep code analysis on the
>> > current bluetooth-next tree combined with an isolated empirical test on
>> > our hardware, which strongly suggests that HCI_CMD_TIMEOUT is still the
>> > root cause of the abort in our industrial use case.
>> >
>> > Here are the facts and the methodology used to verify it.
>> >
>> > 1. Code analysis on bluetooth-next: HCI_CMD_TIMEOUT is still widely used
>> >
>> > While the specific create connection command may use a different timeout,
>> > the entire connection setup sequence relies on multiple synchronous HCI
>> > commands. A simple grep on the current bluetooth-next tree shows extensive
>> > usage of this 2-second limit:
>> >
>> > $ grep -rn "HCI_CMD_TIMEOUT" net/bluetooth/ | wc -l
>> > 150
>> >
>> > $ grep -rn "HCI_CMD_TIMEOUT" net/bluetooth/ | cut -d: -f1 | uniq -c | sort -nr
>> >     124 net/bluetooth/hci_sync.c
>> >       7 net/bluetooth/msft.c
>> >       6 net/bluetooth/hci_core.c
>> >       3 net/bluetooth/hci_conn.c
>> >
>> > With 124 occurrences in hci_sync.c alone, many preparatory commands
>> > (e.g. HCI_OP_LE_ADD_TO_ACCEPT_LIST, HCI_OP_LE_SET_EXT_SCAN_ENABLE,
>> > which are visible in the memcheck logs you provided) rely on
>> > __hci_cmd_sync_sk(), which falls back to the hardcoded
>> > HCI_CMD_TIMEOUT (2000 ms).
>> >
>> > 2. Empirical test methodology
>> >
>> > To verify that this global timeout is the limiting factor for our
>> > 5-second advertising interval sensors, we performed an isolated test
>> > in our environment.
>> >
>> > Environment:
>> > Ubuntu Core 22 / Kernel LTS 5.15.0-1096-raspi (aarch64)
>> >
>> > Sensor:
>> > TE Connectivity M5600 (5 s advertising interval, ~12.5 s handshake time)
>> >
>> > Action:
>> > All previous patches were reverted, including the withdrawn v4 patch on
>> > conn_timeout. We modified only the global definition in
>> > include/net/bluetooth/hci.h:
>> >
>> > --- a/include/net/bluetooth/hci.h
>> > +++ b/include/net/bluetooth/hci.h
>> > @@
>> > -#define HCI_CMD_TIMEOUT msecs_to_jiffies(2000)
>> > +#define HCI_CMD_TIMEOUT msecs_to_jiffies(15000)
>> >
>> > Build process:
>> > To avoid any userspace interference, we rebuilt the kernel natively as an
>> > immutable Snap and generated a custom Ubuntu Core OS image using
>> > snapcraft pack and ubuntu-image.
>> >
>> > 3. Test results
>> >
>> > We modified only the global definition in include/net/bluetooth/hci.h and
>> > observed the exact behavioral threshold.
>> >
>> > Phase 1 (2000 ms – unmodified):
>> > The connection attempt is aborted almost immediately and silently at the
>> > HCI level. Userspace applications remain unaware and continue waiting,
>> > which explains the ~45 s stall observed in our previous Python test.
>> >
>> > Phase 2 (10000 ms):
>> > The kernel allows the connection sequence to progress further, but the
>> > sensor requires ~12.5 s to complete the handshake. The kernel timeout
>> > therefore triggers right before completion. For the first time our
>> > userspace daemon logged explicit "[BLE] Disconnected" events, showing
>> > that the kernel actively aborted the handshake at the 10 s mark.
>> >
>> > Phase 3 (15000 ms):
>> > Once the kernel timeout exceeded the sensor response time, the connection
>> > succeeded reliably. The full handshake consistently took ~12.5 seconds.
>> >
>> > Conclusion
>> >
>> > These observations suggest that even though HCI_OP_LE_CREATE_CONN itself
>> > no longer relies on HCI_CMD_TIMEOUT, the overall connection sequence is
>> > still constrained by synchronous preparatory commands in hci_sync.c that
>> > use this timeout.
>> >
>> > Because our sensors advertise only every 5 seconds, the state machine
>> > appears to hit this limit before the full sequence can complete.
>> >
>> > Since increasing HCI_CMD_TIMEOUT globally to ~15 seconds in the upstream
>> > kernel may be too aggressive for other environments, what would be the
>> > recommended approach from the BlueZ maintainers to support LE devices
>> > with advertising intervals greater than 2 seconds?
>> >
>> > Would it be acceptable to make this synchronization timeout configurable,
>> > for example through sysfs or the mgmt API?
>>
>> Im talking to an AI model/agent? However, it does look like the above
>> was generated by an AI model that is only checking the timeout used in
>> the commands without knowing the command sequence performed when
>> attempting a connection. Specifically for commands that report status
>> the timeout is short because the controller only needs to confirm it
>> received and understood the command. In fact, most commands behave
>> this way since they really need to generate a command complete or
>> status as soon as possible; otherwise, the host wouldn't be able to
>> continue sending the next command. Therefore, the rambling about the
>> usage of HCI_CMD_TIMEOUT is nonsense.
>>
>> Regarding the actual problem, try using something newer, 5.15 might
>> not actually contain the necessary changes to wait an arbitrary amount
>> of time for the connection to complete.
>>
>> > Best regards,
>> >
>> > Dajid Morel
>> > Volvo Group
>> >
>> > Le mar. 3 mars 2026 à 22:12, Luiz Augusto von Dentz
>> > <luiz.dentz@gmail.com> a écrit :
>> > >
>> > > Hi Dajid,
>> > >
>> > > On Tue, Mar 3, 2026 at 3:31 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
>> > > >
>> > > > On Tue, Mar 3, 2026 at 8:26 PM Luiz Augusto von Dentz
>> > > > <luiz.dentz@gmail.com> wrote:
>> > > > >
>> > > > > That is waiting 40 seconds as expected, so I'm not sure what is
>> > > > > causing it to time out in 2 seconds but that is definitely the
>> > > > > expected behavior.
>> > > >
>> > > > Hi Luiz,
>> > > >
>> > > > Thank you for providing those logs. Seeing the 40.5-second delta in
>> > > > your environment is very insightful and confirms that the standard
>> > > > stack should wait much longer than what I am observing.
>> > > >
>> > > > I have finally identified the root cause of the 2-second abort in my
>> > > > setup. My environment uses industrial TE Connectivity M5600 sensors,
>> > > > which are designed for ultra-low power consumption with a long
>> > > > advertising interval of 5 seconds.
>> > > >
>> > > > After auditing the kernel source, I found that HCI_CMD_TIMEOUT is
>> > > > hardcoded to 2.0 seconds (#define HCI_CMD_TIMEOUT
>> > > > msecs_to_jiffies(2000)).
>> > > >
>> > > > When the kernel issues HCI_OP_LE_CREATE_CONN, the local controller
>> > > > (Broadcom on RPi4 or Rockchip on Rock 4 C+) must wait for the next
>> > > > advertisement from the sensor to proceed with the connection. Since
>> > > > the M5600 only wakes up every 5s, the 2-second HCI_CMD_TIMEOUT
>> > > > systematically triggers before the controller can receive the
>> > > > advertisement and acknowledge the command completion. This leads to an
>> > > > immediate abort, even if the sensor is physically next to a high-gain
>> > > > antenna (9.4dBi).
>> > > >
>> > > > This explains why my v4 patch (forcing conn_timeout to 20s) worked as
>> > > > a side-effect: it kept the connection structure alive just long enough
>> > > > to bypass the immediate impact of the HCI command timeout, but it was
>> > > > architecturally the wrong target.
>> > > >
>> > > > I officially withdraw this patch series.
>> > > >
>> > > > However, this 2-second hardcoded limit for HCI_CMD_TIMEOUT seems
>> > > > fundamentally incompatible with many industrial low-duty-cycle
>> > > > sensors. Many developers on various forums resort to kernel hacks to
>> > > > bypass this.
>> > > >
>> > > > Would you consider a patch that either:
>> > > > 1. Increases HCI_CMD_TIMEOUT globally to 5 or 10 seconds?
>> > > > 2. Or makes the LE connection command timeout specifically
>> > > > configurable via the Management API or main.conf?
>> > > >
>> > > > I would like to work on a cleaner solution that accommodates these
>> > > > low-power industrial sleep cycles without breaking existing tools.
>> > >
>> > > What kernel version are you seeing this behavior? We no longer use
>> > > HCI_CMD_TIMEOUT for HCI_OP_LE_CREATE_CONN:
>> > >
>> > > https://github.com/bluez/bluetooth-next/blob/master/net/bluetooth/hci_sync.c#L6673
>> > >
>> > > It was changed some 4 years back, so it quite an old change even for
>> > > stable kernel:
>> > >
>> > > https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
>>
>>
>>
>> --
>> Luiz Augusto von Dentz



-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-06 15:57                 ` Luiz Augusto von Dentz
@ 2026-03-06 17:54                   ` Dajid Morel
  2026-03-06 18:20                     ` Dajid Morel
  2026-03-06 18:27                     ` Luiz Augusto von Dentz
  0 siblings, 2 replies; 22+ messages in thread
From: Dajid Morel @ 2026-03-06 17:54 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth, Dajid MOREL

On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
> You mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
> that was posted 3 days ago in this thread?
> That in theory should have been backported.

Hi Luiz,

ubuntu@builder:~/bluetooth-next$ grep -nB 2 "HCI_CMD_TIMEOUT"
net/bluetooth/hci_sync.c | grep "HCI_OP_LE_ADD_TO_ACCEPT_LIST"
2511:    err = __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
2512:                                sizeof(cp), &cp, HCI_CMD_TIMEOUT);

As shown above, the preparatory command HCI_OP_LE_ADD_TO_ACCEPT_LIST is still
hardcoded to HCI_CMD_TIMEOUT (2s). In the 5.15 LTS kernel (and bluetooth-next),
this command is part of the mandatory sequence before the connection is even
attempted.

ubuntu@builder:~/bluetooth-next$ sed -n '2850,2855p' net/bluetooth/hci_sync.c
for (i = 0; i < n; ++i) {
err = hci_le_add_accept_list_sync(hdev, &params[i],
&num_entries);

Even if the final HCI_OP_LE_CREATE_CONN is decoupled (line 6673), the state
machine fails at line 2511 because our industrial sensors (TE M5600) have a
5-second advertising interval. The controller times out before the device is
even added to the accept list.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-06 17:54                   ` Dajid Morel
@ 2026-03-06 18:20                     ` Dajid Morel
  2026-03-06 18:27                     ` Luiz Augusto von Dentz
  1 sibling, 0 replies; 22+ messages in thread
From: Dajid Morel @ 2026-03-06 18:20 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth, Dajid MOREL

On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
> You mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
> that was posted 3 days ago in this thread?
> That in theory should have been backported.

Yes I mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a,
and the investigation above is conducted on the latest tree.

Le ven. 6 mars 2026 à 18:54, Dajid Morel <dajidp.morel@gmail.com> a écrit :
>
> On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> > You mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
> > that was posted 3 days ago in this thread?
> > That in theory should have been backported.
>
> Hi Luiz,
>
> ubuntu@builder:~/bluetooth-next$ grep -nB 2 "HCI_CMD_TIMEOUT"
> net/bluetooth/hci_sync.c | grep "HCI_OP_LE_ADD_TO_ACCEPT_LIST"
> 2511:    err = __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
> 2512:                                sizeof(cp), &cp, HCI_CMD_TIMEOUT);
>
> As shown above, the preparatory command HCI_OP_LE_ADD_TO_ACCEPT_LIST is still
> hardcoded to HCI_CMD_TIMEOUT (2s). In the 5.15 LTS kernel (and bluetooth-next),
> this command is part of the mandatory sequence before the connection is even
> attempted.
>
> ubuntu@builder:~/bluetooth-next$ sed -n '2850,2855p' net/bluetooth/hci_sync.c
> for (i = 0; i < n; ++i) {
> err = hci_le_add_accept_list_sync(hdev, &params[i],
> &num_entries);
>
> Even if the final HCI_OP_LE_CREATE_CONN is decoupled (line 6673), the state
> machine fails at line 2511 because our industrial sensors (TE M5600) have a
> 5-second advertising interval. The controller times out before the device is
> even added to the accept list.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-06 17:54                   ` Dajid Morel
  2026-03-06 18:20                     ` Dajid Morel
@ 2026-03-06 18:27                     ` Luiz Augusto von Dentz
  2026-03-09 10:02                       ` Dajid Morel
  1 sibling, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-06 18:27 UTC (permalink / raw)
  To: Dajid Morel; +Cc: linux-bluetooth, Dajid MOREL

Hi Dajid,

On Fri, Mar 6, 2026 at 12:54 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
>
> On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> > You mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
> > that was posted 3 days ago in this thread?
> > That in theory should have been backported.
>
> Hi Luiz,
>
> ubuntu@builder:~/bluetooth-next$ grep -nB 2 "HCI_CMD_TIMEOUT"
> net/bluetooth/hci_sync.c | grep "HCI_OP_LE_ADD_TO_ACCEPT_LIST"
> 2511:    err = __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
> 2512:                                sizeof(cp), &cp, HCI_CMD_TIMEOUT);
>
> As shown above, the preparatory command HCI_OP_LE_ADD_TO_ACCEPT_LIST is still
> hardcoded to HCI_CMD_TIMEOUT (2s). In the 5.15 LTS kernel (and bluetooth-next),
> this command is part of the mandatory sequence before the connection is even
> attempted.

That doesn't incur in any traffic, sounds like you didn't read my
previous response where I explained why a short timeout is normally
required for HCI, since normally only one command can be outstanding,
I really don't know why you keep coming back to the same topic when
Ive already shown up to date distros wait 40 seconds to complete a
connection.

> ubuntu@builder:~/bluetooth-next$ sed -n '2850,2855p' net/bluetooth/hci_sync.c
> for (i = 0; i < n; ++i) {
> err = hci_le_add_accept_list_sync(hdev, &params[i],
> &num_entries);
>
> Even if the final HCI_OP_LE_CREATE_CONN is decoupled (line 6673), the state
> machine fails at line 2511 because our industrial sensors (TE M5600) have a
> 5-second advertising interval. The controller times out before the device is
> even added to the accept list.

Yeah, you really don't know what you are talking about, there is no
timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
a command complete immediately and the connection attempt is only
interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
anyway I fill like Im wasting my time here.

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-06 18:27                     ` Luiz Augusto von Dentz
@ 2026-03-09 10:02                       ` Dajid Morel
  2026-03-09 16:02                         ` Paul Menzel
  0 siblings, 1 reply; 22+ messages in thread
From: Dajid Morel @ 2026-03-09 10:02 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: linux-bluetooth, Dajid MOREL

> Yeah, you really don't know what you are talking about, there is no
> timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
> a command complete immediately and the connection attempt is only
> interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
> increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
> cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
> anyway I fill like Im wasting my time here.

Hi Luiz,

I'll be brief and stick to the logs. Here is the output from a stock
5.15 kernel (2s timeout) on the Raspberry Pi 4:

[bluetooth]# connect E8:C0:B1:D4:A3:3C
Attempting to connect to E8:C0:B1:D4:A3:3C
Failed to connect: org.bluez.Error.Failed le-connection-abort-by-local

The "le-connection-abort-by-local" error is the smoking gun. It proves
the Host is aborting the sequence, not the peer.

When I apply my patch (15s timeout) on the exact same hardware:

- The "abort-by-local" error disappears completely.
- The connection succeeds 100% of the time.
- We can read the pressure data.

I noticed in dmesg that the Broadcom controller is missing its firmware patch:
[   16.357546] Bluetooth: hci0: BCM: chip id 63
[   16.360628] Bluetooth: hci0: BCM: features 0x07
[   16.378604] Bluetooth: hci0: BCM20702A
[   16.378639] Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
[   16.381695] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
[   16.388662] Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0a5c-21e8.hcd'
[   16.394969] Bluetooth: hci0: BCM: 'brcm/BCM-0a5c-21e8.hcd'

This likely makes the controller slower to process sync commands when
high-latency sensors are advertising nearby. However, the system is
100% stable with the 15s timeout patch, even without that firmware.

If the timeout is truly decoupled, then "le-connection-abort-by-local"
should not be triggered at exactly 2 seconds. The fact that it is
proves that the hardcoded limit in hci_sync.c is the blocker.

Since this is for a Volvo production line, we need a way to support
these sensors. If you refuse the global constant change, how can we
avoid this "local abort" in the sync sequence for slow controllers?

Le ven. 6 mars 2026 à 19:27, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> a écrit :
>
> Hi Dajid,
>
> On Fri, Mar 6, 2026 at 12:54 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
> >
> > On Tue, Mar 3, 2026 at 10:12 PM Luiz Augusto von Dentz
> > <luiz.dentz@gmail.com> wrote:
> > > You mean https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
> > > that was posted 3 days ago in this thread?
> > > That in theory should have been backported.
> >
> > Hi Luiz,
> >
> > ubuntu@builder:~/bluetooth-next$ grep -nB 2 "HCI_CMD_TIMEOUT"
> > net/bluetooth/hci_sync.c | grep "HCI_OP_LE_ADD_TO_ACCEPT_LIST"
> > 2511:    err = __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
> > 2512:                                sizeof(cp), &cp, HCI_CMD_TIMEOUT);
> >
> > As shown above, the preparatory command HCI_OP_LE_ADD_TO_ACCEPT_LIST is still
> > hardcoded to HCI_CMD_TIMEOUT (2s). In the 5.15 LTS kernel (and bluetooth-next),
> > this command is part of the mandatory sequence before the connection is even
> > attempted.
>
> That doesn't incur in any traffic, sounds like you didn't read my
> previous response where I explained why a short timeout is normally
> required for HCI, since normally only one command can be outstanding,
> I really don't know why you keep coming back to the same topic when
> Ive already shown up to date distros wait 40 seconds to complete a
> connection.
>
> > ubuntu@builder:~/bluetooth-next$ sed -n '2850,2855p' net/bluetooth/hci_sync.c
> > for (i = 0; i < n; ++i) {
> > err = hci_le_add_accept_list_sync(hdev, &params[i],
> > &num_entries);
> >
> > Even if the final HCI_OP_LE_CREATE_CONN is decoupled (line 6673), the state
> > machine fails at line 2511 because our industrial sensors (TE M5600) have a
> > 5-second advertising interval. The controller times out before the device is
> > even added to the accept list.
>
> Yeah, you really don't know what you are talking about, there is no
> timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
> a command complete immediately and the connection attempt is only
> interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
> increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
> cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
> anyway I fill like Im wasting my time here.
>
> --
> Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-09 10:02                       ` Dajid Morel
@ 2026-03-09 16:02                         ` Paul Menzel
  2026-03-09 17:37                           ` Dajid Morel
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Menzel @ 2026-03-09 16:02 UTC (permalink / raw)
  To: Dajid Morel, Dajid Morel; +Cc: Luiz Augusto von Dentz, linux-bluetooth

Dear Dajid,


Sorry for chiming in. Luiz is the most knowledgeable person in this.

Am 09.03.26 um 11:02 schrieb Dajid Morel:
>> Yeah, you really don't know what you are talking about, there is no
>> timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
>> a command complete immediately and the connection attempt is only
>> interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
>> increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
>> cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
>> anyway I fill like Im wasting my time here.
> 
> Hi Luiz,
> 
> I'll be brief and stick to the logs. Here is the output from a stock
> 5.15 kernel (2s timeout) on the Raspberry Pi 4:
> 
> [bluetooth]# connect E8:C0:B1:D4:A3:3C
> Attempting to connect to E8:C0:B1:D4:A3:3C
> Failed to connect: org.bluez.Error.Failed le-connection-abort-by-local
> 
> The "le-connection-abort-by-local" error is the smoking gun. It proves
> the Host is aborting the sequence, not the peer.
> 
> When I apply my patch (15s timeout) on the exact same hardware:
> 
> - The "abort-by-local" error disappears completely.
> - The connection succeeds 100% of the time.
> - We can read the pressure data.
> 
> I noticed in dmesg that the Broadcom controller is missing its firmware patch:
> [   16.357546] Bluetooth: hci0: BCM: chip id 63
> [   16.360628] Bluetooth: hci0: BCM: features 0x07
> [   16.378604] Bluetooth: hci0: BCM20702A
> [   16.378639] Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
> [   16.381695] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
> [   16.388662] Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0a5c-21e8.hcd'
> [   16.394969] Bluetooth: hci0: BCM: 'brcm/BCM-0a5c-21e8.hcd'
> 
> This likely makes the controller slower to process sync commands when
> high-latency sensors are advertising nearby. However, the system is
> 100% stable with the 15s timeout patch, even without that firmware.
> 
> If the timeout is truly decoupled, then "le-connection-abort-by-local"
> should not be triggered at exactly 2 seconds. The fact that it is
> proves that the hardcoded limit in hci_sync.c is the blocker.
> 
> Since this is for a Volvo production line, we need a way to support
> these sensors. If you refuse the global constant change, how can we
> avoid this "local abort" in the sync sequence for slow controllers?

As this is the upstream list, it’d really help if you could test with 
6.19, 7.0-rc3 or – best option – with the bluetooth-next tree, just to 
be sure.

It’s definitely great, that you are looking for an upstream solution, so 
please be patient, and I’d really be interested in your test results. 
Depending on these, a way forward can be derived.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-09 16:02                         ` Paul Menzel
@ 2026-03-09 17:37                           ` Dajid Morel
  2026-03-09 18:03                             ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 22+ messages in thread
From: Dajid Morel @ 2026-03-09 17:37 UTC (permalink / raw)
  To: Paul Menzel; +Cc: Dajid Morel, Luiz Augusto von Dentz, linux-bluetooth

Hi Paul,

Thank you for your feedback. I have analyzed the current bluetooth-next tree
(Commit: 19182348259c) as you suggested

As you pointed out, Luiz is the expert on this topic. For my part, I
am an apprentice student
majoring in Physics and Microelectronics Systems; I do not have
extensive expertise in the Linux kernel.
My analysis relies primarily on observing physical behavior and
experimentation on our production line.

Le lun. 9 mars 2026 à 17:02, Paul Menzel <pmenzel@molgen.mpg.de> a écrit :
>
> Dear Dajid,
>
>
> Sorry for chiming in. Luiz is the most knowledgeable person in this.
>
> Am 09.03.26 um 11:02 schrieb Dajid Morel:
> >> Yeah, you really don't know what you are talking about, there is no
> >> timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
> >> a command complete immediately and the connection attempt is only
> >> interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
> >> increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
> >> cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
> >> anyway I fill like Im wasting my time here.
> >
> > Hi Luiz,
> >
> > I'll be brief and stick to the logs. Here is the output from a stock
> > 5.15 kernel (2s timeout) on the Raspberry Pi 4:
> >
> > [bluetooth]# connect E8:C0:B1:D4:A3:3C
> > Attempting to connect to E8:C0:B1:D4:A3:3C
> > Failed to connect: org.bluez.Error.Failed le-connection-abort-by-local
> >
> > The "le-connection-abort-by-local" error is the smoking gun. It proves
> > the Host is aborting the sequence, not the peer.
> >
> > When I apply my patch (15s timeout) on the exact same hardware:
> >
> > - The "abort-by-local" error disappears completely.
> > - The connection succeeds 100% of the time.
> > - We can read the pressure data.
> >
> > I noticed in dmesg that the Broadcom controller is missing its firmware patch:
> > [   16.357546] Bluetooth: hci0: BCM: chip id 63
> > [   16.360628] Bluetooth: hci0: BCM: features 0x07
> > [   16.378604] Bluetooth: hci0: BCM20702A
> > [   16.378639] Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
> > [   16.381695] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
> > [   16.388662] Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0a5c-21e8.hcd'
> > [   16.394969] Bluetooth: hci0: BCM: 'brcm/BCM-0a5c-21e8.hcd'
> >
> > This likely makes the controller slower to process sync commands when
> > high-latency sensors are advertising nearby. However, the system is
> > 100% stable with the 15s timeout patch, even without that firmware.
> >
> > If the timeout is truly decoupled, then "le-connection-abort-by-local"
> > should not be triggered at exactly 2 seconds. The fact that it is
> > proves that the hardcoded limit in hci_sync.c is the blocker.
> >
> > Since this is for a Volvo production line, we need a way to support
> > these sensors. If you refuse the global constant change, how can we
> > avoid this "local abort" in the sync sequence for slow controllers?
>
> As this is the upstream list, it’d really help if you could test with
> 6.19, 7.0-rc3 or – best option – with the bluetooth-next tree, just to
> be sure.
>
> It’s definitely great, that you are looking for an upstream solution, so
> please be patient, and I’d really be interested in your test results.
> Depending on these, a way forward can be derived.
>
>
> Kind regards,
>
> Paul

The source code analysis of bluetooth-next confirms that the LE
connection preparatory
 phase (Accept List) is still limited by the hardcoded HCI_CMD_TIMEOUT
(2s), which is
identical to my production kernel (Jammy 5.15, Commit: 7824a77711ba):

ubuntu@builder:~/bluetooth-next$ sed -n '2511,2514p'
net/bluetooth/hci_sync.c err
= __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
sizeof(cp), &cp, HCI_CMD_TIMEOUT);

While I cannot easily boot a 7.0-rc3 kernel on this specific
industrial hardware today, the
code at line 2511 in bluetooth-next is strictly identical to my 5.15
kernel. In our environment
with high-latency sensors (5s advertising interval), this 2s limit
systematically triggers a
"le-connection-abort-by-local" error before the final connection
command is even reached.

What architectural approach would you recommend to allow for more
latency during these preparatory
sync commands without modifying the global kernel constant?

Best regards,

Dajid Morel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-09 17:37                           ` Dajid Morel
@ 2026-03-09 18:03                             ` Luiz Augusto von Dentz
  2026-03-10 13:10                               ` Dajid Morel
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-09 18:03 UTC (permalink / raw)
  To: Dajid Morel; +Cc: Paul Menzel, Dajid Morel, linux-bluetooth

Hi Dajid,

On Mon, Mar 9, 2026 at 1:37 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
>
> Hi Paul,
>
> Thank you for your feedback. I have analyzed the current bluetooth-next tree
> (Commit: 19182348259c) as you suggested
>
> As you pointed out, Luiz is the expert on this topic. For my part, I
> am an apprentice student
> majoring in Physics and Microelectronics Systems; I do not have
> extensive expertise in the Linux kernel.
> My analysis relies primarily on observing physical behavior and
> experimentation on our production line.

One thing is discussing, another is arguing. You aren't seeking a solution here.

> Le lun. 9 mars 2026 à 17:02, Paul Menzel <pmenzel@molgen.mpg.de> a écrit :
> >
> > Dear Dajid,
> >
> >
> > Sorry for chiming in. Luiz is the most knowledgeable person in this.
> >
> > Am 09.03.26 um 11:02 schrieb Dajid Morel:
> > >> Yeah, you really don't know what you are talking about, there is no
> > >> timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
> > >> a command complete immediately and the connection attempt is only
> > >> interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
> > >> increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
> > >> cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
> > >> anyway I fill like Im wasting my time here.
> > >
> > > Hi Luiz,
> > >
> > > I'll be brief and stick to the logs. Here is the output from a stock
> > > 5.15 kernel (2s timeout) on the Raspberry Pi 4:
> > >
> > > [bluetooth]# connect E8:C0:B1:D4:A3:3C
> > > Attempting to connect to E8:C0:B1:D4:A3:3C
> > > Failed to connect: org.bluez.Error.Failed le-connection-abort-by-local
> > >
> > > The "le-connection-abort-by-local" error is the smoking gun. It proves
> > > the Host is aborting the sequence, not the peer.
> > >
> > > When I apply my patch (15s timeout) on the exact same hardware:
> > >
> > > - The "abort-by-local" error disappears completely.
> > > - The connection succeeds 100% of the time.
> > > - We can read the pressure data.
> > >
> > > I noticed in dmesg that the Broadcom controller is missing its firmware patch:
> > > [   16.357546] Bluetooth: hci0: BCM: chip id 63
> > > [   16.360628] Bluetooth: hci0: BCM: features 0x07
> > > [   16.378604] Bluetooth: hci0: BCM20702A
> > > [   16.378639] Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
> > > [   16.381695] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
> > > [   16.388662] Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0a5c-21e8.hcd'
> > > [   16.394969] Bluetooth: hci0: BCM: 'brcm/BCM-0a5c-21e8.hcd'
> > >
> > > This likely makes the controller slower to process sync commands when
> > > high-latency sensors are advertising nearby. However, the system is
> > > 100% stable with the 15s timeout patch, even without that firmware.
> > >
> > > If the timeout is truly decoupled, then "le-connection-abort-by-local"
> > > should not be triggered at exactly 2 seconds. The fact that it is
> > > proves that the hardcoded limit in hci_sync.c is the blocker.
> > >
> > > Since this is for a Volvo production line, we need a way to support
> > > these sensors. If you refuse the global constant change, how can we
> > > avoid this "local abort" in the sync sequence for slow controllers?
> >
> > As this is the upstream list, it’d really help if you could test with
> > 6.19, 7.0-rc3 or – best option – with the bluetooth-next tree, just to
> > be sure.
> >
> > It’s definitely great, that you are looking for an upstream solution, so
> > please be patient, and I’d really be interested in your test results.
> > Depending on these, a way forward can be derived.
> >
> >
> > Kind regards,
> >
> > Paul
>
> The source code analysis of bluetooth-next confirms that the LE
> connection preparatory
>  phase (Accept List) is still limited by the hardcoded HCI_CMD_TIMEOUT
> (2s), which is
> identical to my production kernel (Jammy 5.15, Commit: 7824a77711ba):

This is a logical jump, there is no evidence that Aceept List, or any
other command done in preparation has timed out, just collect the
btmon traces so we can check which command is considerer to timeout.

> ubuntu@builder:~/bluetooth-next$ sed -n '2511,2514p'
> net/bluetooth/hci_sync.c err
> = __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
> sizeof(cp), &cp, HCI_CMD_TIMEOUT);
>
> While I cannot easily boot a 7.0-rc3 kernel on this specific
> industrial hardware today, the
> code at line 2511 in bluetooth-next is strictly identical to my 5.15
> kernel. In our environment
> with high-latency sensors (5s advertising interval), this 2s limit
> systematically triggers a
> "le-connection-abort-by-local" error before the final connection
> command is even reached.

Derr, you can boot a regular laptop, it doesn't need to be the
specific industrial hardware. Since you don't claim the issue is
hardware-related, it actully _doesn't matter_, or maybe it does and we
don't know it yet because you never provided any traces.

> What architectural approach would you recommend to allow for more
> latency during these preparatory
> sync commands without modifying the global kernel constant?

On the recent kernels the timeout is configurable.

> Best regards,
>
> Dajid Morel



-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-09 18:03                             ` Luiz Augusto von Dentz
@ 2026-03-10 13:10                               ` Dajid Morel
  2026-03-10 13:47                                 ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 22+ messages in thread
From: Dajid Morel @ 2026-03-10 13:10 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: Paul Menzel, Dajid Morel, linux-bluetooth

[-- Attachment #1: Type: text/plain, Size: 8036 bytes --]

Subject: Re: [PATCH v4] Bluetooth: Increase LE connection timeout for
industrial sensors

Hi Luiz, Paul,

Here are the requested traces and empirical tests.

Test 1: Recent Kernel

- OS/Kernel: Ubuntu Desktop 25.10 / Kernel 6.17
- Hardware: PC using the Broadcom BCM20702A1 controller.
- Target: TE M5600 industrial sensor (strict 5-second advertising interval).
- Setup: The controller initially missed its firmware patch (build 0000).
I manually injected the firmware (BCM20702A1-0a5c-21e8.hcd) to bring it
to build 1764. No kernel modifications were applied.

Result: Connection succeeds reliably. The unmodified 6.17 kernel correctly waits
and handles the sensor’s latency.

Test 2: LTS Kernel (Trace attached)

- OS/Kernel: Ubuntu Core 22 / Kernel 5.15.0-1091-raspi (aarch64)
- Hardware: Raspberry Pi 4 using the exact same Broadcom BCM20702A1 controller.
- Target: Same TE M5600 industrial sensor (5s advertising interval).
- Setup: To perfectly mirror Test 1, the missing firmware patch
(BCM20702A1-0a5c-21e8.hcd) was manually injected via bind mount.
Verified via dmesg that the controller correctly transitioned from build
0000 to build 1764 prior to testing.

Result: Connection consistently fails.

Le lun. 9 mars 2026 à 19:03, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> a écrit :
>
> Hi Dajid,
>
> On Mon, Mar 9, 2026 at 1:37 PM Dajid Morel <dajidp.morel@gmail.com> wrote:
> >
> > Hi Paul,
> >
> > Thank you for your feedback. I have analyzed the current bluetooth-next tree
> > (Commit: 19182348259c) as you suggested
> >
> > As you pointed out, Luiz is the expert on this topic. For my part, I
> > am an apprentice student
> > majoring in Physics and Microelectronics Systems; I do not have
> > extensive expertise in the Linux kernel.
> > My analysis relies primarily on observing physical behavior and
> > experimentation on our production line.
>
> One thing is discussing, another is arguing. You aren't seeking a solution here.
>
> > Le lun. 9 mars 2026 à 17:02, Paul Menzel <pmenzel@molgen.mpg.de> a écrit :
> > >
> > > Dear Dajid,
> > >
> > >
> > > Sorry for chiming in. Luiz is the most knowledgeable person in this.
> > >
> > > Am 09.03.26 um 11:02 schrieb Dajid Morel:
> > > >> Yeah, you really don't know what you are talking about, there is no
> > > >> timeout on HCI_OP_LE_CREATE_CONN itself, the controller shall generate
> > > >> a command complete immediately and the connection attempt is only
> > > >> interrupted with HCI_OP_LE_CREATE_CONN_CANCEL, so lets say yo want to
> > > >> increase HCI_CMD_TIMEOUT that means HCI_OP_LE_CREATE_CONN_CANCEL
> > > >> cannot be send because it would be pending on HCI_EV_LE_CONN_COMPLETE,
> > > >> anyway I fill like Im wasting my time here.
> > > >
> > > > Hi Luiz,
> > > >
> > > > I'll be brief and stick to the logs. Here is the output from a stock
> > > > 5.15 kernel (2s timeout) on the Raspberry Pi 4:
> > > >
> > > > [bluetooth]# connect E8:C0:B1:D4:A3:3C
> > > > Attempting to connect to E8:C0:B1:D4:A3:3C
> > > > Failed to connect: org.bluez.Error.Failed le-connection-abort-by-local
> > > >
> > > > The "le-connection-abort-by-local" error is the smoking gun. It proves
> > > > the Host is aborting the sequence, not the peer.
> > > >
> > > > When I apply my patch (15s timeout) on the exact same hardware:
> > > >
> > > > - The "abort-by-local" error disappears completely.
> > > > - The connection succeeds 100% of the time.
> > > > - We can read the pressure data.
> > > >
> > > > I noticed in dmesg that the Broadcom controller is missing its firmware patch:
> > > > [   16.357546] Bluetooth: hci0: BCM: chip id 63
> > > > [   16.360628] Bluetooth: hci0: BCM: features 0x07
> > > > [   16.378604] Bluetooth: hci0: BCM20702A
> > > > [   16.378639] Bluetooth: hci0: BCM20702A1 (001.002.014) build 0000
> > > > [   16.381695] Bluetooth: hci0: BCM: firmware Patch file not found, tried:
> > > > [   16.388662] Bluetooth: hci0: BCM: 'brcm/BCM20702A1-0a5c-21e8.hcd'
> > > > [   16.394969] Bluetooth: hci0: BCM: 'brcm/BCM-0a5c-21e8.hcd'
> > > >
> > > > This likely makes the controller slower to process sync commands when
> > > > high-latency sensors are advertising nearby. However, the system is
> > > > 100% stable with the 15s timeout patch, even without that firmware.
> > > >
> > > > If the timeout is truly decoupled, then "le-connection-abort-by-local"
> > > > should not be triggered at exactly 2 seconds. The fact that it is
> > > > proves that the hardcoded limit in hci_sync.c is the blocker.
> > > >
> > > > Since this is for a Volvo production line, we need a way to support
> > > > these sensors. If you refuse the global constant change, how can we
> > > > avoid this "local abort" in the sync sequence for slow controllers?
> > >
> > > As this is the upstream list, it’d really help if you could test with
> > > 6.19, 7.0-rc3 or – best option – with the bluetooth-next tree, just to
> > > be sure.
> > >
> > > It’s definitely great, that you are looking for an upstream solution, so
> > > please be patient, and I’d really be interested in your test results.
> > > Depending on these, a way forward can be derived.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul
> >
> > The source code analysis of bluetooth-next confirms that the LE
> > connection preparatory
> >  phase (Accept List) is still limited by the hardcoded HCI_CMD_TIMEOUT
> > (2s), which is
> > identical to my production kernel (Jammy 5.15, Commit: 7824a77711ba):
>
> This is a logical jump, there is no evidence that Aceept List, or any
> other command done in preparation has timed out, just collect the
> btmon traces so we can check which command is considerer to timeout.
>
> > ubuntu@builder:~/bluetooth-next$ sed -n '2511,2514p'
> > net/bluetooth/hci_sync.c err
> > = __hci_cmd_sync_status(hdev, HCI_OP_LE_ADD_TO_ACCEPT_LIST,
> > sizeof(cp), &cp, HCI_CMD_TIMEOUT);
> >
> > While I cannot easily boot a 7.0-rc3 kernel on this specific
> > industrial hardware today, the
> > code at line 2511 in bluetooth-next is strictly identical to my 5.15
> > kernel. In our environment
> > with high-latency sensors (5s advertising interval), this 2s limit
> > systematically triggers a
> > "le-connection-abort-by-local" error before the final connection
> > command is even reached.
>
> Derr, you can boot a regular laptop, it doesn't need to be the
> specific industrial hardware. Since you don't claim the issue is
> hardware-related, it actully _doesn't matter_, or maybe it does and we
> don't know it yet because you never provided any traces.
>
> > What architectural approach would you recommend to allow for more
> > latency during these preparatory
> > sync commands without modifying the global kernel constant?
>
> On the recent kernels the timeout is configurable.
>
> > Best regards,
> >
> > Dajid Morel
>
>
>
> --
> Luiz Augusto von Dentz

Trace Analysis (failure_with_firmware.log attached): Here is the
relevant excerpt showing the host aborting the sequence on the 5.15
kernel:

< HCI Command: LE Create Connection (0x08|0x000d) plen 25 #65 [hci0]
13.006556 > HCI Event: Command Status (0x0f) plen 4 #66 [hci0] 13.009473
LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) […] <
HCI Command: LE Create Connectio.. (0x08|0x000e) plen 0 #88 [hci0]
17.048391 > HCI Event: Command Complete (0x0e) plen 4 #89 [hci0]
17.052037 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status:
Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #90 [hci0]
17.054039 LE Connection Complete (0x01) Status: Unknown Connection
Identifier (0x02)

The host systematically aborts the connection attempt after exactly 4.04
seconds (17.048s - 13.006s).

You mentioned: “On the recent kernels the timeout is configurable.”

Could you point me to the specific commit/patchset or API that
introduced this configurable timeout?

Best regards,

Dajid Morel

[-- Attachment #2: failure_with_firmware.log --]
[-- Type: application/octet-stream, Size: 16908 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-10 13:10                               ` Dajid Morel
@ 2026-03-10 13:47                                 ` Luiz Augusto von Dentz
  2026-03-10 14:04                                   ` Paul Menzel
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-10 13:47 UTC (permalink / raw)
  To: Dajid Morel; +Cc: Paul Menzel, Dajid Morel, linux-bluetooth

Hi Dajid,

> Trace Analysis (failure_with_firmware.log attached): Here is the
> relevant excerpt showing the host aborting the sequence on the 5.15
> kernel:
>
> < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #65 [hci0]
> 13.006556 > HCI Event: Command Status (0x0f) plen 4 #66 [hci0] 13.009473
> LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) […] <
> HCI Command: LE Create Connectio.. (0x08|0x000e) plen 0 #88 [hci0]
> 17.048391 > HCI Event: Command Complete (0x0e) plen 4 #89 [hci0]
> 17.052037 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status:
> Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #90 [hci0]
> 17.054039 LE Connection Complete (0x01) Status: Unknown Connection
> Identifier (0x02)

Ok, so it is actually cancelling after 4 seconds, not 2 seconds as you
repeatedly blabbed about so in the end it doesn't correspond to
HCI_CMD_TIMEOUT but probably something else that sets 4 seconds.

> The host systematically aborts the connection attempt after exactly 4.04
> seconds (17.048s - 13.006s).
>
> You mentioned: “On the recent kernels the timeout is configurable.”
>
> Could you point me to the specific commit/patchset or API that
> introduced this configurable timeout?

It is the third time already:

https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a

But you may also need the following:

https://github.com/bluez/bluetooth-next/commit/bf98feea5b65ced367a871cf35fc044dedbcfb85

> Best regards,
>
> Dajid Morel



-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-10 13:47                                 ` Luiz Augusto von Dentz
@ 2026-03-10 14:04                                   ` Paul Menzel
  2026-03-10 14:19                                     ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Menzel @ 2026-03-10 14:04 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: Dajid Morel, Dajid Morel, linux-bluetooth

Dear Dajid, dear Luiz,


Am 10.03.26 um 14:47 schrieb Luiz Augusto von Dentz:

>> Trace Analysis (failure_with_firmware.log attached): Here is the
>> relevant excerpt showing the host aborting the sequence on the 5.15
>> kernel:
>>
>> < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #65 [hci0]
>> 13.006556 > HCI Event: Command Status (0x0f) plen 4 #66 [hci0] 13.009473
>> LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) […] <
>> HCI Command: LE Create Connectio.. (0x08|0x000e) plen 0 #88 [hci0]
>> 17.048391 > HCI Event: Command Complete (0x0e) plen 4 #89 [hci0]
>> 17.052037 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status:
>> Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #90 [hci0]
>> 17.054039 LE Connection Complete (0x01) Status: Unknown Connection
>> Identifier (0x02)
> 
> Ok, so it is actually cancelling after 4 seconds, not 2 seconds as you
> repeatedly blabbed about so in the end it doesn't correspond to
> HCI_CMD_TIMEOUT but probably something else that sets 4 seconds.
> 
>> The host systematically aborts the connection attempt after exactly 4.04
>> seconds (17.048s - 13.006s).
>>
>> You mentioned: “On the recent kernels the timeout is configurable.”
>>
>> Could you point me to the specific commit/patchset or API that
>> introduced this configurable timeout?
> 
> It is the third time already:
> 
> https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a

This entered v5.17-rc7.

> But you may also need the following:
> 
> https://github.com/bluez/bluetooth-next/commit/bf98feea5b65ced367a871cf35fc044dedbcfb85

This entered v6.10-rc1.

Luiz, would it make sense to backport both commits to the longterm 
stable support (LTS) series?


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-10 14:04                                   ` Paul Menzel
@ 2026-03-10 14:19                                     ` Luiz Augusto von Dentz
  2026-03-10 17:42                                       ` Dajid Morel
  0 siblings, 1 reply; 22+ messages in thread
From: Luiz Augusto von Dentz @ 2026-03-10 14:19 UTC (permalink / raw)
  To: Paul Menzel; +Cc: Dajid Morel, Dajid Morel, linux-bluetooth

Hi Paul,

On Tue, Mar 10, 2026 at 10:04 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> Dear Dajid, dear Luiz,
>
>
> Am 10.03.26 um 14:47 schrieb Luiz Augusto von Dentz:
>
> >> Trace Analysis (failure_with_firmware.log attached): Here is the
> >> relevant excerpt showing the host aborting the sequence on the 5.15
> >> kernel:
> >>
> >> < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #65 [hci0]
> >> 13.006556 > HCI Event: Command Status (0x0f) plen 4 #66 [hci0] 13.009473
> >> LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) […] <
> >> HCI Command: LE Create Connectio.. (0x08|0x000e) plen 0 #88 [hci0]
> >> 17.048391 > HCI Event: Command Complete (0x0e) plen 4 #89 [hci0]
> >> 17.052037 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status:
> >> Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #90 [hci0]
> >> 17.054039 LE Connection Complete (0x01) Status: Unknown Connection
> >> Identifier (0x02)
> >
> > Ok, so it is actually cancelling after 4 seconds, not 2 seconds as you
> > repeatedly blabbed about so in the end it doesn't correspond to
> > HCI_CMD_TIMEOUT but probably something else that sets 4 seconds.
> >
> >> The host systematically aborts the connection attempt after exactly 4.04
> >> seconds (17.048s - 13.006s).
> >>
> >> You mentioned: “On the recent kernels the timeout is configurable.”
> >>
> >> Could you point me to the specific commit/patchset or API that
> >> introduced this configurable timeout?
> >
> > It is the third time already:
> >
> > https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
>
> This entered v5.17-rc7.

This contains a Fixes tag so it was probably considered for
backporting, but I guess it depends if the patch applies cleanly, etc.

> > But you may also need the following:
> >
> > https://github.com/bluez/bluetooth-next/commit/bf98feea5b65ced367a871cf35fc044dedbcfb85
>
> This entered v6.10-rc1.
>
> Luiz, would it make sense to backport both commits to the longterm
> stable support (LTS) series?

It doesn't have any Fixes tag or Cc: stable, and if I recall correctly
this was done on purpose since this introduces a change in the
behavior as previously sockets timeout was not use as connection
timeout, but I can see it being valuable for stable trees to align
them both.

>
>
> Kind regards,
>
> Paul



-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors
  2026-03-10 14:19                                     ` Luiz Augusto von Dentz
@ 2026-03-10 17:42                                       ` Dajid Morel
  0 siblings, 0 replies; 22+ messages in thread
From: Dajid Morel @ 2026-03-10 17:42 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: Paul Menzel, Dajid Morel, linux-bluetooth

Hi Luiz, Paul,

Thank you for considering the backport to the LTS series.

Regarding Luiz's note on whether the patch applies cleanly: I actually
tried to apply the first
commit (a56a1138cbd8) to our 5.15 LTS tree today.

As you suspected, it does not apply cleanly. The patch targets
net/bluetooth/hci_sync.c, but
the 5.15 LTS kernel predates this refactoring and still handles this
logic within net/bluetooth/hci_request.c.

Given this architectural difference in the 5.15 LTS tree, a direct
cherry-pick fails.

Thank you again for your time and guidance on this issue.

Best regards,

Dajid Morel

Le mar. 10 mars 2026 à 15:19, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> a écrit :
>
> Hi Paul,
>
> On Tue, Mar 10, 2026 at 10:04 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
> >
> > Dear Dajid, dear Luiz,
> >
> >
> > Am 10.03.26 um 14:47 schrieb Luiz Augusto von Dentz:
> >
> > >> Trace Analysis (failure_with_firmware.log attached): Here is the
> > >> relevant excerpt showing the host aborting the sequence on the 5.15
> > >> kernel:
> > >>
> > >> < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #65 [hci0]
> > >> 13.006556 > HCI Event: Command Status (0x0f) plen 4 #66 [hci0] 13.009473
> > >> LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) […] <
> > >> HCI Command: LE Create Connectio.. (0x08|0x000e) plen 0 #88 [hci0]
> > >> 17.048391 > HCI Event: Command Complete (0x0e) plen 4 #89 [hci0]
> > >> 17.052037 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status:
> > >> Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #90 [hci0]
> > >> 17.054039 LE Connection Complete (0x01) Status: Unknown Connection
> > >> Identifier (0x02)
> > >
> > > Ok, so it is actually cancelling after 4 seconds, not 2 seconds as you
> > > repeatedly blabbed about so in the end it doesn't correspond to
> > > HCI_CMD_TIMEOUT but probably something else that sets 4 seconds.
> > >
> > >> The host systematically aborts the connection attempt after exactly 4.04
> > >> seconds (17.048s - 13.006s).
> > >>
> > >> You mentioned: “On the recent kernels the timeout is configurable.”
> > >>
> > >> Could you point me to the specific commit/patchset or API that
> > >> introduced this configurable timeout?
> > >
> > > It is the third time already:
> > >
> > > https://github.com/bluez/bluetooth-next/commit/a56a1138cbd85e4d565356199d60e1cb94e5a77a
> >
> > This entered v5.17-rc7.
>
> This contains a Fixes tag so it was probably considered for
> backporting, but I guess it depends if the patch applies cleanly, etc.
>
> > > But you may also need the following:
> > >
> > > https://github.com/bluez/bluetooth-next/commit/bf98feea5b65ced367a871cf35fc044dedbcfb85
> >
> > This entered v6.10-rc1.
> >
> > Luiz, would it make sense to backport both commits to the longterm
> > stable support (LTS) series?
>
> It doesn't have any Fixes tag or Cc: stable, and if I recall correctly
> this was done on purpose since this introduces a change in the
> behavior as previously sockets timeout was not use as connection
> timeout, but I can see it being valuable for stable trees to align
> them both.
>
> >
> >
> > Kind regards,
> >
> > Paul
>
>
>
> --
> Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-03-10 17:43 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-02 23:37 [PATCH v4] Bluetooth: Increase LE connection timeout for industrial sensors Dajid MOREL
2026-03-03  0:11 ` [v4] " bluez.test.bot
2026-03-03 17:24 ` [PATCH v4] " Luiz Augusto von Dentz
2026-03-03 18:57   ` Dajid Morel
2026-03-03 19:26     ` Luiz Augusto von Dentz
2026-03-03 20:30       ` Dajid Morel
2026-03-03 21:12         ` Luiz Augusto von Dentz
2026-03-06  7:15           ` Dajid Morel
2026-03-06 14:26             ` Luiz Augusto von Dentz
     [not found]               ` <CAM8DPm2z-6xUm3SyFJ9umn4=o9bBov6PhKV0TEDCBc14eMFSew@mail.gmail.com>
2026-03-06 15:57                 ` Luiz Augusto von Dentz
2026-03-06 17:54                   ` Dajid Morel
2026-03-06 18:20                     ` Dajid Morel
2026-03-06 18:27                     ` Luiz Augusto von Dentz
2026-03-09 10:02                       ` Dajid Morel
2026-03-09 16:02                         ` Paul Menzel
2026-03-09 17:37                           ` Dajid Morel
2026-03-09 18:03                             ` Luiz Augusto von Dentz
2026-03-10 13:10                               ` Dajid Morel
2026-03-10 13:47                                 ` Luiz Augusto von Dentz
2026-03-10 14:04                                   ` Paul Menzel
2026-03-10 14:19                                     ` Luiz Augusto von Dentz
2026-03-10 17:42                                       ` Dajid Morel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox