* BlueZ/mesh: RX not working after daemon restart (with workaround) @ 2019-11-10 20:08 Aurelien Jarno 2019-11-10 20:59 ` Steve Brown 0 siblings, 1 reply; 4+ messages in thread From: Aurelien Jarno @ 2019-11-10 20:08 UTC (permalink / raw) To: linux-bluetooth Hi all, On my system (Raspberry PI 3), the RX path doesn't work anymore following a restart of the bluetooth-meshd daemon. I have tracked down that to the fact that the receive callbacks are setup before the HCI is fully initialized. Said otherwise, BT_HCI_CMD_LE_SET_SCAN_PARAMETERS is called before BT_HCI_CMD_RESET and the callback calling BT_HCI_CMD_LE_SET_SCAN_ENABLE is not called. This timing dependent and probably not reproducible on all hardware. I have workarounded the issue by adding a small delay between the HCI initialization and the call to node_attach_io_all(): diff --git a/mesh/mesh.c b/mesh/mesh.c index 9b2b2073b..1c06060f9 100644 --- a/mesh/mesh.c +++ b/mesh/mesh.c @@ -167,6 +167,10 @@ bool mesh_init(const char *config_dir, enum mesh_io_type type, void *opts) mesh_io_get_caps(mesh.io, &caps); mesh.max_filters = caps.max_num_filters; + for (int i = 0 ; i < 100 ; i++) { + l_main_iterate(10); + } + node_attach_io_all(mesh.io); return true; I guess there is a better way to do that by waiting for the HCI to be fully initialized before calling node_attach_io_all() or by using a callback instead. However I do not know the codebase good enough to fix that properly. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: BlueZ/mesh: RX not working after daemon restart (with workaround) 2019-11-10 20:08 BlueZ/mesh: RX not working after daemon restart (with workaround) Aurelien Jarno @ 2019-11-10 20:59 ` Steve Brown 2019-11-10 21:39 ` Aurelien Jarno 0 siblings, 1 reply; 4+ messages in thread From: Steve Brown @ 2019-11-10 20:59 UTC (permalink / raw) To: Aurelien Jarno, linux-bluetooth On Sun, 2019-11-10 at 21:08 +0100, Aurelien Jarno wrote: > Hi all, > > On my system (Raspberry PI 3), the RX path doesn't work anymore > following a restart of the bluetooth-meshd daemon. I have tracked > down > that to the fact that the receive callbacks are setup before the HCI > is > fully initialized. Said otherwise, BT_HCI_CMD_LE_SET_SCAN_PARAMETERS > is > called before BT_HCI_CMD_RESET and the callback calling > BT_HCI_CMD_LE_SET_SCAN_ENABLE is not called. This timing dependent > and > probably not reproducible on all hardware. > > I have workarounded the issue by adding a small delay between the HCI > initialization and the call to node_attach_io_all(): > > diff --git a/mesh/mesh.c b/mesh/mesh.c > index 9b2b2073b..1c06060f9 100644 > --- a/mesh/mesh.c > +++ b/mesh/mesh.c > @@ -167,6 +167,10 @@ bool mesh_init(const char *config_dir, enum > mesh_io_type type, void *opts) > mesh_io_get_caps(mesh.io, &caps); > mesh.max_filters = caps.max_num_filters; > > + for (int i = 0 ; i < 100 ; i++) { > + l_main_iterate(10); > + } > + > node_attach_io_all(mesh.io); > > return true; > > I guess there is a better way to do that by waiting for the HCI to be > fully initialized before calling node_attach_io_all() or by using a > callback instead. However I do not know the codebase good enough to > fix > that properly. > > Aurelien > I've experienced something similar on my rpi3. I found that on restart, discover-unprovisioned stopped working. In my case, it appears that meshd assumes that if there are existing nodes, scanning has been enabled. Thus, calls from mesh-cfgclient to discover additional unprovisioned nodes do not need another hci scan enable at mesh/mesh-io-generic.c:736. If meshd is restarted with preexisting nodes, scanning is still assumed to already be enabled, but it's not. This breaks discover-unprovisioned for me. I suspect this is a symptom of a deeper problem where mesh/mesh-config- json.c:load_node doesn't completely reestablish the node state that existed when the node was originally added. Steve ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BlueZ/mesh: RX not working after daemon restart (with workaround) 2019-11-10 20:59 ` Steve Brown @ 2019-11-10 21:39 ` Aurelien Jarno 2019-11-12 6:44 ` Stotland, Inga 0 siblings, 1 reply; 4+ messages in thread From: Aurelien Jarno @ 2019-11-10 21:39 UTC (permalink / raw) To: Steve Brown; +Cc: linux-bluetooth Hi, On 2019-11-10 13:59, Steve Brown wrote: > On Sun, 2019-11-10 at 21:08 +0100, Aurelien Jarno wrote: > > Hi all, > > > > On my system (Raspberry PI 3), the RX path doesn't work anymore > > following a restart of the bluetooth-meshd daemon. I have tracked > > down > > that to the fact that the receive callbacks are setup before the HCI > > is > > fully initialized. Said otherwise, BT_HCI_CMD_LE_SET_SCAN_PARAMETERS > > is > > called before BT_HCI_CMD_RESET and the callback calling > > BT_HCI_CMD_LE_SET_SCAN_ENABLE is not called. This timing dependent > > and > > probably not reproducible on all hardware. > > > > I have workarounded the issue by adding a small delay between the HCI > > initialization and the call to node_attach_io_all(): > > > > diff --git a/mesh/mesh.c b/mesh/mesh.c > > index 9b2b2073b..1c06060f9 100644 > > --- a/mesh/mesh.c > > +++ b/mesh/mesh.c > > @@ -167,6 +167,10 @@ bool mesh_init(const char *config_dir, enum > > mesh_io_type type, void *opts) > > mesh_io_get_caps(mesh.io, &caps); > > mesh.max_filters = caps.max_num_filters; > > > > + for (int i = 0 ; i < 100 ; i++) { > > + l_main_iterate(10); > > + } > > + > > node_attach_io_all(mesh.io); > > > > return true; > > > > I guess there is a better way to do that by waiting for the HCI to be > > fully initialized before calling node_attach_io_all() or by using a > > callback instead. However I do not know the codebase good enough to > > fix > > that properly. > > > > Aurelien > > > I've experienced something similar on my rpi3. I found that on restart, > discover-unprovisioned stopped working. In my case I also observe the same. > In my case, it appears that meshd assumes that if there are existing > nodes, scanning has been enabled. Thus, calls from mesh-cfgclient to > discover additional unprovisioned nodes do not need another hci scan > enable at mesh/mesh-io-generic.c:736. > > If meshd is restarted with preexisting nodes, scanning is still assumed > to already be enabled, but it's not. This breaks discover-unprovisioned > for me. Yes, I think this is exactly my problem. If there are existing nodes, recv_register is called before the HCI is configured and pvt->rx_regs is filled at mesh/mesh-io-generic.c:738. This means that later scanning is assumed to be enabled. However the call to bt_hci_send with BT_HCI_CMD_LE_SET_SCAN_PARAMETERS fails as the HCI is not yet initialized and the callback set_recv_scan_enable() supposed to enable scanning is not called. So when loading a node, scanning is assumed to be enabled, but it is not practice. I believe my workaround should work on your system (maybe after adjusting the number of iterations of the loop). Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BlueZ/mesh: RX not working after daemon restart (with workaround) 2019-11-10 21:39 ` Aurelien Jarno @ 2019-11-12 6:44 ` Stotland, Inga 0 siblings, 0 replies; 4+ messages in thread From: Stotland, Inga @ 2019-11-12 6:44 UTC (permalink / raw) To: aurelien@aurel32.net, Gix, Brian, sbrown@ewol.com Cc: linux-bluetooth@vger.kernel.org Hi Aurelien, On Sun, 2019-11-10 at 22:39 +0100, Aurelien Jarno wrote: > Hi, > > On 2019-11-10 13:59, Steve Brown wrote: > > On Sun, 2019-11-10 at 21:08 +0100, Aurelien Jarno wrote: > > > Hi all, > > > > > > On my system (Raspberry PI 3), the RX path doesn't work anymore > > > following a restart of the bluetooth-meshd daemon. I have tracked > > > down > > > that to the fact that the receive callbacks are setup before the HCI > > > is > > > fully initialized. Said otherwise, BT_HCI_CMD_LE_SET_SCAN_PARAMETERS > > > is > > > called before BT_HCI_CMD_RESET and the callback calling > > > BT_HCI_CMD_LE_SET_SCAN_ENABLE is not called. This timing dependent > > > and > > > probably not reproducible on all hardware. > > > > > > I have workarounded the issue by adding a small delay between the HCI > > > initialization and the call to node_attach_io_all(): > > > > > > diff --git a/mesh/mesh.c b/mesh/mesh.c > > > index 9b2b2073b..1c06060f9 100644 > > > --- a/mesh/mesh.c > > > +++ b/mesh/mesh.c > > > @@ -167,6 +167,10 @@ bool mesh_init(const char *config_dir, enum > > > mesh_io_type type, void *opts) > > > mesh_io_get_caps(mesh.io, &caps); > > > mesh.max_filters = caps.max_num_filters; > > > > > > + for (int i = 0 ; i < 100 ; i++) { > > > + l_main_iterate(10); > > > + } > > > + > > > node_attach_io_all(mesh.io); > > > > > > return true; > > > > > > I guess there is a better way to do that by waiting for the HCI to be > > > fully initialized before calling node_attach_io_all() or by using a > > > callback instead. However I do not know the codebase good enough to > > > fix > > > that properly. > > > > > > Aurelien > > > > > I've experienced something similar on my rpi3. I found that on restart, > > discover-unprovisioned stopped working. > > In my case I also observe the same. > > > In my case, it appears that meshd assumes that if there are existing > > nodes, scanning has been enabled. Thus, calls from mesh-cfgclient to > > discover additional unprovisioned nodes do not need another hci scan > > enable at mesh/mesh-io-generic.c:736. > > > > If meshd is restarted with preexisting nodes, scanning is still assumed > > to already be enabled, but it's not. This breaks discover-unprovisioned > > for me. > > Yes, I think this is exactly my problem. If there are existing nodes, > recv_register is called before the HCI is configured and pvt->rx_regs is > filled at mesh/mesh-io-generic.c:738. This means that later scanning is > assumed to be enabled. However the call to bt_hci_send with > BT_HCI_CMD_LE_SET_SCAN_PARAMETERS fails as the HCI is not yet > initialized and the callback set_recv_scan_enable() supposed to enable > scanning is not called. > > So when loading a node, scanning is assumed to be enabled, but it is > not practice. > > I believe my workaround should work on your system (maybe after > adjusting the number of iterations of the loop). > > Aurelien > Thanks for the analysis. I think we should switch to callback approach, i.e. initialize io first and register the RX on the successful init callback. Best regards, Inga ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-11-12 6:44 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-11-10 20:08 BlueZ/mesh: RX not working after daemon restart (with workaround) Aurelien Jarno 2019-11-10 20:59 ` Steve Brown 2019-11-10 21:39 ` Aurelien Jarno 2019-11-12 6:44 ` Stotland, Inga
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox