[PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests

Wireless Daemon for Linux
 help / color / mirror / Atom feed

* [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests
@ 2020-06-01 17:11 Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-01 17:11 ` [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended Alvin =?unknown-8bit?q?=C5=A0ipraga?=
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alvin =?unknown-8bit?q?=C5=A0ipraga?= @ 2020-06-01 17:11 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 1173 bytes --]

If start_scan_next_request() is called while a scan request
(NL80211_CMD_TRIGGER_SCAN) is still running, the same scan request will
be sent again. Add a check in the function to avoid sending a request if
one is already in progress.

This also fixes a crash that occurs if the following conditions are met:
  - the duplicated request is the only request in the scan request
    queue, and
  - both scan requests fail with an error not EBUSY.

In this case, the first callback to scan_request_triggered() will delete
the request from the scan request queue. The second callback will find
an empty queue and consequently pass a NULL scan_request pointer to
scan_request_failed(), causing a segmentation fault.
---
 src/scan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/scan.c b/src/scan.c
index 718f7497..106fa81c 100644
--- a/src/scan.c
+++ b/src/scan.c
@@ -839,6 +839,9 @@ static bool start_next_scan_request(struct scan_context *sc)
 	if (sc->state != SCAN_STATE_NOT_RUNNING)
 		return true;
 
+	if (sc->start_cmd_id)
+		return true;
+
 	while (sr) {
 		if (!scan_request_send_trigger(sc, sr))
 			return true;
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended
  2020-06-01 17:11 [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Alvin =?unknown-8bit?q?=C5=A0ipraga?=
@ 2020-06-01 17:11 ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-04 18:19   ` Denis Kenzior
  2020-06-02  8:45 ` [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Andrew Zaborowski
  2020-06-04 18:31 ` Denis Kenzior
  2 siblings, 1 reply; 8+ messages in thread
From: Alvin =?unknown-8bit?q?=C5=A0ipraga?= @ 2020-06-01 17:11 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 616 bytes --]

If scanning is suspended, have scan_common() queue its scan request
rather than issuing it immediately. This respects the assumption that
scans are not requested while sc->suspended is true.
---
 src/scan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/scan.c b/src/scan.c
index 106fa81c..e10c5447 100644
--- a/src/scan.c
+++ b/src/scan.c
@@ -546,6 +546,9 @@ static uint32_t scan_common(uint64_t wdev_id, bool passive,
 	if (!l_queue_isempty(sc->requests))
 		goto done;
 
+	if (sc->suspended)
+		goto done;
+
 	if (sc->state != SCAN_STATE_NOT_RUNNING)
 		goto done;
 
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended
  2020-06-01 17:11 ` [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended Alvin =?unknown-8bit?q?=C5=A0ipraga?=
@ 2020-06-04 18:19   ` Denis Kenzior
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Kenzior @ 2020-06-04 18:19 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 410 bytes --]

Hi Alvin,

On 6/1/20 12:11 PM, Alvin Šipraga wrote:
> If scanning is suspended, have scan_common() queue its scan request
> rather than issuing it immediately. This respects the assumption that
> scans are not requested while sc->suspended is true.
> ---
>   src/scan.c | 3 +++
>   1 file changed, 3 insertions(+)
> 

I reworded the commit header slightly and applied, thanks.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests
  2020-06-01 17:11 [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-01 17:11 ` [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended Alvin =?unknown-8bit?q?=C5=A0ipraga?=
@ 2020-06-02  8:45 ` Andrew Zaborowski
  2020-06-04 18:31 ` Denis Kenzior
  2 siblings, 0 replies; 8+ messages in thread
From: Andrew Zaborowski @ 2020-06-02  8:45 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

On Mon, 1 Jun 2020 at 19:11, Alvin Šipraga <alsi@bang-olufsen.dk> wrote:
> If start_scan_next_request() is called while a scan request
> (NL80211_CMD_TRIGGER_SCAN) is still running, the same scan request will
> be sent again. Add a check in the function to avoid sending a request if
> one is already in progress.
>
> This also fixes a crash that occurs if the following conditions are met:
>   - the duplicated request is the only request in the scan request
>     queue, and
>   - both scan requests fail with an error not EBUSY.
>
> In this case, the first callback to scan_request_triggered() will delete
> the request from the scan request queue. The second callback will find
> an empty queue and consequently pass a NULL scan_request pointer to
> scan_request_failed(), causing a segmentation fault.
> ---
>  src/scan.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/scan.c b/src/scan.c
> index 718f7497..106fa81c 100644
> --- a/src/scan.c
> +++ b/src/scan.c
> @@ -839,6 +839,9 @@ static bool start_next_scan_request(struct scan_context *sc)
>         if (sc->state != SCAN_STATE_NOT_RUNNING)
>                 return true;
>
> +       if (sc->start_cmd_id)
> +               return true;

How about checking sc->get_scan_cmd_id as well?  I guess I'd put these
two and the sc->state check in one if ().

Also you can now drop those checks in scan_notify in the
NL80211_CMD_SCAN_ABORTED case because as you mentioned in the other
thread you're moving the checks from the callers into
start_next_scan_request.

Best regards

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests
  2020-06-01 17:11 [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-01 17:11 ` [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-02  8:45 ` [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Andrew Zaborowski
@ 2020-06-04 18:31 ` Denis Kenzior
  2020-06-06 17:33   ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2 siblings, 1 reply; 8+ messages in thread
From: Denis Kenzior @ 2020-06-04 18:31 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 1825 bytes --]

Hi Alvin,

On 6/1/20 12:11 PM, Alvin Šipraga wrote:
> If start_scan_next_request() is called while a scan request
> (NL80211_CMD_TRIGGER_SCAN) is still running, the same scan request will
> be sent again. Add a check in the function to avoid sending a request if
> one is already in progress.
> 
> This also fixes a crash that occurs if the following conditions are met:
>    - the duplicated request is the only request in the scan request
>      queue, and
>    - both scan requests fail with an error not EBUSY.

Nice catch on this.  Scanning code is some of the most complex in the 
project, so kudos.

> 
> In this case, the first callback to scan_request_triggered() will delete
> the request from the scan request queue. The second callback will find
> an empty queue and consequently pass a NULL scan_request pointer to
> scan_request_failed(), causing a segmentation fault.

So for my own education, can you elaborate a bit more on how this is 
actually triggered?  I'm guessing that there are pending scans (probably 
triggered by invocation of Scan() via D-Bus?).  But an exact sequence of 
what happens would be nice.

> ---
>   src/scan.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/src/scan.c b/src/scan.c
> index 718f7497..106fa81c 100644
> --- a/src/scan.c
> +++ b/src/scan.c
> @@ -839,6 +839,9 @@ static bool start_next_scan_request(struct scan_context *sc)
>   	if (sc->state != SCAN_STATE_NOT_RUNNING)
>   		return true;
>   
> +	if (sc->start_cmd_id)
> +		return true;
> +

So I think Andrew's suggestion of also checking for get_scan_id being 
not zero also makes sense.  It would keep things consistent with the 
comment in scan_notify where it says that we do not start the next 
request until GET_SCAN has been done.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests
  2020-06-04 18:31 ` Denis Kenzior
@ 2020-06-06 17:33   ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-09 11:49     ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  0 siblings, 1 reply; 8+ messages in thread
From: Alvin =?unknown-8bit?q?=C5=A0ipraga?= @ 2020-06-06 17:33 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

Hi Andrew and Denis,

Thanks for your comments.

On 6/4/20 8:31 PM, Denis Kenzior wrote:
>> In this case, the first callback to scan_request_triggered() will delete
>> the request from the scan request queue. The second callback will find
>> an empty queue and consequently pass a NULL scan_request pointer to
>> scan_request_failed(), causing a segmentation fault.
> 
> So for my own education, can you elaborate a bit more on how this is
> actually triggered?  I'm guessing that there are pending scans (probably
> triggered by invocation of Scan() via D-Bus?).  But an exact sequence of
> what happens would be nice.

The crash is a little hard to reproduce. I have only observed it while
looping a Wi-Fi integration test we have, where it happens maybe 10-20%
of the time. When I am back in the office on Monday I will try to
prepare a detailed answer to your question.

> So I think Andrew's suggestion of also checking for get_scan_id being
> not zero also makes sense.  It would keep things consistent with the
> comment in scan_notify where it says that we do not start the next
> request until GET_SCAN has been done.

Agreed. I'll prepare a v3 patch and send it to you soon.

Kind regards,
Alvin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests
  2020-06-06 17:33   ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
@ 2020-06-09 11:49     ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
  2020-06-09 20:10       ` Denis Kenzior
  0 siblings, 1 reply; 8+ messages in thread
From: Alvin =?unknown-8bit?q?=C5=A0ipraga?= @ 2020-06-09 11:49 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 4496 bytes --]

Hi Denis,

As promised, here is some more detail on how the duplicate scan request 
gets sent, and the subsequent crash.

On 6/6/20 7:33 PM, Alvin Šipraga wrote:
> On 6/4/20 8:31 PM, Denis Kenzior wrote:
>>> In this case, the first callback to scan_request_triggered() will delete
>>> the request from the scan request queue. The second callback will find
>>> an empty queue and consequently pass a NULL scan_request pointer to
>>> scan_request_failed(), causing a segmentation fault.
>>
>> So for my own education, can you elaborate a bit more on how this is
>> actually triggered?  I'm guessing that there are pending scans (probably
>> triggered by invocation of Scan() via D-Bus?).  But an exact sequence of
>> what happens would be nice.
> 
> The crash is a little hard to reproduce. I have only observed it while
> looping a Wi-Fi integration test we have, where it happens maybe 10-20%
> of the time. When I am back in the office on Monday I will try to
> prepare a detailed answer to your question.

I traced the following events leading to a duplicate scan request being 
sent. The scenario assumes that the scan request queue is empty in the 
beginning.


station_enter_state: disconnected -> autoconnect_quick
   -> station_quick_scan_trigger
     -> station_scan_trigger: notify cb = station_quick_scan_results
       -> scan_passive
         -> scan_common
           -> scan_request_send_trigger
           -> l_queue_push_tail: scan request queue size = 1
scan_notify: quick scan triggered
network_connect: requested over dbus
   -> network_connect_psk
     -> station_connect_network
       -> station_enter_state: autoconnect_quick -> connecting
         -> periodic_scan_stop
           -> scan_periodic_stop
             -> scan_cancel
scan_notify: quick scan aborted
   -> scan_finished
     -> l_queue_remove: scan request queue size = 0
     -> station_quick_scan_results: quick scan's notify callback
     . -> station_enter_state: connecting -> autoconnect_full
     .   -> scan_periodic_start
     .     -> scan_periodic_queue
     .       -> scan_{active_full,passive}: idk which; doesn't matter
     .         -> scan_common
     .           -> scan_request_send_trigger
     .           -> l_queue_push_tail: scan request queue size = 1
     -> start_next_scan_request: head of queue has already been sent
       -> scan_request_send_trigger: send it again


Now the two requests both fail because the driver is busy connecting:

kernel: brcmfmac: brcmf_cfg80211_escan: Connecting: status (3)
kernel: brcmfmac: brcmf_cfg80211_scan: scan error (-11)
kernel: brcmfmac: brcmf_cfg80211_escan: Connecting: status (3)
kernel: brcmfmac: brcmf_cfg80211_scan: scan error (-11)

The errors get reported via the scan_request_triggered callback. The 
first callback deletes the scan_request from the scan request queue. The 
second callback gets a NULL scan_request pointer due to the queue being 
empty, and subsequently dereferences that pointer. Here is the stack 
trace after the segmentation fault:

(gdb) bt
#0  scan_request_failed (sr=sr(a)entry=0x0, err=err(a)entry=-11, 
sc=<optimized out>)
     at /usr/src/debug/iwd/1.7-r0/git/src/scan.c:157
#1  0x0000000000424bf0 in scan_request_triggered (msg=<optimized out>, 
userdata=0x37f320e0)
     at /usr/src/debug/iwd/1.7-r0/git/src/scan.c:222
#2  0x0000ffffb770bc0c in process_unicast (nlmsg=0xffffc2ab2dc0, 
genl=0x37f16830)
     at /usr/src/debug/ell/0.31-r0/ell-0.31/ell/genl.c:979
#3  received_data (io=<optimized out>, user_data=0x37f16830)
     at /usr/src/debug/ell/0.31-r0/ell-0.31/ell/genl.c:1087
#4  0x0000ffffb7707e78 in io_callback (fd=<optimized out>, events=1, 
user_data=0x37f168d0)
     at /usr/src/debug/ell/0.31-r0/ell-0.31/ell/io.c:126
#5  0x0000ffffb77070d4 in l_main_iterate (timeout=<optimized out>)
     at /usr/src/debug/ell/0.31-r0/ell-0.31/ell/main.c:473
#6  0x0000ffffb7707194 in l_main_run () at 
/usr/src/debug/ell/0.31-r0/ell-0.31/ell/main.c:520
#7  0x0000ffffb77073ac in l_main_run_with_signal 
(callback=callback(a)entry=0x408ee0 <signal_handler>,
     user_data=user_data(a)entry=0x0) at 
/usr/src/debug/ell/0.31-r0/ell-0.31/ell/main.c:642
#8  0x0000000000408b8c in main (argc=<optimized out>, argv=<optimized out>)
     at /usr/src/debug/iwd/1.7-r0/git/src/main.c:512


Hope that helps - let me know if you have any questions. And thanks for 
accepting the patches.

Kind regards,
Alvin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests
  2020-06-09 11:49     ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
@ 2020-06-09 20:10       ` Denis Kenzior
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Kenzior @ 2020-06-09 20:10 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 1997 bytes --]

Hi Alvin,

> I traced the following events leading to a duplicate scan request being sent. 
> The scenario assumes that the scan request queue is empty in the beginning.
> 
> 
> station_enter_state: disconnected -> autoconnect_quick
>    -> station_quick_scan_trigger
>      -> station_scan_trigger: notify cb = station_quick_scan_results
>        -> scan_passive
>          -> scan_common
>            -> scan_request_send_trigger
>            -> l_queue_push_tail: scan request queue size = 1
> scan_notify: quick scan triggered
> network_connect: requested over dbus
>    -> network_connect_psk
>      -> station_connect_network
>        -> station_enter_state: autoconnect_quick -> connecting
>          -> periodic_scan_stop
>            -> scan_periodic_stop
>              -> scan_cancel
> scan_notify: quick scan aborted
>    -> scan_finished
>      -> l_queue_remove: scan request queue size = 0
>      -> station_quick_scan_results: quick scan's notify callback
>      . -> station_enter_state: connecting -> autoconnect_full

Aha, so this is the real culprit.  I fixed this as well in commit 
532f6b154e03b5fa222502098b5688a1784cc4e3.

>      .   -> scan_periodic_start
>      .     -> scan_periodic_queue
>      .       -> scan_{active_full,passive}: idk which; doesn't matter
>      .         -> scan_common
>      .           -> scan_request_send_trigger
>      .           -> l_queue_push_tail: scan request queue size = 1

Okay, so we start a periodic scan and in scan_finished call 
start_next_scan_request again.

>      -> start_next_scan_request: head of queue has already been sent
>        -> scan_request_send_trigger: send it again
> 
> 

And since we don't check that one is already going, we issue the trigger again. 
Makes sense to me now.  Much appreciated.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-06-09 20:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-01 17:11 [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Alvin =?unknown-8bit?q?=C5=A0ipraga?=
2020-06-01 17:11 ` [PATCH v2 2/2] scan: refactor scan_common to not scan while scanning is suspended Alvin =?unknown-8bit?q?=C5=A0ipraga?=
2020-06-04 18:19   ` Denis Kenzior
2020-06-02  8:45 ` [PATCH v2 1/2] scan: refactor start_next_scan_request to not send duplicate requests Andrew Zaborowski
2020-06-04 18:31 ` Denis Kenzior
2020-06-06 17:33   ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
2020-06-09 11:49     ` Alvin =?unknown-8bit?q?=C5=A0ipraga?=
2020-06-09 20:10       ` Denis Kenzior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox