iwd.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/3] Improve roam scan strategy
@ 2025-04-15  8:21 Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 1/3] util: add scan_freq_set_size function Alexander Ganslandt
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alexander Ganslandt @ 2025-04-15  8:21 UTC (permalink / raw)
  To: iwd

These patches are WIP, I'm not submitting them for merging at this
point, but rather to start a discussion if this approach makes sense and
if it's a good addition to IWD.

My use case is livestreaming video while moving around a building, which
requires quickly finding a new BSS when the current one becomes weak.
The current behavior of scanning all frequencies at once takes too long
and is almost guaranteed to end up with a disconnection, resulting in
choppy livestreams.

The approach in these patches works a lot better for my use case, but
I'm unsure if it could cause issues for other use cases. I'd appreciate
any feedback or ideas!

Thanks,
Alexander

---
Alexander Ganslandt (3):
      util: add scan_freq_set_size function
      scan: add scan_freq_map
      station: improve roam scan strategy

 src/scan.c    |  36 +++++++++++
 src/scan.h    |   2 +
 src/station.c | 190 +++++++++++++++++++++++++++++++++++++++++++++-------------
 src/util.c    |  11 ++++
 src/util.h    |   1 +
 5 files changed, 199 insertions(+), 41 deletions(-)
---
base-commit: 7d5bcd738bb92cd33756985ffd2c4c67452a7f85
change-id: 20250414-roam-scan-improvements-02116c7c4631

Best regards,
-- 
Alexander Ganslandt <alexander.ganslandt@axis.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RFC 1/3] util: add scan_freq_set_size function
  2025-04-15  8:21 [PATCH RFC 0/3] Improve roam scan strategy Alexander Ganslandt
@ 2025-04-15  8:21 ` Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 2/3] scan: add scan_freq_map Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 3/3] station: improve roam scan strategy Alexander Ganslandt
  2 siblings, 0 replies; 10+ messages in thread
From: Alexander Ganslandt @ 2025-04-15  8:21 UTC (permalink / raw)
  To: iwd

Return the number of freqs in a scan_freq_set, useful for knowing how
big the set is.
---
 src/util.c | 11 +++++++++++
 src/util.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/src/util.c b/src/util.c
index 65b97c8e..1e1dfc15 100644
--- a/src/util.c
+++ b/src/util.c
@@ -637,6 +637,17 @@ struct scan_freq_set *scan_freq_set_clone(const struct scan_freq_set *set,
 	return new;
 }
 
+uint32_t scan_freq_set_size(struct scan_freq_set *freqs)
+{
+	uint32_t size = 0;
+
+	size += __builtin_popcount(freqs->channels_2ghz);
+	size += l_uintset_size(freqs->channels_5ghz);
+	size += l_uintset_size(freqs->channels_6ghz);
+
+	return size;
+}
+
 /* First 64 entries calculated by 1 / pow(n, 0.3) for n >= 1 */
 static const double rankmod_table[] = {
 	1.0000000000, 0.8122523964, 0.7192230933, 0.6597539554,
diff --git a/src/util.h b/src/util.h
index 8aef2985..7c024c79 100644
--- a/src/util.h
+++ b/src/util.h
@@ -138,6 +138,7 @@ uint32_t *scan_freq_set_to_fixed_array(const struct scan_freq_set *set,
 					size_t *len_out);
 struct scan_freq_set *scan_freq_set_clone(const struct scan_freq_set *set,
 						uint32_t band_mask);
+uint32_t scan_freq_set_size(struct scan_freq_set *freqs);
 
 DEFINE_CLEANUP_FUNC(scan_freq_set_free);
 

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC 2/3] scan: add scan_freq_map
  2025-04-15  8:21 [PATCH RFC 0/3] Improve roam scan strategy Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 1/3] util: add scan_freq_set_size function Alexander Ganslandt
@ 2025-04-15  8:21 ` Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 3/3] station: improve roam scan strategy Alexander Ganslandt
  2 siblings, 0 replies; 10+ messages in thread
From: Alexander Ganslandt @ 2025-04-15  8:21 UTC (permalink / raw)
  To: iwd

This is a hashmap where the keys are freqs and the values are
timestamps. When a freq has been scanned, the timestamp for that freq is
updated to "now". The function scan_get_freq_age can be used for getting
the age of a freq, which is how long time ago it was last scanned.

This is useful for deciding if a certain freq should be scanned, or if
it's better to spend that time on scanning another freq.
---
 src/scan.c | 36 ++++++++++++++++++++++++++++++++++++
 src/scan.h |  2 ++
 2 files changed, 38 insertions(+)

diff --git a/src/scan.c b/src/scan.c
index aeab6516..f4c2e548 100644
--- a/src/scan.c
+++ b/src/scan.c
@@ -62,6 +62,7 @@ static uint32_t SCAN_MAX_INTERVAL;
 static uint32_t SCAN_INIT_INTERVAL;
 
 static struct l_queue *scan_contexts;
+static struct l_hashmap *scan_freq_map;
 
 static struct l_genl_family *nl80211;
 
@@ -2316,6 +2317,37 @@ static void scan_retry_pending(uint32_t wiphy_id)
 	}
 }
 
+static void scan_update_freq_map_entry(uint32_t freq, void *user_data)
+{
+	void *existing = l_hashmap_lookup(scan_freq_map, L_UINT_TO_PTR(freq));
+	uint64_t now = l_time_now();
+
+	if (existing) {
+		l_hashmap_remove(scan_freq_map, L_UINT_TO_PTR(freq));
+	}
+
+	l_hashmap_insert(scan_freq_map, L_UINT_TO_PTR(freq), L_UINT_TO_PTR(now));
+}
+
+static void scan_update_freq_map(struct scan_freq_set *freqs)
+{
+	scan_freq_set_foreach(freqs, scan_update_freq_map_entry, NULL);
+}
+
+uint64_t scan_get_freq_age(uint32_t freq)
+{
+	void *entry = l_hashmap_lookup(scan_freq_map, L_UINT_TO_PTR(freq));
+
+	if (entry) {
+		uint64_t timestamp = (uintptr_t) entry;
+		uint64_t now = l_time_now();
+
+		return (now - timestamp) / 1000000;
+	}
+
+	return UINT64_MAX;
+}
+
 static void scan_notify(struct l_genl_msg *msg, void *user_data)
 {
 	struct l_genl_attr attr;
@@ -2461,6 +2493,8 @@ static void scan_notify(struct l_genl_msg *msg, void *user_data)
 
 		scan_get_results(sc, sr, freqs);
 
+		scan_update_freq_map(freqs);
+
 		break;
 	}
 
@@ -2645,6 +2679,7 @@ static int scan_init(void)
 	const struct l_settings *config = iwd_get_config();
 
 	scan_contexts = l_queue_new();
+	scan_freq_map = l_hashmap_new();
 
 	RANK_2G_FACTOR = scan_get_band_rank_modifier(BAND_FREQ_2_4_GHZ);
 	RANK_5G_FACTOR = scan_get_band_rank_modifier(BAND_FREQ_5_GHZ);
@@ -2688,6 +2723,7 @@ static void scan_exit(void)
 	scan_contexts = NULL;
 	l_genl_family_free(nl80211);
 	nl80211 = NULL;
+	l_hashmap_destroy(scan_freq_map, NULL);
 }
 
 IWD_MODULE(scan, scan_init, scan_exit)
diff --git a/src/scan.h b/src/scan.h
index 4c1ebc21..135ffa5e 100644
--- a/src/scan.h
+++ b/src/scan.h
@@ -182,3 +182,5 @@ double scan_get_band_rank_modifier(enum band_freq band);
 
 bool scan_wdev_add(uint64_t wdev_id);
 bool scan_wdev_remove(uint64_t wdev_id);
+
+uint64_t scan_get_freq_age(uint32_t freq);

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC 3/3] station: improve roam scan strategy
  2025-04-15  8:21 [PATCH RFC 0/3] Improve roam scan strategy Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 1/3] util: add scan_freq_set_size function Alexander Ganslandt
  2025-04-15  8:21 ` [PATCH RFC 2/3] scan: add scan_freq_map Alexander Ganslandt
@ 2025-04-15  8:21 ` Alexander Ganslandt
  2025-04-16 16:46   ` James Prestwood
  2025-04-16 17:19   ` Denis Kenzior
  2 siblings, 2 replies; 10+ messages in thread
From: Alexander Ganslandt @ 2025-04-15  8:21 UTC (permalink / raw)
  To: iwd

When IWD decides to roam, it scans either neighbor freqs or known freqs
(if neighbors are not available). If it fails to roam after getting the
scan results, it scans ALL freqs. In my testing there's a high chance
that both neighbor and/or known scans fail to roam and we end up
scanning all freqs. This is very slow and if you're already moving away
from the current BSS, there's a high chance you will lose connection
completely before the scan is finished.

Instead of scanning all freqs at once, split them up into prioritized
subsets. Each subset contains a handful of freqs each, a lower index for
the subset means that its freqs are more common. So subset 0 has the
most common freqs and subset 3 has the least common freqs. The first two
subsets also contain no DFS channels, speeding up scanning even more. In
order to make this efficient, use the "scan_freq_map" to avoid scanning
freqs that were recently scanned.

When a roam scan is triggered, add the most prioritized freqs to the
list of freqs that should be scanned. The order of priority is:

1. Neighbor freqs
2. Known freqs
3. Subsets, starting with index 0 and incrementing if the subset is
   exhausted

For each freq candidate, check the scan_freq_map to see how long time
ago this freq was last scanned, this is denoted as its "age". If its age
is above a threshold, add the freq to the list, otherwise discard it.
Once the list has a certain size, start the scan. If no roaming occurs
after the scan is completed, run the same function again. It will now
pick new freqs since the previous freqs have been updated in
scan_freq_map.

This approach results in more scans, but fewer freqs per scan, leading
to shorter delays between scan results. It also avoids scanning the same
freqs back-to-back, which is generally not very useful. In combination
with the freq priority, this increases the chance of finding a good BSS
early.
---
 src/station.c | 190 +++++++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 149 insertions(+), 41 deletions(-)

diff --git a/src/station.c b/src/station.c
index 9972ea76..87cf9df0 100644
--- a/src/station.c
+++ b/src/station.c
@@ -67,6 +67,8 @@
 
 #define STATION_RECENT_NETWORK_LIMIT	5
 #define STATION_RECENT_FREQS_LIMIT	5
+#define STATION_MAX_SCAN_FREQ_AGE	3
+#define STATION_MAX_SCAN_FREQS	10
 
 static struct l_queue *station_list;
 static uint32_t netdev_watch;
@@ -124,7 +126,7 @@ struct station {
 	struct l_queue *roam_bss_list;
 
 	/* Frequencies split into subsets by priority */
-	struct scan_freq_set *scan_freqs_order[3];
+	struct scan_freq_set *scan_freqs_order[4];
 	unsigned int dbus_scan_subset_idx;
 
 	uint32_t wiphy_watch;
@@ -2385,6 +2387,8 @@ static void station_roam_retry(struct station *station)
 		station_roam_timeout_rearm(station, roam_retry_interval);
 }
 
+static void station_start_roam(struct station *station);
+
 static void station_roam_failed(struct station *station)
 {
 	l_debug("%u", netdev_get_ifindex(station->netdev));
@@ -2414,10 +2418,9 @@ static void station_roam_failed(struct station *station)
 		goto delayed_retry;
 
 	/*
-	 * If we tried a limited scan, failed and the signal is still low,
-	 * repeat with a full scan right away
+	 * If the signal is still low, keep trying to roam
 	 */
-	if (station->signal_low && !station->roam_scan_full) {
+	if (station->signal_low) {
 		/*
 		 * Since we're re-using roam_scan_id, explicitly cancel
 		 * the scan here, so that the destroy callback is not called
@@ -2426,8 +2429,8 @@ static void station_roam_failed(struct station *station)
 		scan_cancel(netdev_get_wdev_id(station->netdev),
 						station->roam_scan_id);
 
-		if (!station_roam_scan(station, NULL))
-			return;
+		station_start_roam(station);
+		return;
 	}
 
 delayed_retry:
@@ -3102,12 +3105,85 @@ static void station_neighbor_report_cb(struct netdev *netdev, int err,
 		station_roam_failed(station);
 }
 
+static void station_filter_roam_scan_freq(uint32_t freq, void *user_data)
+{
+	struct scan_freq_set *freqs = user_data;
+	uint64_t age = scan_get_freq_age(freq);
+
+	if (scan_freq_set_size(freqs) >= STATION_MAX_SCAN_FREQS) {
+		return;
+	}
+
+	if (age < STATION_MAX_SCAN_FREQ_AGE) {
+		return;
+	}
+
+	scan_freq_set_add(freqs, freq);
+}
+
+static struct scan_freq_set *station_get_roam_scan_freqs(struct station *station)
+{
+	struct scan_freq_set *tmp;
+	struct scan_freq_set *scan_freqs;
+
+	scan_freqs = scan_freq_set_new();
+
+	/* Add current frequency, always scan this to get updated data for the
+	 * current BSS */
+	scan_freq_set_add(scan_freqs, station->connected_bss->frequency);
+
+	/* Add neighbor frequencies */
+	scan_freq_set_foreach(station->roam_freqs, station_filter_roam_scan_freq, scan_freqs);
+	if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
+		goto out;
+	}
+
+	/* Add known frequencies */
+	const struct network_info *info = network_get_info(
+						station->connected_network);
+	tmp = network_info_get_roam_frequencies(info,
+					station->connected_bss->frequency,
+					10);
+	scan_freq_set_foreach(tmp, station_filter_roam_scan_freq, scan_freqs);
+	scan_freq_set_free(tmp);
+	if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
+		goto out;
+	}
+
+	/* Add frequencies based on the prioritized subsets */
+	for (uint8_t i = 0; i < L_ARRAY_SIZE(station->scan_freqs_order); i++) {
+		scan_freq_set_foreach(station->scan_freqs_order[i], station_filter_roam_scan_freq, scan_freqs);
+		if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
+			goto out;
+		}
+	}
+
+out:
+	/* TODO: Arbitrary number to not have too small freq list */
+	if (scan_freq_set_size(scan_freqs) <= 5) {
+		/* Might as well add the neighbors */
+		scan_freq_set_merge(scan_freqs, station->roam_freqs);
+	}
+
+	return scan_freqs;
+}
+
 static void station_start_roam(struct station *station)
 {
 	int r;
+	struct scan_freq_set *freqs;
 
 	station->preparing_roam = true;
 
+	/* TODO: Need to request neighbor report here, like below */
+
+	freqs = station_get_roam_scan_freqs(station);
+	station_roam_scan(station, freqs);
+
+	scan_freq_set_free(freqs);
+
+	return;
+
 	/*
 	 * If current BSS supports Neighbor Reports, narrow the scan down
 	 * to channels occupied by known neighbors in the ESS. If no neighbor
@@ -5007,44 +5083,79 @@ static void station_add_2_4ghz_freq(uint32_t freq, void *user_data)
 
 static void station_fill_scan_freq_subsets(struct station *station)
 {
-	const struct scan_freq_set *supported =
-				wiphy_get_supported_freqs(station->wiphy);
 	unsigned int subset_idx = 0;
 
-	/*
-	 * Scan the 2.4GHz "social channels" first, 5GHz second, if supported,
-	 * all other 2.4GHz channels last.  To be refined as needed.
-	 */
+	station->scan_freqs_order[subset_idx] = scan_freq_set_new();
+
+	/* Subset 0: 2.4GHz "social channels" and lower 5GHz non-DFS channels */
 	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
-		station->scan_freqs_order[subset_idx] = scan_freq_set_new();
-		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2412);
-		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2437);
-		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2462);
-		subset_idx++;
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2412); /* 1 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2437); /* 6 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2462); /* 11 */
 	}
 
-	/*
-	 * TODO: It may might sense to split up 5 and 6ghz into separate subsets
-	 *       since the channel set is so large.
-	 */
-	if (allowed_bands & (BAND_FREQ_5_GHZ | BAND_FREQ_6_GHZ)) {
-		uint32_t mask = allowed_bands &
-					(BAND_FREQ_5_GHZ | BAND_FREQ_6_GHZ);
-		struct scan_freq_set *set = scan_freq_set_clone(supported,
-								mask);
-
-		/* 5/6ghz didn't add any frequencies */
-		if (scan_freq_set_isempty(set)) {
-			scan_freq_set_free(set);
-		} else
-			station->scan_freqs_order[subset_idx++] = set;
+	if (allowed_bands & BAND_FREQ_5_GHZ) {
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5180); /* 36 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5200); /* 40 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5220); /* 44 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5240); /* 48 */
 	}
 
-	/* Add remaining 2.4ghz channels to subset */
+	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
+
+	/* Subset 1: 2.4GHz common "middle channels" and high 5GHz non-DFS channels */
 	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
-		station->scan_freqs_order[subset_idx] = scan_freq_set_new();
-		scan_freq_set_foreach(supported, station_add_2_4ghz_freq,
-					station->scan_freqs_order[subset_idx]);
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2422); /* 3 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2427); /* 4 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2447); /* 8 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2452); /* 9 */
+	}
+
+	if (allowed_bands & BAND_FREQ_5_GHZ) {
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5745); /* 149 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5765); /* 153 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5785); /* 157 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5805); /* 161 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5825); /* 165 */
+	}
+
+	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
+
+	/* TODO: Add 6GHz here, after more common 2.4 and 5GHz, but before DFS */
+
+	/* Subset 2: 2.4GHz remaining channels and 5GHz most common DFS channels */
+	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2417); /* 2 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2432); /* 5 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2442); /* 7 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2457); /* 10 */
+	}
+
+	if (allowed_bands & BAND_FREQ_5_GHZ) {
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5260); /* 52 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5280); /* 56 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5300); /* 60 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5320); /* 64 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5500); /* 100 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5520); /* 104 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5540); /* 108 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5560); /* 112 */
+	}
+
+	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
+
+	/* Subset 3: Remaining 5GHz DFS channels */
+	if (allowed_bands & BAND_FREQ_5_GHZ) {
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5340); /* 68 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5480); /* 96 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5580); /* 116 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5600); /* 120 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5620); /* 124 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5640); /* 128 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5660); /* 132 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5680); /* 136 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5700); /* 140 */
+		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5720); /* 144 */
 	}
 
 	/*
@@ -5223,11 +5334,8 @@ static void station_free(struct station *station)
 
 	l_queue_destroy(station->anqp_pending, remove_anqp);
 
-	scan_freq_set_free(station->scan_freqs_order[0]);
-	scan_freq_set_free(station->scan_freqs_order[1]);
-
-	if (station->scan_freqs_order[2])
-		scan_freq_set_free(station->scan_freqs_order[2]);
+	for (uint8_t i = 0; i < 4; i++)
+		scan_freq_set_free(station->scan_freqs_order[i]);
 
 	wiphy_state_watch_remove(station->wiphy, station->wiphy_watch);
 

-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/3] station: improve roam scan strategy
  2025-04-15  8:21 ` [PATCH RFC 3/3] station: improve roam scan strategy Alexander Ganslandt
@ 2025-04-16 16:46   ` James Prestwood
  2025-04-16 17:19   ` Denis Kenzior
  1 sibling, 0 replies; 10+ messages in thread
From: James Prestwood @ 2025-04-16 16:46 UTC (permalink / raw)
  To: Alexander Ganslandt, iwd

Hi Alexander,

On 4/15/25 1:21 AM, Alexander Ganslandt wrote:
> When IWD decides to roam, it scans either neighbor freqs or known freqs
> (if neighbors are not available). If it fails to roam after getting the
> scan results, it scans ALL freqs. In my testing there's a high chance
> that both neighbor and/or known scans fail to roam and we end up
> scanning all freqs. This is very slow and if you're already moving away
> from the current BSS, there's a high chance you will lose connection
> completely before the scan is finished.
>
> Instead of scanning all freqs at once, split them up into prioritized
> subsets. Each subset contains a handful of freqs each, a lower index for
> the subset means that its freqs are more common. So subset 0 has the
> most common freqs and subset 3 has the least common freqs. The first two
> subsets also contain no DFS channels, speeding up scanning even more. In
> order to make this efficient, use the "scan_freq_map" to avoid scanning
> freqs that were recently scanned.
>
> When a roam scan is triggered, add the most prioritized freqs to the
> list of freqs that should be scanned. The order of priority is:
>
> 1. Neighbor freqs
> 2. Known freqs
> 3. Subsets, starting with index 0 and incrementing if the subset is
>     exhausted
>
> For each freq candidate, check the scan_freq_map to see how long time
> ago this freq was last scanned, this is denoted as its "age". If its age
> is above a threshold, add the freq to the list, otherwise discard it.
> Once the list has a certain size, start the scan. If no roaming occurs
> after the scan is completed, run the same function again. It will now
> pick new freqs since the previous freqs have been updated in
> scan_freq_map.
>
> This approach results in more scans, but fewer freqs per scan, leading
> to shorter delays between scan results. It also avoids scanning the same
> freqs back-to-back, which is generally not very useful. In combination
> with the freq priority, this increases the chance of finding a good BSS
> early.
> ---
>   src/station.c | 190 +++++++++++++++++++++++++++++++++++++++++++++-------------
>   1 file changed, 149 insertions(+), 41 deletions(-)
>
> diff --git a/src/station.c b/src/station.c
> index 9972ea76..87cf9df0 100644
> --- a/src/station.c
> +++ b/src/station.c
> @@ -67,6 +67,8 @@
>   
>   #define STATION_RECENT_NETWORK_LIMIT	5
>   #define STATION_RECENT_FREQS_LIMIT	5
> +#define STATION_MAX_SCAN_FREQ_AGE	3
> +#define STATION_MAX_SCAN_FREQS	10
>   
>   static struct l_queue *station_list;
>   static uint32_t netdev_watch;
> @@ -124,7 +126,7 @@ struct station {
>   	struct l_queue *roam_bss_list;
>   
>   	/* Frequencies split into subsets by priority */
> -	struct scan_freq_set *scan_freqs_order[3];
> +	struct scan_freq_set *scan_freqs_order[4];
>   	unsigned int dbus_scan_subset_idx;
>   
>   	uint32_t wiphy_watch;
> @@ -2385,6 +2387,8 @@ static void station_roam_retry(struct station *station)
>   		station_roam_timeout_rearm(station, roam_retry_interval);
>   }
>   
> +static void station_start_roam(struct station *station);
> +
>   static void station_roam_failed(struct station *station)
>   {
>   	l_debug("%u", netdev_get_ifindex(station->netdev));
> @@ -2414,10 +2418,9 @@ static void station_roam_failed(struct station *station)
>   		goto delayed_retry;
>   
>   	/*
> -	 * If we tried a limited scan, failed and the signal is still low,
> -	 * repeat with a full scan right away
> +	 * If the signal is still low, keep trying to roam
>   	 */
> -	if (station->signal_low && !station->roam_scan_full) {
> +	if (station->signal_low) {
>   		/*
>   		 * Since we're re-using roam_scan_id, explicitly cancel
>   		 * the scan here, so that the destroy callback is not called
> @@ -2426,8 +2429,8 @@ static void station_roam_failed(struct station *station)
>   		scan_cancel(netdev_get_wdev_id(station->netdev),
>   						station->roam_scan_id);
>   
> -		if (!station_roam_scan(station, NULL))
> -			return;
> +		station_start_roam(station);
> +		return;
>   	}
>   
>   delayed_retry:
> @@ -3102,12 +3105,85 @@ static void station_neighbor_report_cb(struct netdev *netdev, int err,
>   		station_roam_failed(station);
>   }
>   
> +static void station_filter_roam_scan_freq(uint32_t freq, void *user_data)
> +{
> +	struct scan_freq_set *freqs = user_data;
> +	uint64_t age = scan_get_freq_age(freq);
> +
> +	if (scan_freq_set_size(freqs) >= STATION_MAX_SCAN_FREQS) {
> +		return;
> +	}
> +
> +	if (age < STATION_MAX_SCAN_FREQ_AGE) {

Won't this start reusing the same frequencies after a few scans? Say 
your first N scans take more than 3 seconds, you'd then begin using 
those frequencies again on subsequent scans since their ages are under 
the threshold?

Rather than a hard threshold simply sorting by age seems like the best 
way to do this:

  - Sort the frequencies by least recently used -> Take the first N 
frequencies -> Scan
  - No candidates found? -> Repeat ^^^
> +		return;
> +	}
> +
> +	scan_freq_set_add(freqs, freq);
> +}
> +
> +static struct scan_freq_set *station_get_roam_scan_freqs(struct station *station)
> +{
> +	struct scan_freq_set *tmp;
> +	struct scan_freq_set *scan_freqs;
> +
> +	scan_freqs = scan_freq_set_new();
> +
> +	/* Add current frequency, always scan this to get updated data for the
> +	 * current BSS */
> +	scan_freq_set_add(scan_freqs, station->connected_bss->frequency);
> +
> +	/* Add neighbor frequencies */
> +	scan_freq_set_foreach(station->roam_freqs, station_filter_roam_scan_freq, scan_freqs);
> +	if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
> +		goto out;
> +	}
> +
> +	/* Add known frequencies */
> +	const struct network_info *info = network_get_info(
> +						station->connected_network);
> +	tmp = network_info_get_roam_frequencies(info,
> +					station->connected_bss->frequency,
> +					10);
> +	scan_freq_set_foreach(tmp, station_filter_roam_scan_freq, scan_freqs);
> +	scan_freq_set_free(tmp);
> +	if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
> +		goto out;
> +	}
> +
> +	/* Add frequencies based on the prioritized subsets */
> +	for (uint8_t i = 0; i < L_ARRAY_SIZE(station->scan_freqs_order); i++) {
> +		scan_freq_set_foreach(station->scan_freqs_order[i], station_filter_roam_scan_freq, scan_freqs);
> +		if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	/* TODO: Arbitrary number to not have too small freq list */
> +	if (scan_freq_set_size(scan_freqs) <= 5) {
> +		/* Might as well add the neighbors */
> +		scan_freq_set_merge(scan_freqs, station->roam_freqs);
> +	}
> +
> +	return scan_freqs;
> +}
> +
>   static void station_start_roam(struct station *station)
>   {
>   	int r;
> +	struct scan_freq_set *freqs;
>   
>   	station->preparing_roam = true;
>   
> +	/* TODO: Need to request neighbor report here, like below */
> +
> +	freqs = station_get_roam_scan_freqs(station);
> +	station_roam_scan(station, freqs);
> +
> +	scan_freq_set_free(freqs);
> +
> +	return;
> +
>   	/*
>   	 * If current BSS supports Neighbor Reports, narrow the scan down
>   	 * to channels occupied by known neighbors in the ESS. If no neighbor
> @@ -5007,44 +5083,79 @@ static void station_add_2_4ghz_freq(uint32_t freq, void *user_data)
>   
>   static void station_fill_scan_freq_subsets(struct station *station)
>   {
> -	const struct scan_freq_set *supported =
> -				wiphy_get_supported_freqs(station->wiphy);
>   	unsigned int subset_idx = 0;
>   
> -	/*
> -	 * Scan the 2.4GHz "social channels" first, 5GHz second, if supported,
> -	 * all other 2.4GHz channels last.  To be refined as needed.
> -	 */
> +	station->scan_freqs_order[subset_idx] = scan_freq_set_new();
> +
> +	/* Subset 0: 2.4GHz "social channels" and lower 5GHz non-DFS channels */
>   	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
> -		station->scan_freqs_order[subset_idx] = scan_freq_set_new();
> -		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2412);
> -		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2437);
> -		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2462);
> -		subset_idx++;
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2412); /* 1 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2437); /* 6 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2462); /* 11 */
>   	}
>   
> -	/*
> -	 * TODO: It may might sense to split up 5 and 6ghz into separate subsets
> -	 *       since the channel set is so large.
> -	 */
> -	if (allowed_bands & (BAND_FREQ_5_GHZ | BAND_FREQ_6_GHZ)) {
> -		uint32_t mask = allowed_bands &
> -					(BAND_FREQ_5_GHZ | BAND_FREQ_6_GHZ);
> -		struct scan_freq_set *set = scan_freq_set_clone(supported,
> -								mask);
> -
> -		/* 5/6ghz didn't add any frequencies */
> -		if (scan_freq_set_isempty(set)) {
> -			scan_freq_set_free(set);
> -		} else
> -			station->scan_freqs_order[subset_idx++] = set;
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5180); /* 36 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5200); /* 40 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5220); /* 44 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5240); /* 48 */
>   	}
>   
> -	/* Add remaining 2.4ghz channels to subset */
> +	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
> +
> +	/* Subset 1: 2.4GHz common "middle channels" and high 5GHz non-DFS channels */
>   	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
> -		station->scan_freqs_order[subset_idx] = scan_freq_set_new();
> -		scan_freq_set_foreach(supported, station_add_2_4ghz_freq,
> -					station->scan_freqs_order[subset_idx]);
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2422); /* 3 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2427); /* 4 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2447); /* 8 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2452); /* 9 */
> +	}
> +
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5745); /* 149 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5765); /* 153 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5785); /* 157 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5805); /* 161 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5825); /* 165 */
> +	}
> +
> +	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
> +
> +	/* TODO: Add 6GHz here, after more common 2.4 and 5GHz, but before DFS */
> +
> +	/* Subset 2: 2.4GHz remaining channels and 5GHz most common DFS channels */
> +	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2417); /* 2 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2432); /* 5 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2442); /* 7 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2457); /* 10 */
> +	}
> +
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5260); /* 52 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5280); /* 56 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5300); /* 60 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5320); /* 64 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5500); /* 100 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5520); /* 104 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5540); /* 108 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5560); /* 112 */
> +	}
> +
> +	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
> +
> +	/* Subset 3: Remaining 5GHz DFS channels */
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5340); /* 68 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5480); /* 96 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5580); /* 116 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5600); /* 120 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5620); /* 124 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5640); /* 128 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5660); /* 132 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5680); /* 136 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5700); /* 140 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5720); /* 144 */

This seems overly "manual" adding all these frequencies in the context 
of an initial scan to connect. What I fear is the cases of forgetting to 
add a channel, or new channels get added to the spec. We're now tracking 
a massive list of manual channels here that needs to be maintained.

At the very least it would be good to have a final scan subset to 
include anything we missed. If this final subset is always expected to 
be empty (i.e. we added all the channels in other subsets) then we could 
warn on that.

>   	}
>   
>   	/*
> @@ -5223,11 +5334,8 @@ static void station_free(struct station *station)
>   
>   	l_queue_destroy(station->anqp_pending, remove_anqp);
>   
> -	scan_freq_set_free(station->scan_freqs_order[0]);
> -	scan_freq_set_free(station->scan_freqs_order[1]);
> -
> -	if (station->scan_freqs_order[2])
> -		scan_freq_set_free(station->scan_freqs_order[2]);
> +	for (uint8_t i = 0; i < 4; i++)
> +		scan_freq_set_free(station->scan_freqs_order[i]);
>   
>   	wiphy_state_watch_remove(station->wiphy, station->wiphy_watch);
>   
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/3] station: improve roam scan strategy
  2025-04-15  8:21 ` [PATCH RFC 3/3] station: improve roam scan strategy Alexander Ganslandt
  2025-04-16 16:46   ` James Prestwood
@ 2025-04-16 17:19   ` Denis Kenzior
  2025-05-07 12:53     ` Alexander Ganslandt
  1 sibling, 1 reply; 10+ messages in thread
From: Denis Kenzior @ 2025-04-16 17:19 UTC (permalink / raw)
  To: Alexander Ganslandt, iwd

On 4/15/25 3:21 AM, Alexander Ganslandt wrote:
> When IWD decides to roam, it scans either neighbor freqs or known freqs
> (if neighbors are not available). If it fails to roam after getting the
> scan results, it scans ALL freqs. In my testing there's a high chance
> that both neighbor and/or known scans fail to roam and we end up
> scanning all freqs. This is very slow and if you're already moving away
> from the current BSS, there's a high chance you will lose connection
> completely before the scan is finished.
> 
> Instead of scanning all freqs at once, split them up into prioritized
> subsets. Each subset contains a handful of freqs each, a lower index for
> the subset means that its freqs are more common. So subset 0 has the
> most common freqs and subset 3 has the least common freqs. The first two
> subsets also contain no DFS channels, speeding up scanning even more. In
> order to make this efficient, use the "scan_freq_map" to avoid scanning
> freqs that were recently scanned.
> 
> When a roam scan is triggered, add the most prioritized freqs to the
> list of freqs that should be scanned. The order of priority is:
> 
> 1. Neighbor freqs
> 2. Known freqs
> 3. Subsets, starting with index 0 and incrementing if the subset is
>     exhausted
> 
> For each freq candidate, check the scan_freq_map to see how long time
> ago this freq was last scanned, this is denoted as its "age". If its age
> is above a threshold, add the freq to the list, otherwise discard it.
> Once the list has a certain size, start the scan. If no roaming occurs
> after the scan is completed, run the same function again. It will now
> pick new freqs since the previous freqs have been updated in
> scan_freq_map.
> 
> This approach results in more scans, but fewer freqs per scan, leading
> to shorter delays between scan results. It also avoids scanning the same
> freqs back-to-back, which is generally not very useful. In combination
> with the freq priority, this increases the chance of finding a good BSS
> early.
> ---
>   src/station.c | 190 +++++++++++++++++++++++++++++++++++++++++++++-------------
>   1 file changed, 149 insertions(+), 41 deletions(-)

<snip>

> @@ -2385,6 +2387,8 @@ static void station_roam_retry(struct station *station)
>   		station_roam_timeout_rearm(station, roam_retry_interval);
>   }
>   
> +static void station_start_roam(struct station *station);
> +

Just FYI, we don't like forward-declarations of static functions.  See 
doc/coding-style.txt item M17

> +static void station_filter_roam_scan_freq(uint32_t freq, void *user_data)
> +{
> +	struct scan_freq_set *freqs = user_data;
> +	uint64_t age = scan_get_freq_age(freq);

Have you considered maintaining a scan_freq_set of all the frequencies scanned 
by the neighbor and known-frequency stages instead of maintaining an hashtable 
based on age?
> +
> +	if (scan_freq_set_size(freqs) >= STATION_MAX_SCAN_FREQS) {
> +		return;
> +	}
> +
> +	if (age < STATION_MAX_SCAN_FREQ_AGE) {
> +		return;
> +	}

This is against the preferred Linux Kernel coding style.

> +
> +	scan_freq_set_add(freqs, freq);
> +}
> +
> +static struct scan_freq_set *station_get_roam_scan_freqs(struct station *station)
> +{
> +	struct scan_freq_set *tmp;
> +	struct scan_freq_set *scan_freqs;
> +
> +	scan_freqs = scan_freq_set_new();
> +
> +	/* Add current frequency, always scan this to get updated data for the
> +	 * current BSS */
> +	scan_freq_set_add(scan_freqs, station->connected_bss->frequency);
> +
> +	/* Add neighbor frequencies */
> +	scan_freq_set_foreach(station->roam_freqs, station_filter_roam_scan_freq, scan_freqs);
> +	if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
> +		goto out;
> +	}

So the idea is to add up to STATION_MAX_SCAN_FREQS to a new set, starting in 
order of preference with:
	1. neighbor frequency set
	2. known network set
	3. scan frequency ordering set

while filtering the frequencies already scanned.

How do you know when to give up the current attempt?

> +
> +	/* Add known frequencies */
> +	const struct network_info *info = network_get_info(
> +						station->connected_network);
> +	tmp = network_info_get_roam_frequencies(info,
> +					station->connected_bss->frequency,
> +					10);
> +	scan_freq_set_foreach(tmp, station_filter_roam_scan_freq, scan_freqs);
> +	scan_freq_set_free(tmp);
> +	if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
> +		goto out;
> +	}
> +
> +	/* Add frequencies based on the prioritized subsets */
> +	for (uint8_t i = 0; i < L_ARRAY_SIZE(station->scan_freqs_order); i++) {
> +		scan_freq_set_foreach(station->scan_freqs_order[i], station_filter_roam_scan_freq, scan_freqs);
> +		if (scan_freq_set_size(scan_freqs) >= STATION_MAX_SCAN_FREQS) {
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	/* TODO: Arbitrary number to not have too small freq list */
> +	if (scan_freq_set_size(scan_freqs) <= 5) {

Why 5?  Why is this different from STATION_MAX_SCAN_FREQS?

> +		/* Might as well add the neighbors */
> +		scan_freq_set_merge(scan_freqs, station->roam_freqs);
> +	}
> +
> +	return scan_freqs;
> +}
> +

<snip>

> @@ -5007,44 +5083,79 @@ static void station_add_2_4ghz_freq(uint32_t freq, void *user_data)
>   
>   static void station_fill_scan_freq_subsets(struct station *station)
>   {
> -	const struct scan_freq_set *supported =
> -				wiphy_get_supported_freqs(station->wiphy);

For the most part the changes in this function make sense to me.  They can also 
be split out into a standalone commit since they improve on the already existing 
scan_freqs_order logic and will in theory speed up discovery via the Scan() api.

How do you deal with strange channels that might be supported only on some hardware?

>   	unsigned int subset_idx = 0;
>   
> -	/*
> -	 * Scan the 2.4GHz "social channels" first, 5GHz second, if supported,
> -	 * all other 2.4GHz channels last.  To be refined as needed.
> -	 */
> +	station->scan_freqs_order[subset_idx] = scan_freq_set_new();
> +
> +	/* Subset 0: 2.4GHz "social channels" and lower 5GHz non-DFS channels */
>   	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
> -		station->scan_freqs_order[subset_idx] = scan_freq_set_new();
> -		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2412);
> -		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2437);
> -		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2462);
> -		subset_idx++;
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2412); /* 1 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2437); /* 6 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2462); /* 11 */

Ensure that lines are <= 80 columns.  See doc/coding-style.txt

>   	}
>   
> -	/*
> -	 * TODO: It may might sense to split up 5 and 6ghz into separate subsets
> -	 *       since the channel set is so large.
> -	 */
> -	if (allowed_bands & (BAND_FREQ_5_GHZ | BAND_FREQ_6_GHZ)) {
> -		uint32_t mask = allowed_bands &
> -					(BAND_FREQ_5_GHZ | BAND_FREQ_6_GHZ);
> -		struct scan_freq_set *set = scan_freq_set_clone(supported,
> -								mask);
> -
> -		/* 5/6ghz didn't add any frequencies */
> -		if (scan_freq_set_isempty(set)) {
> -			scan_freq_set_free(set);
> -		} else
> -			station->scan_freqs_order[subset_idx++] = set;
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5180); /* 36 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5200); /* 40 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5220); /* 44 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5240); /* 48 */
>   	}
>   
> -	/* Add remaining 2.4ghz channels to subset */
> +	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
> +
> +	/* Subset 1: 2.4GHz common "middle channels" and high 5GHz non-DFS channels */
>   	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
> -		station->scan_freqs_order[subset_idx] = scan_freq_set_new();
> -		scan_freq_set_foreach(supported, station_add_2_4ghz_freq,
> -					station->scan_freqs_order[subset_idx]);
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2422); /* 3 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2427); /* 4 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2447); /* 8 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2452); /* 9 */
> +	}
> +
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5745); /* 149 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5765); /* 153 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5785); /* 157 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5805); /* 161 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5825); /* 165 */
> +	}
> +
> +	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
> +
> +	/* TODO: Add 6GHz here, after more common 2.4 and 5GHz, but before DFS */
> +
> +	/* Subset 2: 2.4GHz remaining channels and 5GHz most common DFS channels */
> +	if (allowed_bands & BAND_FREQ_2_4_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2417); /* 2 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2432); /* 5 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2442); /* 7 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 2457); /* 10 */

For example, there may be the weird channel 13 in 2.4G band.

> +	}
> +
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5260); /* 52 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5280); /* 56 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5300); /* 60 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5320); /* 64 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5500); /* 100 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5520); /* 104 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5540); /* 108 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5560); /* 112 */
> +	}
> +
> +	station->scan_freqs_order[++subset_idx] = scan_freq_set_new();
> +
> +	/* Subset 3: Remaining 5GHz DFS channels */
> +	if (allowed_bands & BAND_FREQ_5_GHZ) {
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5340); /* 68 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5480); /* 96 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5580); /* 116 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5600); /* 120 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5620); /* 124 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5640); /* 128 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5660); /* 132 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5680); /* 136 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5700); /* 140 */
> +		scan_freq_set_add(station->scan_freqs_order[subset_idx], 5720); /* 144 */
>   	}

There is now a left-over if() here that cannot happen and should be taken out.

Does the logic now need to take into account the fact that any given frequency 
set might be empty?

>   
>   	/*
> @@ -5223,11 +5334,8 @@ static void station_free(struct station *station)
>   
>   	l_queue_destroy(station->anqp_pending, remove_anqp);
>   
> -	scan_freq_set_free(station->scan_freqs_order[0]);
> -	scan_freq_set_free(station->scan_freqs_order[1]);
> -
> -	if (station->scan_freqs_order[2])
> -		scan_freq_set_free(station->scan_freqs_order[2]);
> +	for (uint8_t i = 0; i < 4; i++)
> +		scan_freq_set_free(station->scan_freqs_order[i]);

We prefer somewhat old style C here.  Also magic number 4 should be changed to 
L_ARRAY_SIZE.

>   
>   	wiphy_state_watch_remove(station->wiphy, station->wiphy_watch);
>   
> 

Regards,
-Denis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/3] station: improve roam scan strategy
  2025-04-16 17:19   ` Denis Kenzior
@ 2025-05-07 12:53     ` Alexander Ganslandt
  2025-05-07 15:40       ` James Prestwood
  2025-05-07 16:31       ` Denis Kenzior
  0 siblings, 2 replies; 10+ messages in thread
From: Alexander Ganslandt @ 2025-05-07 12:53 UTC (permalink / raw)
  To: iwd

Hello James and Denis,

Thank you both for the feedback! I want to focus on the overall idea 
here and will look at the code details later.

On 4/16/25 18:46, James Prestwood wrote:
 > Won't this start reusing the same frequencies after a few scans? Say
 > your first N scans take more than 3 seconds, you'd then begin using
 > those frequencies again on subsequent scans since their ages are under
 > the threshold?
 >
 > Rather than a hard threshold simply sorting by age seems like the best
 > way to do this:
 >
 >   - Sort the frequencies by least recently used -> Take the first N
 > frequencies -> Scan
 >   - No candidates found? -> Repeat ^^^

Yes, it will start reusing frequencies and might never reach the less 
common frequencies depending on how long time the scans take. That's 
definitely not ideal. Looking at the age as you suggest has the problem 
that it will eventually spend time scanning uncommon frequencies when 
that time would have been better spent scanning common frequencies that 
are getting old.

Maybe this is an impossible task to solve perfectly as there will always 
be a trade-off, maybe it's better left as configs for the user? In my 
livestream use case the APs should be configured such that most good 
frequencies are in "neighbor" or "known", so I would like to spend most 
time on these frequencies. However, for a use case where the network 
setup is unknown, having more exploration of uncommon frequencies might 
be better.

We could have configs instead where users can define the subsets and age 
threshold. If they don't set anything, the default is the current 
behavior. If they set subsets, the roaming algorithm will pick 
frequencies from those in the prioritized order, just like this patch 
does. It could still get stuck scanning the same frequencies if the age 
threshold is too low, but then it would be a user configuration error. 
We then also get rid of hard-coding frequencies in IWD and all problems 
that causes, but still give users the possibility to do so if needed. 
Does this make more sense?

On 4/16/25 19:19, Denis Kenzior wrote:
> Have you considered maintaining a scan_freq_set of all the frequencies 
> scanned
> by the neighbor and known-frequency stages instead of maintaining an 
> hashtable
> based on age?

I don't completely follow what you mean here. The reason for having the 
hashtable is to know how long ago a certain frequency was scanned in 
order to not scan it again too soon. So that info will have to be stored 
somehow, or did you have some other approach in mind?

> So the idea is to add up to STATION_MAX_SCAN_FREQS to a new set, 
> starting in
> order of preference with:
>         1. neighbor frequency set
>         2. known network set
>         3. scan frequency ordering set
> 
> while filtering the frequencies already scanned.
> 
> How do you know when to give up the current attempt?

Yes, correct. Do you mean when to give up on the roaming attempt 
completely? The current approach is to never give up, it will continue 
scanning frequencies in the subsets until it finds a better BSS, or 
until the current BSS signal goes above the threshold and roaming stops. 
I believe this is also what IWD does today, except it uses full scans 
continuously.

Regards,
Alexander

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/3] station: improve roam scan strategy
  2025-05-07 12:53     ` Alexander Ganslandt
@ 2025-05-07 15:40       ` James Prestwood
  2025-05-07 16:31       ` Denis Kenzior
  1 sibling, 0 replies; 10+ messages in thread
From: James Prestwood @ 2025-05-07 15:40 UTC (permalink / raw)
  To: Alexander Ganslandt, iwd

Hi Alexander,

On 5/7/25 5:53 AM, Alexander Ganslandt wrote:
> Hello James and Denis,
>
> Thank you both for the feedback! I want to focus on the overall idea 
> here and will look at the code details later.
>
> On 4/16/25 18:46, James Prestwood wrote:
> > Won't this start reusing the same frequencies after a few scans? Say
> > your first N scans take more than 3 seconds, you'd then begin using
> > those frequencies again on subsequent scans since their ages are under
> > the threshold?
> >
> > Rather than a hard threshold simply sorting by age seems like the best
> > way to do this:
> >
> >   - Sort the frequencies by least recently used -> Take the first N
> > frequencies -> Scan
> >   - No candidates found? -> Repeat ^^^
>
> Yes, it will start reusing frequencies and might never reach the less 
> common frequencies depending on how long time the scans take. That's 
> definitely not ideal. Looking at the age as you suggest has the 
> problem that it will eventually spend time scanning uncommon 
> frequencies when that time would have been better spent scanning 
> common frequencies that are getting old.

I think we can say the #1 goal is to avoid full spectrum scans. But 
entirely avoiding frequencies even if they are uncommon is definitely a 
show stopper. We do (eventually) need to scan all the frequencies even 
if they come after the 3rd, 4th, 5th scan subset.

>
> Maybe this is an impossible task to solve perfectly as there will 
> always be a trade-off, maybe it's better left as configs for the user? 
> In my livestream use case the APs should be configured such that most 
> good frequencies are in "neighbor" or "known", so I would like to 
> spend most time on these frequencies. However, for a use case where 
> the network setup is unknown, having more exploration of uncommon 
> frequencies might be better.

I see your point, and even for my use-case we generally know a subset of 
frequencies that the APs are on. But this really doesn't align well with 
a generalized use case where you may be on various public network, 
traveling between countries, etc. We should definitely prefer 
neighbor/known frequencies but we need to account for all frequencies at 
some point during the scanning process. Completely ignoring certain 
frequencies will end up biting us when some user comes and says their 
device refuses to roam to some AP in their network because its using an 
"uncommon" frequency.

>
> We could have configs instead where users can define the subsets and 
> age threshold. If they don't set anything, the default is the current 
> behavior. If they set subsets, the roaming algorithm will pick 
> frequencies from those in the prioritized order, just like this patch 
> does. It could still get stuck scanning the same frequencies if the 
> age threshold is too low, but then it would be a user configuration 
> error. We then also get rid of hard-coding frequencies in IWD and all 
> problems that causes, but still give users the possibility to do so if 
> needed. Does this make more sense?
I doubt your going to get such a configuration parameter past Denis :) 
We need to figure out a path for a dynamic/learned configuration based 
on prior frequencies seen/used, neighbor reports, while still 
(eventually) scanning the entire spectrum.
>
>
> On 4/16/25 19:19, Denis Kenzior wrote:
>> Have you considered maintaining a scan_freq_set of all the 
>> frequencies scanned
>> by the neighbor and known-frequency stages instead of maintaining an 
>> hashtable
>> based on age?
>
> I don't completely follow what you mean here. The reason for having 
> the hashtable is to know how long ago a certain frequency was scanned 
> in order to not scan it again too soon. So that info will have to be 
> stored somehow, or did you have some other approach in mind?
>
>> So the idea is to add up to STATION_MAX_SCAN_FREQS to a new set, 
>> starting in
>> order of preference with:
>>         1. neighbor frequency set
>>         2. known network set
>>         3. scan frequency ordering set
>>
>> while filtering the frequencies already scanned.
>>
>> How do you know when to give up the current attempt?
>
> Yes, correct. Do you mean when to give up on the roaming attempt 
> completely? The current approach is to never give up, it will continue 
> scanning frequencies in the subsets until it finds a better BSS, or 
> until the current BSS signal goes above the threshold and roaming 
> stops. I believe this is also what IWD does today, except it uses full 
> scans continuously.
>
> Regards,
> Alexander
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/3] station: improve roam scan strategy
  2025-05-07 12:53     ` Alexander Ganslandt
  2025-05-07 15:40       ` James Prestwood
@ 2025-05-07 16:31       ` Denis Kenzior
  2025-05-09  8:53         ` Alexander Ganslandt
  1 sibling, 1 reply; 10+ messages in thread
From: Denis Kenzior @ 2025-05-07 16:31 UTC (permalink / raw)
  To: Alexander Ganslandt, iwd

Hi Alexander,

 >
> Yes, it will start reusing frequencies and might never reach the less common 
> frequencies depending on how long time the scans take. That's definitely not 
> ideal. Looking at the age as you suggest has the problem that it will eventually 
> spend time scanning uncommon frequencies when that time would have been better 
> spent scanning common frequencies that are getting old.

DFS frequencies are valid and it is quite common for APs to operate on them. 
Not scanning them or delaying the DFS scan excessively can lead to really 
terrible user experience.  In fact, we're having this problem on some older 
FullMAC hardware.  The firmware doesn't initially scan DFS frequencies, even 
though it is being told to.  This has led to all sorts of fun issues and lots of 
$/time wasted.

The less 'preferred' frequencies can be scanned last, but I think we still need 
to scan every valid frequency before starting the next roaming attempt.

> 
> Maybe this is an impossible task to solve perfectly as there will always be a 
> trade-off, maybe it's better left as configs for the user? In my livestream use 

A typical user cannot be expected to optimize something like this.  Our 
principle is to not rely on the user for any configuration.  The default 
behavior should be good enough for typical uses.

> case the APs should be configured such that most good frequencies are in 
> "neighbor" or "known", so I would like to spend most time on these frequencies. 
> However, for a use case where the network setup is unknown, having more 
> exploration of uncommon frequencies might be better.
> 

+1

> We could have configs instead where users can define the subsets and age 
> threshold. If they don't set anything, the default is the current behavior. If 

No.  You should not be relying on the user for anything like this.

> they set subsets, the roaming algorithm will pick frequencies from those in the 
> prioritized order, just like this patch does. It could still get stuck scanning 
> the same frequencies if the age threshold is too low, but then it would be a 
> user configuration error. We then also get rid of hard-coding frequencies in IWD 

Strong no.  It is not the user's problem and never should be.  Also, I really 
question using 'age' in seconds.  I strongly suspect it is something that will 
likely fail in all kinds of unexpected ways.  Hardware has all sorts of strange 
behaviors outside of iwd's control.  Scan times vary wildly.  A more stable 
solution is needed.

> and all problems that causes, but still give users the possibility to do so if 
> needed. Does this make more sense?
> 
> On 4/16/25 19:19, Denis Kenzior wrote:
>> Have you considered maintaining a scan_freq_set of all the frequencies scanned
>> by the neighbor and known-frequency stages instead of maintaining an hashtable
>> based on age?
> 
> I don't completely follow what you mean here. The reason for having the 
> hashtable is to know how long ago a certain frequency was scanned in order to 
> not scan it again too soon. So that info will have to be stored somehow, or did 
> you have some other approach in mind?

See my comment above about using a time based age value.  I would push more 
towards a solution that scans every (enabled) frequency in some sort of 
preference order.  My opinion is that using a set of previously scanned 
frequencies, rather than a hashtable, would fit better into that sort of strategy.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/3] station: improve roam scan strategy
  2025-05-07 16:31       ` Denis Kenzior
@ 2025-05-09  8:53         ` Alexander Ganslandt
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Ganslandt @ 2025-05-09  8:53 UTC (permalink / raw)
  To: iwd

Hi Denis and James,

On 5/7/25 18:31, Denis Kenzior wrote:
> Hi Alexander,
> 
> DFS frequencies are valid and it is quite common for APs to operate on 
> them.
> Not scanning them or delaying the DFS scan excessively can lead to really
> terrible user experience.  In fact, we're having this problem on some older
> FullMAC hardware.  The firmware doesn't initially scan DFS frequencies, 
> even
> though it is being told to.  This has led to all sorts of fun issues and 
> lots of
> $/time wasted.
> 
> The less 'preferred' frequencies can be scanned last, but I think we 
> still need
> to scan every valid frequency before starting the next roaming attempt.

Good point. What is the definition of a "roaming attempt" here? You mean 
that all valid frequencies should be scanned before any frequency is 
re-scanned?

> A typical user cannot be expected to optimize something like this.  Our
> principle is to not rely on the user for any configuration.  The default
> behavior should be good enough for typical uses.

I agree, default should be good enough but for some use cases I think 
this kind of configuration could give big benefits. Maybe having a 
general config that limits which frequencies iwd uses is more 
reasonable? For example wpa-supplicant has the "scan_freq" config for 
this, which provides a simple way to optimize scans if the user knows 
what they're doing.

> See my comment above about using a time based age value.  I would push more
> towards a solution that scans every (enabled) frequency in some sort of
> preference order.  My opinion is that using a set of previously scanned
> frequencies, rather than a hashtable, would fit better into that sort of 
> strategy.

Gotcha. I still think neighbors will have to be scanned more often 
though. Maybe something like this would make more sense:
1. Scan neighbors
2. Scan knowns
3. Scan X un-scanned frequencies (ordered in some nice way)
4. Scan neighbors
5. Goto 3

And then use a list to track which frequencies have been scanned, as you 
say. The time until neighbors are re-scanned would then be dependent on 
how long the hardware takes to scan the other frequencies, but that 
seems hard to avoid. I'm only working with one type of hardware at the 
moment, so I don't have the full picture of how other hardware behaves 
and how big the difference is between fast and slow hardware.

I will also look in to ordering the frequencies in a more sustainable 
way, rather than hard-coding values. Also 6 GHz will have to be added to 
this somehow, not sure how to handle that in terms of priority.

Do you still see problems with this approach? It should eventually scan 
all frequencies, but might take a bit longer due to extra neighbor scans.

Regards,
Alexander

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-05-09  8:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-15  8:21 [PATCH RFC 0/3] Improve roam scan strategy Alexander Ganslandt
2025-04-15  8:21 ` [PATCH RFC 1/3] util: add scan_freq_set_size function Alexander Ganslandt
2025-04-15  8:21 ` [PATCH RFC 2/3] scan: add scan_freq_map Alexander Ganslandt
2025-04-15  8:21 ` [PATCH RFC 3/3] station: improve roam scan strategy Alexander Ganslandt
2025-04-16 16:46   ` James Prestwood
2025-04-16 17:19   ` Denis Kenzior
2025-05-07 12:53     ` Alexander Ganslandt
2025-05-07 15:40       ` James Prestwood
2025-05-07 16:31       ` Denis Kenzior
2025-05-09  8:53         ` Alexander Ganslandt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).