* [PATCH net-next 0/4] rps: misc changes
@ 2025-04-07 16:35 Eric Dumazet
2025-04-07 16:35 ` [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function Eric Dumazet
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Eric Dumazet @ 2025-04-07 16:35 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
Minor changes in rps:
skb_flow_limit() is probably unused these days,
and data-races are quite theoretical.
Eric Dumazet (4):
net: rps: change skb_flow_limit() hash function
net: rps: annotate data-races around (struct sd_flow_limit)->count
net: add data-race annotations in softnet_seq_show()
net: rps: remove kfree_rcu_mightsleep() use
include/net/rps.h | 5 +++--
net/core/dev.c | 11 +++++++----
net/core/dev.h | 5 +++--
net/core/net-procfs.c | 9 +++++----
net/core/sysctl_net_core.c | 6 +++---
5 files changed, 21 insertions(+), 15 deletions(-)
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function
2025-04-07 16:35 [PATCH net-next 0/4] rps: misc changes Eric Dumazet
@ 2025-04-07 16:35 ` Eric Dumazet
2025-04-07 18:53 ` Willem de Bruijn
2025-04-07 16:36 ` [PATCH net-next 2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count Eric Dumazet
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2025-04-07 16:35 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
As explained in commit f3483c8e1da6 ("net: rfs: hash function change"),
masking low order bits of skb_get_hash(skb) has low entropy.
A NIC with 32 RX queues uses the 5 low order bits of rss key
to select a queue. This means all packets landing to a given
queue share the same 5 low order bits.
Switch to hash_32() to reduce hash collisions.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/dev.c | 2 +-
net/core/dev.h | 2 +-
net/core/sysctl_net_core.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 0608605cfc24..f674236f34be 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5038,7 +5038,7 @@ static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
rcu_read_lock();
fl = rcu_dereference(sd->flow_limit);
if (fl) {
- new_flow = skb_get_hash(skb) & (fl->num_buckets - 1);
+ new_flow = hash_32(skb_get_hash(skb), fl->log_buckets);
old_flow = fl->history[fl->history_head];
fl->history[fl->history_head] = new_flow;
diff --git a/net/core/dev.h b/net/core/dev.h
index 7ee203395d8e..b1cd44b5f009 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -16,7 +16,7 @@ struct cpumask;
#define FLOW_LIMIT_HISTORY (1 << 7) /* must be ^2 and !overflow buckets */
struct sd_flow_limit {
u64 count;
- unsigned int num_buckets;
+ u8 log_buckets;
unsigned int history_head;
u16 history[FLOW_LIMIT_HISTORY];
u8 buckets[];
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index c7769ee0d9c5..5cfe76ede523 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -248,7 +248,7 @@ static int flow_limit_cpu_sysctl(const struct ctl_table *table, int write,
ret = -ENOMEM;
goto write_unlock;
}
- cur->num_buckets = netdev_flow_limit_table_len;
+ cur->log_buckets = ilog2(netdev_flow_limit_table_len);
rcu_assign_pointer(sd->flow_limit, cur);
}
}
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next 2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count
2025-04-07 16:35 [PATCH net-next 0/4] rps: misc changes Eric Dumazet
2025-04-07 16:35 ` [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function Eric Dumazet
@ 2025-04-07 16:36 ` Eric Dumazet
2025-04-07 18:55 ` Willem de Bruijn
2025-04-07 16:36 ` [PATCH net-next 3/4] net: add data-race annotations in softnet_seq_show() Eric Dumazet
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2025-04-07 16:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
softnet_seq_show() can read fl->count while another cpu
updates this field from skb_flow_limit().
Make this field an 'unsigned int', as its only consumer
only deals with 32 bit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/dev.c | 3 ++-
net/core/dev.h | 2 +-
net/core/net-procfs.c | 3 ++-
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index f674236f34be..969883173182 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5049,7 +5049,8 @@ static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
fl->buckets[old_flow]--;
if (++fl->buckets[new_flow] > (FLOW_LIMIT_HISTORY >> 1)) {
- fl->count++;
+ /* Pairs with READ_ONCE() in softnet_seq_show() */
+ WRITE_ONCE(fl->count, fl->count + 1);
rcu_read_unlock();
return true;
}
diff --git a/net/core/dev.h b/net/core/dev.h
index b1cd44b5f009..e855e1cb43fd 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -15,7 +15,7 @@ struct cpumask;
/* Random bits of netdevice that don't need to be exposed */
#define FLOW_LIMIT_HISTORY (1 << 7) /* must be ^2 and !overflow buckets */
struct sd_flow_limit {
- u64 count;
+ unsigned int count;
u8 log_buckets;
unsigned int history_head;
u16 history[FLOW_LIMIT_HISTORY];
diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index 3e92bf0f9060..69782d62fbe1 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -132,8 +132,9 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
rcu_read_lock();
fl = rcu_dereference(sd->flow_limit);
+ /* Pairs with WRITE_ONCE() in skb_flow_limit() */
if (fl)
- flow_limit_count = fl->count;
+ flow_limit_count = READ_ONCE(fl->count);
rcu_read_unlock();
#endif
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next 3/4] net: add data-race annotations in softnet_seq_show()
2025-04-07 16:35 [PATCH net-next 0/4] rps: misc changes Eric Dumazet
2025-04-07 16:35 ` [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function Eric Dumazet
2025-04-07 16:36 ` [PATCH net-next 2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count Eric Dumazet
@ 2025-04-07 16:36 ` Eric Dumazet
2025-04-07 19:04 ` Willem de Bruijn
2025-04-07 16:36 ` [PATCH net-next 4/4] net: rps: remove kfree_rcu_mightsleep() use Eric Dumazet
2025-04-08 20:08 ` [PATCH net-next 0/4] rps: misc changes patchwork-bot+netdevbpf
4 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2025-04-07 16:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
softnet_seq_show() reads several fields that might be updated
concurrently. Add READ_ONCE() and WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/dev.c | 6 ++++--
net/core/net-procfs.c | 6 +++---
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 969883173182..4ccc6dc5303e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4953,7 +4953,8 @@ static void rps_trigger_softirq(void *data)
struct softnet_data *sd = data;
____napi_schedule(sd, &sd->backlog);
- sd->received_rps++;
+ /* Pairs with READ_ONCE() in softnet_seq_show() */
+ WRITE_ONCE(sd->received_rps, sd->received_rps + 1);
}
#endif /* CONFIG_RPS */
@@ -7523,7 +7524,8 @@ static __latent_entropy void net_rx_action(void)
*/
if (unlikely(budget <= 0 ||
time_after_eq(jiffies, time_limit))) {
- sd->time_squeeze++;
+ /* Pairs with READ_ONCE() in softnet_seq_show() */
+ WRITE_ONCE(sd->time_squeeze, sd->time_squeeze + 1);
break;
}
}
diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index 69782d62fbe1..4f0f0709a1cb 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -145,11 +145,11 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
seq_printf(seq,
"%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
"%08x %08x\n",
- sd->processed, atomic_read(&sd->dropped),
- sd->time_squeeze, 0,
+ READ_ONCE(sd->processed), atomic_read(&sd->dropped),
+ READ_ONCE(sd->time_squeeze), 0,
0, 0, 0, 0, /* was fastroute */
0, /* was cpu_collision */
- sd->received_rps, flow_limit_count,
+ READ_ONCE(sd->received_rps), flow_limit_count,
input_qlen + process_qlen, (int)seq->index,
input_qlen, process_qlen);
return 0;
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next 4/4] net: rps: remove kfree_rcu_mightsleep() use
2025-04-07 16:35 [PATCH net-next 0/4] rps: misc changes Eric Dumazet
` (2 preceding siblings ...)
2025-04-07 16:36 ` [PATCH net-next 3/4] net: add data-race annotations in softnet_seq_show() Eric Dumazet
@ 2025-04-07 16:36 ` Eric Dumazet
2025-04-07 19:36 ` Willem de Bruijn
2025-04-08 20:08 ` [PATCH net-next 0/4] rps: misc changes patchwork-bot+netdevbpf
4 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2025-04-07 16:36 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
Add an rcu_head to sd_flow_limit and rps_sock_flow_table structs
to use the more conventional and predictable k[v]free_rcu().
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/rps.h | 5 +++--
net/core/dev.h | 1 +
net/core/sysctl_net_core.c | 4 ++--
3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/net/rps.h b/include/net/rps.h
index e358e9711f27..507f4aa5d39b 100644
--- a/include/net/rps.h
+++ b/include/net/rps.h
@@ -57,9 +57,10 @@ struct rps_dev_flow_table {
* meaning we use 32-6=26 bits for the hash.
*/
struct rps_sock_flow_table {
- u32 mask;
+ struct rcu_head rcu;
+ u32 mask;
- u32 ents[] ____cacheline_aligned_in_smp;
+ u32 ents[] ____cacheline_aligned_in_smp;
};
#define RPS_SOCK_FLOW_TABLE_SIZE(_num) (offsetof(struct rps_sock_flow_table, ents[_num]))
diff --git a/net/core/dev.h b/net/core/dev.h
index e855e1cb43fd..710abc05ebdb 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -15,6 +15,7 @@ struct cpumask;
/* Random bits of netdevice that don't need to be exposed */
#define FLOW_LIMIT_HISTORY (1 << 7) /* must be ^2 and !overflow buckets */
struct sd_flow_limit {
+ struct rcu_head rcu;
unsigned int count;
u8 log_buckets;
unsigned int history_head;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 5cfe76ede523..5dbb2c6f371d 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -201,7 +201,7 @@ static int rps_sock_flow_sysctl(const struct ctl_table *table, int write,
if (orig_sock_table) {
static_branch_dec(&rps_needed);
static_branch_dec(&rfs_needed);
- kvfree_rcu_mightsleep(orig_sock_table);
+ kvfree_rcu(orig_sock_table, rcu);
}
}
}
@@ -239,7 +239,7 @@ static int flow_limit_cpu_sysctl(const struct ctl_table *table, int write,
lockdep_is_held(&flow_limit_update_mutex));
if (cur && !cpumask_test_cpu(i, mask)) {
RCU_INIT_POINTER(sd->flow_limit, NULL);
- kfree_rcu_mightsleep(cur);
+ kfree_rcu(cur, rcu);
} else if (!cur && cpumask_test_cpu(i, mask)) {
cur = kzalloc_node(len, GFP_KERNEL,
cpu_to_node(i));
--
2.49.0.504.g3bcea36a83-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function
2025-04-07 16:35 ` [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function Eric Dumazet
@ 2025-04-07 18:53 ` Willem de Bruijn
0 siblings, 0 replies; 10+ messages in thread
From: Willem de Bruijn @ 2025-04-07 18:53 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
Eric Dumazet wrote:
> As explained in commit f3483c8e1da6 ("net: rfs: hash function change"),
> masking low order bits of skb_get_hash(skb) has low entropy.
>
> A NIC with 32 RX queues uses the 5 low order bits of rss key
> to select a queue. This means all packets landing to a given
> queue share the same 5 low order bits.
>
> Switch to hash_32() to reduce hash collisions.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count
2025-04-07 16:36 ` [PATCH net-next 2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count Eric Dumazet
@ 2025-04-07 18:55 ` Willem de Bruijn
0 siblings, 0 replies; 10+ messages in thread
From: Willem de Bruijn @ 2025-04-07 18:55 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
Eric Dumazet wrote:
> softnet_seq_show() can read fl->count while another cpu
> updates this field from skb_flow_limit().
>
> Make this field an 'unsigned int', as its only consumer
> only deals with 32 bit.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 3/4] net: add data-race annotations in softnet_seq_show()
2025-04-07 16:36 ` [PATCH net-next 3/4] net: add data-race annotations in softnet_seq_show() Eric Dumazet
@ 2025-04-07 19:04 ` Willem de Bruijn
0 siblings, 0 replies; 10+ messages in thread
From: Willem de Bruijn @ 2025-04-07 19:04 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
Eric Dumazet wrote:
> softnet_seq_show() reads several fields that might be updated
> concurrently. Add READ_ONCE() and WRITE_ONCE() annotations.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 4/4] net: rps: remove kfree_rcu_mightsleep() use
2025-04-07 16:36 ` [PATCH net-next 4/4] net: rps: remove kfree_rcu_mightsleep() use Eric Dumazet
@ 2025-04-07 19:36 ` Willem de Bruijn
0 siblings, 0 replies; 10+ messages in thread
From: Willem de Bruijn @ 2025-04-07 19:36 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
Eric Dumazet
Eric Dumazet wrote:
> Add an rcu_head to sd_flow_limit and rps_sock_flow_table structs
> to use the more conventional and predictable k[v]free_rcu().
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next 0/4] rps: misc changes
2025-04-07 16:35 [PATCH net-next 0/4] rps: misc changes Eric Dumazet
` (3 preceding siblings ...)
2025-04-07 16:36 ` [PATCH net-next 4/4] net: rps: remove kfree_rcu_mightsleep() use Eric Dumazet
@ 2025-04-08 20:08 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-04-08 20:08 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, kuba, pabeni, horms, willemb, netdev, eric.dumazet
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 7 Apr 2025 16:35:58 +0000 you wrote:
> Minor changes in rps:
>
> skb_flow_limit() is probably unused these days,
> and data-races are quite theoretical.
>
> Eric Dumazet (4):
> net: rps: change skb_flow_limit() hash function
> net: rps: annotate data-races around (struct sd_flow_limit)->count
> net: add data-race annotations in softnet_seq_show()
> net: rps: remove kfree_rcu_mightsleep() use
>
> [...]
Here is the summary with links:
- [net-next,1/4] net: rps: change skb_flow_limit() hash function
https://git.kernel.org/netdev/net-next/c/c3025e94daa9
- [net-next,2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count
https://git.kernel.org/netdev/net-next/c/7b6f0a852da3
- [net-next,3/4] net: add data-race annotations in softnet_seq_show()
https://git.kernel.org/netdev/net-next/c/22d046a778e4
- [net-next,4/4] net: rps: remove kfree_rcu_mightsleep() use
https://git.kernel.org/netdev/net-next/c/0a7de4a8f898
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-04-08 20:08 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 16:35 [PATCH net-next 0/4] rps: misc changes Eric Dumazet
2025-04-07 16:35 ` [PATCH net-next 1/4] net: rps: change skb_flow_limit() hash function Eric Dumazet
2025-04-07 18:53 ` Willem de Bruijn
2025-04-07 16:36 ` [PATCH net-next 2/4] net: rps: annotate data-races around (struct sd_flow_limit)->count Eric Dumazet
2025-04-07 18:55 ` Willem de Bruijn
2025-04-07 16:36 ` [PATCH net-next 3/4] net: add data-race annotations in softnet_seq_show() Eric Dumazet
2025-04-07 19:04 ` Willem de Bruijn
2025-04-07 16:36 ` [PATCH net-next 4/4] net: rps: remove kfree_rcu_mightsleep() use Eric Dumazet
2025-04-07 19:36 ` Willem de Bruijn
2025-04-08 20:08 ` [PATCH net-next 0/4] rps: misc changes patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).