public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: edumazet@google.com
Cc: Jakub Kicinski <kuba@kernel.org>,
	willemb@google.com, netdev@vger.kernel.org, davem@davemloft.net,
	pabeni@redhat.com, krikku@gmail.com, horms@kernel.org,
	kuniyu@google.com, eric.dumazet@gmail.com
Subject: Re: [net] net: fix potential out-of-bound access in set_rps_cpu()
Date: Fri, 20 Feb 2026 13:52:10 -0800	[thread overview]
Message-ID: <20260220215209.3209259-2-kuba@kernel.org> (raw)
In-Reply-To: <20260220144344.2548121-1-edumazet@google.com>

Hi Eric!

I'm attaching the AI review. It's a bit long, I stopped reading
half way thru. But looking at the code I think AI is onto something.
We have fetched the flow_table in get_rps_cpu() based on
skb_get_rx_queue(skb), but in set_rps_cpu() we are fetching the table
for the target / next_cpu, which we make sure is _not_ the same:

    if (rxq_index == skb_get_rx_queue(skb))

---
net: fix potential out-of-bound access in set_rps_cpu()

This commit addresses an out-of-bound access issue introduced when
set_rps_cpu() was reloading rxqueue->rps_flow_table. The fix passes the
flow_table pointer from get_rps_cpu() to set_rps_cpu() to follow RCU rules.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 096b3ff13f6b..7c5e69c771ca 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4993,13 +4993,11 @@ static bool rps_flow_is_active(struct rps_dev_flow *rflow,
>  static struct rps_dev_flow *
>  set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
>  	    struct rps_dev_flow *rflow, u16 next_cpu, u32 hash,
> -	    u32 flow_id)
> +	    u32 flow_id, struct rps_dev_flow_table *flow_table)
>  {
>  	if (next_cpu < nr_cpu_ids) {
>  		u32 head;
>  #ifdef CONFIG_RFS_ACCEL
> -		struct netdev_rx_queue *rxqueue;
> -		struct rps_dev_flow_table *flow_table;
>  		struct rps_dev_flow *old_rflow;
>  		struct rps_dev_flow *tmp_rflow;
>  		unsigned int tmp_cpu;
> @@ -5014,11 +5012,6 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
>  		if (rxq_index == skb_get_rx_queue(skb))
>  			goto out;
>
> -		rxqueue = dev->_rx + rxq_index;
> -		flow_table = rcu_dereference(rxqueue->rps_flow_table);
> -		if (!flow_table)
> -			goto out;
> -
>  		tmp_rflow = &flow_table->flows[flow_id];
>  		tmp_cpu = READ_ONCE(tmp_rflow->cpu);
>

Does this change introduce an aliasing issue in the CONFIG_RFS_ACCEL path?

In get_rps_cpu(), the flow_table parameter is obtained from the receiving
queue (dev->_rx[skb_get_rx_queue(skb)]) and rflow points to
&flow_table->flows[flow_id].

When this rflow is passed into set_rps_cpu() with the same flow_table and
flow_id, tmp_rflow is computed as &flow_table->flows[flow_id]:

net/core/dev.c:set_rps_cpu() {
	tmp_rflow = &flow_table->flows[flow_id];
	tmp_cpu = READ_ONCE(tmp_rflow->cpu);
	rc = cpu_rmap_lookup_index(dev->rx_cpu_rmap, next_cpu);
	if (rc >= 0)
		next_cpu = rc;

	old_rflow = rflow;
	rflow = tmp_rflow;
	WRITE_ONCE(rflow->filter, rc);
	WRITE_ONCE(rflow->hash, hash);

	if (old_rflow->filter == rc)
		WRITE_ONCE(old_rflow->filter, RPS_NO_FILTER);
}

Since rflow and tmp_rflow are derived from the same flow_table and flow_id,
they point to the same memory location. After the assignments old_rflow =
rflow and rflow = tmp_rflow, all three pointers alias the same object.

The code stores rc into rflow->filter with WRITE_ONCE(rflow->filter, rc),
then checks if (old_rflow->filter == rc). Because old_rflow is rflow (which
is tmp_rflow), this condition is always true, causing WRITE_ONCE(old_rflow->
filter, RPS_NO_FILTER) to immediately clear the filter ID that was just set.

This means the hardware filter ID is never retained in the rps_dev_flow
entry. On subsequent packets for the same flow, the stale-flow check:

	if (READ_ONCE(tmp_rflow->filter) != RPS_NO_FILTER)

will always find RPS_NO_FILTER, causing ndo_rx_flow_steer() to be called
redundantly for every packet even when the flow is already steered.

Was the intent to use different flow tables for the receiving and target
queues so that the filter deduplication logic could work properly?

> @@ -5134,7 +5127,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
>  		      rflow->last_qtail)) >= 0)) {
>  			tcpu = next_cpu;
>  			rflow = set_rps_cpu(dev, skb, rflow, next_cpu, hash,
> -					    flow_id);
> +					    flow_id, flow_table);
>  		}
>
>  		if (tcpu < nr_cpu_ids && cpu_online(tcpu)) {
-- 
pw-bot: cr

  reply	other threads:[~2026-02-20 21:55 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-20 14:43 [PATCH net] net: fix potential out-of-bound access in set_rps_cpu() Eric Dumazet
2026-02-20 21:52 ` Jakub Kicinski [this message]
2026-02-20 22:03   ` [net] " Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260220215209.3209259-2-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=horms@kernel.org \
    --cc=krikku@gmail.com \
    --cc=kuniyu@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox