From: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
To: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel@openvz.org
Subject: Re: [PATCH] neighbour: purge nf_bridged skb from foreign device neigh
Date: Tue, 9 Jan 2024 12:57:36 +0800 [thread overview]
Message-ID: <07490c75-86c3-4488-8adb-7740b14feb30@virtuozzo.com> (raw)
In-Reply-To: <a84b2797-2008-45d6-9ca3-c72666d3c419@virtuozzo.com>
On 08/01/2024 19:26, Pavel Tikhomirov wrote:
>
>
> On 08/01/2024 19:15, Florian Westphal wrote:
>> Pavel Tikhomirov <ptikhomirov@virtuozzo.com> wrote:
>>> An skb can be added to a neigh->arp_queue while waiting for an arp
>>> reply. Where original skb's skb->dev can be different to neigh's
>>> neigh->dev. For instance in case of bridging dnated skb from one veth to
>>> another, the skb would be added to a neigh->arp_queue of the bridge.
>>>
>>> There is no explicit mechanism that prevents the original skb->dev link
>>> of such skb from being freed under us. For instance neigh_flush_dev does
>>> not cleanup skbs from different device's neigh queue. But that original
>>> link can be used and lead to crash on e.g. this stack:
>>>
>>> arp_process
>>> neigh_update
>>> skb = __skb_dequeue(&neigh->arp_queue)
>>> neigh_resolve_output(..., skb)
>>> ...
>>> br_nf_dev_xmit
>>> br_nf_pre_routing_finish_bridge_slow
>>> skb->dev = nf_bridge->physindev
>>> br_handle_frame_finish
>>>
>>> So let's improve neigh_flush_dev to also purge skbs when device
>>> equal to their skb->nf_bridge->physindev gets destroyed.
>>
>> Can we fix this by replacing physindev pointer with plain
>> ifindex instead? There are not too many places that need to
>> peek into the original net_device struct, so I don't think
>> the additional dev_get_by_index_rcu() would be an issue.
>
> I will work on it, thanks for a good idea!
>
If we replace nf_bridge->physindev completely, we would need to do
something like this in every place physindev was used:
diff --git a/include/linux/netfilter_bridge.h
b/include/linux/netfilter_bridge.h
index f980edfdd2783..105fbdb029261 100644
--- a/include/linux/netfilter_bridge.h
+++ b/include/linux/netfilter_bridge.h
@@ -56,11 +56,15 @@ static inline int nf_bridge_get_physoutif(const
struct sk_buff *skb)
}
static inline struct net_device *
-nf_bridge_get_physindev(const struct sk_buff *skb)
+nf_bridge_get_physindev_rcu(const struct sk_buff *skb)
{
const struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
+ struct net_device *dev;
- return nf_bridge ? nf_bridge->physindev : NULL;
+ if (!nf_bridge || !skb->dev)
+ return 0;
+
+ return dev_get_by_index_rcu(skb->dev->net, nf_bridge->physindev_if);
}
static inline struct net_device *
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a5ae952454c89..51e7cdf9b51c9 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -295,7 +295,7 @@ struct nf_bridge_info {
u8 bridged_dnat:1;
u8 sabotage_in_done:1;
__u16 frag_max_size;
- struct net_device *physindev;
+ int *physindev_if;
/* always valid & non-NULL from FORWARD on, for physdev match */
struct net_device *physoutdev;
diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c
b/net/ipv4/netfilter/nf_reject_ipv4.c
index f01b038fc1cda..01b3eb169772e 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -289,7 +289,8 @@ void nf_send_reset(struct net *net, struct sock *sk,
struct sk_buff *oldskb,
* build the eth header using the original destination's MAC as the
* source, and send the RST packet directly.
*/
- br_indev = nf_bridge_get_physindev(oldskb);
+ rcu_read_lock_bh();
+ br_indev = nf_bridge_get_physindev_rcu(oldskb);
if (br_indev) {
struct ethhdr *oeth = eth_hdr(oldskb);
@@ -297,12 +298,19 @@ void nf_send_reset(struct net *net, struct sock
*sk, struct sk_buff *oldskb,
niph->tot_len = htons(nskb->len);
ip_send_check(niph);
if (dev_hard_header(nskb, nskb->dev, ntohs(nskb->protocol),
- oeth->h_source, oeth->h_dest,
nskb->len) < 0)
+ oeth->h_source, oeth->h_dest,
nskb->len) < 0) {
+ rcu_read_unlock_bh();
goto free_nskb;
+ }
dev_queue_xmit(nskb);
- } else
+ rcu_read_unlock_bh();
+ } else {
+ rcu_read_unlock_bh();
#endif
ip_local_out(net, nskb->sk, nskb);
+#if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
+ }
+#endif
return;
Does it sound good?
Or maybe instead we can have extra physindev_if field in addition to
existing physindev to only do dev_get_by_index_rcu inside
br_nf_pre_routing_finish_bridge_slow to doublecheck the ->physindev link?
Sorry in advance if I'm missing anything obvious.
--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.
next prev parent reply other threads:[~2024-01-09 4:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-08 8:50 [PATCH] neighbour: purge nf_bridged skb from foreign device neigh Pavel Tikhomirov
2024-01-08 9:10 ` Eric Dumazet
2024-01-08 11:15 ` Florian Westphal
2024-01-08 11:26 ` Pavel Tikhomirov
2024-01-09 4:57 ` Pavel Tikhomirov [this message]
2024-01-09 11:12 ` Florian Westphal
2024-01-10 11:16 ` Pavel Tikhomirov
2024-01-09 5:38 ` kernel test robot
2024-01-09 6:05 ` Pavel Tikhomirov
2024-01-09 9:01 ` kernel test robot
2024-01-09 10:50 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07490c75-86c3-4488-8adb-7740b14feb30@virtuozzo.com \
--to=ptikhomirov@virtuozzo.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=kernel@openvz.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox