* [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
@ 2025-02-20 7:25 Ido Schimmel
2025-02-20 10:40 ` Eric Dumazet
2025-02-22 0:40 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 4+ messages in thread
From: Ido Schimmel @ 2025-02-20 7:25 UTC (permalink / raw)
To: netdev
Cc: davem, kuba, pabeni, edumazet, andrew+netdev, maheshb, lucien.xin,
fmei, Ido Schimmel
After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
IPv4 neighbors can be constructed on the blackhole net device, but they
are constructed with an output function (neigh_direct_output()) that
simply calls dev_queue_xmit(). The latter will transmit packets via
'skb->dev' which might not be the blackhole net device if dst_dev_put()
switched 'dst->dev' to the blackhole net device while another CPU was
using the dst entry in ip_output(), but after it already initialized
'skb->dev' from 'dst->dev'.
Specifically, the following can happen:
CPU1 CPU2
udp_sendmsg(sk1) udp_sendmsg(sk2)
udp_send_skb() [...]
ip_output()
skb->dev = skb_dst(skb)->dev
dst_dev_put()
dst->dev = blackhole_netdev
ip_finish_output2()
resolves neigh on dst->dev
neigh_output()
neigh_direct_output()
dev_queue_xmit()
This will result in IPv4 packets being sent without an Ethernet header
via a valid net device:
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown
(0x58c6), length 68:
0x0000: 8dda 74ca f1ae ca6c ca6c 0098 969c 0400 ..t....l.l......
0x0010: 0000 4730 3f18 6800 0000 0000 0000 9971 ..G0?.h........q
0x0020: c4c9 9055 a157 0a70 9ead bf83 38ca ab38 ...U.W.p....8..8
0x0030: 8add ab96 e052 .....R
Fix by making sure that neighbors are constructed on top of the
blackhole net device with an output function that simply consumes the
packets, in a similar fashion to dst_discard_out() and
blackhole_netdev_xmit().
Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries")
Fixes: 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
Reported-by: Florian Meister <fmei@sfs.com>
Closes: https://lore.kernel.org/netdev/20250210084931.23a5c2e4@hermes.local/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
drivers/net/loopback.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index c8840c3b9a1b..f1d68153987e 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -244,8 +244,22 @@ static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
return NETDEV_TX_OK;
}
+static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
+{
+ kfree_skb(skb);
+ return 0;
+}
+
+static int blackhole_neigh_construct(struct net_device *dev,
+ struct neighbour *n)
+{
+ n->output = blackhole_neigh_output;
+ return 0;
+}
+
static const struct net_device_ops blackhole_netdev_ops = {
.ndo_start_xmit = blackhole_netdev_xmit,
+ .ndo_neigh_construct = blackhole_neigh_construct,
};
/* This is a dst-dummy device used specifically for invalidated
--
2.48.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
2025-02-20 7:25 [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header Ido Schimmel
@ 2025-02-20 10:40 ` Eric Dumazet
2025-02-20 15:38 ` Ido Schimmel
2025-02-22 0:40 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2025-02-20 10:40 UTC (permalink / raw)
To: Ido Schimmel
Cc: netdev, davem, kuba, pabeni, andrew+netdev, maheshb, lucien.xin,
fmei
On Thu, Feb 20, 2025 at 8:26 AM Ido Schimmel <idosch@nvidia.com> wrote:
>
> After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
> IPv4 neighbors can be constructed on the blackhole net device, but they
> are constructed with an output function (neigh_direct_output()) that
> simply calls dev_queue_xmit(). The latter will transmit packets via
> 'skb->dev' which might not be the blackhole net device if dst_dev_put()
> switched 'dst->dev' to the blackhole net device while another CPU was
> using the dst entry in ip_output(), but after it already initialized
> 'skb->dev' from 'dst->dev'.
>
> Specifically, the following can happen:
>
> CPU1 CPU2
>
> udp_sendmsg(sk1) udp_sendmsg(sk2)
> udp_send_skb() [...]
> ip_output()
> skb->dev = skb_dst(skb)->dev
> dst_dev_put()
> dst->dev = blackhole_netdev
> ip_finish_output2()
> resolves neigh on dst->dev
> neigh_output()
> neigh_direct_output()
> dev_queue_xmit()
>
> This will result in IPv4 packets being sent without an Ethernet header
> via a valid net device:
>
> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
> listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
> 22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown
> (0x58c6), length 68:
> 0x0000: 8dda 74ca f1ae ca6c ca6c 0098 969c 0400 ..t....l.l......
> 0x0010: 0000 4730 3f18 6800 0000 0000 0000 9971 ..G0?.h........q
> 0x0020: c4c9 9055 a157 0a70 9ead bf83 38ca ab38 ...U.W.p....8..8
> 0x0030: 8add ab96 e052 .....R
>
> Fix by making sure that neighbors are constructed on top of the
> blackhole net device with an output function that simply consumes the
> packets, in a similar fashion to dst_discard_out() and
> blackhole_netdev_xmit().
>
> Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries")
> Fixes: 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
> Reported-by: Florian Meister <fmei@sfs.com>
> Closes: https://lore.kernel.org/netdev/20250210084931.23a5c2e4@hermes.local/
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
> drivers/net/loopback.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index c8840c3b9a1b..f1d68153987e 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -244,8 +244,22 @@ static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
> return NETDEV_TX_OK;
> }
>
> +static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
> +{
> + kfree_skb(skb);
If there is any risk of this being hit often, I would probably use the
recent SKB_DROP_REASON_BLACKHOLE
(feel free to resubmit
https://lore.kernel.org/netdev/20250212164323.2183023-1-edumazet@google.com/T/#mbb8d4b0779cb8f0654a382772c943af5389606ea
?)
Otherwise, this looks good to me, thanks !
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
2025-02-20 10:40 ` Eric Dumazet
@ 2025-02-20 15:38 ` Ido Schimmel
0 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2025-02-20 15:38 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, davem, kuba, pabeni, andrew+netdev, maheshb, lucien.xin,
fmei
On Thu, Feb 20, 2025 at 11:40:07AM +0100, Eric Dumazet wrote:
> On Thu, Feb 20, 2025 at 8:26 AM Ido Schimmel <idosch@nvidia.com> wrote:
> > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > index c8840c3b9a1b..f1d68153987e 100644
> > --- a/drivers/net/loopback.c
> > +++ b/drivers/net/loopback.c
> > @@ -244,8 +244,22 @@ static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
> > return NETDEV_TX_OK;
> > }
> >
> > +static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
> > +{
> > + kfree_skb(skb);
>
> If there is any risk of this being hit often, I would probably use the
> recent SKB_DROP_REASON_BLACKHOLE
Not very often. About 10 times while running the reproducer I shared
here:
https://lore.kernel.org/netdev/Z7D9cR22BDPN7WSJ@shredder/
In line with the original report:
https://github.com/siderolabs/talos/issues/9837#issuecomment-2642116378
> (feel free to resubmit
> https://lore.kernel.org/netdev/20250212164323.2183023-1-edumazet@google.com/T/#mbb8d4b0779cb8f0654a382772c943af5389606ea
> ?)
Can we do it in net-next?
A few questions / suggestions regarding the patch:
1. Can we use it for IPv4 as well? I tested the following:
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 753704f75b2c..2aeab70c1cb5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -966,6 +966,7 @@ static int ip_error(struct sk_buff *skb)
switch (rt->dst.error) {
case EINVAL:
+ SKB_DR_SET(reason, BLACKHOLE);
default:
goto out;
case EHOSTUNREACH:
2. Given that this reason is going to be triggered both for
user-installed blackhole routes and dst entries being destroyed how
about adjusting the comment? Otherwise I think it will be confusing for
users who didn't install a blackhole route. Something like:
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index f1d68153987e..cb269b3251d4 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -239,14 +239,13 @@ struct pernet_operations __net_initdata loopback_net_ops = {
static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
struct net_device *dev)
{
- kfree_skb(skb);
- net_warn_ratelimited("%s(): Dropping skb.\n", __func__);
+ kfree_skb_reason(skb, SKB_DROP_REASON_BLACKHOLE);
return NETDEV_TX_OK;
}
static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
{
- kfree_skb(skb);
+ kfree_skb_reason(skb, SKB_DROP_REASON_BLACKHOLE);
return 0;
}
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index de42577f16dd..0ef6869dbd1b 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -556,7 +556,8 @@ enum skb_drop_reason {
*/
SKB_DROP_REASON_BRIDGE_INGRESS_STP_STATE,
/**
- * @SKB_DROP_REASON_BLACKHOLE: blackhole route.
+ * @SKB_DROP_REASON_BLACKHOLE: blackhole route or dst entry being
+ * destroyed.
*/
SKB_DROP_REASON_BLACKHOLE,
/**
> Otherwise, this looks good to me, thanks !
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
Thanks!
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
2025-02-20 7:25 [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header Ido Schimmel
2025-02-20 10:40 ` Eric Dumazet
@ 2025-02-22 0:40 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-02-22 0:40 UTC (permalink / raw)
To: Ido Schimmel
Cc: netdev, davem, kuba, pabeni, edumazet, andrew+netdev, maheshb,
lucien.xin, fmei
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 20 Feb 2025 09:25:59 +0200 you wrote:
> After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
> IPv4 neighbors can be constructed on the blackhole net device, but they
> are constructed with an output function (neigh_direct_output()) that
> simply calls dev_queue_xmit(). The latter will transmit packets via
> 'skb->dev' which might not be the blackhole net device if dst_dev_put()
> switched 'dst->dev' to the blackhole net device while another CPU was
> using the dst entry in ip_output(), but after it already initialized
> 'skb->dev' from 'dst->dev'.
>
> [...]
Here is the summary with links:
- [net] net: loopback: Avoid sending IP packets without an Ethernet header
https://git.kernel.org/netdev/net/c/0e4427f8f587
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-02-22 0:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-20 7:25 [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header Ido Schimmel
2025-02-20 10:40 ` Eric Dumazet
2025-02-20 15:38 ` Ido Schimmel
2025-02-22 0:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).