netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
@ 2025-02-20  7:25 Ido Schimmel
  2025-02-20 10:40 ` Eric Dumazet
  2025-02-22  0:40 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 4+ messages in thread
From: Ido Schimmel @ 2025-02-20  7:25 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, maheshb, lucien.xin,
	fmei, Ido Schimmel

After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
IPv4 neighbors can be constructed on the blackhole net device, but they
are constructed with an output function (neigh_direct_output()) that
simply calls dev_queue_xmit(). The latter will transmit packets via
'skb->dev' which might not be the blackhole net device if dst_dev_put()
switched 'dst->dev' to the blackhole net device while another CPU was
using the dst entry in ip_output(), but after it already initialized
'skb->dev' from 'dst->dev'.

Specifically, the following can happen:

    CPU1                                      CPU2

udp_sendmsg(sk1)                          udp_sendmsg(sk2)
udp_send_skb()                            [...]
ip_output()
    skb->dev = skb_dst(skb)->dev
                                          dst_dev_put()
                                              dst->dev = blackhole_netdev
ip_finish_output2()
    resolves neigh on dst->dev
neigh_output()
neigh_direct_output()
dev_queue_xmit()

This will result in IPv4 packets being sent without an Ethernet header
via a valid net device:

tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown
(0x58c6), length 68:
        0x0000:  8dda 74ca f1ae ca6c ca6c 0098 969c 0400  ..t....l.l......
        0x0010:  0000 4730 3f18 6800 0000 0000 0000 9971  ..G0?.h........q
        0x0020:  c4c9 9055 a157 0a70 9ead bf83 38ca ab38  ...U.W.p....8..8
        0x0030:  8add ab96 e052                           .....R

Fix by making sure that neighbors are constructed on top of the
blackhole net device with an output function that simply consumes the
packets, in a similar fashion to dst_discard_out() and
blackhole_netdev_xmit().

Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries")
Fixes: 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
Reported-by: Florian Meister <fmei@sfs.com>
Closes: https://lore.kernel.org/netdev/20250210084931.23a5c2e4@hermes.local/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/loopback.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index c8840c3b9a1b..f1d68153987e 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -244,8 +244,22 @@ static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
 	return NETDEV_TX_OK;
 }
 
+static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
+{
+	kfree_skb(skb);
+	return 0;
+}
+
+static int blackhole_neigh_construct(struct net_device *dev,
+				     struct neighbour *n)
+{
+	n->output = blackhole_neigh_output;
+	return 0;
+}
+
 static const struct net_device_ops blackhole_netdev_ops = {
 	.ndo_start_xmit = blackhole_netdev_xmit,
+	.ndo_neigh_construct = blackhole_neigh_construct,
 };
 
 /* This is a dst-dummy device used specifically for invalidated
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
  2025-02-20  7:25 [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header Ido Schimmel
@ 2025-02-20 10:40 ` Eric Dumazet
  2025-02-20 15:38   ` Ido Schimmel
  2025-02-22  0:40 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2025-02-20 10:40 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, andrew+netdev, maheshb, lucien.xin,
	fmei

On Thu, Feb 20, 2025 at 8:26 AM Ido Schimmel <idosch@nvidia.com> wrote:
>
> After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
> IPv4 neighbors can be constructed on the blackhole net device, but they
> are constructed with an output function (neigh_direct_output()) that
> simply calls dev_queue_xmit(). The latter will transmit packets via
> 'skb->dev' which might not be the blackhole net device if dst_dev_put()
> switched 'dst->dev' to the blackhole net device while another CPU was
> using the dst entry in ip_output(), but after it already initialized
> 'skb->dev' from 'dst->dev'.
>
> Specifically, the following can happen:
>
>     CPU1                                      CPU2
>
> udp_sendmsg(sk1)                          udp_sendmsg(sk2)
> udp_send_skb()                            [...]
> ip_output()
>     skb->dev = skb_dst(skb)->dev
>                                           dst_dev_put()
>                                               dst->dev = blackhole_netdev
> ip_finish_output2()
>     resolves neigh on dst->dev
> neigh_output()
> neigh_direct_output()
> dev_queue_xmit()
>
> This will result in IPv4 packets being sent without an Ethernet header
> via a valid net device:
>
> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
> listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
> 22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown
> (0x58c6), length 68:
>         0x0000:  8dda 74ca f1ae ca6c ca6c 0098 969c 0400  ..t....l.l......
>         0x0010:  0000 4730 3f18 6800 0000 0000 0000 9971  ..G0?.h........q
>         0x0020:  c4c9 9055 a157 0a70 9ead bf83 38ca ab38  ...U.W.p....8..8
>         0x0030:  8add ab96 e052                           .....R
>
> Fix by making sure that neighbors are constructed on top of the
> blackhole net device with an output function that simply consumes the
> packets, in a similar fashion to dst_discard_out() and
> blackhole_netdev_xmit().
>
> Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries")
> Fixes: 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
> Reported-by: Florian Meister <fmei@sfs.com>
> Closes: https://lore.kernel.org/netdev/20250210084931.23a5c2e4@hermes.local/
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/loopback.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index c8840c3b9a1b..f1d68153987e 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -244,8 +244,22 @@ static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
>         return NETDEV_TX_OK;
>  }
>
> +static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
> +{
> +       kfree_skb(skb);

If there is any risk of this being hit often, I would probably use the
recent SKB_DROP_REASON_BLACKHOLE

(feel free to resubmit
https://lore.kernel.org/netdev/20250212164323.2183023-1-edumazet@google.com/T/#mbb8d4b0779cb8f0654a382772c943af5389606ea
?)

Otherwise, this looks good to me, thanks !

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
  2025-02-20 10:40 ` Eric Dumazet
@ 2025-02-20 15:38   ` Ido Schimmel
  0 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2025-02-20 15:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, kuba, pabeni, andrew+netdev, maheshb, lucien.xin,
	fmei

On Thu, Feb 20, 2025 at 11:40:07AM +0100, Eric Dumazet wrote:
> On Thu, Feb 20, 2025 at 8:26 AM Ido Schimmel <idosch@nvidia.com> wrote:
> > diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> > index c8840c3b9a1b..f1d68153987e 100644
> > --- a/drivers/net/loopback.c
> > +++ b/drivers/net/loopback.c
> > @@ -244,8 +244,22 @@ static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
> >         return NETDEV_TX_OK;
> >  }
> >
> > +static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
> > +{
> > +       kfree_skb(skb);
> 
> If there is any risk of this being hit often, I would probably use the
> recent SKB_DROP_REASON_BLACKHOLE

Not very often. About 10 times while running the reproducer I shared
here:
https://lore.kernel.org/netdev/Z7D9cR22BDPN7WSJ@shredder/

In line with the original report:
https://github.com/siderolabs/talos/issues/9837#issuecomment-2642116378

> (feel free to resubmit
> https://lore.kernel.org/netdev/20250212164323.2183023-1-edumazet@google.com/T/#mbb8d4b0779cb8f0654a382772c943af5389606ea
> ?)

Can we do it in net-next?

A few questions / suggestions regarding the patch:

1. Can we use it for IPv4 as well? I tested the following:

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 753704f75b2c..2aeab70c1cb5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -966,6 +966,7 @@ static int ip_error(struct sk_buff *skb)
 
 	switch (rt->dst.error) {
 	case EINVAL:
+		SKB_DR_SET(reason, BLACKHOLE);
 	default:
 		goto out;
 	case EHOSTUNREACH:

2. Given that this reason is going to be triggered both for
user-installed blackhole routes and dst entries being destroyed how
about adjusting the comment? Otherwise I think it will be confusing for
users who didn't install a blackhole route. Something like:

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index f1d68153987e..cb269b3251d4 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -239,14 +239,13 @@ struct pernet_operations __net_initdata loopback_net_ops = {
 static netdev_tx_t blackhole_netdev_xmit(struct sk_buff *skb,
 					 struct net_device *dev)
 {
-	kfree_skb(skb);
-	net_warn_ratelimited("%s(): Dropping skb.\n", __func__);
+	kfree_skb_reason(skb, SKB_DROP_REASON_BLACKHOLE);
 	return NETDEV_TX_OK;
 }
 
 static int blackhole_neigh_output(struct neighbour *n, struct sk_buff *skb)
 {
-	kfree_skb(skb);
+	kfree_skb_reason(skb, SKB_DROP_REASON_BLACKHOLE);
 	return 0;
 }
 
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index de42577f16dd..0ef6869dbd1b 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -556,7 +556,8 @@ enum skb_drop_reason {
 	 */
 	SKB_DROP_REASON_BRIDGE_INGRESS_STP_STATE,
 	/**
-	 * @SKB_DROP_REASON_BLACKHOLE: blackhole route.
+	 * @SKB_DROP_REASON_BLACKHOLE: blackhole route or dst entry being
+	 * destroyed.
 	 */
 	SKB_DROP_REASON_BLACKHOLE,
 	/**

> Otherwise, this looks good to me, thanks !
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks!

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header
  2025-02-20  7:25 [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header Ido Schimmel
  2025-02-20 10:40 ` Eric Dumazet
@ 2025-02-22  0:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-02-22  0:40 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, andrew+netdev, maheshb,
	lucien.xin, fmei

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 20 Feb 2025 09:25:59 +0200 you wrote:
> After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
> IPv4 neighbors can be constructed on the blackhole net device, but they
> are constructed with an output function (neigh_direct_output()) that
> simply calls dev_queue_xmit(). The latter will transmit packets via
> 'skb->dev' which might not be the blackhole net device if dst_dev_put()
> switched 'dst->dev' to the blackhole net device while another CPU was
> using the dst entry in ip_output(), but after it already initialized
> 'skb->dev' from 'dst->dev'.
> 
> [...]

Here is the summary with links:
  - [net] net: loopback: Avoid sending IP packets without an Ethernet header
    https://git.kernel.org/netdev/net/c/0e4427f8f587

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-02-22  0:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-20  7:25 [PATCH net] net: loopback: Avoid sending IP packets without an Ethernet header Ido Schimmel
2025-02-20 10:40 ` Eric Dumazet
2025-02-20 15:38   ` Ido Schimmel
2025-02-22  0:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).