All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Henrik Lindström" <lindstrom515@gmail.com>
To: Florian Westphal <fw@strlen.de>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: macvtap performs IP defragmentation, causing MTU problems for virtual machines
Date: Mon, 02 Oct 2023 20:49:36 +0200	[thread overview]
Message-ID: <2197902.NgBsaNRSFp@pc> (raw)
In-Reply-To: <20231002092010.GA30843@breakpoint.cc>

Had to change "return 0" to "return vif" but other than that your changes
seem to work, even with macvlan defragmentation removed.

I tested it by sending 8K fragmented multicast packets, with 5 macvlans on
the receiving side. I consistently received 6 copies of the packet (1 from the
real interface and 1 per macvlan). While doing this i had my VM running with
a macvtap, and it was receiving fragmented packets as expected.

Here are the changes i was testing with, first time sending a diff over mail
so hope it works :-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 02bd201bc7e5..5f638433cef9 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -462,10 +462,6 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 	if (is_multicast_ether_addr(eth->h_dest)) {
 		unsigned int hash;
 
-		skb = ip_check_defrag(dev_net(skb->dev), skb, IP_DEFRAG_MACVLAN);
-		if (!skb)
-			return RX_HANDLER_CONSUMED;
-		*pskb = skb;
 		eth = eth_hdr(skb);
 		if (macvlan_forward_source(skb, port, eth->h_source)) {
 			kfree_skb(skb);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index a4941f53b523..30b822dfa222 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -479,11 +479,29 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
 	return err;
 }
 
+static int ip_defrag_vif(const struct sk_buff *skb, const struct net_device *dev)
+{
+	int vif = l3mdev_master_ifindex_rcu(dev);
+
+	if (vif)
+		return vif;
+
+	/* some folks insist that receiving a fragmented mcast dgram on n devices shall
+	 * result in n defragmented packets.
+	 */
+	if (skb->pkt_type == PACKET_BROADCAST || skb->pkt_type == PACKET_MULTICAST) {
+		if (dev)
+			vif = dev->ifindex;
+	}
+
+	return vif;
+}
+
 /* Process an incoming IP datagram fragment. */
 int ip_defrag(struct net *net, struct sk_buff *skb, u32 user)
 {
 	struct net_device *dev = skb->dev ? : skb_dst(skb)->dev;
-	int vif = l3mdev_master_ifindex_rcu(dev);
+	int vif = ip_defrag_vif(skb, dev);
 	struct ipq *qp;
 
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMREQDS);




  reply	other threads:[~2023-10-02 18:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-30 16:00 macvtap performs IP defragmentation, causing MTU problems for virtual machines Henrik Lindström
2023-10-02  9:20 ` Florian Westphal
2023-10-02 18:49   ` Henrik Lindström [this message]
2023-10-04  8:00     ` Florian Westphal
2023-10-05 17:25       ` Henrik Lindström
2023-10-06  6:06         ` Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2197902.NgBsaNRSFp@pc \
    --to=lindstrom515@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.