netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge
@ 2025-10-21 14:44 Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 1/8] " Dmitry Skorodumov
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev; +Cc: andrey.bokhanko, Dmitry Skorodumov

Make it is possible to create link in L2E mode: learnable
bridge. The IPs will be learned from TX-packets of child interfaces.

Also, dev_add_pack() protocol is attached to the main port
to support communication from main to child interfaces.

This mode is intended for the desktop virtual machines, for
bridging to Wireless interfaces.

The mode should be specified while creating first child interface.
It is not possible to change it after this.

This functionality is quite often requested by users.

Dmitry Skorodumov (8):
  ipvlan: Implement learnable L2-bridge
  ipvlan: Send mcasts out directly in ipvlan_xmit_mode_l2()
  ipvlan: Handle rx mcast-ip and unicast eth
  ipvlan: Added some kind of MAC SNAT
  ipvlan: Forget all IP when device goes down
  ipvlan: Support GSO for port -> ipvlan
  ipvlan: Support IPv6 for learnable l2-bridge
  ipvlan: Don't learn child with host-ip

 Documentation/networking/ipvlan.rst |  11 +
 drivers/net/ipvlan/ipvlan.h         |  26 ++
 drivers/net/ipvlan/ipvlan_core.c    | 488 +++++++++++++++++++++++++---
 drivers/net/ipvlan/ipvlan_main.c    | 219 +++++++++++--
 include/uapi/linux/if_link.h        |   1 +
 5 files changed, 659 insertions(+), 86 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/8] ipvlan: Implement learnable L2-bridge
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-22 14:23   ` Simon Horman
  2025-10-21 14:44 ` [PATCH net-next 2/8] ipvlan: Send mcasts out directly in ipvlan_xmit_mode_l2() Dmitry Skorodumov
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, Simon Horman, linux-doc, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Jonathan Corbet, Andrew Lunn

Now it is possible to create link in L2E mode: learnable
bridge. The IPs will be learned from TX-packets of child interfaces.

Also, dev_add_pack() protocol is attached to the main port
to support communication from main to child interfaces.

This mode is intended for the desktop virtual machines, for
bridging to Wireless interfaces.

The mode should be specified while creating first child interface.
It is not possible to change it after this.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 Documentation/networking/ipvlan.rst |  11 ++
 drivers/net/ipvlan/ipvlan.h         |  21 ++++
 drivers/net/ipvlan/ipvlan_core.c    | 163 +++++++++++++++++++++++++---
 drivers/net/ipvlan/ipvlan_main.c    | 140 +++++++++++++++++++++---
 include/uapi/linux/if_link.h        |   1 +
 5 files changed, 301 insertions(+), 35 deletions(-)

diff --git a/Documentation/networking/ipvlan.rst b/Documentation/networking/ipvlan.rst
index 895d0ccfd596..9539e8ac99f4 100644
--- a/Documentation/networking/ipvlan.rst
+++ b/Documentation/networking/ipvlan.rst
@@ -90,6 +90,17 @@ works in this mode and hence it is L3-symmetric (L3s). This will have slightly l
 performance but that shouldn't matter since you are choosing this mode over plain-L3
 mode to make conn-tracking work.
 
+4.4 L2E mode:
+-------------
+
+This mode is an extension for the L2 mode. It is primarily intended for
+desktop virtual machines for bridging to Wireless interfaces. In plain L2
+mode you have to configure IPs on slave interface to make it possible
+mux-ing frames between slaves/master. In the L2E mode, ipvlan will
+learn itself IPv4/IPv6 address from outgoing packets. Moreover,
+the dev_add_pack() is configured on master interface to capture
+outgoing frames and mux-ing it to slave interfaces, if needed.
+
 5. Mode flags:
 ==============
 
diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 50de3ee204db..020e80df1e38 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -91,6 +91,7 @@ struct ipvl_port {
 	possible_net_t		pnet;
 	struct hlist_head	hlhead[IPVLAN_HASH_SIZE];
 	struct list_head	ipvlans;
+	struct packet_type	ipvl_ptype;
 	u16			mode;
 	u16			flags;
 	u16			dev_id_start;
@@ -103,6 +104,7 @@ struct ipvl_port {
 
 struct ipvl_skb_cb {
 	bool tx_pkt;
+	void *mark;
 };
 #define IPVL_SKB_CB(_skb) ((struct ipvl_skb_cb *)&((_skb)->cb[0]))
 
@@ -151,12 +153,31 @@ static inline void ipvlan_clear_vepa(struct ipvl_port *port)
 	port->flags &= ~IPVLAN_F_VEPA;
 }
 
+static inline bool ipvlan_is_learnable(struct ipvl_port *port)
+{
+	return port->mode == IPVLAN_MODE_L2E;
+}
+
+static inline void ipvlan_mark_skb(struct sk_buff *skb, struct net_device *dev)
+{
+	IPVL_SKB_CB(skb)->mark = dev;
+}
+
+static inline bool ipvlan_is_skb_marked(struct sk_buff *skb, struct net_device *dev)
+{
+	return (IPVL_SKB_CB(skb)->mark == dev);
+}
+
 void ipvlan_init_secret(void);
 unsigned int ipvlan_mac_hash(const unsigned char *addr);
 rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb);
+void ipvlan_skb_crossing_ns(struct sk_buff *skb, struct net_device *dev);
 void ipvlan_process_multicast(struct work_struct *work);
+void ipvlan_multicast_enqueue(struct ipvl_port *port,
+			      struct sk_buff *skb, bool tx_pkt);
 int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev);
 void ipvlan_ht_addr_add(struct ipvl_dev *ipvlan, struct ipvl_addr *addr);
+int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6);
 struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev *ipvlan,
 				   const void *iaddr, bool is_v6);
 bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6);
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index d7e3ddbcab6f..ffe8efd2f1aa 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -284,6 +284,18 @@ void ipvlan_process_multicast(struct work_struct *work)
 		rcu_read_unlock();
 
 		if (tx_pkt) {
+			if (ipvlan_is_learnable(port)) {
+				/* Inject packet to main dev */
+				nskb = skb_clone(skb, GFP_ATOMIC);
+				if (nskb) {
+					local_bh_disable();
+					nskb->pkt_type = pkt_type;
+					nskb->dev = port->dev;
+					dev_forward_skb(port->dev, nskb);
+					local_bh_enable();
+				}
+			}
+
 			/* If the packet originated here, send it out. */
 			skb->dev = port->dev;
 			skb->pkt_type = pkt_type;
@@ -299,7 +311,7 @@ void ipvlan_process_multicast(struct work_struct *work)
 	}
 }
 
-static void ipvlan_skb_crossing_ns(struct sk_buff *skb, struct net_device *dev)
+void ipvlan_skb_crossing_ns(struct sk_buff *skb, struct net_device *dev)
 {
 	bool xnet = true;
 
@@ -414,6 +426,77 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h,
 	return addr;
 }
 
+static inline bool is_ipv4_usable(__be32 addr)
+{
+	return !ipv4_is_lbcast(addr) && !ipv4_is_multicast(addr) &&
+	       !ipv4_is_zeronet(addr);
+}
+
+static inline bool is_ipv6_usable(const struct in6_addr *addr)
+{
+	return !ipv6_addr_is_multicast(addr) && !ipv6_addr_loopback(addr) &&
+	       !ipv6_addr_any(addr);
+}
+
+static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
+			      int addr_type)
+{
+	void *addr = NULL;
+	bool is_v6;
+
+	switch (addr_type) {
+#if IS_ENABLED(CONFIG_IPV6)
+	/* No need to handle IPVL_ICMPV6, since it never has valid src-address */
+	case IPVL_IPV6: {
+		struct ipv6hdr *ip6h;
+
+		ip6h = (struct ipv6hdr *)lyr3h;
+		if (!is_ipv6_usable(&ip6h->saddr))
+			return;
+		is_v6 = true;
+		addr = &ip6h->saddr;
+		break;
+	}
+#endif
+	case IPVL_IPV4: {
+		struct iphdr *ip4h;
+		__be32 *i4addr;
+
+		ip4h = (struct iphdr *)lyr3h;
+		i4addr = &ip4h->saddr;
+		if (!is_ipv4_usable(*i4addr))
+			return;
+		is_v6 = false;
+		addr = i4addr;
+		break;
+	}
+	case IPVL_ARP: {
+		struct arphdr *arph;
+		unsigned char *arp_ptr;
+		__be32 *i4addr;
+
+		arph = (struct arphdr *)lyr3h;
+		arp_ptr = (unsigned char *)(arph + 1);
+		arp_ptr += ipvlan->port->dev->addr_len;
+		i4addr = (__be32 *)arp_ptr;
+		if (!is_ipv4_usable(*i4addr))
+			return;
+		is_v6 = false;
+		addr = i4addr;
+		break;
+	}
+	default:
+		return;
+	}
+
+	if (!ipvlan_ht_addr_lookup(ipvlan->port, addr, is_v6)) {
+		spin_lock_bh(&ipvlan->addrs_lock);
+		if (!ipvlan_addr_busy(ipvlan->port, addr, is_v6))
+			ipvlan_add_addr(ipvlan, addr, is_v6);
+		spin_unlock_bh(&ipvlan->addrs_lock);
+	}
+}
+
 static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb)
 {
 	struct net_device *dev = skb->dev;
@@ -561,8 +644,8 @@ static int ipvlan_process_outbound(struct sk_buff *skb)
 	return ret;
 }
 
-static void ipvlan_multicast_enqueue(struct ipvl_port *port,
-				     struct sk_buff *skb, bool tx_pkt)
+void ipvlan_multicast_enqueue(struct ipvl_port *port,
+			      struct sk_buff *skb, bool tx_pkt)
 {
 	if (skb->protocol == htons(ETH_P_PAUSE)) {
 		kfree_skb(skb);
@@ -618,15 +701,56 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev)
 
 static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 {
-	const struct ipvl_dev *ipvlan = netdev_priv(dev);
-	struct ethhdr *eth = skb_eth_hdr(skb);
-	struct ipvl_addr *addr;
 	void *lyr3h;
+	struct ipvl_addr *addr;
 	int addr_type;
+	bool same_mac_addr;
+	struct ipvl_dev *ipvlan = netdev_priv(dev);
+	struct ethhdr *eth = skb_eth_hdr(skb);
+
+	if (ipvlan_is_learnable(ipvlan->port) &&
+	    ether_addr_equal(eth->h_source, dev->dev_addr)) {
+		/* ignore tx-packets from host */
+		goto out_drop;
+	}
+
+	same_mac_addr = ether_addr_equal(eth->h_dest, eth->h_source);
+
+	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
 
-	if (!ipvlan_is_vepa(ipvlan->port) &&
-	    ether_addr_equal(eth->h_dest, eth->h_source)) {
-		lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
+	if (ipvlan_is_learnable(ipvlan->port)) {
+		if (lyr3h)
+			ipvlan_addr_learn(ipvlan, lyr3h, addr_type);
+		/* Mark SKB in advance */
+		skb = skb_share_check(skb, GFP_ATOMIC);
+		if (!skb)
+			return NET_XMIT_DROP;
+		ipvlan_mark_skb(skb, ipvlan->phy_dev);
+	}
+
+	if (is_multicast_ether_addr(eth->h_dest)) {
+		skb_reset_mac_header(skb);
+		ipvlan_skb_crossing_ns(skb, NULL);
+		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
+		return NET_XMIT_SUCCESS;
+	}
+
+	if (ipvlan_is_vepa(ipvlan->port))
+		goto tx_phy_dev;
+
+	if (!same_mac_addr &&
+	    ether_addr_equal(eth->h_dest, ipvlan->phy_dev->dev_addr)) {
+		/* It is a packet from child with destination to main port.
+		 * Pass it to main.
+		 */
+		skb = skb_share_check(skb, GFP_ATOMIC);
+		if (!skb)
+			return NET_XMIT_DROP;
+		skb->pkt_type = PACKET_HOST;
+		skb->dev = ipvlan->phy_dev;
+		dev_forward_skb(ipvlan->phy_dev, skb);
+		return NET_XMIT_SUCCESS;
+	} else if (same_mac_addr) {
 		if (lyr3h) {
 			addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true);
 			if (addr) {
@@ -649,16 +773,14 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 		 */
 		dev_forward_skb(ipvlan->phy_dev, skb);
 		return NET_XMIT_SUCCESS;
-
-	} else if (is_multicast_ether_addr(eth->h_dest)) {
-		skb_reset_mac_header(skb);
-		ipvlan_skb_crossing_ns(skb, NULL);
-		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
-		return NET_XMIT_SUCCESS;
 	}
 
+tx_phy_dev:
 	skb->dev = ipvlan->phy_dev;
 	return dev_queue_xmit(skb);
+out_drop:
+	consume_skb(skb);
+	return NET_XMIT_DROP;
 }
 
 int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -674,6 +796,7 @@ int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	switch(port->mode) {
 	case IPVLAN_MODE_L2:
+	case IPVLAN_MODE_L2E:
 		return ipvlan_xmit_mode_l2(skb, dev);
 	case IPVLAN_MODE_L3:
 #ifdef CONFIG_IPVLAN_L3S
@@ -737,17 +860,22 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
 	struct ethhdr *eth = eth_hdr(skb);
 	rx_handler_result_t ret = RX_HANDLER_PASS;
 
+	/* Ignore already seen packets. */
+	if (ipvlan_is_skb_marked(skb, port->dev))
+		return RX_HANDLER_PASS;
+
 	if (is_multicast_ether_addr(eth->h_dest)) {
 		if (ipvlan_external_frame(skb, port)) {
-			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
-
 			/* External frames are queued for device local
 			 * distribution, but a copy is given to master
 			 * straight away to avoid sending duplicates later
 			 * when work-queue processes this frame. This is
 			 * achieved by returning RX_HANDLER_PASS.
 			 */
+			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
 			if (nskb) {
+				ipvlan_mark_skb(skb, port->dev);
 				ipvlan_skb_crossing_ns(nskb, NULL);
 				ipvlan_multicast_enqueue(port, nskb, false);
 			}
@@ -770,6 +898,7 @@ rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
 
 	switch (port->mode) {
 	case IPVLAN_MODE_L2:
+	case IPVLAN_MODE_L2E:
 		return ipvlan_handle_mode_l2(pskb, port);
 	case IPVLAN_MODE_L3:
 		return ipvlan_handle_mode_l3(pskb, port);
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 660f3db11766..df5275bc30fc 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -7,6 +7,11 @@
 
 #include "ipvlan.h"
 
+static void ipvlan_set_learnable(struct ipvl_port *port)
+{
+	dev_add_pack(&port->ipvl_ptype);
+}
+
 static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval,
 				struct netlink_ext_ack *extack)
 {
@@ -16,6 +21,15 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval,
 
 	ASSERT_RTNL();
 	if (port->mode != nval) {
+		/* Don't allow switch off the learnable bridge mode.
+		 * Flags also must be set from the first port-link setup.
+		 */
+		if (port->mode == IPVLAN_MODE_L2E ||
+		    (nval == IPVLAN_MODE_L2E && port->count > 1)) {
+			netdev_err(port->dev, "L2E mode cannot be changed.\n");
+			return -EINVAL;
+		}
+
 		list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
 			flags = ipvlan->dev->flags;
 			if (nval == IPVLAN_MODE_L3 || nval == IPVLAN_MODE_L3S) {
@@ -40,7 +54,10 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval,
 			ipvlan_l3s_unregister(port);
 		}
 		port->mode = nval;
+		if (port->mode == IPVLAN_MODE_L2E)
+			ipvlan_set_learnable(port);
 	}
+
 	return 0;
 
 fail:
@@ -59,6 +76,64 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval,
 	return err;
 }
 
+static int ipvlan_port_receive(struct sk_buff *skb, struct net_device *wdev,
+			       struct packet_type *pt, struct net_device *orig_wdev)
+{
+	struct ipvl_port *port;
+	struct ipvl_addr *addr;
+	struct ethhdr *eth;
+	void *lyr3h;
+	int addr_type;
+
+	port = container_of(pt, struct ipvl_port, ipvl_ptype);
+	/* We are interested only in outgoing packets.
+	 * rx-path is handled in rx_handler().
+	 */
+	if (skb->pkt_type != PACKET_OUTGOING || ipvlan_is_skb_marked(skb, port->dev))
+		goto out;
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (!skb)
+		goto no_mem;
+
+	/* data should point to eth-header */
+	skb_push(skb, skb->data - skb_mac_header(skb));
+	skb->dev = port->dev;
+	eth = eth_hdr(skb);
+
+	if (is_multicast_ether_addr(eth->h_dest)) {
+		ipvlan_skb_crossing_ns(skb, NULL);
+		skb->protocol = eth_type_trans(skb, skb->dev);
+		skb->pkt_type = PACKET_HOST;
+		ipvlan_mark_skb(skb, port->dev);
+		ipvlan_multicast_enqueue(port, skb, false);
+		return 0;
+	}
+
+	lyr3h = ipvlan_get_L3_hdr(port, skb, &addr_type);
+	if (!lyr3h)
+		goto out;
+
+	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
+	if (addr) {
+		int ret, len;
+
+		ipvlan_skb_crossing_ns(skb, addr->master->dev);
+		skb->protocol = eth_type_trans(skb, skb->dev);
+		skb->pkt_type = PACKET_HOST;
+		ipvlan_mark_skb(skb, port->dev);
+		len = skb->len + ETH_HLEN;
+		ret = netif_rx(skb);
+		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
+		return 0;
+	}
+
+out:
+	dev_kfree_skb(skb);
+no_mem:
+	return 0; // actually, ret value is ignored
+}
+
 static int ipvlan_port_create(struct net_device *dev)
 {
 	struct ipvl_port *port;
@@ -84,6 +159,11 @@ static int ipvlan_port_create(struct net_device *dev)
 	if (err)
 		goto err;
 
+	port->ipvl_ptype.func = ipvlan_port_receive;
+	port->ipvl_ptype.type = htons(ETH_P_ALL);
+	port->ipvl_ptype.dev = dev;
+	port->ipvl_ptype.list.prev = LIST_POISON2;
+
 	netdev_hold(dev, &port->dev_tracker, GFP_KERNEL);
 	return 0;
 
@@ -100,6 +180,8 @@ static void ipvlan_port_destroy(struct net_device *dev)
 	netdev_put(dev, &port->dev_tracker);
 	if (port->mode == IPVLAN_MODE_L3S)
 		ipvlan_l3s_unregister(port);
+	if (port->ipvl_ptype.list.prev != LIST_POISON2)
+		dev_remove_pack(&port->ipvl_ptype);
 	netdev_rx_handler_unregister(dev);
 	cancel_work_sync(&port->wq);
 	while ((skb = __skb_dequeue(&port->backlog)) != NULL) {
@@ -189,10 +271,13 @@ static int ipvlan_open(struct net_device *dev)
 	else
 		dev->flags &= ~IFF_NOARP;
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(addr, &ipvlan->addrs, anode)
-		ipvlan_ht_addr_add(ipvlan, addr);
-	rcu_read_unlock();
+	/* for learnable, addresses will be obtained from tx-packets. */
+	if (!ipvlan_is_learnable(ipvlan->port)) {
+		rcu_read_lock();
+		list_for_each_entry_rcu(addr, &ipvlan->addrs, anode)
+			ipvlan_ht_addr_add(ipvlan, addr);
+		rcu_read_unlock();
+	}
 
 	return 0;
 }
@@ -581,11 +666,21 @@ int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params,
 	INIT_LIST_HEAD(&ipvlan->addrs);
 	spin_lock_init(&ipvlan->addrs_lock);
 
-	/* TODO Probably put random address here to be presented to the
-	 * world but keep using the physical-dev address for the outgoing
-	 * packets.
+	/* Flags are per port and latest update overrides. User has
+	 * to be consistent in setting it just like the mode attribute.
 	 */
-	eth_hw_addr_set(dev, phy_dev->dev_addr);
+	if (data && data[IFLA_IPVLAN_MODE])
+		mode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
+
+	if (mode != IPVLAN_MODE_L2E) {
+		/* TODO Probably put random address here to be presented to the
+		 * world but keep using the physical-dev address for the outgoing
+		 * packets.
+		 */
+		eth_hw_addr_set(dev, phy_dev->dev_addr);
+	} else {
+		eth_hw_addr_random(dev);
+	}
 
 	dev->priv_flags |= IFF_NO_RX_HANDLER;
 
@@ -597,6 +692,9 @@ int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params,
 	port = ipvlan_port_get_rtnl(phy_dev);
 	ipvlan->port = port;
 
+	if (data && data[IFLA_IPVLAN_FLAGS])
+		port->flags = nla_get_u16(data[IFLA_IPVLAN_FLAGS]);
+
 	/* If the port-id base is at the MAX value, then wrap it around and
 	 * begin from 0x1 again. This may be due to a busy system where lots
 	 * of slaves are getting created and deleted.
@@ -625,19 +723,13 @@ int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params,
 	if (err)
 		goto remove_ida;
 
-	/* Flags are per port and latest update overrides. User has
-	 * to be consistent in setting it just like the mode attribute.
-	 */
-	if (data && data[IFLA_IPVLAN_FLAGS])
-		port->flags = nla_get_u16(data[IFLA_IPVLAN_FLAGS]);
-
-	if (data && data[IFLA_IPVLAN_MODE])
-		mode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
-
 	err = ipvlan_set_port_mode(port, mode, extack);
 	if (err)
 		goto unlink_netdev;
 
+	if (ipvlan_is_learnable(port))
+		dev_set_allmulti(dev, 1);
+
 	list_add_tail_rcu(&ipvlan->pnode, &port->ipvlans);
 	netif_stacked_transfer_operstate(phy_dev, dev);
 	return 0;
@@ -657,6 +749,9 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct ipvl_addr *addr, *next;
 
+	if (ipvlan_is_learnable(ipvlan->port))
+		dev_set_allmulti(dev, -1);
+
 	spin_lock_bh(&ipvlan->addrs_lock);
 	list_for_each_entry_safe(addr, next, &ipvlan->addrs, anode) {
 		ipvlan_ht_addr_del(addr);
@@ -793,6 +888,9 @@ static int ipvlan_device_event(struct notifier_block *unused,
 		break;
 
 	case NETDEV_CHANGEADDR:
+		if (ipvlan_is_learnable(ipvlan->port))
+			break;
+
 		list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
 			eth_hw_addr_set(ipvlan->dev, dev->dev_addr);
 			call_netdevice_notifiers(NETDEV_CHANGEADDR, ipvlan->dev);
@@ -813,7 +911,7 @@ static int ipvlan_device_event(struct notifier_block *unused,
 }
 
 /* the caller must held the addrs lock */
-static int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
+int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 {
 	struct ipvl_addr *addr;
 
@@ -928,6 +1026,9 @@ static int ipvlan_addr6_validator_event(struct notifier_block *unused,
 	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
+	if (ipvlan_is_learnable(ipvlan->port))
+		return notifier_from_errno(-EADDRNOTAVAIL);
+
 	switch (event) {
 	case NETDEV_UP:
 		if (ipvlan_addr_busy(ipvlan->port, &i6vi->i6vi_addr, true)) {
@@ -999,6 +1100,9 @@ static int ipvlan_addr4_validator_event(struct notifier_block *unused,
 	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
+	if (ipvlan_is_learnable(ipvlan->port))
+		return notifier_from_errno(-EADDRNOTAVAIL);
+
 	switch (event) {
 	case NETDEV_UP:
 		if (ipvlan_addr_busy(ipvlan->port, &ivi->ivi_addr, false)) {
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 3b491d96e52e..6b543c05392d 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1269,6 +1269,7 @@ enum ipvlan_mode {
 	IPVLAN_MODE_L2 = 0,
 	IPVLAN_MODE_L3,
 	IPVLAN_MODE_L3S,
+	IPVLAN_MODE_L2E,
 	IPVLAN_MODE_MAX
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/8] ipvlan: Send mcasts out directly in ipvlan_xmit_mode_l2()
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 1/8] " Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 3/8] ipvlan: Handle rx mcast-ip and unicast eth Dmitry Skorodumov
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Mcasts are sent to external net directly in
ipvlan_xmit_mode_l2(). The ipvlan_process_multicast()
for tx-packets just distributes them to local ifaces.

This makes life a bit easier for further patches. When
out-mcasts should be patched with proper MAC-address.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan_core.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index ffe8efd2f1aa..9af0dcc307da 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -285,9 +285,10 @@ void ipvlan_process_multicast(struct work_struct *work)
 
 		if (tx_pkt) {
 			if (ipvlan_is_learnable(port)) {
-				/* Inject packet to main dev */
+				/* Inject as rx-packet to main dev */
 				nskb = skb_clone(skb, GFP_ATOMIC);
 				if (nskb) {
+					consumed = true;
 					local_bh_disable();
 					nskb->pkt_type = pkt_type;
 					nskb->dev = port->dev;
@@ -295,17 +296,13 @@ void ipvlan_process_multicast(struct work_struct *work)
 					local_bh_enable();
 				}
 			}
-
-			/* If the packet originated here, send it out. */
-			skb->dev = port->dev;
-			skb->pkt_type = pkt_type;
-			dev_queue_xmit(skb);
-		} else {
-			if (consumed)
-				consume_skb(skb);
-			else
-				kfree_skb(skb);
+			/* Packet was already tx out in ipvlan_xmit_mode_l2(). */
 		}
+		if (consumed)
+			consume_skb(skb);
+		else
+			kfree_skb(skb);
+
 		dev_put(dev);
 		cond_resched();
 	}
@@ -729,10 +726,15 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 	}
 
 	if (is_multicast_ether_addr(eth->h_dest)) {
-		skb_reset_mac_header(skb);
-		ipvlan_skb_crossing_ns(skb, NULL);
-		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
-		return NET_XMIT_SUCCESS;
+		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+		if (nskb) {
+			skb_reset_mac_header(nskb);
+			ipvlan_skb_crossing_ns(nskb, NULL);
+			ipvlan_multicast_enqueue(ipvlan->port, nskb, true);
+		}
+
+		goto tx_phy_dev;
 	}
 
 	if (ipvlan_is_vepa(ipvlan->port))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 3/8] ipvlan: Handle rx mcast-ip and unicast eth
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 1/8] " Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 2/8] ipvlan: Send mcasts out directly in ipvlan_xmit_mode_l2() Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 4/8] ipvlan: Added some kind of MAC SNAT Dmitry Skorodumov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Some WiFi enfironments sometimes send mcast packets
with unicast eth_dst. Forcibly replace eth_dst to be bcast in this case
if bridge is in L2E mode.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan_core.c | 57 ++++++++++++++++++++++++++++++--
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 9af0dcc307da..41059639f307 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -855,18 +855,69 @@ static rx_handler_result_t ipvlan_handle_mode_l3(struct sk_buff **pskb,
 	return ret;
 }
 
+static bool ipvlan_is_mcast(struct ipvl_port *port, void *lyr3h, int addr_type)
+{
+	switch (addr_type) {
+#if IS_ENABLED(CONFIG_IPV6)
+	/* ToDo: handle  ICMPV6*/
+	case IPVL_IPV6:
+		return !is_ipv6_usable(&((struct ipv6hdr *)lyr3h)->daddr);
+#endif
+	case IPVL_IPV4: {
+		/* Treat mcast, bcast and zero as multicast. */
+		__be32 i4addr = ((struct iphdr *)lyr3h)->daddr;
+
+		return !is_ipv4_usable(i4addr);
+	}
+	case IPVL_ARP: {
+		struct arphdr *arph;
+		unsigned char *arp_ptr;
+		__be32 i4addr;
+
+		arph = (struct arphdr *)lyr3h;
+		arp_ptr = (unsigned char *)(arph + 1);
+		arp_ptr += (2 * port->dev->addr_len) + 4;
+		i4addr = *(__be32 *)arp_ptr;
+		return !is_ipv4_usable(i4addr);
+	}
+	}
+	return false;
+}
+
+static bool ipvlan_is_l2_mcast(struct ipvl_port *port, struct sk_buff *skb,
+			       bool *need_eth_fix)
+{
+	void *lyr3h;
+	int addr_type;
+
+	/* In some wifi environments unicast dest address means nothing.
+	 * IP still can be a mcast and frame should be treated as mcast.
+	 */
+	*need_eth_fix = false;
+	if (is_multicast_ether_addr(eth_hdr(skb)->h_dest))
+		return true;
+
+	if (!ipvlan_is_learnable(port))
+		return false;
+
+	lyr3h = ipvlan_get_L3_hdr(port, skb, &addr_type);
+	*need_eth_fix = lyr3h && ipvlan_is_mcast(port, lyr3h, addr_type);
+
+	return *need_eth_fix;
+}
+
 static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
 						 struct ipvl_port *port)
 {
 	struct sk_buff *skb = *pskb;
-	struct ethhdr *eth = eth_hdr(skb);
 	rx_handler_result_t ret = RX_HANDLER_PASS;
+	bool need_eth_fix;
 
 	/* Ignore already seen packets. */
 	if (ipvlan_is_skb_marked(skb, port->dev))
 		return RX_HANDLER_PASS;
 
-	if (is_multicast_ether_addr(eth->h_dest)) {
+	if (ipvlan_is_l2_mcast(port, skb, &need_eth_fix)) {
 		if (ipvlan_external_frame(skb, port)) {
 			/* External frames are queued for device local
 			 * distribution, but a copy is given to master
@@ -877,6 +928,8 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
 			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
 
 			if (nskb) {
+				if (need_eth_fix)
+					memset(eth_hdr(nskb)->h_dest, 0xff, ETH_ALEN);
 				ipvlan_mark_skb(skb, port->dev);
 				ipvlan_skb_crossing_ns(nskb, NULL);
 				ipvlan_multicast_enqueue(port, nskb, false);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 4/8] ipvlan: Added some kind of MAC SNAT
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
                   ` (2 preceding siblings ...)
  2025-10-21 14:44 ` [PATCH net-next 3/8] ipvlan: Handle rx mcast-ip and unicast eth Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 5/8] ipvlan: Forget all IP when device goes down Dmitry Skorodumov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

We remember the SRC MAC address of outgoing packets
together with IP addresses.

While RX, we patch MAC address with remembered MAC.

We do patching for both eth_dst and ARPs.

ToDo: support IPv6 Neighbours Discovery.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan.h      |   5 +-
 drivers/net/ipvlan/ipvlan_core.c | 144 +++++++++++++++++++++++--------
 drivers/net/ipvlan/ipvlan_main.c |  11 ++-
 3 files changed, 118 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 020e80df1e38..02a705bf9d42 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -78,6 +78,7 @@ struct ipvl_addr {
 		struct in6_addr	ip6;	 /* IPv6 address on logical interface */
 		struct in_addr	ip4;	 /* IPv4 address on logical interface */
 	} ipu;
+	u8			hwaddr[ETH_ALEN];
 #define ip6addr	ipu.ip6
 #define ip4addr ipu.ip4
 	struct hlist_node	hlnode;  /* Hash-table linkage */
@@ -177,7 +178,9 @@ void ipvlan_multicast_enqueue(struct ipvl_port *port,
 			      struct sk_buff *skb, bool tx_pkt);
 int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev);
 void ipvlan_ht_addr_add(struct ipvl_dev *ipvlan, struct ipvl_addr *addr);
-int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6);
+int ipvlan_add_addr(struct ipvl_dev *ipvlan,
+		    void *iaddr, bool is_v6, const u8 *hwaddr);
+void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6);
 struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev *ipvlan,
 				   const void *iaddr, bool is_v6);
 bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6);
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 41059639f307..fe8e59066c46 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -320,8 +320,36 @@ void ipvlan_skb_crossing_ns(struct sk_buff *skb, struct net_device *dev)
 		skb->dev = dev;
 }
 
-static int ipvlan_rcv_frame(struct ipvl_addr *addr, struct sk_buff **pskb,
-			    bool local)
+static int ipvlan_snat_rx_skb(struct ipvl_addr *addr, int addr_type,
+			      struct sk_buff *skb)
+{
+	/* Here we have non-shared skb and free to modify it. */
+	struct ethhdr *eth = eth_hdr(skb);
+
+	if (addr_type == IPVL_ARP) {
+		struct arphdr *arph = arp_hdr(skb);
+		u8 *arp_ptr = (u8 *)(arph + 1);
+		u8 *dsthw = arp_ptr + addr->master->dev->addr_len + sizeof(u32);
+		const u8 *phy_addr = addr->master->phy_dev->dev_addr;
+
+		/* Some access points may do ARP-proxy and answers us back.
+		 * Client may treat this as address-conflict.
+		 */
+		if (ether_addr_equal(eth->h_source, phy_addr) &&
+		    ether_addr_equal(eth->h_dest, phy_addr) &&
+		    is_zero_ether_addr(dsthw)) {
+			return NET_RX_DROP;
+		}
+		if (ether_addr_equal(dsthw, phy_addr))
+			ether_addr_copy(dsthw, addr->hwaddr);
+	}
+
+	ether_addr_copy(eth->h_dest, addr->hwaddr);
+	return NET_RX_SUCCESS;
+}
+
+static int ipvlan_rcv_frame(struct ipvl_addr *addr, int addr_type,
+			    struct sk_buff **pskb, bool local)
 {
 	struct ipvl_dev *ipvlan = addr->master;
 	struct net_device *dev = ipvlan->dev;
@@ -331,10 +359,8 @@ static int ipvlan_rcv_frame(struct ipvl_addr *addr, struct sk_buff **pskb,
 	struct sk_buff *skb = *pskb;
 
 	len = skb->len + ETH_HLEN;
-	/* Only packets exchanged between two local slaves need to have
-	 * device-up check as well as skb-share check.
-	 */
-	if (local) {
+
+	if (local || ipvlan_is_learnable(ipvlan->port)) {
 		if (unlikely(!(dev->flags & IFF_UP))) {
 			kfree_skb(skb);
 			goto out;
@@ -345,6 +371,13 @@ static int ipvlan_rcv_frame(struct ipvl_addr *addr, struct sk_buff **pskb,
 			goto out;
 
 		*pskb = skb;
+		if (!local && ipvlan_is_learnable(ipvlan->port)) {
+			if (ipvlan_snat_rx_skb(addr, addr_type, skb) !=
+			    NET_RX_SUCCESS) {
+				kfree_skb(skb);
+				goto out;
+			}
+		}
 	}
 
 	if (local) {
@@ -436,8 +469,9 @@ static inline bool is_ipv6_usable(const struct in6_addr *addr)
 }
 
 static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
-			      int addr_type)
+			      int addr_type, const u8 *hwaddr)
 {
+	struct ipvl_addr *ipvladdr;
 	void *addr = NULL;
 	bool is_v6;
 
@@ -486,10 +520,18 @@ static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
 		return;
 	}
 
-	if (!ipvlan_ht_addr_lookup(ipvlan->port, addr, is_v6)) {
+	/* handle situation when MAC changed, but IP is the same. */
+	ipvladdr = ipvlan_ht_addr_lookup(ipvlan->port, addr, is_v6);
+	if (ipvladdr && !ether_addr_equal(ipvladdr->hwaddr, hwaddr)) {
+		/* del_addr is safe to call, because we are inside xmit*/
+		ipvlan_del_addr(ipvladdr->master, addr, is_v6);
+		ipvladdr = NULL;
+	}
+
+	if (!ipvladdr) {
 		spin_lock_bh(&ipvlan->addrs_lock);
 		if (!ipvlan_addr_busy(ipvlan->port, addr, is_v6))
-			ipvlan_add_addr(ipvlan, addr, is_v6);
+			ipvlan_add_addr(ipvlan, addr, is_v6, hwaddr);
 		spin_unlock_bh(&ipvlan->addrs_lock);
 	}
 }
@@ -687,7 +729,7 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev)
 				consume_skb(skb);
 				return NET_XMIT_DROP;
 			}
-			ipvlan_rcv_frame(addr, &skb, true);
+			ipvlan_rcv_frame(addr, addr_type, &skb, true);
 			return NET_XMIT_SUCCESS;
 		}
 	}
@@ -712,12 +754,14 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 	}
 
 	same_mac_addr = ether_addr_equal(eth->h_dest, eth->h_source);
+	if (same_mac_addr && ipvlan_is_learnable(ipvlan->port))
+		goto out_drop;
 
 	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
 
 	if (ipvlan_is_learnable(ipvlan->port)) {
 		if (lyr3h)
-			ipvlan_addr_learn(ipvlan, lyr3h, addr_type);
+			ipvlan_addr_learn(ipvlan, lyr3h, addr_type, eth->h_source);
 		/* Mark SKB in advance */
 		skb = skb_share_check(skb, GFP_ATOMIC);
 		if (!skb)
@@ -734,47 +778,74 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 			ipvlan_multicast_enqueue(ipvlan->port, nskb, true);
 		}
 
-		goto tx_phy_dev;
+		goto tx_frame_out;
 	}
 
 	if (ipvlan_is_vepa(ipvlan->port))
 		goto tx_phy_dev;
 
-	if (!same_mac_addr &&
+	if (ipvlan_is_learnable(ipvlan->port) &&
 	    ether_addr_equal(eth->h_dest, ipvlan->phy_dev->dev_addr)) {
 		/* It is a packet from child with destination to main port.
 		 * Pass it to main.
 		 */
-		skb = skb_share_check(skb, GFP_ATOMIC);
-		if (!skb)
-			return NET_XMIT_DROP;
 		skb->pkt_type = PACKET_HOST;
 		skb->dev = ipvlan->phy_dev;
 		dev_forward_skb(ipvlan->phy_dev, skb);
 		return NET_XMIT_SUCCESS;
-	} else if (same_mac_addr) {
-		if (lyr3h) {
-			addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true);
-			if (addr) {
-				if (ipvlan_is_private(ipvlan->port)) {
-					consume_skb(skb);
-					return NET_XMIT_DROP;
-				}
-				ipvlan_rcv_frame(addr, &skb, true);
-				return NET_XMIT_SUCCESS;
-			}
+	}
+
+	if (lyr3h) {
+		addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true);
+		if (addr) {
+			if (ipvlan_is_private(ipvlan->port))
+				goto out_drop;
+
+			ipvlan_rcv_frame(addr, addr_type, &skb, true);
+			return NET_XMIT_SUCCESS;
 		}
+	}
+
+tx_frame_out:
+	/* We don't know destination. Now we have to handle case for
+	 * non-learnable bridge and learnable case.
+	 */
+	if (!ipvlan_is_learnable(ipvlan->port)) {
 		skb = skb_share_check(skb, GFP_ATOMIC);
 		if (!skb)
 			return NET_XMIT_DROP;
+		if (same_mac_addr) {
+			/* Packet definitely does not belong to any of the
+			 * virtual devices, but the dest is local. So forward
+			 * the skb for the main-dev. At the RX side we just return
+			 * RX_PASS for it to be processed further on the stack.
+			 */
+			dev_forward_skb(ipvlan->phy_dev, skb);
+			return NET_XMIT_SUCCESS;
+		}
+	} else {
+		/* Ok. It is a packet to outside on learnable. Fix source eth-address. */
+		struct sk_buff *orig_skb = skb;
 
-		/* Packet definitely does not belong to any of the
-		 * virtual devices, but the dest is local. So forward
-		 * the skb for the main-dev. At the RX side we just return
-		 * RX_PASS for it to be processed further on the stack.
-		 */
-		dev_forward_skb(ipvlan->phy_dev, skb);
-		return NET_XMIT_SUCCESS;
+		skb = skb_unshare(skb, GFP_ATOMIC);
+		if (!skb)
+			return NET_XMIT_DROP;
+
+		skb_reset_mac_header(skb);
+		ether_addr_copy(skb_eth_hdr(skb)->h_source,
+				ipvlan->phy_dev->dev_addr);
+
+		/* ToDo: Handle ICMPv6 for neighbours discovery.*/
+		if (lyr3h && addr_type == IPVL_ARP) {
+			struct arphdr *arph;
+			/* must reparse new skb */
+			if (skb != orig_skb && lyr3h && addr_type == IPVL_ARP)
+				lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb,
+							  &addr_type);
+			arph = (struct arphdr *)lyr3h;
+			ether_addr_copy((u8 *)(arph + 1),
+					ipvlan->phy_dev->dev_addr);
+		}
 	}
 
 tx_phy_dev:
@@ -849,8 +920,7 @@ static rx_handler_result_t ipvlan_handle_mode_l3(struct sk_buff **pskb,
 
 	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
 	if (addr)
-		ret = ipvlan_rcv_frame(addr, pskb, false);
-
+		ret = ipvlan_rcv_frame(addr, addr_type, pskb, false);
 out:
 	return ret;
 }
@@ -918,7 +988,7 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
 		return RX_HANDLER_PASS;
 
 	if (ipvlan_is_l2_mcast(port, skb, &need_eth_fix)) {
-		if (ipvlan_external_frame(skb, port)) {
+		if (ipvlan_is_learnable(port) || ipvlan_external_frame(skb, port)) {
 			/* External frames are queued for device local
 			 * distribution, but a copy is given to master
 			 * straight away to avoid sending duplicates later
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index df5275bc30fc..6fdfeca6081d 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -911,7 +911,8 @@ static int ipvlan_device_event(struct notifier_block *unused,
 }
 
 /* the caller must held the addrs lock */
-int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
+int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6,
+		    const u8 *hwaddr)
 {
 	struct ipvl_addr *addr;
 
@@ -929,6 +930,8 @@ int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 		addr->atype = IPVL_IPV6;
 #endif
 	}
+	if (hwaddr)
+		ether_addr_copy(addr->hwaddr, hwaddr);
 
 	list_add_tail_rcu(&addr->anode, &ipvlan->addrs);
 
@@ -941,7 +944,7 @@ int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 	return 0;
 }
 
-static void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
+void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 {
 	struct ipvl_addr *addr;
 
@@ -982,7 +985,7 @@ static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr)
 			  "Failed to add IPv6=%pI6c addr for %s intf\n",
 			  ip6_addr, ipvlan->dev->name);
 	else
-		ret = ipvlan_add_addr(ipvlan, ip6_addr, true);
+		ret = ipvlan_add_addr(ipvlan, ip6_addr, true, NULL);
 	spin_unlock_bh(&ipvlan->addrs_lock);
 	return ret;
 }
@@ -1053,7 +1056,7 @@ static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr)
 			  "Failed to add IPv4=%pI4 on %s intf.\n",
 			  ip4_addr, ipvlan->dev->name);
 	else
-		ret = ipvlan_add_addr(ipvlan, ip4_addr, false);
+		ret = ipvlan_add_addr(ipvlan, ip4_addr, false, NULL);
 	spin_unlock_bh(&ipvlan->addrs_lock);
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 5/8] ipvlan: Forget all IP when device goes down
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
                   ` (3 preceding siblings ...)
  2025-10-21 14:44 ` [PATCH net-next 4/8] ipvlan: Added some kind of MAC SNAT Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan Dmitry Skorodumov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

When ipvlan interface goes down, forget all learned addresses.

This is a way to cleanup addresses when master dev switches to
another network.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan_main.c | 49 ++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 6fdfeca6081d..28ce36669d39 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -744,14 +744,10 @@ int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params,
 }
 EXPORT_SYMBOL_GPL(ipvlan_link_new);
 
-void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
+static void ipvlan_addrs_forget_all(struct ipvl_dev *ipvlan)
 {
-	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct ipvl_addr *addr, *next;
 
-	if (ipvlan_is_learnable(ipvlan->port))
-		dev_set_allmulti(dev, -1);
-
 	spin_lock_bh(&ipvlan->addrs_lock);
 	list_for_each_entry_safe(addr, next, &ipvlan->addrs, anode) {
 		ipvlan_ht_addr_del(addr);
@@ -759,6 +755,16 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
 		kfree_rcu(addr, rcu);
 	}
 	spin_unlock_bh(&ipvlan->addrs_lock);
+}
+
+void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
+{
+	struct ipvl_dev *ipvlan = netdev_priv(dev);
+
+	if (ipvlan_is_learnable(ipvlan->port))
+		dev_set_allmulti(dev, -1);
+
+	ipvlan_addrs_forget_all(ipvlan);
 
 	ida_free(&ipvlan->port->ida, dev->dev_id);
 	list_del_rcu(&ipvlan->pnode);
@@ -816,6 +822,19 @@ int ipvlan_link_register(struct rtnl_link_ops *ops)
 }
 EXPORT_SYMBOL_GPL(ipvlan_link_register);
 
+static bool ipvlan_is_valid_dev(const struct net_device *dev)
+{
+	struct ipvl_dev *ipvlan = netdev_priv(dev);
+
+	if (!netif_is_ipvlan(dev))
+		return false;
+
+	if (!ipvlan || !ipvlan->port)
+		return false;
+
+	return true;
+}
+
 static int ipvlan_device_event(struct notifier_block *unused,
 			       unsigned long event, void *ptr)
 {
@@ -827,6 +846,13 @@ static int ipvlan_device_event(struct notifier_block *unused,
 	LIST_HEAD(lst_kill);
 	int err;
 
+	if (event == NETDEV_DOWN && ipvlan_is_valid_dev(dev)) {
+		struct ipvl_dev *ipvlan = netdev_priv(dev);
+
+		ipvlan_addrs_forget_all(ipvlan);
+		return NOTIFY_DONE;
+	}
+
 	if (!netif_is_ipvlan_port(dev))
 		return NOTIFY_DONE;
 
@@ -961,19 +987,6 @@ void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6)
 	kfree_rcu(addr, rcu);
 }
 
-static bool ipvlan_is_valid_dev(const struct net_device *dev)
-{
-	struct ipvl_dev *ipvlan = netdev_priv(dev);
-
-	if (!netif_is_ipvlan(dev))
-		return false;
-
-	if (!ipvlan || !ipvlan->port)
-		return false;
-
-	return true;
-}
-
 #if IS_ENABLED(CONFIG_IPV6)
 static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
                   ` (4 preceding siblings ...)
  2025-10-21 14:44 ` [PATCH net-next 5/8] ipvlan: Forget all IP when device goes down Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-22 20:55   ` kernel test robot
  2025-10-21 14:44 ` [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge Dmitry Skorodumov
  2025-10-21 14:44 ` [PATCH net-next 8/8] ipvlan: Don't learn child with host-ip Dmitry Skorodumov
  7 siblings, 1 reply; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

If main port interface supports GSO, we need manually segment
the skb before forwarding it to ipvlan interface.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan_main.c | 50 ++++++++++++++++++++++++--------
 1 file changed, 38 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 28ce36669d39..f1b1f91f94c0 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -4,6 +4,7 @@
 
 #include <linux/ethtool.h>
 #include <net/netdev_lock.h>
+#include <net/gso.h>
 
 #include "ipvlan.h"
 
@@ -76,6 +77,41 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval,
 	return err;
 }
 
+static int ipvlan_receive(struct ipvl_dev *ipvlan, struct sk_buff *skb)
+{
+	struct sk_buff *segs;
+	struct sk_buff *nskb;
+	ssize_t mac_hdr_size;
+	int ret, len;
+
+	skb->pkt_type = PACKET_HOST;
+	skb->protocol = eth_type_trans(skb, skb->dev);
+	ipvlan_skb_crossing_ns(skb, ipvlan->dev);
+	ipvlan_mark_skb(skb, ipvlan->phy_dev);
+	if (skb_shinfo(skb)->gso_size == 0) {
+		len = skb->len + ETH_HLEN;
+		ret = netif_rx(skb);
+		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
+		return ret;
+	}
+
+	mac_hdr_size = skb->network_header - skb->mac_header;
+	__skb_push(skb, mac_hdr_size);
+	segs = skb_gso_segment(skb, 0);
+	dev_kfree_skb(skb);
+	if (IS_ERR(segs))
+		return 0;
+
+	skb_list_walk_safe(segs, segs, nskb) {
+		skb_mark_not_on_list(segs);
+		__skb_pull(segs, mac_hdr_size);
+		len = segs->len + ETH_HLEN;
+		ret = netif_rx(segs);
+		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
+	}
+	return ret;
+}
+
 static int ipvlan_port_receive(struct sk_buff *skb, struct net_device *wdev,
 			       struct packet_type *pt, struct net_device *orig_wdev)
 {
@@ -115,18 +151,8 @@ static int ipvlan_port_receive(struct sk_buff *skb, struct net_device *wdev,
 		goto out;
 
 	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
-	if (addr) {
-		int ret, len;
-
-		ipvlan_skb_crossing_ns(skb, addr->master->dev);
-		skb->protocol = eth_type_trans(skb, skb->dev);
-		skb->pkt_type = PACKET_HOST;
-		ipvlan_mark_skb(skb, port->dev);
-		len = skb->len + ETH_HLEN;
-		ret = netif_rx(skb);
-		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
-		return 0;
-	}
+	if (addr)
+		return ipvlan_receive(addr->master, skb);
 
 out:
 	dev_kfree_skb(skb);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
                   ` (5 preceding siblings ...)
  2025-10-21 14:44 ` [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  2025-10-23  0:03   ` kernel test robot
                     ` (2 more replies)
  2025-10-21 14:44 ` [PATCH net-next 8/8] ipvlan: Don't learn child with host-ip Dmitry Skorodumov
  7 siblings, 3 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

To make IPv6 work with learnable l2-bridge, need to
process the TX-path:
* Replace Source-ll-addr in Solicitation ndisc,
* Replace Target-ll-addr in Advertisement ndisc

No need to do anything in RX-path

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan_core.c | 125 +++++++++++++++++++++++++++----
 1 file changed, 111 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index fe8e59066c46..ce06a06d8a28 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -738,11 +738,117 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev)
 	return ipvlan_process_outbound(skb);
 }
 
+static void ipvlan_snat_patch_tx_arp(struct ipvl_dev *ipvlan,
+				     struct sk_buff *skb)
+{
+	int addr_type;
+	struct arphdr *arph;
+
+	arph = (struct arphdr *)ipvlan_get_L3_hdr(ipvlan->port, skb,
+						 &addr_type);
+	ether_addr_copy((u8 *)(arph + 1), ipvlan->phy_dev->dev_addr);
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+
+static u8 *ipvlan_search_icmp6_ll_addr(struct sk_buff *skb, u8 icmp_option)
+{
+	/* skb is ensured to pullable for all ipv6 payload_len by caller */
+	struct ipv6hdr *ip6h = ipv6_hdr(skb);
+	struct icmp6hdr *icmph = (struct icmp6hdr *)(ip6h + 1);
+	int curr_off = sizeof(*icmph);
+	int ndsize = htons(ip6h->payload_len);
+
+	if (icmph->icmp6_type != NDISC_ROUTER_SOLICITATION)
+		curr_off += sizeof(struct in6_addr);
+
+	while ((curr_off + 2) < ndsize) {
+		u8  *data = (u8 *)icmph + curr_off;
+		u32 opt_len = data[1] << 3;
+
+		if (unlikely(opt_len == 0))
+			return NULL;
+
+		if (data[0] != icmp_option) {
+			curr_off += opt_len;
+			continue;
+		}
+
+		if (unlikely(opt_len < ETH_ALEN + 2))
+			return NULL;
+
+		if (unlikely(curr_off + opt_len > ndsize))
+			return NULL;
+
+		return data + 2;
+	}
+
+	return NULL;
+}
+
+static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
+				      struct sk_buff *skb)
+{
+	struct ipv6hdr *ip6h;
+	struct icmp6hdr *icmph;
+	u8 icmp_option;
+	u8 *lladdr;
+	u16 ndsize;
+
+	if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h))))
+		return;
+
+	if (ipv6_hdr(skb)->nexthdr != NEXTHDR_ICMP)
+		return;
+
+	if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + sizeof(*icmph))))
+		return;
+
+	ip6h = ipv6_hdr(skb);
+	icmph = (struct icmp6hdr *)(ip6h + 1);
+
+	/* Patch Source-LL for solicitation, Target-LL for advertisement */
+	if (icmph->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
+	    icmph->icmp6_type == NDISC_ROUTER_SOLICITATION)
+		icmp_option = ND_OPT_SOURCE_LL_ADDR;
+	else if (icmph->icmp6_type == NDISC_NEIGHBOUR_ADVERTISEMENT)
+		icmp_option = ND_OPT_TARGET_LL_ADDR;
+	else
+		return;
+
+	ndsize = htons(ip6h->payload_len);
+	if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + ndsize)))
+		return;
+
+	lladdr = ipvlan_search_icmp6_ll_addr(skb, icmp_option);
+	if (!lladdr)
+		return;
+
+	ether_addr_copy(lladdr, ipvlan->phy_dev->dev_addr);
+
+	ip6h = ipv6_hdr(skb);
+	icmph = (struct icmp6hdr *)(ip6h + 1);
+	icmph->icmp6_cksum = 0;
+	icmph->icmp6_cksum = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr,
+					     ndsize,
+					     IPPROTO_ICMPV6,
+					     csum_partial(icmph,
+							  ndsize,
+							  0));
+	skb->ip_summed = CHECKSUM_COMPLETE;
+}
+#else
+static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
+				      struct sk_buff *skb)
+{
+}
+#endif
+
 static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 {
 	void *lyr3h;
 	struct ipvl_addr *addr;
-	int addr_type;
+	int addr_type = -1;
 	bool same_mac_addr;
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct ethhdr *eth = skb_eth_hdr(skb);
@@ -825,8 +931,6 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 		}
 	} else {
 		/* Ok. It is a packet to outside on learnable. Fix source eth-address. */
-		struct sk_buff *orig_skb = skb;
-
 		skb = skb_unshare(skb, GFP_ATOMIC);
 		if (!skb)
 			return NET_XMIT_DROP;
@@ -835,17 +939,10 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 		ether_addr_copy(skb_eth_hdr(skb)->h_source,
 				ipvlan->phy_dev->dev_addr);
 
-		/* ToDo: Handle ICMPv6 for neighbours discovery.*/
-		if (lyr3h && addr_type == IPVL_ARP) {
-			struct arphdr *arph;
-			/* must reparse new skb */
-			if (skb != orig_skb && lyr3h && addr_type == IPVL_ARP)
-				lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb,
-							  &addr_type);
-			arph = (struct arphdr *)lyr3h;
-			ether_addr_copy((u8 *)(arph + 1),
-					ipvlan->phy_dev->dev_addr);
-		}
+		if (addr_type == IPVL_ARP)
+			ipvlan_snat_patch_tx_arp(ipvlan, skb);
+		else if (addr_type == IPVL_ICMPV6 || addr_type == IPVL_IPV6)
+			ipvlan_snat_patch_tx_ipv6(ipvlan, skb);
 	}
 
 tx_phy_dev:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 8/8] ipvlan: Don't learn child with host-ip
  2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
                   ` (6 preceding siblings ...)
  2025-10-21 14:44 ` [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge Dmitry Skorodumov
@ 2025-10-21 14:44 ` Dmitry Skorodumov
  7 siblings, 0 replies; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-21 14:44 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

When some child attempts to send a packet with host ip,
remember host IP in the list of ipvlan-addrs with mark "blocked".

Don't send anything if child tries to send a packet with IP of main.

ToDo: track addresses on main port and mark them as blocked if bridge
already learned some of them from some of the children.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
---
 drivers/net/ipvlan/ipvlan.h      |  4 ++-
 drivers/net/ipvlan/ipvlan_core.c | 61 +++++++++++++++++++++++++-------
 drivers/net/ipvlan/ipvlan_main.c |  9 ++---
 3 files changed, 57 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 02a705bf9d42..7de9794efbda 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -74,6 +74,7 @@ struct ipvl_dev {
 
 struct ipvl_addr {
 	struct ipvl_dev		*master; /* Back pointer to master */
+	bool			is_blocked; /* Blocked. Addr from main iface */
 	union {
 		struct in6_addr	ip6;	 /* IPv6 address on logical interface */
 		struct in_addr	ip4;	 /* IPv4 address on logical interface */
@@ -179,7 +180,8 @@ void ipvlan_multicast_enqueue(struct ipvl_port *port,
 int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev);
 void ipvlan_ht_addr_add(struct ipvl_dev *ipvlan, struct ipvl_addr *addr);
 int ipvlan_add_addr(struct ipvl_dev *ipvlan,
-		    void *iaddr, bool is_v6, const u8 *hwaddr);
+		    void *iaddr, bool is_v6, const u8 *hwaddr,
+		    bool is_blocked);
 void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6);
 struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev *ipvlan,
 				   const void *iaddr, bool is_v6);
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index ce06a06d8a28..8b2c2d455ea5 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -468,8 +468,30 @@ static inline bool is_ipv6_usable(const struct in6_addr *addr)
 	       !ipv6_addr_any(addr);
 }
 
-static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
-			      int addr_type, const u8 *hwaddr)
+static bool ipvlan_is_portaddr_busy(struct ipvl_dev *ipvlan,
+				    void *addr, bool is_v6)
+{
+	const struct in_ifaddr *ifa;
+	struct in_device *in_dev;
+
+	if (is_v6)
+		return ipv6_chk_addr(dev_net(ipvlan->phy_dev), addr,
+				    ipvlan->phy_dev, 1);
+
+	in_dev = __in_dev_get_rcu(ipvlan->phy_dev);
+	if (!in_dev)
+		return false;
+
+	in_dev_for_each_ifa_rcu(ifa, in_dev)
+		if (ifa->ifa_local == *(__be32 *)addr)
+			return true;
+
+	return false;
+}
+
+/* return -1 if frame should be dropped. */
+static int ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
+			     int addr_type, const u8 *hwaddr)
 {
 	struct ipvl_addr *ipvladdr;
 	void *addr = NULL;
@@ -483,7 +505,7 @@ static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
 
 		ip6h = (struct ipv6hdr *)lyr3h;
 		if (!is_ipv6_usable(&ip6h->saddr))
-			return;
+			return 0;
 		is_v6 = true;
 		addr = &ip6h->saddr;
 		break;
@@ -496,7 +518,7 @@ static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
 		ip4h = (struct iphdr *)lyr3h;
 		i4addr = &ip4h->saddr;
 		if (!is_ipv4_usable(*i4addr))
-			return;
+			return 0;
 		is_v6 = false;
 		addr = i4addr;
 		break;
@@ -511,17 +533,20 @@ static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
 		arp_ptr += ipvlan->port->dev->addr_len;
 		i4addr = (__be32 *)arp_ptr;
 		if (!is_ipv4_usable(*i4addr))
-			return;
+			return 0;
 		is_v6 = false;
 		addr = i4addr;
 		break;
 	}
 	default:
-		return;
+		return 0;
 	}
 
 	/* handle situation when MAC changed, but IP is the same. */
 	ipvladdr = ipvlan_ht_addr_lookup(ipvlan->port, addr, is_v6);
+	if (ipvladdr && ipvladdr->is_blocked)
+		return -1;
+
 	if (ipvladdr && !ether_addr_equal(ipvladdr->hwaddr, hwaddr)) {
 		/* del_addr is safe to call, because we are inside xmit*/
 		ipvlan_del_addr(ipvladdr->master, addr, is_v6);
@@ -529,11 +554,17 @@ static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
 	}
 
 	if (!ipvladdr) {
+		bool is_port_ip = ipvlan_is_portaddr_busy(ipvlan, addr, is_v6);
+
 		spin_lock_bh(&ipvlan->addrs_lock);
 		if (!ipvlan_addr_busy(ipvlan->port, addr, is_v6))
-			ipvlan_add_addr(ipvlan, addr, is_v6, hwaddr);
+			ipvlan_add_addr(ipvlan, addr, is_v6, hwaddr, is_port_ip);
 		spin_unlock_bh(&ipvlan->addrs_lock);
+
+		return is_port_ip ? -1 : 0;
 	}
+
+	return 0;
 }
 
 static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb)
@@ -724,11 +755,12 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev)
 
 	if (!ipvlan_is_vepa(ipvlan->port)) {
 		addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true);
-		if (addr) {
+		if (addr && !addr->is_blocked) {
 			if (ipvlan_is_private(ipvlan->port)) {
 				consume_skb(skb);
 				return NET_XMIT_DROP;
 			}
+
 			ipvlan_rcv_frame(addr, addr_type, &skb, true);
 			return NET_XMIT_SUCCESS;
 		}
@@ -866,8 +898,12 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
 
 	if (ipvlan_is_learnable(ipvlan->port)) {
-		if (lyr3h)
-			ipvlan_addr_learn(ipvlan, lyr3h, addr_type, eth->h_source);
+		if (lyr3h) {
+			if (ipvlan_addr_learn(ipvlan, lyr3h, addr_type,
+					      eth->h_source) < 0)
+				goto out_drop;
+		}
+
 		/* Mark SKB in advance */
 		skb = skb_share_check(skb, GFP_ATOMIC);
 		if (!skb)
@@ -903,7 +939,7 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
 
 	if (lyr3h) {
 		addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true);
-		if (addr) {
+		if (addr && !addr->is_blocked) {
 			if (ipvlan_is_private(ipvlan->port))
 				goto out_drop;
 
@@ -1016,8 +1052,9 @@ static rx_handler_result_t ipvlan_handle_mode_l3(struct sk_buff **pskb,
 		goto out;
 
 	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
-	if (addr)
+	if (addr && !addr->is_blocked)
 		ret = ipvlan_rcv_frame(addr, addr_type, pskb, false);
+
 out:
 	return ret;
 }
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index f1b1f91f94c0..5df6bdeadef5 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -151,7 +151,7 @@ static int ipvlan_port_receive(struct sk_buff *skb, struct net_device *wdev,
 		goto out;
 
 	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
-	if (addr)
+	if (addr && !addr->is_blocked)
 		return ipvlan_receive(addr->master, skb);
 
 out:
@@ -964,7 +964,7 @@ static int ipvlan_device_event(struct notifier_block *unused,
 
 /* the caller must held the addrs lock */
 int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6,
-		    const u8 *hwaddr)
+		    const u8 *hwaddr, bool is_blocked)
 {
 	struct ipvl_addr *addr;
 
@@ -973,6 +973,7 @@ int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6,
 		return -ENOMEM;
 
 	addr->master = ipvlan;
+	addr->is_blocked = is_blocked;
 	if (!is_v6) {
 		memcpy(&addr->ip4addr, iaddr, sizeof(struct in_addr));
 		addr->atype = IPVL_IPV4;
@@ -1024,7 +1025,7 @@ static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr)
 			  "Failed to add IPv6=%pI6c addr for %s intf\n",
 			  ip6_addr, ipvlan->dev->name);
 	else
-		ret = ipvlan_add_addr(ipvlan, ip6_addr, true, NULL);
+		ret = ipvlan_add_addr(ipvlan, ip6_addr, true, NULL, false);
 	spin_unlock_bh(&ipvlan->addrs_lock);
 	return ret;
 }
@@ -1095,7 +1096,7 @@ static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr)
 			  "Failed to add IPv4=%pI4 on %s intf.\n",
 			  ip4_addr, ipvlan->dev->name);
 	else
-		ret = ipvlan_add_addr(ipvlan, ip4_addr, false, NULL);
+		ret = ipvlan_add_addr(ipvlan, ip4_addr, false, NULL, false);
 	spin_unlock_bh(&ipvlan->addrs_lock);
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/8] ipvlan: Implement learnable L2-bridge
  2025-10-21 14:44 ` [PATCH net-next 1/8] " Dmitry Skorodumov
@ 2025-10-22 14:23   ` Simon Horman
  2025-10-23 10:21     ` Dmitry Skorodumov
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Horman @ 2025-10-22 14:23 UTC (permalink / raw)
  To: Dmitry Skorodumov
  Cc: netdev, linux-doc, linux-kernel, andrey.bokhanko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jonathan Corbet,
	Andrew Lunn

On Tue, Oct 21, 2025 at 05:44:03PM +0300, Dmitry Skorodumov wrote:
> Now it is possible to create link in L2E mode: learnable
> bridge. The IPs will be learned from TX-packets of child interfaces.

Is there a standard for this approach - where does the L2E name come from?

> 
> Also, dev_add_pack() protocol is attached to the main port
> to support communication from main to child interfaces.
> 
> This mode is intended for the desktop virtual machines, for
> bridging to Wireless interfaces.
> 
> The mode should be specified while creating first child interface.
> It is not possible to change it after this.
> 
> Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>

...

> diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h

...

It is still preferred in networking code to linewrap lines
so that they are not wider than 80 columns, where than can be done without
reducing readability. Which appears to be the case here.

Flagged by checkpatch.pl --max-line-length=80

...
> diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c

...

> @@ -414,6 +426,77 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h,
>  	return addr;
>  }
>  
> +static inline bool is_ipv4_usable(__be32 addr)
> +{
> +	return !ipv4_is_lbcast(addr) && !ipv4_is_multicast(addr) &&
> +	       !ipv4_is_zeronet(addr);
> +}
> +
> +static inline bool is_ipv6_usable(const struct in6_addr *addr)
> +{
> +	return !ipv6_addr_is_multicast(addr) && !ipv6_addr_loopback(addr) &&
> +	       !ipv6_addr_any(addr);
> +}

Please don't use the inline keyword in .c files unless there
is a demonstrable reason to do so - usually performance.
Rather, please let the compiler inline functions as it sees fit.

> +
> +static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
> +			      int addr_type)
> +{
> +	void *addr = NULL;
> +	bool is_v6;
> +
> +	switch (addr_type) {
> +#if IS_ENABLED(CONFIG_IPV6)
> +	/* No need to handle IPVL_ICMPV6, since it never has valid src-address */
> +	case IPVL_IPV6: {
> +		struct ipv6hdr *ip6h;
> +
> +		ip6h = (struct ipv6hdr *)lyr3h;
> +		if (!is_ipv6_usable(&ip6h->saddr))

It is preferred to avoid #if / #ifdef in order to improve compile coverage
(and, I would argue, readability).

In this case I think that can be achieved by changing the line above to:

		if (!IS_ENABLED(CONFIG_IPV6) || !is_ipv6_usable(&ip6h->saddr))

I think it would be interesting to see if a similar approach can be used
to remove other #if CONFIG_IPV6 conditions in this file, and if successful
provide that as a clean-up as the opening patch in this series.

However, without that, I can see how one could argue for the approach
you have taken here on the basis of consistency.

> +			return;
> +		is_v6 = true;
> +		addr = &ip6h->saddr;
> +		break;
> +	}
> +#endif

...

> @@ -618,15 +701,56 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev)
>  
>  static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
>  {
> -	const struct ipvl_dev *ipvlan = netdev_priv(dev);
> -	struct ethhdr *eth = skb_eth_hdr(skb);
> -	struct ipvl_addr *addr;
>  	void *lyr3h;
> +	struct ipvl_addr *addr;
>  	int addr_type;
> +	bool same_mac_addr;
> +	struct ipvl_dev *ipvlan = netdev_priv(dev);
> +	struct ethhdr *eth = skb_eth_hdr(skb);

I realise that the convention is not followed in the existing code,
but please prefer to arrange local variables in reverse xmas tree order -
longest line to shortest.

In this case I think we can avoid moving things away
from that order like this (completely untested):

-	const struct ipvl_dev *ipvlan = netdev_priv(dev);
+	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct ethhdr *eth = skb_eth_hdr(skb);
 	struct ipvl_addr *addr;
+	bool same_mac_addr;
 	void *lyr3h;
 	int addr_type;

Likewise elsewhere in this patch.

This too can be helpful in this area
github.com/ecree-solarflare/xmastree/commits/master/

> +
> +	if (ipvlan_is_learnable(ipvlan->port) &&
> +	    ether_addr_equal(eth->h_source, dev->dev_addr)) {
> +		/* ignore tx-packets from host */
> +		goto out_drop;
> +	}
> +
> +	same_mac_addr = ether_addr_equal(eth->h_dest, eth->h_source);
> +
> +	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
>  
> -	if (!ipvlan_is_vepa(ipvlan->port) &&
> -	    ether_addr_equal(eth->h_dest, eth->h_source)) {
> -		lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
> +	if (ipvlan_is_learnable(ipvlan->port)) {
> +		if (lyr3h)
> +			ipvlan_addr_learn(ipvlan, lyr3h, addr_type);
> +		/* Mark SKB in advance */
> +		skb = skb_share_check(skb, GFP_ATOMIC);
> +		if (!skb)
> +			return NET_XMIT_DROP;

I think that when you drop packets a counter should be incremented.
Likewise elsewhere in this function.

> +		ipvlan_mark_skb(skb, ipvlan->phy_dev);
> +	}
> +
> +	if (is_multicast_ether_addr(eth->h_dest)) {
> +		skb_reset_mac_header(skb);
> +		ipvlan_skb_crossing_ns(skb, NULL);
> +		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
> +		return NET_XMIT_SUCCESS;
> +	}
> +
> +	if (ipvlan_is_vepa(ipvlan->port))
> +		goto tx_phy_dev;
> +
> +	if (!same_mac_addr &&
> +	    ether_addr_equal(eth->h_dest, ipvlan->phy_dev->dev_addr)) {
> +		/* It is a packet from child with destination to main port.
> +		 * Pass it to main.
> +		 */
> +		skb = skb_share_check(skb, GFP_ATOMIC);
> +		if (!skb)
> +			return NET_XMIT_DROP;
> +		skb->pkt_type = PACKET_HOST;
> +		skb->dev = ipvlan->phy_dev;
> +		dev_forward_skb(ipvlan->phy_dev, skb);
> +		return NET_XMIT_SUCCESS;
> +	} else if (same_mac_addr) {
>  		if (lyr3h) {
>  			addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true);
>  			if (addr) {
> @@ -649,16 +773,14 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
>  		 */
>  		dev_forward_skb(ipvlan->phy_dev, skb);
>  		return NET_XMIT_SUCCESS;
> -
> -	} else if (is_multicast_ether_addr(eth->h_dest)) {
> -		skb_reset_mac_header(skb);
> -		ipvlan_skb_crossing_ns(skb, NULL);
> -		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
> -		return NET_XMIT_SUCCESS;
>  	}
>  
> +tx_phy_dev:
>  	skb->dev = ipvlan->phy_dev;
>  	return dev_queue_xmit(skb);
> +out_drop:
> +	consume_skb(skb);
> +	return NET_XMIT_DROP;
>  }
>  
>  int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)

...

> diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c

...

> +static int ipvlan_port_receive(struct sk_buff *skb, struct net_device *wdev,
> +			       struct packet_type *pt, struct net_device *orig_wdev)
> +{
> +	struct ipvl_port *port;
> +	struct ipvl_addr *addr;
> +	struct ethhdr *eth;
> +	void *lyr3h;
> +	int addr_type;
> +
> +	port = container_of(pt, struct ipvl_port, ipvl_ptype);
> +	/* We are interested only in outgoing packets.
> +	 * rx-path is handled in rx_handler().
> +	 */
> +	if (skb->pkt_type != PACKET_OUTGOING || ipvlan_is_skb_marked(skb, port->dev))
> +		goto out;
> +
> +	skb = skb_share_check(skb, GFP_ATOMIC);
> +	if (!skb)
> +		goto no_mem;
> +
> +	/* data should point to eth-header */
> +	skb_push(skb, skb->data - skb_mac_header(skb));
> +	skb->dev = port->dev;
> +	eth = eth_hdr(skb);
> +
> +	if (is_multicast_ether_addr(eth->h_dest)) {
> +		ipvlan_skb_crossing_ns(skb, NULL);
> +		skb->protocol = eth_type_trans(skb, skb->dev);
> +		skb->pkt_type = PACKET_HOST;
> +		ipvlan_mark_skb(skb, port->dev);
> +		ipvlan_multicast_enqueue(port, skb, false);
> +		return 0;
> +	}
> +
> +	lyr3h = ipvlan_get_L3_hdr(port, skb, &addr_type);
> +	if (!lyr3h)
> +		goto out;
> +
> +	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
> +	if (addr) {
> +		int ret, len;
> +
> +		ipvlan_skb_crossing_ns(skb, addr->master->dev);
> +		skb->protocol = eth_type_trans(skb, skb->dev);
> +		skb->pkt_type = PACKET_HOST;
> +		ipvlan_mark_skb(skb, port->dev);
> +		len = skb->len + ETH_HLEN;
> +		ret = netif_rx(skb);
> +		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);

This fails to build because ipvlan is not declared in this scope.
Perhaps something got missed due to an edit?

> +		return 0;
> +	}
> +
> +out:
> +	dev_kfree_skb(skb);
> +no_mem:
> +	return 0; // actually, ret value is ignored

Maybe, but it seems to me that the return values
should follow that of netif_receive_skb_core().

> +}

...

-- 
pw-bot: changes-requested

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan
  2025-10-21 14:44 ` [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan Dmitry Skorodumov
@ 2025-10-22 20:55   ` kernel test robot
  0 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-10-22 20:55 UTC (permalink / raw)
  To: Dmitry Skorodumov, netdev, linux-kernel
  Cc: llvm, oe-kbuild-all, andrey.bokhanko, Dmitry Skorodumov,
	Andrew Lunn, Eric Dumazet, Jakub Kicinski, Paolo Abeni

Hi Dmitry,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Dmitry-Skorodumov/ipvlan-Implement-learnable-L2-bridge/20251021-224923
base:   net-next/main
patch link:    https://lore.kernel.org/r/20251021144410.257905-7-skorodumov.dmitry%40huawei.com
patch subject: [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan
config: i386-buildonly-randconfig-004-20251023 (https://download.01.org/0day-ci/archive/20251023/202510230401.r4e62ODH-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251023/202510230401.r4e62ODH-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510230401.r4e62ODH-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/net/ipvlan/ipvlan_main.c:943:27: warning: variable 'ipvlan' is uninitialized when used here [-Wuninitialized]
     943 |                 if (ipvlan_is_learnable(ipvlan->port))
         |                                         ^~~~~~
   drivers/net/ipvlan/ipvlan_main.c:870:25: note: initialize the variable 'ipvlan' to silence this warning
     870 |         struct ipvl_dev *ipvlan, *next;
         |                                ^
         |                                 = NULL
   1 warning generated.


vim +/ipvlan +943 drivers/net/ipvlan/ipvlan_main.c

1fb81b882de575 Dmitry Skorodumov  2025-10-21  863  
2ad7bf3638411c Mahesh Bandewar    2014-11-23  864  static int ipvlan_device_event(struct notifier_block *unused,
2ad7bf3638411c Mahesh Bandewar    2014-11-23  865  			       unsigned long event, void *ptr)
2ad7bf3638411c Mahesh Bandewar    2014-11-23  866  {
61345fab484b97 Petr Machata       2018-12-13  867  	struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr);
61345fab484b97 Petr Machata       2018-12-13  868  	struct netdev_notifier_pre_changeaddr_info *prechaddr_info;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  869  	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  870  	struct ipvl_dev *ipvlan, *next;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  871  	struct ipvl_port *port;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  872  	LIST_HEAD(lst_kill);
61345fab484b97 Petr Machata       2018-12-13  873  	int err;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  874  
1fb81b882de575 Dmitry Skorodumov  2025-10-21  875  	if (event == NETDEV_DOWN && ipvlan_is_valid_dev(dev)) {
1fb81b882de575 Dmitry Skorodumov  2025-10-21  876  		struct ipvl_dev *ipvlan = netdev_priv(dev);
1fb81b882de575 Dmitry Skorodumov  2025-10-21  877  
1fb81b882de575 Dmitry Skorodumov  2025-10-21  878  		ipvlan_addrs_forget_all(ipvlan);
1fb81b882de575 Dmitry Skorodumov  2025-10-21  879  		return NOTIFY_DONE;
1fb81b882de575 Dmitry Skorodumov  2025-10-21  880  	}
1fb81b882de575 Dmitry Skorodumov  2025-10-21  881  
5933fea7aa7237 Mahesh Bandewar    2014-12-06  882  	if (!netif_is_ipvlan_port(dev))
2ad7bf3638411c Mahesh Bandewar    2014-11-23  883  		return NOTIFY_DONE;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  884  
2ad7bf3638411c Mahesh Bandewar    2014-11-23  885  	port = ipvlan_port_get_rtnl(dev);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  886  
2ad7bf3638411c Mahesh Bandewar    2014-11-23  887  	switch (event) {
57fb346cc7d0fc Di Zhu             2021-07-29  888  	case NETDEV_UP:
22978397083888 Venkat Venkatsubra 2024-04-05  889  	case NETDEV_DOWN:
2ad7bf3638411c Mahesh Bandewar    2014-11-23  890  	case NETDEV_CHANGE:
2ad7bf3638411c Mahesh Bandewar    2014-11-23  891  		list_for_each_entry(ipvlan, &port->ipvlans, pnode)
2ad7bf3638411c Mahesh Bandewar    2014-11-23  892  			netif_stacked_transfer_operstate(ipvlan->phy_dev,
2ad7bf3638411c Mahesh Bandewar    2014-11-23  893  							 ipvlan->dev);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  894  		break;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  895  
3133822f5ac13b Florian Westphal   2017-04-20  896  	case NETDEV_REGISTER: {
3133822f5ac13b Florian Westphal   2017-04-20  897  		struct net *oldnet, *newnet = dev_net(dev);
3133822f5ac13b Florian Westphal   2017-04-20  898  
3133822f5ac13b Florian Westphal   2017-04-20  899  		oldnet = read_pnet(&port->pnet);
3133822f5ac13b Florian Westphal   2017-04-20  900  		if (net_eq(newnet, oldnet))
3133822f5ac13b Florian Westphal   2017-04-20  901  			break;
3133822f5ac13b Florian Westphal   2017-04-20  902  
3133822f5ac13b Florian Westphal   2017-04-20  903  		write_pnet(&port->pnet, newnet);
3133822f5ac13b Florian Westphal   2017-04-20  904  
043d5f68d0ccdd Lu Wei             2023-08-17  905  		if (port->mode == IPVLAN_MODE_L3S)
c675e06a98a474 Daniel Borkmann    2019-02-08  906  			ipvlan_migrate_l3s_hook(oldnet, newnet);
3133822f5ac13b Florian Westphal   2017-04-20  907  		break;
3133822f5ac13b Florian Westphal   2017-04-20  908  	}
2ad7bf3638411c Mahesh Bandewar    2014-11-23  909  	case NETDEV_UNREGISTER:
2ad7bf3638411c Mahesh Bandewar    2014-11-23  910  		if (dev->reg_state != NETREG_UNREGISTERING)
2ad7bf3638411c Mahesh Bandewar    2014-11-23  911  			break;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  912  
8230819494b3bf Paolo Abeni        2018-02-28  913  		list_for_each_entry_safe(ipvlan, next, &port->ipvlans, pnode)
2ad7bf3638411c Mahesh Bandewar    2014-11-23  914  			ipvlan->dev->rtnl_link_ops->dellink(ipvlan->dev,
2ad7bf3638411c Mahesh Bandewar    2014-11-23  915  							    &lst_kill);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  916  		unregister_netdevice_many(&lst_kill);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  917  		break;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  918  
2ad7bf3638411c Mahesh Bandewar    2014-11-23  919  	case NETDEV_FEAT_CHANGE:
2ad7bf3638411c Mahesh Bandewar    2014-11-23  920  		list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
6df6398f7c8b48 Jakub Kicinski     2022-05-05  921  			netif_inherit_tso_max(ipvlan->dev, dev);
d0f5c7076e01fe Mahesh Bandewar    2020-08-14  922  			netdev_update_features(ipvlan->dev);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  923  		}
2ad7bf3638411c Mahesh Bandewar    2014-11-23  924  		break;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  925  
2ad7bf3638411c Mahesh Bandewar    2014-11-23  926  	case NETDEV_CHANGEMTU:
2ad7bf3638411c Mahesh Bandewar    2014-11-23  927  		list_for_each_entry(ipvlan, &port->ipvlans, pnode)
2ad7bf3638411c Mahesh Bandewar    2014-11-23  928  			ipvlan_adjust_mtu(ipvlan, dev);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  929  		break;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  930  
61345fab484b97 Petr Machata       2018-12-13  931  	case NETDEV_PRE_CHANGEADDR:
61345fab484b97 Petr Machata       2018-12-13  932  		prechaddr_info = ptr;
61345fab484b97 Petr Machata       2018-12-13  933  		list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
0413a34ef678c3 Stanislav Fomichev 2025-07-17  934  			err = netif_pre_changeaddr_notify(ipvlan->dev,
61345fab484b97 Petr Machata       2018-12-13  935  							  prechaddr_info->dev_addr,
61345fab484b97 Petr Machata       2018-12-13  936  							  extack);
61345fab484b97 Petr Machata       2018-12-13  937  			if (err)
61345fab484b97 Petr Machata       2018-12-13  938  				return notifier_from_errno(err);
61345fab484b97 Petr Machata       2018-12-13  939  		}
61345fab484b97 Petr Machata       2018-12-13  940  		break;
61345fab484b97 Petr Machata       2018-12-13  941  
32c10bbfe914c7 Mahesh Bandewar    2017-10-11  942  	case NETDEV_CHANGEADDR:
711f25b2660608 Dmitry Skorodumov  2025-10-21 @943  		if (ipvlan_is_learnable(ipvlan->port))
711f25b2660608 Dmitry Skorodumov  2025-10-21  944  			break;
711f25b2660608 Dmitry Skorodumov  2025-10-21  945  
ab452c3ce7bacb Keefe Liu          2018-05-14  946  		list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
e35b8d7dbb094c Jakub Kicinski     2021-10-01  947  			eth_hw_addr_set(ipvlan->dev, dev->dev_addr);
ab452c3ce7bacb Keefe Liu          2018-05-14  948  			call_netdevice_notifiers(NETDEV_CHANGEADDR, ipvlan->dev);
ab452c3ce7bacb Keefe Liu          2018-05-14  949  		}
32c10bbfe914c7 Mahesh Bandewar    2017-10-11  950  		break;
32c10bbfe914c7 Mahesh Bandewar    2017-10-11  951  
2ad7bf3638411c Mahesh Bandewar    2014-11-23  952  	case NETDEV_PRE_TYPE_CHANGE:
2ad7bf3638411c Mahesh Bandewar    2014-11-23  953  		/* Forbid underlying device to change its type. */
2ad7bf3638411c Mahesh Bandewar    2014-11-23  954  		return NOTIFY_BAD;
e79a98e68b96a9 Etienne Champetier 2025-01-08  955  
e79a98e68b96a9 Etienne Champetier 2025-01-08  956  	case NETDEV_NOTIFY_PEERS:
e79a98e68b96a9 Etienne Champetier 2025-01-08  957  	case NETDEV_BONDING_FAILOVER:
e79a98e68b96a9 Etienne Champetier 2025-01-08  958  	case NETDEV_RESEND_IGMP:
e79a98e68b96a9 Etienne Champetier 2025-01-08  959  		list_for_each_entry(ipvlan, &port->ipvlans, pnode)
e79a98e68b96a9 Etienne Champetier 2025-01-08  960  			call_netdevice_notifiers(event, ipvlan->dev);
2ad7bf3638411c Mahesh Bandewar    2014-11-23  961  	}
2ad7bf3638411c Mahesh Bandewar    2014-11-23  962  	return NOTIFY_DONE;
2ad7bf3638411c Mahesh Bandewar    2014-11-23  963  }
2ad7bf3638411c Mahesh Bandewar    2014-11-23  964  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
  2025-10-21 14:44 ` [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge Dmitry Skorodumov
@ 2025-10-23  0:03   ` kernel test robot
  2025-10-23  0:24   ` kernel test robot
  2025-10-24  2:21   ` kernel test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-10-23  0:03 UTC (permalink / raw)
  To: Dmitry Skorodumov, netdev, linux-kernel
  Cc: llvm, oe-kbuild-all, andrey.bokhanko, Dmitry Skorodumov,
	Andrew Lunn, Eric Dumazet, Jakub Kicinski, Paolo Abeni

Hi Dmitry,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Dmitry-Skorodumov/ipvlan-Implement-learnable-L2-bridge/20251021-224923
base:   net-next/main
patch link:    https://lore.kernel.org/r/20251021144410.257905-8-skorodumov.dmitry%40huawei.com
patch subject: [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
config: hexagon-allmodconfig (https://download.01.org/0day-ci/archive/20251023/202510230706.1LUrP6NA-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251023/202510230706.1LUrP6NA-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510230706.1LUrP6NA-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/net/ipvlan/ipvlan_core.c:832:23: error: call to undeclared function 'csum_ipv6_magic'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     832 |         icmph->icmp6_cksum = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr,
         |                              ^
   drivers/net/ipvlan/ipvlan_core.c:832:23: note: did you mean 'csum_tcpudp_magic'?
   arch/hexagon/include/asm/checksum.h:21:9: note: 'csum_tcpudp_magic' declared here
      21 | __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
         |         ^
   arch/hexagon/include/asm/checksum.h:20:27: note: expanded from macro 'csum_tcpudp_magic'
      20 | #define csum_tcpudp_magic csum_tcpudp_magic
         |                           ^
   1 error generated.


vim +/csum_ipv6_magic +832 drivers/net/ipvlan/ipvlan_core.c

   788	
   789	static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
   790					      struct sk_buff *skb)
   791	{
   792		struct ipv6hdr *ip6h;
   793		struct icmp6hdr *icmph;
   794		u8 icmp_option;
   795		u8 *lladdr;
   796		u16 ndsize;
   797	
   798		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h))))
   799			return;
   800	
   801		if (ipv6_hdr(skb)->nexthdr != NEXTHDR_ICMP)
   802			return;
   803	
   804		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + sizeof(*icmph))))
   805			return;
   806	
   807		ip6h = ipv6_hdr(skb);
   808		icmph = (struct icmp6hdr *)(ip6h + 1);
   809	
   810		/* Patch Source-LL for solicitation, Target-LL for advertisement */
   811		if (icmph->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
   812		    icmph->icmp6_type == NDISC_ROUTER_SOLICITATION)
   813			icmp_option = ND_OPT_SOURCE_LL_ADDR;
   814		else if (icmph->icmp6_type == NDISC_NEIGHBOUR_ADVERTISEMENT)
   815			icmp_option = ND_OPT_TARGET_LL_ADDR;
   816		else
   817			return;
   818	
   819		ndsize = htons(ip6h->payload_len);
   820		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + ndsize)))
   821			return;
   822	
   823		lladdr = ipvlan_search_icmp6_ll_addr(skb, icmp_option);
   824		if (!lladdr)
   825			return;
   826	
   827		ether_addr_copy(lladdr, ipvlan->phy_dev->dev_addr);
   828	
   829		ip6h = ipv6_hdr(skb);
   830		icmph = (struct icmp6hdr *)(ip6h + 1);
   831		icmph->icmp6_cksum = 0;
 > 832		icmph->icmp6_cksum = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr,
   833						     ndsize,
   834						     IPPROTO_ICMPV6,
   835						     csum_partial(icmph,
   836								  ndsize,
   837								  0));
   838		skb->ip_summed = CHECKSUM_COMPLETE;
   839	}
   840	#else
   841	static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
   842					      struct sk_buff *skb)
   843	{
   844	}
   845	#endif
   846	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
  2025-10-21 14:44 ` [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge Dmitry Skorodumov
  2025-10-23  0:03   ` kernel test robot
@ 2025-10-23  0:24   ` kernel test robot
  2025-10-24  2:21   ` kernel test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-10-23  0:24 UTC (permalink / raw)
  To: Dmitry Skorodumov, netdev, linux-kernel
  Cc: oe-kbuild-all, andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Hi Dmitry,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Dmitry-Skorodumov/ipvlan-Implement-learnable-L2-bridge/20251021-224923
base:   net-next/main
patch link:    https://lore.kernel.org/r/20251021144410.257905-8-skorodumov.dmitry%40huawei.com
patch subject: [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
config: x86_64-rhel-9.4-func (https://download.01.org/0day-ci/archive/20251023/202510230845.sa3eZvoK-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251023/202510230845.sa3eZvoK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510230845.sa3eZvoK-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/net/ipvlan/ipvlan_core.c: In function 'ipvlan_snat_patch_tx_ipv6':
>> drivers/net/ipvlan/ipvlan_core.c:832:30: error: implicit declaration of function 'csum_ipv6_magic'; did you mean 'csum_tcpudp_magic'? [-Wimplicit-function-declaration]
     832 |         icmph->icmp6_cksum = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr,
         |                              ^~~~~~~~~~~~~~~
         |                              csum_tcpudp_magic


vim +832 drivers/net/ipvlan/ipvlan_core.c

   788	
   789	static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
   790					      struct sk_buff *skb)
   791	{
   792		struct ipv6hdr *ip6h;
   793		struct icmp6hdr *icmph;
   794		u8 icmp_option;
   795		u8 *lladdr;
   796		u16 ndsize;
   797	
   798		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h))))
   799			return;
   800	
   801		if (ipv6_hdr(skb)->nexthdr != NEXTHDR_ICMP)
   802			return;
   803	
   804		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + sizeof(*icmph))))
   805			return;
   806	
   807		ip6h = ipv6_hdr(skb);
   808		icmph = (struct icmp6hdr *)(ip6h + 1);
   809	
   810		/* Patch Source-LL for solicitation, Target-LL for advertisement */
   811		if (icmph->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
   812		    icmph->icmp6_type == NDISC_ROUTER_SOLICITATION)
   813			icmp_option = ND_OPT_SOURCE_LL_ADDR;
   814		else if (icmph->icmp6_type == NDISC_NEIGHBOUR_ADVERTISEMENT)
   815			icmp_option = ND_OPT_TARGET_LL_ADDR;
   816		else
   817			return;
   818	
   819		ndsize = htons(ip6h->payload_len);
   820		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + ndsize)))
   821			return;
   822	
   823		lladdr = ipvlan_search_icmp6_ll_addr(skb, icmp_option);
   824		if (!lladdr)
   825			return;
   826	
   827		ether_addr_copy(lladdr, ipvlan->phy_dev->dev_addr);
   828	
   829		ip6h = ipv6_hdr(skb);
   830		icmph = (struct icmp6hdr *)(ip6h + 1);
   831		icmph->icmp6_cksum = 0;
 > 832		icmph->icmp6_cksum = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr,
   833						     ndsize,
   834						     IPPROTO_ICMPV6,
   835						     csum_partial(icmph,
   836								  ndsize,
   837								  0));
   838		skb->ip_summed = CHECKSUM_COMPLETE;
   839	}
   840	#else
   841	static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
   842					      struct sk_buff *skb)
   843	{
   844	}
   845	#endif
   846	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/8] ipvlan: Implement learnable L2-bridge
  2025-10-22 14:23   ` Simon Horman
@ 2025-10-23 10:21     ` Dmitry Skorodumov
  2025-10-23 11:31       ` Simon Horman
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Skorodumov @ 2025-10-23 10:21 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, linux-doc, linux-kernel, andrey.bokhanko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jonathan Corbet,
	Andrew Lunn

On 22.10.2025 17:23, Simon Horman wrote:
> On Tue, Oct 21, 2025 at 05:44:03PM +0300, Dmitry Skorodumov wrote:
>> Now it is possible to create link in L2E mode: learnable
>> bridge. The IPs will be learned from TX-packets of child interfaces.
> Is there a standard for this approach - where does the L2E name come from?

Actually, I meant "E" here as "Extended". But more or less standard naming - is "MAC NAT" - "Mac network address translation". I discussed a bit naming with LLM, and it suggested name "macsnat".. looks like  it is a better name. Hope it is ok, but I don't mind to rename if anyone has better idea

> ...
>
> It is still preferred in networking code to linewrap lines
> so that they are not wider than 80 columns, where than can be done without
> reducing readability. Which appears to be the case here.
>
> Flagged by checkpatch.pl --max-line-length=80
...
> Please don't use the inline keyword in .c files

Thank you, this will be fixed

>> +static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
>> +			      int addr_type)
>> +{
>> +	void *addr = NULL;
>> +	bool is_v6;
>> +
>> +	switch (addr_type) {
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	/* No need to handle IPVL_ICMPV6, since it never has valid src-address */
>> +	case IPVL_IPV6: {
>> +		struct ipv6hdr *ip6h;
>> +
>> +		ip6h = (struct ipv6hdr *)lyr3h;
>> +		if (!is_ipv6_usable(&ip6h->saddr))
> It is preferred to avoid #if / #ifdef in order to improve compile coverage
> (and, I would argue, readability).
..
> In this case I think that can be achieved by changing the line above to:
>
> 		if (!IS_ENABLED(CONFIG_IPV6) || !is_ipv6_usable(&ip6h->saddr))
>
> I think it would be interesting to see if a similar approach can be used
> to remove other #if CONFIG_IPV6 conditions in this file, and if successful
> provide that as a clean-up as the opening patch in this series.
>
> However, without that, I can see how one could argue for the approach
> you have taken here on the basis of consistency.
>

Hmmmm.... this raises a complicated for me questions of testing this refactoring: 

- whether IPv6 specific functions (like csum_ipv6_magic(), register_inet6addr_notifier()) are available if kernel is compiled without CONFIG_IPV6

- ideally the code should be retested with kernel without CONFIG_IPV6

This looks like a separate work that requires more or less additional efforts...

> static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
>>  {
>> -	const struct ipvl_dev *ipvlan = netdev_priv(dev);
>> -	struct ethhdr *eth = skb_eth_hdr(skb);
>> -	struct ipvl_addr *addr;
>>  	void *lyr3h;
>> +	struct ipvl_addr *addr;
>>  	int addr_type;
>> +	bool same_mac_addr;
>> +	struct ipvl_dev *ipvlan = netdev_priv(dev);
>> +	struct ethhdr *eth = skb_eth_hdr(skb);
> I realise that the convention is not followed in the existing code,
> but please prefer to arrange local variables in reverse xmas tree order -
> longest line to shortest.
I fixed all my changes to follow this style, except one - where it seems a bit unnatural to to declare dependent variable before "parent" variable. Hope it is ok.
>> +	    ether_addr_equal(eth->h_source, dev->dev_addr)) {
>> +		/* ignore tx-packets from host */
>> +		goto out_drop;
>> +	}
>> +
>> +	same_mac_addr = ether_addr_equal(eth->h_dest, eth->h_source);
>> +
>> +	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
>>  
>> -	if (!ipvlan_is_vepa(ipvlan->port) &&
>> -	    ether_addr_equal(eth->h_dest, eth->h_source)) {
>> -		lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
>> +	if (ipvlan_is_learnable(ipvlan->port)) {
>> +		if (lyr3h)
>> +			ipvlan_addr_learn(ipvlan, lyr3h, addr_type);
>> +		/* Mark SKB in advance */
>> +		skb = skb_share_check(skb, GFP_ATOMIC);
>> +		if (!skb)
>> +			return NET_XMIT_DROP;
> I think that when you drop packets a counter should be incremented.
> Likewise elsewhere in this function.
The counter appears to be handled in parent function - in ipvlan_start_xmit()
>> +	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
>> +	if (addr) {
>> +		int ret, len;
>> +
>> +		ipvlan_skb_crossing_ns(skb, addr->master->dev);
>> +		skb->protocol = eth_type_trans(skb, skb->dev);
>> +		skb->pkt_type = PACKET_HOST;
>> +		ipvlan_mark_skb(skb, port->dev);
>> +		len = skb->len + ETH_HLEN;
>> +		ret = netif_rx(skb);
>> +		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
>>
>> This fails to build because ipvlan is not declared in this scope.
>> Perhaps something got missed due to an edit?
Oops, really. Compilation was fixed in later patches.
>> +
>> +out:
>> +	dev_kfree_skb(skb);
>> +no_mem:
>> +	return 0; // actually, ret value is ignored
> Maybe, but it seems to me that the return values
> should follow that of netif_receive_skb_core().
Agree.. will be fixed.

Dmitru


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/8] ipvlan: Implement learnable L2-bridge
  2025-10-23 10:21     ` Dmitry Skorodumov
@ 2025-10-23 11:31       ` Simon Horman
  0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2025-10-23 11:31 UTC (permalink / raw)
  To: Dmitry Skorodumov
  Cc: netdev, linux-doc, linux-kernel, andrey.bokhanko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jonathan Corbet,
	Andrew Lunn

On Thu, Oct 23, 2025 at 01:21:20PM +0300, Dmitry Skorodumov wrote:
> On 22.10.2025 17:23, Simon Horman wrote:
> > On Tue, Oct 21, 2025 at 05:44:03PM +0300, Dmitry Skorodumov wrote:
> >> Now it is possible to create link in L2E mode: learnable
> >> bridge. The IPs will be learned from TX-packets of child interfaces.
> > Is there a standard for this approach - where does the L2E name come from?
> 
> Actually, I meant "E" here as "Extended". But more or less standard naming - is "MAC NAT" - "Mac network address translation". I discussed a bit naming with LLM, and it suggested name "macsnat".. looks like  it is a better name. Hope it is ok, but I don't mind to rename if anyone has better idea

I was more curious than anything else. But perhaps it would
be worth providing some explanation of the name in the
commit message.

...

> >> +static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
> >> +			      int addr_type)
> >> +{
> >> +	void *addr = NULL;
> >> +	bool is_v6;
> >> +
> >> +	switch (addr_type) {
> >> +#if IS_ENABLED(CONFIG_IPV6)
> >> +	/* No need to handle IPVL_ICMPV6, since it never has valid src-address */
> >> +	case IPVL_IPV6: {
> >> +		struct ipv6hdr *ip6h;
> >> +
> >> +		ip6h = (struct ipv6hdr *)lyr3h;
> >> +		if (!is_ipv6_usable(&ip6h->saddr))
> > It is preferred to avoid #if / #ifdef in order to improve compile coverage
> > (and, I would argue, readability).
> ..
> > In this case I think that can be achieved by changing the line above to:
> >
> > 		if (!IS_ENABLED(CONFIG_IPV6) || !is_ipv6_usable(&ip6h->saddr))
> >
> > I think it would be interesting to see if a similar approach can be used
> > to remove other #if CONFIG_IPV6 conditions in this file, and if successful
> > provide that as a clean-up as the opening patch in this series.
> >
> > However, without that, I can see how one could argue for the approach
> > you have taken here on the basis of consistency.
> >
> 
> Hmmmm.... this raises a complicated for me questions of testing this refactoring: 
> 
> - whether IPv6 specific functions (like csum_ipv6_magic(), register_inet6addr_notifier()) are available if kernel is compiled without CONFIG_IPV6
> 
> - ideally the code should be retested with kernel without CONFIG_IPV6
> 
> This looks like a separate work that requires more or less additional efforts...

Understood, I agree this can be left as future work.

> 
> > static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
> >>  {
> >> -	const struct ipvl_dev *ipvlan = netdev_priv(dev);
> >> -	struct ethhdr *eth = skb_eth_hdr(skb);
> >> -	struct ipvl_addr *addr;
> >>  	void *lyr3h;
> >> +	struct ipvl_addr *addr;
> >>  	int addr_type;
> >> +	bool same_mac_addr;
> >> +	struct ipvl_dev *ipvlan = netdev_priv(dev);
> >> +	struct ethhdr *eth = skb_eth_hdr(skb);
> > I realise that the convention is not followed in the existing code,
> > but please prefer to arrange local variables in reverse xmas tree order -
> > longest line to shortest.
> I fixed all my changes to follow this style, except one - where it seems a bit unnatural to to declare dependent variable before "parent" variable. Hope it is ok.

I would lean towards reverse xmas here too.
But I understand if you feel otherwise.
And given the current state of this file, I think that is ok.

> >> +	    ether_addr_equal(eth->h_source, dev->dev_addr)) {
> >> +		/* ignore tx-packets from host */
> >> +		goto out_drop;
> >> +	}
> >> +
> >> +	same_mac_addr = ether_addr_equal(eth->h_dest, eth->h_source);
> >> +
> >> +	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
> >>  
> >> -	if (!ipvlan_is_vepa(ipvlan->port) &&
> >> -	    ether_addr_equal(eth->h_dest, eth->h_source)) {
> >> -		lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
> >> +	if (ipvlan_is_learnable(ipvlan->port)) {
> >> +		if (lyr3h)
> >> +			ipvlan_addr_learn(ipvlan, lyr3h, addr_type);
> >> +		/* Mark SKB in advance */
> >> +		skb = skb_share_check(skb, GFP_ATOMIC);
> >> +		if (!skb)
> >> +			return NET_XMIT_DROP;
> > I think that when you drop packets a counter should be incremented.
> > Likewise elsewhere in this function.
> The counter appears to be handled in parent function - in ipvlan_start_xmit()

Thanks, I see that now.

> >> +	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
> >> +	if (addr) {
> >> +		int ret, len;
> >> +
> >> +		ipvlan_skb_crossing_ns(skb, addr->master->dev);
> >> +		skb->protocol = eth_type_trans(skb, skb->dev);
> >> +		skb->pkt_type = PACKET_HOST;
> >> +		ipvlan_mark_skb(skb, port->dev);
> >> +		len = skb->len + ETH_HLEN;
> >> +		ret = netif_rx(skb);
> >> +		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
> >>
> >> This fails to build because ipvlan is not declared in this scope.
> >> Perhaps something got missed due to an edit?
> Oops, really. Compilation was fixed in later patches.

Stuff happens :)

...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
  2025-10-21 14:44 ` [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge Dmitry Skorodumov
  2025-10-23  0:03   ` kernel test robot
  2025-10-23  0:24   ` kernel test robot
@ 2025-10-24  2:21   ` kernel test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2025-10-24  2:21 UTC (permalink / raw)
  To: Dmitry Skorodumov, netdev, linux-kernel
  Cc: oe-kbuild-all, andrey.bokhanko, Dmitry Skorodumov, Andrew Lunn,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Hi Dmitry,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Dmitry-Skorodumov/ipvlan-Implement-learnable-L2-bridge/20251021-224923
base:   net-next/main
patch link:    https://lore.kernel.org/r/20251021144410.257905-8-skorodumov.dmitry%40huawei.com
patch subject: [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge
config: sparc-randconfig-r132-20251023 (https://download.01.org/0day-ci/archive/20251024/202510241011.2cTY5v7o-lkp@intel.com/config)
compiler: sparc-linux-gcc (GCC) 12.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251024/202510241011.2cTY5v7o-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510241011.2cTY5v7o-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   drivers/net/ipvlan/ipvlan_core.c:55:36: sparse: sparse: incorrect type in argument 1 (different base types) @@     expected unsigned int [usertype] a @@     got restricted __be32 const [usertype] s_addr @@
   drivers/net/ipvlan/ipvlan_core.c:55:36: sparse:     expected unsigned int [usertype] a
   drivers/net/ipvlan/ipvlan_core.c:55:36: sparse:     got restricted __be32 const [usertype] s_addr
>> drivers/net/ipvlan/ipvlan_core.c:760:22: sparse: sparse: cast from restricted __be16
>> drivers/net/ipvlan/ipvlan_core.c:760:22: sparse: sparse: incorrect type in initializer (different base types) @@     expected int ndsize @@     got restricted __be16 [usertype] @@
   drivers/net/ipvlan/ipvlan_core.c:760:22: sparse:     expected int ndsize
   drivers/net/ipvlan/ipvlan_core.c:760:22: sparse:     got restricted __be16 [usertype]
   drivers/net/ipvlan/ipvlan_core.c:819:18: sparse: sparse: cast from restricted __be16
>> drivers/net/ipvlan/ipvlan_core.c:819:16: sparse: sparse: incorrect type in assignment (different base types) @@     expected unsigned short [usertype] ndsize @@     got restricted __be16 [usertype] @@
   drivers/net/ipvlan/ipvlan_core.c:819:16: sparse:     expected unsigned short [usertype] ndsize
   drivers/net/ipvlan/ipvlan_core.c:819:16: sparse:     got restricted __be16 [usertype]

vim +760 drivers/net/ipvlan/ipvlan_core.c

   753	
   754	static u8 *ipvlan_search_icmp6_ll_addr(struct sk_buff *skb, u8 icmp_option)
   755	{
   756		/* skb is ensured to pullable for all ipv6 payload_len by caller */
   757		struct ipv6hdr *ip6h = ipv6_hdr(skb);
   758		struct icmp6hdr *icmph = (struct icmp6hdr *)(ip6h + 1);
   759		int curr_off = sizeof(*icmph);
 > 760		int ndsize = htons(ip6h->payload_len);
   761	
   762		if (icmph->icmp6_type != NDISC_ROUTER_SOLICITATION)
   763			curr_off += sizeof(struct in6_addr);
   764	
   765		while ((curr_off + 2) < ndsize) {
   766			u8  *data = (u8 *)icmph + curr_off;
   767			u32 opt_len = data[1] << 3;
   768	
   769			if (unlikely(opt_len == 0))
   770				return NULL;
   771	
   772			if (data[0] != icmp_option) {
   773				curr_off += opt_len;
   774				continue;
   775			}
   776	
   777			if (unlikely(opt_len < ETH_ALEN + 2))
   778				return NULL;
   779	
   780			if (unlikely(curr_off + opt_len > ndsize))
   781				return NULL;
   782	
   783			return data + 2;
   784		}
   785	
   786		return NULL;
   787	}
   788	
   789	static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
   790					      struct sk_buff *skb)
   791	{
   792		struct ipv6hdr *ip6h;
   793		struct icmp6hdr *icmph;
   794		u8 icmp_option;
   795		u8 *lladdr;
   796		u16 ndsize;
   797	
   798		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h))))
   799			return;
   800	
   801		if (ipv6_hdr(skb)->nexthdr != NEXTHDR_ICMP)
   802			return;
   803	
   804		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + sizeof(*icmph))))
   805			return;
   806	
   807		ip6h = ipv6_hdr(skb);
   808		icmph = (struct icmp6hdr *)(ip6h + 1);
   809	
   810		/* Patch Source-LL for solicitation, Target-LL for advertisement */
   811		if (icmph->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
   812		    icmph->icmp6_type == NDISC_ROUTER_SOLICITATION)
   813			icmp_option = ND_OPT_SOURCE_LL_ADDR;
   814		else if (icmph->icmp6_type == NDISC_NEIGHBOUR_ADVERTISEMENT)
   815			icmp_option = ND_OPT_TARGET_LL_ADDR;
   816		else
   817			return;
   818	
 > 819		ndsize = htons(ip6h->payload_len);
   820		if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + ndsize)))
   821			return;
   822	
   823		lladdr = ipvlan_search_icmp6_ll_addr(skb, icmp_option);
   824		if (!lladdr)
   825			return;
   826	
   827		ether_addr_copy(lladdr, ipvlan->phy_dev->dev_addr);
   828	
   829		ip6h = ipv6_hdr(skb);
   830		icmph = (struct icmp6hdr *)(ip6h + 1);
   831		icmph->icmp6_cksum = 0;
   832		icmph->icmp6_cksum = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr,
   833						     ndsize,
   834						     IPPROTO_ICMPV6,
   835						     csum_partial(icmph,
   836								  ndsize,
   837								  0));
   838		skb->ip_summed = CHECKSUM_COMPLETE;
   839	}
   840	#else
   841	static void ipvlan_snat_patch_tx_ipv6(struct ipvl_dev *ipvlan,
   842					      struct sk_buff *skb)
   843	{
   844	}
   845	#endif
   846	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-10-24  2:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-21 14:44 [PATCH net-next 0/8] ipvlan: Implement learnable L2-bridge Dmitry Skorodumov
2025-10-21 14:44 ` [PATCH net-next 1/8] " Dmitry Skorodumov
2025-10-22 14:23   ` Simon Horman
2025-10-23 10:21     ` Dmitry Skorodumov
2025-10-23 11:31       ` Simon Horman
2025-10-21 14:44 ` [PATCH net-next 2/8] ipvlan: Send mcasts out directly in ipvlan_xmit_mode_l2() Dmitry Skorodumov
2025-10-21 14:44 ` [PATCH net-next 3/8] ipvlan: Handle rx mcast-ip and unicast eth Dmitry Skorodumov
2025-10-21 14:44 ` [PATCH net-next 4/8] ipvlan: Added some kind of MAC SNAT Dmitry Skorodumov
2025-10-21 14:44 ` [PATCH net-next 5/8] ipvlan: Forget all IP when device goes down Dmitry Skorodumov
2025-10-21 14:44 ` [PATCH net-next 6/8] ipvlan: Support GSO for port -> ipvlan Dmitry Skorodumov
2025-10-22 20:55   ` kernel test robot
2025-10-21 14:44 ` [PATCH net-next 7/8] ipvlan: Support IPv6 for learnable l2-bridge Dmitry Skorodumov
2025-10-23  0:03   ` kernel test robot
2025-10-23  0:24   ` kernel test robot
2025-10-24  2:21   ` kernel test robot
2025-10-21 14:44 ` [PATCH net-next 8/8] ipvlan: Don't learn child with host-ip Dmitry Skorodumov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).