netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver
@ 2007-07-12 18:13 Patrick McHardy
  2007-07-12 18:13 ` [NET 01/05]: Add net_device change_rx_mode callback Patrick McHardy
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Patrick McHardy @ 2007-07-12 18:13 UTC (permalink / raw)
  To: davem; +Cc: netdev, greearb, Patrick McHardy

These patches introduce a new net_device hook for synchronizing promiscous
and allmulti state to an underlying device (as done by 8021q, bonding,
maxvlan, ..) to fix some races that occur when doing this from
dev->set_multicast_list. They also introduce two helpers for multicast
list synchonization and change VLAN to make use of all of this (mainly to
show that its not only for macvlan, but it also fixes some real races).
The last patch adds the macvlan driver.

Please apply, thanks.


 MAINTAINERS                |    6 +
 drivers/net/Kconfig        |   10 +
 drivers/net/Makefile       |    1 +
 drivers/net/macvlan.c      |  496 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/if_macvlan.h |    9 +
 include/linux/if_vlan.h    |    7 -
 include/linux/netdevice.h  |    8 +
 net/8021q/vlan.c           |    1 +
 net/8021q/vlan.h           |    1 +
 net/8021q/vlan_dev.c       |  167 ++-------------
 net/core/dev.c             |   43 ++++-
 net/core/dev_mcast.c       |   75 +++++++
 12 files changed, 670 insertions(+), 154 deletions(-)
 create mode 100644 drivers/net/macvlan.c
 create mode 100644 include/linux/if_macvlan.h

Patrick McHardy (5):
      [NET]: Add net_device change_rx_mode callback
      [NET]: dev_mcast: add multicast list synchronization helpers
      [VLAN]: Fix promiscous/allmulti synchronization races
      [VLAN]: Use multicast list synchronization helpers
      [NET]: Add macvlan driver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [NET 01/05]: Add net_device change_rx_mode callback
  2007-07-12 18:13 [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver Patrick McHardy
@ 2007-07-12 18:13 ` Patrick McHardy
  2007-07-15  1:51   ` David Miller
  2007-07-12 18:13 ` [NET 02/05]: dev_mcast: add multicast list synchronization helpers Patrick McHardy
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-07-12 18:13 UTC (permalink / raw)
  To: davem; +Cc: netdev, greearb, Patrick McHardy

[NET]: Add net_device change_rx_mode callback

Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).

These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.

For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.

When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.

Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit b977db07766ffb261d2c7ebe050cbc3b1d7d281a
tree 1497ca885659afdde8ea81ea7d1534de28a773d3
parent 15028aad00ddf241581fbe74a02ec89cbb28d35d
author Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:53:55 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:53:55 +0200

 include/linux/netdevice.h |    3 +++
 net/core/dev.c            |   17 ++++++++++++++++-
 2 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 79cc3da..f193aba 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -516,6 +516,9 @@ struct net_device
 						void *saddr,
 						unsigned len);
 	int			(*rebuild_header)(struct sk_buff *skb);
+#define HAVE_CHANGE_RX_FLAGS
+	void			(*change_rx_flags)(struct net_device *dev,
+						   int flags);
 #define HAVE_SET_RX_MODE
 	void			(*set_rx_mode)(struct net_device *dev);
 #define HAVE_MULTICAST			 
diff --git a/net/core/dev.c b/net/core/dev.c
index 4221dcd..cb055e5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2507,6 +2507,8 @@ static void __dev_set_promiscuity(struct net_device *dev, int inc)
 {
 	unsigned short old_flags = dev->flags;
 
+	ASSERT_RTNL();
+
 	if ((dev->promiscuity += inc) == 0)
 		dev->flags &= ~IFF_PROMISC;
 	else
@@ -2521,6 +2523,9 @@ static void __dev_set_promiscuity(struct net_device *dev, int inc)
 			dev->name, (dev->flags & IFF_PROMISC),
 			(old_flags & IFF_PROMISC),
 			audit_get_loginuid(current->audit_context));
+
+		if (dev->change_rx_flags)
+			dev->change_rx_flags(dev, IFF_PROMISC);
 	}
 }
 
@@ -2559,11 +2564,16 @@ void dev_set_allmulti(struct net_device *dev, int inc)
 {
 	unsigned short old_flags = dev->flags;
 
+	ASSERT_RTNL();
+
 	dev->flags |= IFF_ALLMULTI;
 	if ((dev->allmulti += inc) == 0)
 		dev->flags &= ~IFF_ALLMULTI;
-	if (dev->flags ^ old_flags)
+	if (dev->flags ^ old_flags) {
+		if (dev->change_rx_flags)
+			dev->change_rx_flags(dev, IFF_ALLMULTI);
 		dev_set_rx_mode(dev);
+	}
 }
 
 /*
@@ -2764,6 +2774,8 @@ int dev_change_flags(struct net_device *dev, unsigned flags)
 	int ret, changes;
 	int old_flags = dev->flags;
 
+	ASSERT_RTNL();
+
 	/*
 	 *	Set the flags on our device.
 	 */
@@ -2778,6 +2790,9 @@ int dev_change_flags(struct net_device *dev, unsigned flags)
 	 *	Load in the correct multicast list now the flags have changed.
 	 */
 
+	if (dev->change_rx_flags && (dev->flags ^ flags) & IFF_MULTICAST)
+		dev->change_rx_flags(dev, IFF_MULTICAST);
+
 	dev_set_rx_mode(dev);
 
 	/*

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [NET 02/05]: dev_mcast: add multicast list synchronization helpers
  2007-07-12 18:13 [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver Patrick McHardy
  2007-07-12 18:13 ` [NET 01/05]: Add net_device change_rx_mode callback Patrick McHardy
@ 2007-07-12 18:13 ` Patrick McHardy
  2007-07-15  1:52   ` David Miller
  2007-07-12 18:13 ` [VLAN 03/05]: Fix promiscous/allmulti synchronization races Patrick McHardy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-07-12 18:13 UTC (permalink / raw)
  To: davem; +Cc: netdev, greearb, Patrick McHardy

[NET]: dev_mcast: add multicast list synchronization helpers

The method drivers currently use to synchronize multicast lists is not
very pretty:

- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list

This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 934ee4749ce12cb9b868a98fd3822cf5cdfa8208
tree ddb91cc1428e52f82712101518fccce65ed4597d
parent b977db07766ffb261d2c7ebe050cbc3b1d7d281a
author Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:53:58 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:53:58 +0200

 include/linux/netdevice.h |    3 ++
 net/core/dev_mcast.c      |   75 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+), 0 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f193aba..e5af458 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -190,6 +190,7 @@ struct dev_addr_list
 	struct dev_addr_list	*next;
 	u8			da_addr[MAX_ADDR_LEN];
 	u8			da_addrlen;
+	u8			da_synced;
 	int			da_users;
 	int			da_gusers;
 };
@@ -1103,6 +1104,8 @@ extern int		dev_unicast_delete(struct net_device *dev, void *addr, int alen);
 extern int		dev_unicast_add(struct net_device *dev, void *addr, int alen);
 extern int 		dev_mc_delete(struct net_device *dev, void *addr, int alen, int all);
 extern int		dev_mc_add(struct net_device *dev, void *addr, int alen, int newonly);
+extern int		dev_mc_sync(struct net_device *to, struct net_device *from);
+extern void		dev_mc_unsync(struct net_device *to, struct net_device *from);
 extern void		dev_mc_discard(struct net_device *dev);
 extern int 		__dev_addr_delete(struct dev_addr_list **list, int *count, void *addr, int alen, int all);
 extern int		__dev_addr_add(struct dev_addr_list **list, int *count, void *addr, int alen, int newonly);
diff --git a/net/core/dev_mcast.c b/net/core/dev_mcast.c
index aa38100..235a2a8 100644
--- a/net/core/dev_mcast.c
+++ b/net/core/dev_mcast.c
@@ -102,6 +102,81 @@ int dev_mc_add(struct net_device *dev, void *addr, int alen, int glbl)
 	return err;
 }
 
+/**
+ *	dev_mc_sync	- Synchronize device's multicast list to another device
+ *	@to: destination device
+ *	@from: source device
+ *
+ * 	Add newly added addresses to the destination device and release
+ * 	addresses that have no users left. The source device must be
+ * 	locked by netif_tx_lock_bh.
+ *
+ *	This function is intended to be called from the dev->set_multicast_list
+ *	function of layered software devices.
+ */
+int dev_mc_sync(struct net_device *to, struct net_device *from)
+{
+	struct dev_addr_list *da;
+	int err = 0;
+
+	netif_tx_lock_bh(to);
+	for (da = from->mc_list; da != NULL; da = da->next) {
+		if (!da->da_synced) {
+			err = __dev_addr_add(&to->mc_list, &to->mc_count,
+					     da->da_addr, da->da_addrlen, 0);
+			if (err < 0)
+				break;
+			da->da_synced = 1;
+			da->da_users++;
+		} else if (da->da_users == 1) {
+			__dev_addr_delete(&to->mc_list, &to->mc_count,
+					  da->da_addr, da->da_addrlen, 0);
+			__dev_addr_delete(&from->mc_list, &from->mc_count,
+					  da->da_addr, da->da_addrlen, 0);
+		}
+	}
+	if (!err)
+		__dev_set_rx_mode(to);
+	netif_tx_unlock_bh(to);
+
+	return err;
+}
+EXPORT_SYMBOL(dev_mc_sync);
+
+
+/**
+ * 	dev_mc_unsync	- Remove synchronized addresses from the destination
+ * 			  device
+ *	@to: destination device
+ *	@from: source device
+ *
+ * 	Remove all addresses that were added to the destination device by
+ * 	dev_mc_sync(). This function is intended to be called from the
+ * 	dev->stop function of layered software devices.
+ */
+void dev_mc_unsync(struct net_device *to, struct net_device *from)
+{
+	struct dev_addr_list *da;
+
+	netif_tx_lock_bh(from);
+	netif_tx_lock_bh(to);
+
+	for (da = from->mc_list; da != NULL; da = da->next) {
+		if (!da->da_synced)
+			continue;
+		__dev_addr_delete(&to->mc_list, &to->mc_count,
+				  da->da_addr, da->da_addrlen, 0);
+		da->da_synced = 0;
+		__dev_addr_delete(&from->mc_list, &from->mc_count,
+				  da->da_addr, da->da_addrlen, 0);
+	}
+	__dev_set_rx_mode(to);
+
+	netif_tx_unlock_bh(to);
+	netif_tx_unlock_bh(from);
+}
+EXPORT_SYMBOL(dev_mc_unsync);
+
 /*
  *	Discard multicast list when a device is downed
  */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [VLAN 03/05]: Fix promiscous/allmulti synchronization races
  2007-07-12 18:13 [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver Patrick McHardy
  2007-07-12 18:13 ` [NET 01/05]: Add net_device change_rx_mode callback Patrick McHardy
  2007-07-12 18:13 ` [NET 02/05]: dev_mcast: add multicast list synchronization helpers Patrick McHardy
@ 2007-07-12 18:13 ` Patrick McHardy
  2007-07-15  1:53   ` David Miller
  2007-07-12 18:13 ` [VLAN 04/05]: Use multicast list synchronization helpers Patrick McHardy
  2007-07-12 18:13 ` [NET 05/05]: Add macvlan driver Patrick McHardy
  4 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-07-12 18:13 UTC (permalink / raw)
  To: davem; +Cc: netdev, greearb, Patrick McHardy

[VLAN]: Fix promiscous/allmulti synchronization races

The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 86de1244f4e8736a7bdfc28b899cc2ce5e36cf5c
tree 1379e7ae0b054d4f1390b0d1960aecee9ece37e6
parent 934ee4749ce12cb9b868a98fd3822cf5cdfa8208
author Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:57:34 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:57:34 +0200

 include/linux/if_vlan.h |    2 --
 net/8021q/vlan.c        |    1 +
 net/8021q/vlan.h        |    1 +
 net/8021q/vlan_dev.c    |   38 ++++++++++++++++++++------------------
 4 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 61a57dc..7f71df4 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -132,8 +132,6 @@ struct vlan_dev_info {
                                            * made, in order to feed the right changes down
                                            * to the real hardware...
                                            */
-	int old_allmulti;               /* similar to above. */
-	int old_promiscuity;            /* similar to above. */
 	struct net_device *real_dev;    /* the underlying device/interface */
 	unsigned char real_dev_addr[ETH_ALEN];
 	struct proc_dir_entry *dent;    /* Holds the proc data */
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index abb9900..39bdcc2 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -373,6 +373,7 @@ void vlan_setup(struct net_device *new_dev)
 	new_dev->open = vlan_dev_open;
 	new_dev->stop = vlan_dev_stop;
 	new_dev->set_multicast_list = vlan_dev_set_multicast_list;
+	new_dev->change_rx_flags = vlan_change_rx_flags;
 	new_dev->destructor = free_netdev;
 	new_dev->do_ioctl = vlan_dev_ioctl;
 
diff --git a/net/8021q/vlan.h b/net/8021q/vlan.h
index 62ce1c5..7df5b29 100644
--- a/net/8021q/vlan.h
+++ b/net/8021q/vlan.h
@@ -69,6 +69,7 @@ int vlan_dev_set_vlan_flag(const struct net_device *dev,
 			   u32 flag, short flag_val);
 void vlan_dev_get_realdev_name(const struct net_device *dev, char *result);
 void vlan_dev_get_vid(const struct net_device *dev, unsigned short *result);
+void vlan_change_rx_flags(struct net_device *dev, int change);
 void vlan_dev_set_multicast_list(struct net_device *vlan_dev);
 
 int vlan_check_real_dev(struct net_device *real_dev, unsigned short vlan_id);
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index d4a62d1..dec7e62 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -712,6 +712,11 @@ int vlan_dev_open(struct net_device *dev)
 	}
 	memcpy(vlan->real_dev_addr, real_dev->dev_addr, ETH_ALEN);
 
+	if (dev->flags & IFF_ALLMULTI)
+		dev_set_allmulti(real_dev, 1);
+	if (dev->flags & IFF_PROMISC)
+		dev_set_promiscuity(real_dev, 1);
+
 	return 0;
 }
 
@@ -721,6 +726,11 @@ int vlan_dev_stop(struct net_device *dev)
 
 	vlan_flush_mc_list(dev);
 
+	if (dev->flags & IFF_ALLMULTI)
+		dev_set_allmulti(real_dev, -1);
+	if (dev->flags & IFF_PROMISC)
+		dev_set_promiscuity(real_dev, -1);
+
 	if (compare_ether_addr(dev->dev_addr, real_dev->dev_addr))
 		dev_unicast_delete(real_dev, dev->dev_addr, dev->addr_len);
 
@@ -754,34 +764,26 @@ int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	return err;
 }
 
+void vlan_change_rx_flags(struct net_device *dev, int change)
+{
+	struct net_device *real_dev = VLAN_DEV_INFO(dev)->real_dev;
+
+	if (change & IFF_ALLMULTI)
+		dev_set_allmulti(real_dev, dev->flags & IFF_ALLMULTI ? 1 : -1);
+	if (change & IFF_PROMISC)
+		dev_set_promiscuity(real_dev, dev->flags & IFF_PROMISC ? 1 : -1);
+}
+
 /** Taken from Gleb + Lennert's VLAN code, and modified... */
 void vlan_dev_set_multicast_list(struct net_device *vlan_dev)
 {
 	struct dev_mc_list *dmi;
 	struct net_device *real_dev;
-	int inc;
 
 	if (vlan_dev && (vlan_dev->priv_flags & IFF_802_1Q_VLAN)) {
 		/* Then it's a real vlan device, as far as we can tell.. */
 		real_dev = VLAN_DEV_INFO(vlan_dev)->real_dev;
 
-		/* compare the current promiscuity to the last promisc we had.. */
-		inc = vlan_dev->promiscuity - VLAN_DEV_INFO(vlan_dev)->old_promiscuity;
-		if (inc) {
-			printk(KERN_INFO "%s: dev_set_promiscuity(master, %d)\n",
-			       vlan_dev->name, inc);
-			dev_set_promiscuity(real_dev, inc); /* found in dev.c */
-			VLAN_DEV_INFO(vlan_dev)->old_promiscuity = vlan_dev->promiscuity;
-		}
-
-		inc = vlan_dev->allmulti - VLAN_DEV_INFO(vlan_dev)->old_allmulti;
-		if (inc) {
-			printk(KERN_INFO "%s: dev_set_allmulti(master, %d)\n",
-			       vlan_dev->name, inc);
-			dev_set_allmulti(real_dev, inc); /* dev.c */
-			VLAN_DEV_INFO(vlan_dev)->old_allmulti = vlan_dev->allmulti;
-		}
-
 		/* looking for addresses to add to master's list */
 		for (dmi = vlan_dev->mc_list; dmi != NULL; dmi = dmi->next) {
 			if (vlan_should_add_mc(dmi, VLAN_DEV_INFO(vlan_dev)->old_mc_list)) {

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [VLAN 04/05]: Use multicast list synchronization helpers
  2007-07-12 18:13 [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver Patrick McHardy
                   ` (2 preceding siblings ...)
  2007-07-12 18:13 ` [VLAN 03/05]: Fix promiscous/allmulti synchronization races Patrick McHardy
@ 2007-07-12 18:13 ` Patrick McHardy
  2007-07-15  1:53   ` David Miller
  2007-07-12 18:13 ` [NET 05/05]: Add macvlan driver Patrick McHardy
  4 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-07-12 18:13 UTC (permalink / raw)
  To: davem; +Cc: netdev, greearb, Patrick McHardy

[VLAN]: Use multicast list synchronization helpers

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 30fed2136dc8c9baea5aada97fc6ce60e99fdbf9
tree 8919e11b22f621c4ab2a3cbb45e8d4431385a38f
parent 86de1244f4e8736a7bdfc28b899cc2ce5e36cf5c
author Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:57:37 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:57:37 +0200

 include/linux/if_vlan.h |    5 --
 net/8021q/vlan_dev.c    |  131 +----------------------------------------------
 2 files changed, 2 insertions(+), 134 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 7f71df4..f8443fd 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -127,11 +127,6 @@ struct vlan_dev_info {
                                         *   like DHCP that use packet-filtering and don't understand
                                         *   802.1Q
                                         */
-	struct dev_mc_list *old_mc_list;  /* old multi-cast list for the VLAN interface..
-                                           * we save this so we can tell what changes were
-                                           * made, in order to feed the right changes down
-                                           * to the real hardware...
-                                           */
 	struct net_device *real_dev;    /* the underlying device/interface */
 	unsigned char real_dev_addr[ETH_ALEN];
 	struct proc_dir_entry *dent;    /* Holds the proc data */
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index dec7e62..4d2aa4d 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -612,90 +612,6 @@ void vlan_dev_get_vid(const struct net_device *dev, unsigned short *result)
 	*result = VLAN_DEV_INFO(dev)->vlan_id;
 }
 
-static inline int vlan_dmi_equals(struct dev_mc_list *dmi1,
-				  struct dev_mc_list *dmi2)
-{
-	return ((dmi1->dmi_addrlen == dmi2->dmi_addrlen) &&
-		(memcmp(dmi1->dmi_addr, dmi2->dmi_addr, dmi1->dmi_addrlen) == 0));
-}
-
-/** dmi is a single entry into a dev_mc_list, a single node.  mc_list is
- *  an entire list, and we'll iterate through it.
- */
-static int vlan_should_add_mc(struct dev_mc_list *dmi, struct dev_mc_list *mc_list)
-{
-	struct dev_mc_list *idmi;
-
-	for (idmi = mc_list; idmi != NULL; ) {
-		if (vlan_dmi_equals(dmi, idmi)) {
-			if (dmi->dmi_users > idmi->dmi_users)
-				return 1;
-			else
-				return 0;
-		} else {
-			idmi = idmi->next;
-		}
-	}
-
-	return 1;
-}
-
-static inline void vlan_destroy_mc_list(struct dev_mc_list *mc_list)
-{
-	struct dev_mc_list *dmi = mc_list;
-	struct dev_mc_list *next;
-
-	while(dmi) {
-		next = dmi->next;
-		kfree(dmi);
-		dmi = next;
-	}
-}
-
-static void vlan_copy_mc_list(struct dev_mc_list *mc_list, struct vlan_dev_info *vlan_info)
-{
-	struct dev_mc_list *dmi, *new_dmi;
-
-	vlan_destroy_mc_list(vlan_info->old_mc_list);
-	vlan_info->old_mc_list = NULL;
-
-	for (dmi = mc_list; dmi != NULL; dmi = dmi->next) {
-		new_dmi = kmalloc(sizeof(*new_dmi), GFP_ATOMIC);
-		if (new_dmi == NULL) {
-			printk(KERN_ERR "vlan: cannot allocate memory. "
-			       "Multicast may not work properly from now.\n");
-			return;
-		}
-
-		/* Copy whole structure, then make new 'next' pointer */
-		*new_dmi = *dmi;
-		new_dmi->next = vlan_info->old_mc_list;
-		vlan_info->old_mc_list = new_dmi;
-	}
-}
-
-static void vlan_flush_mc_list(struct net_device *dev)
-{
-	struct dev_mc_list *dmi = dev->mc_list;
-
-	while (dmi) {
-		printk(KERN_DEBUG "%s: del %.2x:%.2x:%.2x:%.2x:%.2x:%.2x mcast address from vlan interface\n",
-		       dev->name,
-		       dmi->dmi_addr[0],
-		       dmi->dmi_addr[1],
-		       dmi->dmi_addr[2],
-		       dmi->dmi_addr[3],
-		       dmi->dmi_addr[4],
-		       dmi->dmi_addr[5]);
-		dev_mc_delete(dev, dmi->dmi_addr, dmi->dmi_addrlen, 0);
-		dmi = dev->mc_list;
-	}
-
-	/* dev->mc_list is NULL by the time we get here. */
-	vlan_destroy_mc_list(VLAN_DEV_INFO(dev)->old_mc_list);
-	VLAN_DEV_INFO(dev)->old_mc_list = NULL;
-}
-
 int vlan_dev_open(struct net_device *dev)
 {
 	struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
@@ -724,8 +640,7 @@ int vlan_dev_stop(struct net_device *dev)
 {
 	struct net_device *real_dev = VLAN_DEV_INFO(dev)->real_dev;
 
-	vlan_flush_mc_list(dev);
-
+	dev_mc_unsync(real_dev, dev);
 	if (dev->flags & IFF_ALLMULTI)
 		dev_set_allmulti(real_dev, -1);
 	if (dev->flags & IFF_PROMISC)
@@ -777,47 +692,5 @@ void vlan_change_rx_flags(struct net_device *dev, int change)
 /** Taken from Gleb + Lennert's VLAN code, and modified... */
 void vlan_dev_set_multicast_list(struct net_device *vlan_dev)
 {
-	struct dev_mc_list *dmi;
-	struct net_device *real_dev;
-
-	if (vlan_dev && (vlan_dev->priv_flags & IFF_802_1Q_VLAN)) {
-		/* Then it's a real vlan device, as far as we can tell.. */
-		real_dev = VLAN_DEV_INFO(vlan_dev)->real_dev;
-
-		/* looking for addresses to add to master's list */
-		for (dmi = vlan_dev->mc_list; dmi != NULL; dmi = dmi->next) {
-			if (vlan_should_add_mc(dmi, VLAN_DEV_INFO(vlan_dev)->old_mc_list)) {
-				dev_mc_add(real_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0);
-				printk(KERN_DEBUG "%s: add %.2x:%.2x:%.2x:%.2x:%.2x:%.2x mcast address to master interface\n",
-				       vlan_dev->name,
-				       dmi->dmi_addr[0],
-				       dmi->dmi_addr[1],
-				       dmi->dmi_addr[2],
-				       dmi->dmi_addr[3],
-				       dmi->dmi_addr[4],
-				       dmi->dmi_addr[5]);
-			}
-		}
-
-		/* looking for addresses to delete from master's list */
-		for (dmi = VLAN_DEV_INFO(vlan_dev)->old_mc_list; dmi != NULL; dmi = dmi->next) {
-			if (vlan_should_add_mc(dmi, vlan_dev->mc_list)) {
-				/* if we think we should add it to the new list, then we should really
-				 * delete it from the real list on the underlying device.
-				 */
-				dev_mc_delete(real_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0);
-				printk(KERN_DEBUG "%s: del %.2x:%.2x:%.2x:%.2x:%.2x:%.2x mcast address from master interface\n",
-				       vlan_dev->name,
-				       dmi->dmi_addr[0],
-				       dmi->dmi_addr[1],
-				       dmi->dmi_addr[2],
-				       dmi->dmi_addr[3],
-				       dmi->dmi_addr[4],
-				       dmi->dmi_addr[5]);
-			}
-		}
-
-		/* save multicast list */
-		vlan_copy_mc_list(vlan_dev->mc_list, VLAN_DEV_INFO(vlan_dev));
-	}
+	dev_mc_sync(VLAN_DEV_INFO(vlan_dev)->real_dev, vlan_dev);
 }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [NET 05/05]: Add macvlan driver
  2007-07-12 18:13 [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver Patrick McHardy
                   ` (3 preceding siblings ...)
  2007-07-12 18:13 ` [VLAN 04/05]: Use multicast list synchronization helpers Patrick McHardy
@ 2007-07-12 18:13 ` Patrick McHardy
  2007-07-15  1:56   ` David Miller
  4 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2007-07-12 18:13 UTC (permalink / raw)
  To: davem; +Cc: netdev, greearb, Patrick McHardy

[NET]: Add macvlan driver

Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 553029f4dd24c41322c3e6cb31458e7e094d21ee
tree 71791bde3cea46103a44389f7f07cf6eea86eba9
parent 30fed2136dc8c9baea5aada97fc6ce60e99fdbf9
author Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:57:45 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 12 Jul 2007 19:57:45 +0200

 MAINTAINERS                |    6 +
 drivers/net/Kconfig        |   10 +
 drivers/net/Makefile       |    1 
 drivers/net/macvlan.c      |  496 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/if_macvlan.h |    9 +
 include/linux/netdevice.h  |    2 
 net/core/dev.c             |   26 ++
 7 files changed, 550 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index fcfe598..a368686 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2330,6 +2330,12 @@ W:	http://linuxwireless.org/
 T:	git kernel.org:/pub/scm/linux/kernel/git/jbenc/mac80211.git
 S:	Maintained
 
+MACVLAN DRIVER
+P:	Patrick McHardy
+M:	kaber@trash.net
+L:	netdev@vger.kernel.org
+S:	Maintained
+
 MARVELL YUKON / SYSKONNECT DRIVER
 P:	Mirko Lindner
 M: 	mlindner@syskonnect.de
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index d4e39ff..51117d9 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -82,6 +82,16 @@ config BONDING
 	  To compile this driver as a module, choose M here: the module
 	  will be called bonding.
 
+config MACVLAN
+	tristate "MAC-VLAN support (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	---help---
+	  This allows one to create virtual interfaces that map packets to
+	  or from specific MAC addresses to a particular interface.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called macvlan.
+
 config EQUALIZER
 	tristate "EQL (serial line load balancing) support"
 	---help---
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a2241e6..c26b867 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -128,6 +128,7 @@ obj-$(CONFIG_SLHC) += slhc.o
 
 obj-$(CONFIG_DUMMY) += dummy.o
 obj-$(CONFIG_IFB) += ifb.o
+obj-$(CONFIG_MACVLAN) += macvlan.o
 obj-$(CONFIG_DE600) += de600.o
 obj-$(CONFIG_DE620) += de620.o
 obj-$(CONFIG_LANCE) += lance.o
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
new file mode 100644
index 0000000..dc74d00
--- /dev/null
+++ b/drivers/net/macvlan.c
@@ -0,0 +1,496 @@
+/*
+ * Copyright (c) 2007 Patrick McHardy <kaber@trash.net>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * The code this is based on carried the following copyright notice:
+ * ---
+ * (C) Copyright 2001-2006
+ * Alex Zeffertt, Cambridge Broadband Ltd, ajz@cambridgebroadband.com
+ * Re-worked by Ben Greear <greearb@candelatech.com>
+ * ---
+ */
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/list.h>
+#include <linux/notifier.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/ethtool.h>
+#include <linux/if_arp.h>
+#include <linux/if_link.h>
+#include <linux/if_macvlan.h>
+#include <net/rtnetlink.h>
+
+#define MACVLAN_HASH_SIZE	(1 << BITS_PER_BYTE)
+
+struct macvlan_port {
+	struct net_device	*dev;
+	struct hlist_head	vlan_hash[MACVLAN_HASH_SIZE];
+	struct list_head	vlans;
+};
+
+struct macvlan_dev {
+	struct net_device	*dev;
+	struct list_head	list;
+	struct hlist_node	hlist;
+	struct macvlan_port	*port;
+	struct net_device	*lowerdev;
+};
+
+
+static struct macvlan_dev *macvlan_hash_lookup(const struct macvlan_port *port,
+					       const unsigned char *addr)
+{
+	struct macvlan_dev *vlan;
+	struct hlist_node *n;
+
+	hlist_for_each_entry_rcu(vlan, n, &port->vlan_hash[addr[5]], hlist) {
+		if (!compare_ether_addr(vlan->dev->dev_addr, addr))
+			return vlan;
+	}
+	return NULL;
+}
+
+static void macvlan_broadcast(struct sk_buff *skb,
+			      const struct macvlan_port *port)
+{
+	const struct ethhdr *eth = eth_hdr(skb);
+	const struct macvlan_dev *vlan;
+	struct hlist_node *n;
+	struct net_device *dev;
+	struct sk_buff *nskb;
+	unsigned int i;
+
+	for (i = 0; i < MACVLAN_HASH_SIZE; i++) {
+		hlist_for_each_entry_rcu(vlan, n, &port->vlan_hash[i], hlist) {
+			dev = vlan->dev;
+			if (unlikely(!(dev->flags & IFF_UP)))
+				continue;
+
+			nskb = skb_clone(skb, GFP_ATOMIC);
+			if (nskb == NULL) {
+				dev->stats.rx_errors++;
+				dev->stats.rx_dropped++;
+				continue;
+			}
+
+			dev->stats.rx_bytes += skb->len + ETH_HLEN;
+			dev->stats.rx_packets++;
+			dev->stats.multicast++;
+			dev->last_rx = jiffies;
+
+			nskb->dev = dev;
+			if (!compare_ether_addr(eth->h_dest, dev->broadcast))
+				nskb->pkt_type = PACKET_BROADCAST;
+			else
+				nskb->pkt_type = PACKET_MULTICAST;
+
+			netif_rx(nskb);
+		}
+	}
+}
+
+/* called under rcu_read_lock() from netif_receive_skb */
+static struct sk_buff *macvlan_handle_frame(struct sk_buff *skb)
+{
+	const struct ethhdr *eth = eth_hdr(skb);
+	const struct macvlan_port *port;
+	const struct macvlan_dev *vlan;
+	struct net_device *dev;
+
+	port = rcu_dereference(skb->dev->macvlan_port);
+	if (port == NULL)
+		return skb;
+
+	if (is_multicast_ether_addr(eth->h_dest)) {
+		macvlan_broadcast(skb, port);
+		return skb;
+	}
+
+	vlan = macvlan_hash_lookup(port, eth->h_dest);
+	if (vlan == NULL)
+		return skb;
+
+	dev = vlan->dev;
+	if (unlikely(!(dev->flags & IFF_UP))) {
+		kfree_skb(skb);
+		return NULL;
+	}
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (skb == NULL) {
+		dev->stats.rx_errors++;
+		dev->stats.rx_dropped++;
+		return NULL;
+	}
+
+	dev->stats.rx_bytes += skb->len + ETH_HLEN;
+	dev->stats.rx_packets++;
+	dev->last_rx = jiffies;
+
+	skb->dev = dev;
+	skb->pkt_type = PACKET_HOST;
+
+	netif_rx(skb);
+	return NULL;
+}
+
+static int macvlan_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	const struct macvlan_dev *vlan = netdev_priv(dev);
+	unsigned int len = skb->len;
+	int ret;
+
+	skb->dev = vlan->lowerdev;
+	ret = dev_queue_xmit(skb);
+
+	if (likely(ret == NET_XMIT_SUCCESS)) {
+		dev->stats.tx_packets++;
+		dev->stats.tx_bytes += len;
+	} else {
+		dev->stats.tx_errors++;
+		dev->stats.tx_aborted_errors++;
+	}
+	return NETDEV_TX_OK;
+}
+
+static int macvlan_hard_header(struct sk_buff *skb, struct net_device *dev,
+			       unsigned short type, void *daddr, void *saddr,
+			       unsigned len)
+{
+	const struct macvlan_dev *vlan = netdev_priv(dev);
+	struct net_device *lowerdev = vlan->lowerdev;
+
+	return lowerdev->hard_header(skb, lowerdev, type, daddr,
+				     saddr ? : dev->dev_addr, len);
+}
+
+static int macvlan_open(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;
+	struct net_device *lowerdev = vlan->lowerdev;
+	int err;
+
+	err = dev_unicast_add(lowerdev, dev->dev_addr, ETH_ALEN);
+	if (err < 0)
+		return err;
+	if (dev->flags & IFF_ALLMULTI)
+		dev_set_allmulti(lowerdev, 1);
+
+	hlist_add_head_rcu(&vlan->hlist, &port->vlan_hash[dev->dev_addr[5]]);
+	return 0;
+}
+
+static int macvlan_stop(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct net_device *lowerdev = vlan->lowerdev;
+
+	dev_mc_unsync(lowerdev, dev);
+	if (dev->flags & IFF_ALLMULTI)
+		dev_set_allmulti(lowerdev, -1);
+
+	dev_unicast_delete(lowerdev, dev->dev_addr, ETH_ALEN);
+
+	hlist_del_rcu(&vlan->hlist);
+	synchronize_rcu();
+	return 0;
+}
+
+static void macvlan_change_rx_flags(struct net_device *dev, int change)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct net_device *lowerdev = vlan->lowerdev;
+
+	if (change & IFF_ALLMULTI)
+		dev_set_allmulti(lowerdev, dev->flags & IFF_ALLMULTI ? 1 : -1);
+}
+
+static void macvlan_set_multicast_list(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+
+	dev_mc_sync(vlan->lowerdev, dev);
+}
+
+static int macvlan_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+
+	if (new_mtu < 68 || vlan->lowerdev->mtu < new_mtu)
+		return -EINVAL;
+	dev->mtu = new_mtu;
+	return 0;
+}
+
+/*
+ * macvlan network devices have devices nesting below it and are a special
+ * "super class" of normal network devices; split their locks off into a
+ * separate class since they always nest.
+ */
+static struct lock_class_key macvlan_netdev_xmit_lock_key;
+
+#define MACVLAN_FEATURES \
+	(NETIF_F_SG | NETIF_F_ALL_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \
+	 NETIF_F_GSO | NETIF_F_TSO | NETIF_F_UFO | NETIF_F_GSO_ROBUST | \
+	 NETIF_F_TSO_ECN | NETIF_F_TSO6)
+
+#define MACVLAN_STATE_MASK \
+	((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT))
+
+static int macvlan_init(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	const struct net_device *lowerdev = vlan->lowerdev;
+
+	dev->state		= (dev->state & ~MACVLAN_STATE_MASK) |
+				  (lowerdev->state & MACVLAN_STATE_MASK);
+	dev->features 		= lowerdev->features & MACVLAN_FEATURES;
+	dev->iflink		= lowerdev->ifindex;
+
+	lockdep_set_class(&dev->_xmit_lock, &macvlan_netdev_xmit_lock_key);
+	return 0;
+}
+
+static void macvlan_ethtool_get_drvinfo(struct net_device *dev,
+					struct ethtool_drvinfo *drvinfo)
+{
+	snprintf(drvinfo->driver, 32, "macvlan");
+	snprintf(drvinfo->version, 32, "0.1");
+}
+
+static u32 macvlan_ethtool_get_rx_csum(struct net_device *dev)
+{
+	const struct macvlan_dev *vlan = netdev_priv(dev);
+	struct net_device *lowerdev = vlan->lowerdev;
+
+	if (lowerdev->ethtool_ops->get_rx_csum == NULL)
+		return 0;
+	return lowerdev->ethtool_ops->get_rx_csum(lowerdev);
+}
+
+static const struct ethtool_ops macvlan_ethtool_ops = {
+	.get_link		= ethtool_op_get_link,
+	.get_rx_csum		= macvlan_ethtool_get_rx_csum,
+	.get_tx_csum		= ethtool_op_get_tx_csum,
+	.get_tso		= ethtool_op_get_tso,
+	.get_ufo		= ethtool_op_get_ufo,
+	.get_sg			= ethtool_op_get_sg,
+	.get_drvinfo		= macvlan_ethtool_get_drvinfo,
+};
+
+static void macvlan_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+
+	dev->init		= macvlan_init;
+	dev->open		= macvlan_open;
+	dev->stop		= macvlan_stop;
+	dev->change_mtu		= macvlan_change_mtu;
+	dev->change_rx_flags	= macvlan_change_rx_flags;
+	dev->set_multicast_list	= macvlan_set_multicast_list;
+	dev->hard_header	= macvlan_hard_header;
+	dev->hard_start_xmit	= macvlan_hard_start_xmit;
+	dev->destructor		= free_netdev;
+	dev->ethtool_ops	= &macvlan_ethtool_ops;
+	dev->tx_queue_len	= 0;
+}
+
+static int macvlan_port_create(struct net_device *dev)
+{
+	struct macvlan_port *port;
+	unsigned int i;
+
+	if (dev->type != ARPHRD_ETHER || dev->flags & IFF_LOOPBACK)
+		return -EINVAL;
+
+	port = kzalloc(sizeof(*port), GFP_KERNEL);
+	if (port == NULL)
+		return -ENOMEM;
+
+	port->dev = dev;
+	INIT_LIST_HEAD(&port->vlans);
+	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
+		INIT_HLIST_HEAD(&port->vlan_hash[i]);
+	rcu_assign_pointer(dev->macvlan_port, port);
+	return 0;
+}
+
+static void macvlan_port_destroy(struct net_device *dev)
+{
+	struct macvlan_port *port = dev->macvlan_port;
+
+	rcu_assign_pointer(dev->macvlan_port, NULL);
+	synchronize_rcu();
+	kfree(port);
+}
+
+static void macvlan_transfer_operstate(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	const struct net_device *lowerdev = vlan->lowerdev;
+
+	if (lowerdev->operstate == IF_OPER_DORMANT)
+		netif_dormant_on(dev);
+	else
+		netif_dormant_off(dev);
+
+	if (netif_carrier_ok(lowerdev)) {
+		if (!netif_carrier_ok(dev))
+			netif_carrier_on(dev);
+	} else {
+		if (netif_carrier_ok(lowerdev))
+			netif_carrier_off(dev);
+	}
+}
+
+static int macvlan_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+	if (tb[IFLA_ADDRESS]) {
+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+			return -EINVAL;
+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+			return -EADDRNOTAVAIL;
+	}
+	return 0;
+}
+
+static int macvlan_newlink(struct net_device *dev,
+			   struct nlattr *tb[], struct nlattr *data[])
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port;
+	struct net_device *lowerdev;
+	int err;
+
+	if (!tb[IFLA_LINK])
+		return -EINVAL;
+
+	lowerdev = __dev_get_by_index(nla_get_u32(tb[IFLA_LINK]));
+	if (lowerdev == NULL)
+		return -ENODEV;
+
+	if (!tb[IFLA_MTU])
+		dev->mtu = lowerdev->mtu;
+	else if (dev->mtu > lowerdev->mtu)
+		return -EINVAL;
+
+	if (!tb[IFLA_ADDRESS])
+		random_ether_addr(dev->dev_addr);
+
+	if (lowerdev->macvlan_port == NULL) {
+		err = macvlan_port_create(lowerdev);
+		if (err < 0)
+			return err;
+	}
+	port = lowerdev->macvlan_port;
+
+	vlan->lowerdev = lowerdev;
+	vlan->dev      = dev;
+	vlan->port     = port;
+
+	err = register_netdevice(dev);
+	if (err < 0)
+		return err;
+
+	list_add_tail(&vlan->list, &port->vlans);
+	macvlan_transfer_operstate(dev);
+	return 0;
+}
+
+static void macvlan_dellink(struct net_device *dev)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;
+
+	list_del(&vlan->list);
+	unregister_netdevice(dev);
+
+	if (list_empty(&port->vlans))
+		macvlan_port_destroy(dev);
+}
+
+static struct rtnl_link_ops macvlan_link_ops __read_mostly = {
+	.kind		= "macvlan",
+	.priv_size	= sizeof(struct macvlan_dev),
+	.setup		= macvlan_setup,
+	.validate	= macvlan_validate,
+	.newlink	= macvlan_newlink,
+	.dellink	= macvlan_dellink,
+};
+
+static int macvlan_device_event(struct notifier_block *unused,
+				unsigned long event, void *ptr)
+{
+	struct net_device *dev = ptr;
+	struct macvlan_dev *vlan, *next;
+	struct macvlan_port *port;
+
+	port = dev->macvlan_port;
+	if (port == NULL)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_CHANGE:
+		list_for_each_entry(vlan, &port->vlans, list)
+			macvlan_transfer_operstate(vlan->dev);
+		break;
+	case NETDEV_FEAT_CHANGE:
+		list_for_each_entry(vlan, &port->vlans, list) {
+			vlan->dev->features = dev->features & MACVLAN_FEATURES;
+			netdev_features_change(vlan->dev);
+		}
+		break;
+	case NETDEV_UNREGISTER:
+		list_for_each_entry_safe(vlan, next, &port->vlans, list)
+			macvlan_dellink(vlan->dev);
+		break;
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block macvlan_notifier_block __read_mostly = {
+	.notifier_call	= macvlan_device_event,
+};
+
+static int __init macvlan_init_module(void)
+{
+	int err;
+
+	register_netdevice_notifier(&macvlan_notifier_block);
+	macvlan_handle_frame_hook = macvlan_handle_frame;
+
+	err = rtnl_link_register(&macvlan_link_ops);
+	if (err < 0)
+		goto err1;
+	return 0;
+err1:
+	macvlan_handle_frame_hook = macvlan_handle_frame;
+	unregister_netdevice_notifier(&macvlan_notifier_block);
+	return err;
+}
+
+static void __exit macvlan_cleanup_module(void)
+{
+	rtnl_link_unregister(&macvlan_link_ops);
+	macvlan_handle_frame_hook = NULL;
+	unregister_netdevice_notifier(&macvlan_notifier_block);
+}
+
+module_init(macvlan_init_module);
+module_exit(macvlan_cleanup_module);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>");
+MODULE_DESCRIPTION("Driver for MAC address based VLANs");
+MODULE_ALIAS_RTNL_LINK("macvlan");
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
new file mode 100644
index 0000000..0d9d7ea
--- /dev/null
+++ b/include/linux/if_macvlan.h
@@ -0,0 +1,9 @@
+#ifndef _LINUX_IF_MACVLAN_H
+#define _LINUX_IF_MACVLAN_H
+
+#ifdef __KERNEL__
+
+extern struct sk_buff *(*macvlan_handle_frame_hook)(struct sk_buff *);
+
+#endif /* __KERNEL__ */
+#endif /* _LINUX_IF_MACVLAN_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e5af458..322b5ea 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -564,6 +564,8 @@ struct net_device
 
 	/* bridge stuff */
 	struct net_bridge_port	*br_port;
+	/* macvlan */
+	struct macvlan_port	*macvlan_port;
 
 	/* class/net/name entry */
 	struct device		dev;
diff --git a/net/core/dev.c b/net/core/dev.c
index cb055e5..f18687f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -98,6 +98,7 @@
 #include <linux/seq_file.h>
 #include <linux/stat.h>
 #include <linux/if_bridge.h>
+#include <linux/if_macvlan.h>
 #include <net/dst.h>
 #include <net/pkt_sched.h>
 #include <net/checksum.h>
@@ -1800,6 +1801,28 @@ static inline struct sk_buff *handle_bridge(struct sk_buff *skb,
 #define handle_bridge(skb, pt_prev, ret, orig_dev)	(skb)
 #endif
 
+#if defined(CONFIG_MACVLAN) || defined(CONFIG_MACVLAN_MODULE)
+struct sk_buff *(*macvlan_handle_frame_hook)(struct sk_buff *skb) __read_mostly;
+EXPORT_SYMBOL_GPL(macvlan_handle_frame_hook);
+
+static inline struct sk_buff *handle_macvlan(struct sk_buff *skb,
+					     struct packet_type **pt_prev,
+					     int *ret,
+					     struct net_device *orig_dev)
+{
+	if (skb->dev->macvlan_port == NULL)
+		return skb;
+
+	if (*pt_prev) {
+		*ret = deliver_skb(skb, *pt_prev, orig_dev);
+		*pt_prev = NULL;
+	}
+	return macvlan_handle_frame_hook(skb);
+}
+#else
+#define handle_macvlan(skb, pt_prev, ret, orig_dev)	(skb)
+#endif
+
 #ifdef CONFIG_NET_CLS_ACT
 /* TODO: Maybe we should just force sch_ingress to be compiled in
  * when CONFIG_NET_CLS_ACT is? otherwise some useless instructions
@@ -1907,6 +1930,9 @@ ncls:
 	skb = handle_bridge(skb, &pt_prev, &ret, orig_dev);
 	if (!skb)
 		goto out;
+	skb = handle_macvlan(skb, &pt_prev, &ret, orig_dev);
+	if (!skb)
+		goto out;
 
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type)&15], list) {

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [NET 01/05]: Add net_device change_rx_mode callback
  2007-07-12 18:13 ` [NET 01/05]: Add net_device change_rx_mode callback Patrick McHardy
@ 2007-07-15  1:51   ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2007-07-15  1:51 UTC (permalink / raw)
  To: kaber; +Cc: netdev, greearb

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 12 Jul 2007 20:13:40 +0200 (MEST)

> [NET]: Add net_device change_rx_mode callback
> 
> Currently the set_multicast_list (and set_rx_mode) callbacks are
> responsible for configuring the device according to the IFF_PROMISC,
> IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
> case of set_rx_mode).
> 
> These callbacks can be invoked from BH context without the rtnl_mutex
> by dev_mc_add/dev_mc_delete, which makes reading the device flags and
> promiscous/allmulti count racy. For real hardware drivers that just
> commit all changes to the hardware this is not a real problem since
> the stack guarantees to call them for every change, so at least the
> final call will not race and commit the correct configuration to the
> hardware.
> 
> For software devices that want to synchronize promiscous and multicast
> state to an underlying device however this can cause corruption of the
> underlying device's flags or promisc/allmulti counts.
> 
> When the software device is concurrently put in promiscous or allmulti
> mode while set_multicast_list is invoked from bottem half context, the
> device might synchronize the change to the underlying device without
> holding the rtnl_mutex, which races with concurrent changes to the
> underlying device.
> 
> Add a dev->change_rx_flags hook that is invoked when any of the flags
> that affect rx filtering change (under the rtnl_mutex), which allows
> drivers to perform synchronization immediately and only synchronize
> the address lists in set_multicast_list/set_rx_mode.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied, thanks Patrick.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [NET 02/05]: dev_mcast: add multicast list synchronization helpers
  2007-07-12 18:13 ` [NET 02/05]: dev_mcast: add multicast list synchronization helpers Patrick McHardy
@ 2007-07-15  1:52   ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2007-07-15  1:52 UTC (permalink / raw)
  To: kaber; +Cc: netdev, greearb

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 12 Jul 2007 20:13:42 +0200 (MEST)

> [NET]: dev_mcast: add multicast list synchronization helpers
> 
> The method drivers currently use to synchronize multicast lists is not
> very pretty:
> 
> - walk the multicast list
> - search each entry on a copy of the previous list
> - if new add to lower device
> - walk the copy of the previous list
> - search each entry on the current list
> - if removed delete from lower device
> - copy entire list
> 
> This patch adds a new field to struct dev_addr_list to store the
> synchronization state and adds two helper functions for synchronization
> and cleanup.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [VLAN 03/05]: Fix promiscous/allmulti synchronization races
  2007-07-12 18:13 ` [VLAN 03/05]: Fix promiscous/allmulti synchronization races Patrick McHardy
@ 2007-07-15  1:53   ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2007-07-15  1:53 UTC (permalink / raw)
  To: kaber; +Cc: netdev, greearb

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 12 Jul 2007 20:13:44 +0200 (MEST)

> [VLAN]: Fix promiscous/allmulti synchronization races
> 
> The set_multicast_list function may be called without holding the rtnl
> mutex, resulting in races when changing the underlying device's promiscous
> and allmulti state. Use the change_rx_mode hook, which is always invoked
> under the rtnl.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [VLAN 04/05]: Use multicast list synchronization helpers
  2007-07-12 18:13 ` [VLAN 04/05]: Use multicast list synchronization helpers Patrick McHardy
@ 2007-07-15  1:53   ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2007-07-15  1:53 UTC (permalink / raw)
  To: kaber; +Cc: netdev, greearb

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 12 Jul 2007 20:13:46 +0200 (MEST)

> [VLAN]: Use multicast list synchronization helpers
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [NET 05/05]: Add macvlan driver
  2007-07-12 18:13 ` [NET 05/05]: Add macvlan driver Patrick McHardy
@ 2007-07-15  1:56   ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2007-07-15  1:56 UTC (permalink / raw)
  To: kaber; +Cc: netdev, greearb

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 12 Jul 2007 20:13:48 +0200 (MEST)

> [NET]: Add macvlan driver
> 
> Add macvlan driver, which allows to create virtual ethernet devices
> based on MAC address.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Looks good, applied.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-07-15  1:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-12 18:13 [NET 00/05]: netdevice rx mode synchronization fixes + macvlan driver Patrick McHardy
2007-07-12 18:13 ` [NET 01/05]: Add net_device change_rx_mode callback Patrick McHardy
2007-07-15  1:51   ` David Miller
2007-07-12 18:13 ` [NET 02/05]: dev_mcast: add multicast list synchronization helpers Patrick McHardy
2007-07-15  1:52   ` David Miller
2007-07-12 18:13 ` [VLAN 03/05]: Fix promiscous/allmulti synchronization races Patrick McHardy
2007-07-15  1:53   ` David Miller
2007-07-12 18:13 ` [VLAN 04/05]: Use multicast list synchronization helpers Patrick McHardy
2007-07-15  1:53   ` David Miller
2007-07-12 18:13 ` [NET 05/05]: Add macvlan driver Patrick McHardy
2007-07-15  1:56   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).