From: Arnd Bergmann <arnd@arndb.de>
To: Patrick McHardy <kaber@trash.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
Eric Dumazet <eric.dumazet@gmail.com>,
Anna Fischer <anna.fischer@hp.com>,
netdev@vger.kernel.org, bridge@lists.linux-foundation.org,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Mark Smith <lk-netdev@lk-netdev.nosense.org>,
Gerhard Stenzel <gerhard.stenzel@de.ibm.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Jens Osterkamp <jens@linux.vnet.ibm.com>,
Patrick Mullaney <pmullaney@novell.com>,
Stephen Hemminger <shemminger@vyatta.com>,
David Miller <davem@davemloft.net>
Subject: Re: [Bridge] [PATCH 1/4] veth: move loopback logic to common location
Date: Thu, 26 Nov 2009 18:44:59 +0100 [thread overview]
Message-ID: <200911261844.59912.arnd@arndb.de> (raw)
In-Reply-To: <4B0E9FD0.4040107@trash.net>
On Thursday 26 November 2009, Patrick McHardy wrote:
> In addition to those already handled, I'd say
>
> - priority: affects qdisc classification, may refer to classes of the
> old namespace
> - ipvs_property: might cause packets to incorrectly skip netfilter hooks
> - nf_trace: might trigger packet tracing
> - nf_bridge: contains references to network devices in the old NS,
> also indicates packet was bridged
> - iif: index is only valid in the originating namespace
> - probably secmark.
ok
> - tc_index: classification result, should only be set in the namespace
> of the classifier
> - tc_verd: RTTL etc. should begin at zero again
Wouldn't that defeat the purpose of RTTL? If you create a loop
across two devices in different namespaces, it may no longer get
detected. Or is that a different problem again?
Arnd <><
---
net: maintain namespace isolation between vlan and real device
In the vlan and macvlan drivers, the start_xmit function forwards
data to the dev_queue_xmit function for another device, which may
potentially belong to a different namespace.
To make sure that classification stays within a single namespace,
this resets the potentially critical fields.
Still needs testing, don't apply
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/net/macvlan.c | 2 +-
include/linux/netdevice.h | 9 +++++++++
net/8021q/vlan_dev.c | 2 +-
net/core/dev.c | 37 +++++++++++++++++++++++++++++++++----
4 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 322112c..edcebf1 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -269,7 +269,7 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
}
xmit_world:
- skb->dev = vlan->lowerdev;
+ skb_set_dev(skb, vlan->lowerdev);
return dev_queue_xmit(skb);
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9428793..fdf4a1a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1009,6 +1009,15 @@ static inline bool netdev_uses_dsa_tags(struct net_device *dev)
return 0;
}
+#ifdef CONFIG_NET_NS
+static inline void skb_set_dev(struct sk_buff *skb, struct net_device *dev)
+{
+ skb->dev = dev;
+}
+#else /* CONFIG_NET_NS */
+void skb_set_dev(struct sk_buff *skb, struct net_device *dev);
+#endif
+
static inline bool netdev_uses_trailer_tags(struct net_device *dev)
{
#ifdef CONFIG_NET_DSA_TAG_TRAILER
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index de0dc6b..51fcfff 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -323,7 +323,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
}
- skb->dev = vlan_dev_info(dev)->real_dev;
+ skb_set_dev(skb, vlan_dev_info(dev)->real_dev);
len = skb->len;
ret = dev_queue_xmit(skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index f8baa15..220d4e4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1448,13 +1448,10 @@ int dev_forward_skb(struct net_device *dev, struct sk_buff *skb)
if (skb->len > (dev->mtu + dev->hard_header_len))
return NET_RX_DROP;
- skb_dst_drop(skb);
+ skb_set_dev(skb, dev);
skb->tstamp.tv64 = 0;
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, dev);
- skb->mark = 0;
- secpath_reset(skb);
- nf_reset(skb);
return netif_rx(skb);
}
EXPORT_SYMBOL_GPL(dev_forward_skb);
@@ -1614,6 +1611,39 @@ static bool dev_can_checksum(struct net_device *dev, struct sk_buff *skb)
return false;
}
+/**
+ * skb_dev_set -- assign a buffer to a new device
+ * @skb: buffer for the new device
+ * @dev: network device
+ *
+ * If an skb is owned by a device already, we have to reset
+ * all data private to the namespace a device belongs to
+ * before assigning it a new device.
+ */
+void skb_set_dev(struct sk_buff *skb, struct net_device *dev)
+{
+ if (skb->dev && !net_eq(dev_net(skb->dev), dev_net(dev))) {
+ secpath_reset(skb);
+ skb_dst_drop(skb);
+ nf_reset(skb);
+ skb_init_secmark(skb);
+ skb->mark = 0;
+ skb->priority = 0;
+ skb->nf_trace = 0;
+ skb->ipvs_property = 0;
+#ifdef CONFIG_NET_SCHED
+ skb->tc_index = 0;
+#ifdef CONFIG_NET_CLS_ACT
+ skb->tc_verd = SET_TC_VERD(skb->tc_verd, 0);
+ skb->tc_verd = SET_TC_RTTL(skb->tc_verd, 0);
+#endif
+#endif
+ }
+ skb->dev = dev;
+ skb->skb_iif = skb->dev->ifindex;
+}
+EXPORT_SYMBOL(skb_set_dev);
+
/*
* Invalidate hardware checksum when packet is to be mangled, and
* complete checksum manually on outgoing path.
WARNING: multiple messages have this Message-ID (diff)
From: Arnd Bergmann <arnd@arndb.de>
To: Patrick McHardy <kaber@trash.net>
Cc: David Miller <davem@davemloft.net>,
"Eric W. Biederman" <ebiederm@xmission.com>,
virtualization@lists.linux-foundation.org,
Herbert Xu <herbert@gondor.apana.org.au>,
Eric Dumazet <eric.dumazet@gmail.com>,
Anna Fischer <anna.fischer@hp.com>,
netdev@vger.kernel.org, bridge@lists.linux-foundation.org,
linux-kernel@vger.kernel.org,
Mark Smith <lk-netdev@lk-netdev.nosense.org>,
Gerhard Stenzel <gerhard.stenzel@de.ibm.com>,
Jens Osterkamp <jens@linux.vnet.ibm.com>,
Patrick Mullaney <pmullaney@novell.com>,
Stephen Hemminger <shemminger@vyatta.com>
Subject: Re: [PATCH 1/4] veth: move loopback logic to common location
Date: Thu, 26 Nov 2009 18:44:59 +0100 [thread overview]
Message-ID: <200911261844.59912.arnd@arndb.de> (raw)
In-Reply-To: <4B0E9FD0.4040107@trash.net>
On Thursday 26 November 2009, Patrick McHardy wrote:
> In addition to those already handled, I'd say
>
> - priority: affects qdisc classification, may refer to classes of the
> old namespace
> - ipvs_property: might cause packets to incorrectly skip netfilter hooks
> - nf_trace: might trigger packet tracing
> - nf_bridge: contains references to network devices in the old NS,
> also indicates packet was bridged
> - iif: index is only valid in the originating namespace
> - probably secmark.
ok
> - tc_index: classification result, should only be set in the namespace
> of the classifier
> - tc_verd: RTTL etc. should begin at zero again
Wouldn't that defeat the purpose of RTTL? If you create a loop
across two devices in different namespaces, it may no longer get
detected. Or is that a different problem again?
Arnd <><
---
net: maintain namespace isolation between vlan and real device
In the vlan and macvlan drivers, the start_xmit function forwards
data to the dev_queue_xmit function for another device, which may
potentially belong to a different namespace.
To make sure that classification stays within a single namespace,
this resets the potentially critical fields.
Still needs testing, don't apply
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/net/macvlan.c | 2 +-
include/linux/netdevice.h | 9 +++++++++
net/8021q/vlan_dev.c | 2 +-
net/core/dev.c | 37 +++++++++++++++++++++++++++++++++----
4 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 322112c..edcebf1 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -269,7 +269,7 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
}
xmit_world:
- skb->dev = vlan->lowerdev;
+ skb_set_dev(skb, vlan->lowerdev);
return dev_queue_xmit(skb);
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9428793..fdf4a1a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1009,6 +1009,15 @@ static inline bool netdev_uses_dsa_tags(struct net_device *dev)
return 0;
}
+#ifdef CONFIG_NET_NS
+static inline void skb_set_dev(struct sk_buff *skb, struct net_device *dev)
+{
+ skb->dev = dev;
+}
+#else /* CONFIG_NET_NS */
+void skb_set_dev(struct sk_buff *skb, struct net_device *dev);
+#endif
+
static inline bool netdev_uses_trailer_tags(struct net_device *dev)
{
#ifdef CONFIG_NET_DSA_TAG_TRAILER
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index de0dc6b..51fcfff 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -323,7 +323,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
}
- skb->dev = vlan_dev_info(dev)->real_dev;
+ skb_set_dev(skb, vlan_dev_info(dev)->real_dev);
len = skb->len;
ret = dev_queue_xmit(skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index f8baa15..220d4e4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1448,13 +1448,10 @@ int dev_forward_skb(struct net_device *dev, struct sk_buff *skb)
if (skb->len > (dev->mtu + dev->hard_header_len))
return NET_RX_DROP;
- skb_dst_drop(skb);
+ skb_set_dev(skb, dev);
skb->tstamp.tv64 = 0;
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, dev);
- skb->mark = 0;
- secpath_reset(skb);
- nf_reset(skb);
return netif_rx(skb);
}
EXPORT_SYMBOL_GPL(dev_forward_skb);
@@ -1614,6 +1611,39 @@ static bool dev_can_checksum(struct net_device *dev, struct sk_buff *skb)
return false;
}
+/**
+ * skb_dev_set -- assign a buffer to a new device
+ * @skb: buffer for the new device
+ * @dev: network device
+ *
+ * If an skb is owned by a device already, we have to reset
+ * all data private to the namespace a device belongs to
+ * before assigning it a new device.
+ */
+void skb_set_dev(struct sk_buff *skb, struct net_device *dev)
+{
+ if (skb->dev && !net_eq(dev_net(skb->dev), dev_net(dev))) {
+ secpath_reset(skb);
+ skb_dst_drop(skb);
+ nf_reset(skb);
+ skb_init_secmark(skb);
+ skb->mark = 0;
+ skb->priority = 0;
+ skb->nf_trace = 0;
+ skb->ipvs_property = 0;
+#ifdef CONFIG_NET_SCHED
+ skb->tc_index = 0;
+#ifdef CONFIG_NET_CLS_ACT
+ skb->tc_verd = SET_TC_VERD(skb->tc_verd, 0);
+ skb->tc_verd = SET_TC_RTTL(skb->tc_verd, 0);
+#endif
+#endif
+ }
+ skb->dev = dev;
+ skb->skb_iif = skb->dev->ifindex;
+}
+EXPORT_SYMBOL(skb_set_dev);
+
/*
* Invalidate hardware checksum when packet is to be mangled, and
* complete checksum manually on outgoing path.
next prev parent reply other threads:[~2009-11-26 17:44 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-24 0:56 [Bridge] [PATCHv2 0/4] macvlan: add vepa and bridge mode Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 0:56 ` [PATCH 1/4] veth: move loopback logic to common location Arnd Bergmann
2009-11-24 0:56 ` [Bridge] " Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 9:51 ` Patrick McHardy
2009-11-24 9:51 ` [Bridge] " Patrick McHardy
2009-11-24 9:51 ` Patrick McHardy
2009-11-24 10:02 ` Arnd Bergmann
2009-11-24 10:02 ` [Bridge] " Arnd Bergmann
2009-11-24 10:02 ` Arnd Bergmann
2009-11-24 10:17 ` Patrick McHardy
2009-11-24 10:17 ` [Bridge] " Patrick McHardy
2009-11-24 10:17 ` Patrick McHardy
2009-11-24 10:34 ` Arnd Bergmann
2009-11-24 10:34 ` [Bridge] " Arnd Bergmann
2009-11-24 10:34 ` Arnd Bergmann
2009-11-24 10:40 ` Patrick McHardy
2009-11-24 10:40 ` [Bridge] " Patrick McHardy
2009-11-24 10:40 ` Patrick McHardy
2009-11-24 13:13 ` Arnd Bergmann
2009-11-24 13:13 ` [Bridge] " Arnd Bergmann
2009-11-24 13:13 ` Arnd Bergmann
2009-11-24 16:42 ` [Bridge] " Eric W. Biederman
2009-11-24 16:42 ` Eric W. Biederman
2009-11-24 16:56 ` [Bridge] " Patrick McHardy
2009-11-24 16:56 ` Patrick McHardy
2009-11-24 18:10 ` [Bridge] " Eric W. Biederman
2009-11-24 18:10 ` Eric W. Biederman
2009-11-24 18:28 ` Arnd Bergmann
2009-11-24 18:28 ` [Bridge] " Arnd Bergmann
2009-11-24 18:28 ` Arnd Bergmann
2009-11-24 18:38 ` [Bridge] " Patrick McHardy
2009-11-24 18:38 ` Patrick McHardy
2009-11-26 15:21 ` [Bridge] " Arnd Bergmann
2009-11-26 15:21 ` Arnd Bergmann
2009-11-26 15:33 ` Patrick McHardy
2009-11-26 15:33 ` [Bridge] " Patrick McHardy
2009-11-26 15:33 ` Patrick McHardy
2009-11-26 16:38 ` [Bridge] " Eric W. Biederman
2009-11-26 16:38 ` Eric W. Biederman
2009-11-26 16:38 ` Eric W. Biederman
2009-11-26 17:44 ` Arnd Bergmann
2009-11-26 17:44 ` Arnd Bergmann [this message]
2009-11-26 17:44 ` Arnd Bergmann
2009-11-26 21:14 ` Patrick McHardy
2009-11-26 21:14 ` [Bridge] " Patrick McHardy
2009-11-26 21:14 ` Patrick McHardy
2009-11-26 15:21 ` Arnd Bergmann
2009-11-24 18:38 ` Patrick McHardy
2009-11-24 18:10 ` Eric W. Biederman
2009-11-24 16:56 ` Patrick McHardy
2009-11-24 16:42 ` Eric W. Biederman
2009-11-24 0:56 ` [Bridge] [PATCH 2/4] macvlan: cleanup rx statistics Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 8:15 ` Eric Dumazet
2009-11-24 8:15 ` [Bridge] " Eric Dumazet
2009-11-24 8:15 ` Eric Dumazet
2009-11-24 8:45 ` Arnd Bergmann
2009-11-24 8:45 ` [Bridge] " Arnd Bergmann
2009-11-24 8:45 ` Arnd Bergmann
2009-11-24 9:28 ` Arnd Bergmann
2009-11-24 9:28 ` [Bridge] " Arnd Bergmann
2009-11-24 9:28 ` Arnd Bergmann
2009-11-24 10:41 ` Patrick McHardy
2009-11-24 10:41 ` [Bridge] " Patrick McHardy
2009-11-24 10:41 ` Patrick McHardy
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 0:56 ` [Bridge] [PATCH 3/4] macvlan: implement bridge, VEPA and private mode Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 10:42 ` Patrick McHardy
2009-11-24 10:42 ` [Bridge] " Patrick McHardy
2009-11-24 10:42 ` Patrick McHardy
2009-11-24 12:45 ` [Bridge] " Arnd Bergmann
2009-11-24 12:45 ` Arnd Bergmann
2009-11-24 12:45 ` Arnd Bergmann
2009-11-24 0:56 ` [PATCH 4/4] macvlan: export macvlan mode through netlink Arnd Bergmann
2009-11-24 0:56 ` [Bridge] " Arnd Bergmann
2009-11-24 0:56 ` Arnd Bergmann
2009-11-24 10:53 ` Patrick McHardy
2009-11-24 10:53 ` [Bridge] " Patrick McHardy
2009-11-24 10:53 ` Patrick McHardy
2009-11-24 12:57 ` Arnd Bergmann
2009-11-24 12:57 ` [Bridge] " Arnd Bergmann
2009-11-24 12:57 ` Arnd Bergmann
2009-11-24 13:47 ` [Bridge] " Patrick McHardy
2009-11-24 13:47 ` Patrick McHardy
2009-11-24 13:47 ` Patrick McHardy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200911261844.59912.arnd@arndb.de \
--to=arnd@arndb.de \
--cc=anna.fischer@hp.com \
--cc=bridge@lists.linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=eric.dumazet@gmail.com \
--cc=gerhard.stenzel@de.ibm.com \
--cc=herbert@gondor.apana.org.au \
--cc=jens@linux.vnet.ibm.com \
--cc=kaber@trash.net \
--cc=linux-kernel@vger.kernel.org \
--cc=lk-netdev@lk-netdev.nosense.org \
--cc=netdev@vger.kernel.org \
--cc=pmullaney@novell.com \
--cc=shemminger@vyatta.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.