* [0/4] gre: Ethernet over GRE
@ 2008-10-09 7:04 Herbert Xu
2008-10-09 7:05 ` [PATCH 1/4] gre: Use needed_headroom Herbert Xu
` (7 more replies)
0 siblings, 8 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 7:04 UTC (permalink / raw)
To: David S. Miller, netdev; +Cc: Patrick McHardy
Hi:
This series of patches add Ethernet over GRE support. The user
space interface is done with the rtnl_link mechanism. I'll post
the patches for iproute too.
There has been a small demand of such a tunneling mechanism that
allows direct Ethernet connections over IP. In the past many
efforts have been made in this direction. However, inadequacies
with our user interface have foiled many of them.
Recently Patrick McHardy created the rtnl_link mechanism which
finally allows tunnel configuration to be brought into the 21st
century. Having tried it I must say that it has been an absolute
pleasure to use :)
This should be completely backwards compatible in that if you
don't create Ethernet over GRE tunnels then your GRE experience
should not differ one single bit from before. However, you can
manage your existing GRE interfaces using the new interface should
you choose to do so.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/4] gre: Use needed_headroom
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
@ 2008-10-09 7:05 ` Herbert Xu
2008-10-09 7:05 ` [PATCH 2/4] gre: Move MTU setting out of ipgre_tunnel_bind_dev Herbert Xu
` (6 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 7:05 UTC (permalink / raw)
To: David S. Miller, netdev, Patrick McHardy
gre: Use needed_headroom
Now that we have dev->needed_headroom, we can use it instead of
having a bogus dev->hard_header_len. This also allows us to
include dev->hard_header_len in the MTU computation so that when
we do have a meaningful hard_harder_len in future it is included
automatically in figuring out the MTU.
Incidentally, this fixes a bug where we ignored the needed_headroom
field of the underlying device in calculating our own hard_header_len.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
net/ipv4/ip_gre.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 2a61158..fd192d6 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -637,7 +637,7 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
df = tiph->frag_off;
if (df)
- mtu = dst_mtu(&rt->u.dst) - tunnel->hlen;
+ mtu = dst_mtu(&rt->u.dst) - dev->hard_header_len - tunnel->hlen;
else
mtu = skb->dst ? dst_mtu(skb->dst) : dev->mtu;
@@ -785,7 +785,7 @@ static void ipgre_tunnel_bind_dev(struct net_device *dev)
tunnel = netdev_priv(dev);
iph = &tunnel->parms.iph;
- /* Guess output device to choose reasonable mtu and hard_header_len */
+ /* Guess output device to choose reasonable mtu and needed_headroom */
if (iph->daddr) {
struct flowi fl = { .oif = tunnel->parms.link,
@@ -806,7 +806,7 @@ static void ipgre_tunnel_bind_dev(struct net_device *dev)
tdev = __dev_get_by_index(dev_net(dev), tunnel->parms.link);
if (tdev) {
- hlen = tdev->hard_header_len;
+ hlen = tdev->hard_header_len + tdev->needed_headroom;
mtu = tdev->mtu;
}
dev->iflink = tunnel->parms.link;
@@ -820,8 +820,8 @@ static void ipgre_tunnel_bind_dev(struct net_device *dev)
if (tunnel->parms.o_flags&GRE_SEQ)
addend += 4;
}
- dev->hard_header_len = hlen + addend;
- dev->mtu = mtu - addend;
+ dev->needed_headroom = addend + hlen;
+ dev->mtu = mtu - dev->hard_header_len - addend;
tunnel->hlen = addend;
}
@@ -959,7 +959,8 @@ done:
static int ipgre_tunnel_change_mtu(struct net_device *dev, int new_mtu)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- if (new_mtu < 68 || new_mtu > 0xFFF8 - tunnel->hlen)
+ if (new_mtu < 68 ||
+ new_mtu > 0xFFF8 - dev->hard_header_len - tunnel->hlen)
return -EINVAL;
dev->mtu = new_mtu;
return 0;
@@ -1085,7 +1086,7 @@ static void ipgre_tunnel_setup(struct net_device *dev)
dev->change_mtu = ipgre_tunnel_change_mtu;
dev->type = ARPHRD_IPGRE;
- dev->hard_header_len = LL_MAX_HEADER + sizeof(struct iphdr) + 4;
+ dev->needed_headroom = LL_MAX_HEADER + sizeof(struct iphdr) + 4;
dev->mtu = ETH_DATA_LEN - sizeof(struct iphdr) - 4;
dev->flags = IFF_NOARP;
dev->iflink = 0;
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/4] gre: Move MTU setting out of ipgre_tunnel_bind_dev
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
2008-10-09 7:05 ` [PATCH 1/4] gre: Use needed_headroom Herbert Xu
@ 2008-10-09 7:05 ` Herbert Xu
2008-10-09 7:05 ` [PATCH 3/4] gre: Add netlink interface Herbert Xu
` (5 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 7:05 UTC (permalink / raw)
To: David S. Miller, netdev, Patrick McHardy
gre: Move MTU setting out of ipgre_tunnel_bind_dev
This patch moves the dev->mtu setting out of ipgre_tunnel_bind_dev.
This is in prepartion of using rtnl_link where we'll need to make
the MTU setting conditional on whether the user has supplied an
MTU. This also requires the move of the ipgre_tunnel_bind_dev
call out of the dev->init function so that we can access the user
parameters later.
This patch also adds a check to prevent setting the MTU below
the minimum of 68.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
net/ipv4/ip_gre.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index fd192d6..80622dd 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -119,6 +119,7 @@
static int ipgre_tunnel_init(struct net_device *dev);
static void ipgre_tunnel_setup(struct net_device *dev);
+static int ipgre_tunnel_bind_dev(struct net_device *dev);
/* Fallback tunnel: no source, no destination, no key, no options */
@@ -289,6 +290,8 @@ static struct ip_tunnel * ipgre_tunnel_locate(struct net *net,
nt = netdev_priv(dev);
nt->parms = *parms;
+ dev->mtu = ipgre_tunnel_bind_dev(dev);
+
if (register_netdevice(dev) < 0)
goto failed_free;
@@ -773,7 +776,7 @@ tx_error:
return 0;
}
-static void ipgre_tunnel_bind_dev(struct net_device *dev)
+static int ipgre_tunnel_bind_dev(struct net_device *dev)
{
struct net_device *tdev = NULL;
struct ip_tunnel *tunnel;
@@ -821,9 +824,14 @@ static void ipgre_tunnel_bind_dev(struct net_device *dev)
addend += 4;
}
dev->needed_headroom = addend + hlen;
- dev->mtu = mtu - dev->hard_header_len - addend;
+ mtu -= dev->hard_header_len - addend;
+
+ if (mtu < 68)
+ mtu = 68;
+
tunnel->hlen = addend;
+ return mtu;
}
static int
@@ -917,7 +925,7 @@ ipgre_tunnel_ioctl (struct net_device *dev, struct ifreq *ifr, int cmd)
t->parms.iph.frag_off = p.iph.frag_off;
if (t->parms.link != p.link) {
t->parms.link = p.link;
- ipgre_tunnel_bind_dev(dev);
+ dev->mtu = ipgre_tunnel_bind_dev(dev);
netdev_state_change(dev);
}
}
@@ -1108,8 +1116,6 @@ static int ipgre_tunnel_init(struct net_device *dev)
memcpy(dev->dev_addr, &tunnel->parms.iph.saddr, 4);
memcpy(dev->broadcast, &tunnel->parms.iph.daddr, 4);
- ipgre_tunnel_bind_dev(dev);
-
if (iph->daddr) {
#ifdef CONFIG_NET_IPGRE_BROADCAST
if (ipv4_is_multicast(iph->daddr)) {
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/4] gre: Add netlink interface
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
2008-10-09 7:05 ` [PATCH 1/4] gre: Use needed_headroom Herbert Xu
2008-10-09 7:05 ` [PATCH 2/4] gre: Move MTU setting out of ipgre_tunnel_bind_dev Herbert Xu
@ 2008-10-09 7:05 ` Herbert Xu
2008-10-09 7:05 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
` (4 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 7:05 UTC (permalink / raw)
To: David S. Miller, netdev, Patrick McHardy
gre: Add netlink interface
This patch adds a netlink interface that will eventually displace
the existing ioctl interface. It utilises the elegant rtnl_link_ops
mechanism.
This also means that user-space no longer needs to rely on the
tunnel interface being of type GRE to identify GRE tunnels. The
identification can now occur using rtnl_link_ops.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
include/linux/if_tunnel.h | 19 +++
net/ipv4/ip_gre.c | 247 +++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 262 insertions(+), 4 deletions(-)
diff --git a/include/linux/if_tunnel.h b/include/linux/if_tunnel.h
index d4efe40..aeab2cb 100644
--- a/include/linux/if_tunnel.h
+++ b/include/linux/if_tunnel.h
@@ -2,6 +2,7 @@
#define _IF_TUNNEL_H_
#include <linux/types.h>
+#include <linux/ip.h>
#define SIOCGETTUNNEL (SIOCDEVPRIVATE + 0)
#define SIOCADDTUNNEL (SIOCDEVPRIVATE + 1)
@@ -47,4 +48,22 @@ struct ip_tunnel_prl {
/* PRL flags */
#define PRL_DEFAULT 0x0001
+enum
+{
+ IFLA_GRE_UNSPEC,
+ IFLA_GRE_LINK,
+ IFLA_GRE_IFLAGS,
+ IFLA_GRE_OFLAGS,
+ IFLA_GRE_IKEY,
+ IFLA_GRE_OKEY,
+ IFLA_GRE_LOCAL,
+ IFLA_GRE_REMOTE,
+ IFLA_GRE_TTL,
+ IFLA_GRE_TOS,
+ IFLA_GRE_PMTUDISC,
+ __IFLA_GRE_MAX,
+};
+
+#define IFLA_GRE_MAX (__IFLA_GRE_MAX - 1)
+
#endif /* _IF_TUNNEL_H_ */
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 80622dd..25d2c77 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -41,6 +41,7 @@
#include <net/xfrm.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
+#include <net/rtnetlink.h>
#ifdef CONFIG_IPV6
#include <net/ipv6.h>
@@ -117,6 +118,7 @@
Alexey Kuznetsov.
*/
+static struct rtnl_link_ops ipgre_link_ops __read_mostly;
static int ipgre_tunnel_init(struct net_device *dev);
static void ipgre_tunnel_setup(struct net_device *dev);
static int ipgre_tunnel_bind_dev(struct net_device *dev);
@@ -286,9 +288,9 @@ static struct ip_tunnel * ipgre_tunnel_locate(struct net *net,
goto failed_free;
}
- dev->init = ipgre_tunnel_init;
nt = netdev_priv(dev);
nt->parms = *parms;
+ dev->rtnl_link_ops = &ipgre_link_ops;
dev->mtu = ipgre_tunnel_bind_dev(dev);
@@ -1087,6 +1089,7 @@ static int ipgre_close(struct net_device *dev)
static void ipgre_tunnel_setup(struct net_device *dev)
{
+ dev->init = ipgre_tunnel_init;
dev->uninit = ipgre_tunnel_uninit;
dev->destructor = free_netdev;
dev->hard_start_xmit = ipgre_tunnel_xmit;
@@ -1196,6 +1199,7 @@ static int ipgre_init_net(struct net *net)
ign->fb_tunnel_dev->init = ipgre_fb_tunnel_init;
dev_net_set(ign->fb_tunnel_dev, net);
+ ign->fb_tunnel_dev->rtnl_link_ops = &ipgre_link_ops;
if ((err = register_netdev(ign->fb_tunnel_dev)))
goto err_reg_dev;
@@ -1228,6 +1232,229 @@ static struct pernet_operations ipgre_net_ops = {
.exit = ipgre_exit_net,
};
+static int ipgre_tunnel_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+ __be16 flags;
+
+ if (!data)
+ return 0;
+
+ flags = 0;
+ if (data[IFLA_GRE_IFLAGS])
+ flags |= nla_get_be16(data[IFLA_GRE_IFLAGS]);
+ if (data[IFLA_GRE_OFLAGS])
+ flags |= nla_get_be16(data[IFLA_GRE_OFLAGS]);
+ if (flags & (GRE_VERSION|GRE_ROUTING))
+ return -EINVAL;
+
+ return 0;
+}
+
+static void ipgre_netlink_parms(struct nlattr *data[],
+ struct ip_tunnel_parm *parms)
+{
+ memset(parms, 0, sizeof(parms));
+
+ parms->iph.protocol = IPPROTO_GRE;
+
+ if (!data)
+ return;
+
+ if (data[IFLA_GRE_LINK])
+ parms->link = nla_get_u32(data[IFLA_GRE_LINK]);
+
+ if (data[IFLA_GRE_IFLAGS])
+ parms->i_flags = nla_get_be16(data[IFLA_GRE_IFLAGS]);
+
+ if (data[IFLA_GRE_OFLAGS])
+ parms->o_flags = nla_get_be16(data[IFLA_GRE_OFLAGS]);
+
+ if (data[IFLA_GRE_IKEY])
+ parms->i_key = nla_get_be32(data[IFLA_GRE_IKEY]);
+
+ if (data[IFLA_GRE_OKEY])
+ parms->o_key = nla_get_be32(data[IFLA_GRE_OKEY]);
+
+ if (data[IFLA_GRE_LOCAL])
+ memcpy(&parms->iph.saddr, nla_data(data[IFLA_GRE_LOCAL]), 4);
+
+ if (data[IFLA_GRE_REMOTE])
+ memcpy(&parms->iph.daddr, nla_data(data[IFLA_GRE_REMOTE]), 4);
+
+ if (data[IFLA_GRE_TTL])
+ parms->iph.ttl = nla_get_u8(data[IFLA_GRE_TTL]);
+
+ if (data[IFLA_GRE_TOS])
+ parms->iph.tos = nla_get_u8(data[IFLA_GRE_TOS]);
+
+ if (!data[IFLA_GRE_PMTUDISC] || nla_get_u8(data[IFLA_GRE_PMTUDISC]))
+ parms->iph.frag_off = htons(IP_DF);
+}
+
+static int ipgre_newlink(struct net_device *dev, struct nlattr *tb[],
+ struct nlattr *data[])
+{
+ struct ip_tunnel *nt;
+ struct net *net = dev_net(dev);
+ struct ipgre_net *ign = net_generic(net, ipgre_net_id);
+ int mtu;
+ int err;
+
+ nt = netdev_priv(dev);
+ ipgre_netlink_parms(data, &nt->parms);
+
+ if (ipgre_tunnel_locate(net, &nt->parms, 0))
+ return -EEXIST;
+
+ mtu = ipgre_tunnel_bind_dev(dev);
+ if (!tb[IFLA_MTU])
+ dev->mtu = mtu;
+
+ err = register_netdevice(dev);
+ if (err)
+ goto out;
+
+ dev_hold(dev);
+ ipgre_tunnel_link(ign, nt);
+
+out:
+ return err;
+}
+
+static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[],
+ struct nlattr *data[])
+{
+ struct ip_tunnel *t, *nt;
+ struct net *net = dev_net(dev);
+ struct ipgre_net *ign = net_generic(net, ipgre_net_id);
+ struct ip_tunnel_parm p;
+ int mtu;
+
+ if (dev == ign->fb_tunnel_dev)
+ return -EINVAL;
+
+ nt = netdev_priv(dev);
+ ipgre_netlink_parms(data, &p);
+
+ t = ipgre_tunnel_locate(net, &p, 0);
+
+ if (t) {
+ if (t->dev != dev)
+ return -EEXIST;
+ } else {
+ unsigned nflags = 0;
+
+ t = nt;
+
+ if (ipv4_is_multicast(p.iph.daddr))
+ nflags = IFF_BROADCAST;
+ else if (p.iph.daddr)
+ nflags = IFF_POINTOPOINT;
+
+ if ((dev->flags ^ nflags) &
+ (IFF_POINTOPOINT | IFF_BROADCAST))
+ return -EINVAL;
+
+ ipgre_tunnel_unlink(ign, t);
+ t->parms.iph.saddr = p.iph.saddr;
+ t->parms.iph.daddr = p.iph.daddr;
+ t->parms.i_key = p.i_key;
+ memcpy(dev->dev_addr, &p.iph.saddr, 4);
+ memcpy(dev->broadcast, &p.iph.daddr, 4);
+ ipgre_tunnel_link(ign, t);
+ netdev_state_change(dev);
+ }
+
+ t->parms.o_key = p.o_key;
+ t->parms.iph.ttl = p.iph.ttl;
+ t->parms.iph.tos = p.iph.tos;
+ t->parms.iph.frag_off = p.iph.frag_off;
+
+ if (t->parms.link != p.link) {
+ t->parms.link = p.link;
+ mtu = ipgre_tunnel_bind_dev(dev);
+ if (!tb[IFLA_MTU])
+ dev->mtu = mtu;
+ netdev_state_change(dev);
+ }
+
+ return 0;
+}
+
+static size_t ipgre_get_size(const struct net_device *dev)
+{
+ return
+ /* IFLA_GRE_LINK */
+ nla_total_size(4) +
+ /* IFLA_GRE_IFLAGS */
+ nla_total_size(2) +
+ /* IFLA_GRE_OFLAGS */
+ nla_total_size(2) +
+ /* IFLA_GRE_IKEY */
+ nla_total_size(4) +
+ /* IFLA_GRE_OKEY */
+ nla_total_size(4) +
+ /* IFLA_GRE_LOCAL */
+ nla_total_size(4) +
+ /* IFLA_GRE_REMOTE */
+ nla_total_size(4) +
+ /* IFLA_GRE_TTL */
+ nla_total_size(1) +
+ /* IFLA_GRE_TOS */
+ nla_total_size(1) +
+ /* IFLA_GRE_PMTUDISC */
+ nla_total_size(1) +
+ 0;
+}
+
+static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
+{
+ struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm *p = &t->parms;
+
+ NLA_PUT_U32(skb, IFLA_GRE_LINK, p->link);
+ NLA_PUT_BE16(skb, IFLA_GRE_IFLAGS, p->i_flags);
+ NLA_PUT_BE16(skb, IFLA_GRE_OFLAGS, p->o_flags);
+ NLA_PUT_BE32(skb, IFLA_GRE_IFLAGS, p->i_flags);
+ NLA_PUT_BE32(skb, IFLA_GRE_OFLAGS, p->o_flags);
+ NLA_PUT(skb, IFLA_GRE_LOCAL, 4, &p->iph.saddr);
+ NLA_PUT(skb, IFLA_GRE_REMOTE, 4, &p->iph.daddr);
+ NLA_PUT_U8(skb, IFLA_GRE_TTL, p->iph.ttl);
+ NLA_PUT_U8(skb, IFLA_GRE_TOS, p->iph.tos);
+ NLA_PUT_U8(skb, IFLA_GRE_PMTUDISC, !!(p->iph.frag_off & htons(IP_DF)));
+
+ return 0;
+
+nla_put_failure:
+ return -EMSGSIZE;
+}
+
+static const struct nla_policy ipgre_policy[IFLA_GRE_MAX + 1] = {
+ [IFLA_GRE_LINK] = { .type = NLA_U32 },
+ [IFLA_GRE_IFLAGS] = { .type = NLA_U16 },
+ [IFLA_GRE_OFLAGS] = { .type = NLA_U16 },
+ [IFLA_GRE_IKEY] = { .type = NLA_U32 },
+ [IFLA_GRE_OKEY] = { .type = NLA_U32 },
+ [IFLA_GRE_LOCAL] = { .len = 4 },
+ [IFLA_GRE_REMOTE] = { .len = 4 },
+ [IFLA_GRE_TTL] = { .type = NLA_U8 },
+ [IFLA_GRE_TOS] = { .type = NLA_U8 },
+ [IFLA_GRE_PMTUDISC] = { .type = NLA_U8 },
+};
+
+static struct rtnl_link_ops ipgre_link_ops __read_mostly = {
+ .kind = "gre",
+ .maxtype = IFLA_GRE_MAX,
+ .policy = ipgre_policy,
+ .priv_size = sizeof(struct ip_tunnel),
+ .setup = ipgre_tunnel_setup,
+ .validate = ipgre_tunnel_validate,
+ .newlink = ipgre_newlink,
+ .changelink = ipgre_changelink,
+ .get_size = ipgre_get_size,
+ .fill_info = ipgre_fill_info,
+};
+
/*
* And now the modules code and kernel interface.
*/
@@ -1245,19 +1472,31 @@ static int __init ipgre_init(void)
err = register_pernet_gen_device(&ipgre_net_id, &ipgre_net_ops);
if (err < 0)
- inet_del_protocol(&ipgre_protocol, IPPROTO_GRE);
+ goto gen_device_failed;
+ err = rtnl_link_register(&ipgre_link_ops);
+ if (err < 0)
+ goto rtnl_link_failed;
+
+out:
return err;
+
+rtnl_link_failed:
+ unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
+gen_device_failed:
+ inet_del_protocol(&ipgre_protocol, IPPROTO_GRE);
+ goto out;
}
static void __exit ipgre_fini(void)
{
+ rtnl_link_unregister(&ipgre_link_ops);
+ unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
if (inet_del_protocol(&ipgre_protocol, IPPROTO_GRE) < 0)
printk(KERN_INFO "ipgre close: can't remove protocol\n");
-
- unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
}
module_init(ipgre_init);
module_exit(ipgre_fini);
MODULE_LICENSE("GPL");
+MODULE_ALIAS("rtnl-link-gre");
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/4] gre: Add Transparent Ethernet Bridging
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
` (2 preceding siblings ...)
2008-10-09 7:05 ` [PATCH 3/4] gre: Add netlink interface Herbert Xu
@ 2008-10-09 7:05 ` Herbert Xu
2008-10-09 8:43 ` Philip Craig
2008-10-09 11:21 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
2008-10-09 7:08 ` ip: gre: Add GRE configuration support through rtnl_link Herbert Xu
` (3 subsequent siblings)
7 siblings, 2 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 7:05 UTC (permalink / raw)
To: David S. Miller, netdev, Patrick McHardy
gre: Add Transparent Ethernet Bridging
This patch adds support for Ethernet over GRE encapsulation.
This is exposed to user-space with a new link type of "gretap"
instead of "gre". It will create an ARPHRD_ETHER device in
lieu of the usual ARPHRD_IPGRE.
Note that to preserver backwards compatibility all Transparent
Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel
if its key matches and there is no ARPHRD_ETHER device whose
key matches more closely.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
include/linux/if_ether.h | 1
net/ipv4/ip_gre.c | 195 +++++++++++++++++++++++++++++++++++++++--------
2 files changed, 164 insertions(+), 32 deletions(-)
diff --git a/include/linux/if_ether.h b/include/linux/if_ether.h
index e157c13..56f9039 100644
--- a/include/linux/if_ether.h
+++ b/include/linux/if_ether.h
@@ -56,6 +56,7 @@
#define ETH_P_DIAG 0x6005 /* DEC Diagnostics */
#define ETH_P_CUST 0x6006 /* DEC Customer use */
#define ETH_P_SCA 0x6007 /* DEC Systems Comms Arch */
+#define ETH_P_TEB 0x6558 /* Trans Ether Bridging */
#define ETH_P_RARP 0x8035 /* Reverse Addr Res packet */
#define ETH_P_ATALK 0x809B /* Appletalk DDP */
#define ETH_P_AARP 0x80F3 /* Appletalk AARP */
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 25d2c77..bc462ee 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -27,6 +27,7 @@
#include <linux/inetdevice.h>
#include <linux/igmp.h>
#include <linux/netfilter_ipv4.h>
+#include <linux/etherdevice.h>
#include <linux/if_ether.h>
#include <net/sock.h>
@@ -166,38 +167,64 @@ static DEFINE_RWLOCK(ipgre_lock);
/* Given src, dst and key, find appropriate for input tunnel. */
static struct ip_tunnel * ipgre_tunnel_lookup(struct net *net,
- __be32 remote, __be32 local, __be32 key)
+ __be32 remote, __be32 local,
+ __be32 key, __be16 gre_proto)
{
unsigned h0 = HASH(remote);
unsigned h1 = HASH(key);
struct ip_tunnel *t;
+ struct ip_tunnel *t2 = NULL;
struct ipgre_net *ign = net_generic(net, ipgre_net_id);
+ int dev_type = (gre_proto == htons(ETH_P_TEB)) ?
+ ARPHRD_ETHER : ARPHRD_IPGRE;
for (t = ign->tunnels_r_l[h0^h1]; t; t = t->next) {
if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
}
+
for (t = ign->tunnels_r[h0^h1]; t; t = t->next) {
if (remote == t->parms.iph.daddr) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
}
+
for (t = ign->tunnels_l[h1]; t; t = t->next) {
if (local == t->parms.iph.saddr ||
(local == t->parms.iph.daddr &&
ipv4_is_multicast(local))) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
}
+
for (t = ign->tunnels_wc[h1]; t; t = t->next) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
+ if (t2)
+ return t2;
+
if (ign->fb_tunnel_dev->flags&IFF_UP)
return netdev_priv(ign->fb_tunnel_dev);
return NULL;
@@ -252,25 +279,37 @@ static void ipgre_tunnel_unlink(struct ipgre_net *ign, struct ip_tunnel *t)
}
}
-static struct ip_tunnel * ipgre_tunnel_locate(struct net *net,
- struct ip_tunnel_parm *parms, int create)
+static struct ip_tunnel *ipgre_tunnel_find(struct net *net,
+ struct ip_tunnel_parm *parms,
+ int type)
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
__be32 key = parms->i_key;
- struct ip_tunnel *t, **tp, *nt;
+ struct ip_tunnel *t, **tp;
+ struct ipgre_net *ign = net_generic(net, ipgre_net_id);
+
+ for (tp = __ipgre_bucket(ign, parms); (t = *tp) != NULL; tp = &t->next)
+ if (local == t->parms.iph.saddr &&
+ remote == t->parms.iph.daddr &&
+ key == t->parms.i_key &&
+ type == t->dev->type)
+ break;
+
+ return t;
+}
+
+static struct ip_tunnel * ipgre_tunnel_locate(struct net *net,
+ struct ip_tunnel_parm *parms, int create)
+{
+ struct ip_tunnel *t, *nt;
struct net_device *dev;
char name[IFNAMSIZ];
struct ipgre_net *ign = net_generic(net, ipgre_net_id);
- for (tp = __ipgre_bucket(ign, parms); (t = *tp) != NULL; tp = &t->next) {
- if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr) {
- if (key == t->parms.i_key)
- return t;
- }
- }
- if (!create)
- return NULL;
+ t = ipgre_tunnel_find(net, parms, ARPHRD_IPGRE);
+ if (t || !create)
+ return t;
if (parms->name[0])
strlcpy(name, parms->name, IFNAMSIZ);
@@ -385,8 +424,9 @@ static void ipgre_err(struct sk_buff *skb, u32 info)
read_lock(&ipgre_lock);
t = ipgre_tunnel_lookup(dev_net(skb->dev), iph->daddr, iph->saddr,
- (flags&GRE_KEY) ?
- *(((__be32*)p) + (grehlen>>2) - 1) : 0);
+ flags & GRE_KEY ?
+ *(((__be32 *)p) + (grehlen / 4) - 1) : 0,
+ p[1]);
if (t == NULL || t->parms.iph.daddr == 0 ||
ipv4_is_multicast(t->parms.iph.daddr))
goto out;
@@ -436,6 +476,7 @@ static int ipgre_rcv(struct sk_buff *skb)
u32 seqno = 0;
struct ip_tunnel *tunnel;
int offset = 4;
+ __be16 gre_proto;
if (!pskb_may_pull(skb, 16))
goto drop_nolock;
@@ -475,20 +516,22 @@ static int ipgre_rcv(struct sk_buff *skb)
}
}
+ gre_proto = *(__be16 *)(h + 2);
+
read_lock(&ipgre_lock);
if ((tunnel = ipgre_tunnel_lookup(dev_net(skb->dev),
- iph->saddr, iph->daddr, key)) != NULL) {
+ iph->saddr, iph->daddr, key,
+ gre_proto))) {
struct net_device_stats *stats = &tunnel->dev->stats;
secpath_reset(skb);
- skb->protocol = *(__be16*)(h + 2);
+ skb->protocol = gre_proto;
/* WCCP version 1 and 2 protocol decoding.
* - Change protocol to IP
* - When dealing with WCCPv2, Skip extra 4 bytes in GRE header
*/
- if (flags == 0 &&
- skb->protocol == htons(ETH_P_WCCP)) {
+ if (flags == 0 && gre_proto == htons(ETH_P_WCCP)) {
skb->protocol = htons(ETH_P_IP);
if ((*(h + offset) & 0xF0) != 0x40)
offset += 4;
@@ -496,7 +539,6 @@ static int ipgre_rcv(struct sk_buff *skb)
skb->mac_header = skb->network_header;
__pskb_pull(skb, offset);
- skb_reset_network_header(skb);
skb_postpull_rcsum(skb, skb_transport_header(skb), offset);
skb->pkt_type = PACKET_HOST;
#ifdef CONFIG_NET_IPGRE_BROADCAST
@@ -530,7 +572,13 @@ static int ipgre_rcv(struct sk_buff *skb)
dst_release(skb->dst);
skb->dst = NULL;
nf_reset(skb);
+
+ if (tunnel->dev->type == ARPHRD_ETHER)
+ skb->protocol = eth_type_trans(skb, skb->dev);
+
+ skb_reset_network_header(skb);
ipgre_ecn_decapsulate(iph, skb);
+
netif_rx(skb);
read_unlock(&ipgre_lock);
return(0);
@@ -565,7 +613,10 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
goto tx_error;
}
- if (dev->header_ops) {
+ if (dev->type == ARPHRD_ETHER)
+ IPCB(skb)->flags = 0;
+
+ if (dev->header_ops && dev->type == ARPHRD_IPGRE) {
gre_hlen = 0;
tiph = (struct iphdr*)skb->data;
} else {
@@ -741,8 +792,9 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
iph->ttl = dst_metric(&rt->u.dst, RTAX_HOPLIMIT);
}
- ((__be16*)(iph+1))[0] = tunnel->parms.o_flags;
- ((__be16*)(iph+1))[1] = skb->protocol;
+ ((__be16 *)(iph + 1))[0] = tunnel->parms.o_flags;
+ ((__be16 *)(iph + 1))[1] = (dev->type == ARPHRD_ETHER) ?
+ htons(ETH_P_TEB) : skb->protocol;
if (tunnel->parms.o_flags&(GRE_KEY|GRE_CSUM|GRE_SEQ)) {
__be32 *ptr = (__be32*)(((u8*)iph) + tunnel->hlen - 4);
@@ -804,7 +856,9 @@ static int ipgre_tunnel_bind_dev(struct net_device *dev)
tdev = rt->u.dst.dev;
ip_rt_put(rt);
}
- dev->flags |= IFF_POINTOPOINT;
+
+ if (dev->type != ARPHRD_ETHER)
+ dev->flags |= IFF_POINTOPOINT;
}
if (!tdev && tunnel->parms.link)
@@ -1250,6 +1304,30 @@ static int ipgre_tunnel_validate(struct nlattr *tb[], struct nlattr *data[])
return 0;
}
+static int ipgre_tap_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+ __be32 daddr;
+
+ if (tb[IFLA_ADDRESS]) {
+ if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+ return -EINVAL;
+ if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+ return -EADDRNOTAVAIL;
+ }
+
+ if (!data)
+ goto out;
+
+ if (data[IFLA_GRE_REMOTE]) {
+ memcpy(&daddr, nla_data(data[IFLA_GRE_REMOTE]), 4);
+ if (!daddr)
+ return -EINVAL;
+ }
+
+out:
+ return ipgre_tunnel_validate(tb, data);
+}
+
static void ipgre_netlink_parms(struct nlattr *data[],
struct ip_tunnel_parm *parms)
{
@@ -1291,6 +1369,35 @@ static void ipgre_netlink_parms(struct nlattr *data[],
parms->iph.frag_off = htons(IP_DF);
}
+static int ipgre_tap_init(struct net_device *dev)
+{
+ struct ip_tunnel *tunnel;
+
+ tunnel = netdev_priv(dev);
+
+ tunnel->dev = dev;
+ strcpy(tunnel->parms.name, dev->name);
+
+ ipgre_tunnel_bind_dev(dev);
+
+ return 0;
+}
+
+static void ipgre_tap_setup(struct net_device *dev)
+{
+
+ ether_setup(dev);
+
+ dev->init = ipgre_tap_init;
+ dev->uninit = ipgre_tunnel_uninit;
+ dev->destructor = free_netdev;
+ dev->hard_start_xmit = ipgre_tunnel_xmit;
+ dev->change_mtu = ipgre_tunnel_change_mtu;
+
+ dev->iflink = 0;
+ dev->features |= NETIF_F_NETNS_LOCAL;
+}
+
static int ipgre_newlink(struct net_device *dev, struct nlattr *tb[],
struct nlattr *data[])
{
@@ -1303,9 +1410,12 @@ static int ipgre_newlink(struct net_device *dev, struct nlattr *tb[],
nt = netdev_priv(dev);
ipgre_netlink_parms(data, &nt->parms);
- if (ipgre_tunnel_locate(net, &nt->parms, 0))
+ if (ipgre_tunnel_find(net, &nt->parms, dev->type))
return -EEXIST;
+ if (dev->type == ARPHRD_ETHER && !tb[IFLA_ADDRESS])
+ random_ether_addr(dev->dev_addr);
+
mtu = ipgre_tunnel_bind_dev(dev);
if (!tb[IFLA_MTU])
dev->mtu = mtu;
@@ -1455,6 +1565,19 @@ static struct rtnl_link_ops ipgre_link_ops __read_mostly = {
.fill_info = ipgre_fill_info,
};
+static struct rtnl_link_ops ipgre_tap_ops __read_mostly = {
+ .kind = "gretap",
+ .maxtype = IFLA_GRE_MAX,
+ .policy = ipgre_policy,
+ .priv_size = sizeof(struct ip_tunnel),
+ .setup = ipgre_tap_setup,
+ .validate = ipgre_tap_validate,
+ .newlink = ipgre_newlink,
+ .changelink = ipgre_changelink,
+ .get_size = ipgre_get_size,
+ .fill_info = ipgre_fill_info,
+};
+
/*
* And now the modules code and kernel interface.
*/
@@ -1478,9 +1601,15 @@ static int __init ipgre_init(void)
if (err < 0)
goto rtnl_link_failed;
+ err = rtnl_link_register(&ipgre_tap_ops);
+ if (err < 0)
+ goto tap_ops_failed;
+
out:
return err;
+tap_ops_failed:
+ rtnl_link_unregister(&ipgre_link_ops);
rtnl_link_failed:
unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
gen_device_failed:
@@ -1490,6 +1619,7 @@ gen_device_failed:
static void __exit ipgre_fini(void)
{
+ rtnl_link_unregister(&ipgre_tap_ops);
rtnl_link_unregister(&ipgre_link_ops);
unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
if (inet_del_protocol(&ipgre_protocol, IPPROTO_GRE) < 0)
@@ -1500,3 +1630,4 @@ module_init(ipgre_init);
module_exit(ipgre_fini);
MODULE_LICENSE("GPL");
MODULE_ALIAS("rtnl-link-gre");
+MODULE_ALIAS("rtnl-link-gretap");
^ permalink raw reply related [flat|nested] 16+ messages in thread
* ip: gre: Add GRE configuration support through rtnl_link
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
` (3 preceding siblings ...)
2008-10-09 7:05 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
@ 2008-10-09 7:08 ` Herbert Xu
2008-10-11 13:03 ` Herbert Xu
2008-10-09 11:23 ` [PATCH 5/4] inet: Make tunnel RX/TX byte counters more consistent Herbert Xu
` (2 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 7:08 UTC (permalink / raw)
To: David S. Miller, netdev, Stephen Hemminger; +Cc: Patrick McHardy
ip: gre: Add GRE configuration support through rtnl_link
This patch adds support for configuring GRE tunnels using the
new rtnl_link interface. This only works on kernels that have
the new GRE configuration interface.
This is accessed through the "ip link" command. The previous
tunnel configuration interface "ip tunnel" remains as it is
and should be retained for compatibility with old kernels.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/if_tunnel.h b/include/linux/if_tunnel.h
index d4efe40..b9b8760 100644
--- a/include/linux/if_tunnel.h
+++ b/include/linux/if_tunnel.h
@@ -2,6 +2,7 @@
#define _IF_TUNNEL_H_
#include <linux/types.h>
+#include <linux/ip.h>
#define SIOCGETTUNNEL (SIOCDEVPRIVATE + 0)
#define SIOCADDTUNNEL (SIOCDEVPRIVATE + 1)
@@ -47,4 +48,26 @@ struct ip_tunnel_prl {
/* PRL flags */
#define PRL_DEFAULT 0x0001
+enum
+{
+ IFLA_GRE_UNSPEC,
+ IFLA_GRE_LINK,
+ IFLA_GRE_IFLAGS,
+ IFLA_GRE_OFLAGS,
+ IFLA_GRE_IKEY,
+ IFLA_GRE_OKEY,
+ IFLA_GRE_LOCAL,
+ IFLA_GRE_REMOTE,
+ IFLA_GRE_TTL,
+ IFLA_GRE_TOS,
+ IFLA_GRE_PMTUDISC,
+ IFLA_GRE_TYPE,
+ __IFLA_GRE_MAX,
+};
+
+#define IFLA_GRE_MAX (__IFLA_GRE_MAX - 1)
+
+#define GRE_TYPE_TUN 0x00000000
+#define GRE_TYPE_TAP 0x00000001
+
#endif /* _IF_TUNNEL_H_ */
diff --git a/ip/Makefile b/ip/Makefile
index 73978ff..98ba876 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -2,7 +2,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o \
rtm_map.o iptunnel.o ip6tunnel.o tunnel.o ipneigh.o ipntable.o iplink.o \
ipmaddr.o ipmonitor.o ipmroute.o ipprefix.o \
ipxfrm.o xfrm_state.o xfrm_policy.o xfrm_monitor.o \
- iplink_vlan.o link_veth.o
+ iplink_vlan.o link_veth.o link_gre.o
RTMONOBJ=rtmon.o
diff --git a/ip/link_gre.c b/ip/link_gre.c
new file mode 100644
index 0000000..e92b0d3
--- /dev/null
+++ b/ip/link_gre.c
@@ -0,0 +1,294 @@
+/*
+ * link_gre.c gre driver module
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors: Herbert Xu <herbert@gondor.apana.org.au>
+ *
+ */
+
+#include <string.h>
+#include <net/if.h>
+#include <linux/if_tunnel.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <arpa/inet.h>
+
+#include "rt_names.h"
+#include "utils.h"
+#include "ip_common.h"
+#include "tunnel.h"
+
+static void usage(void) __attribute__((noreturn));
+static void usage(void)
+{
+ fprintf(stderr, "Usage: ip link { add | set | change | replace | del } NAME\n");
+ fprintf(stderr, " type { gre | gretap } [ remote ADDR ] [ local ADDR ]\n");
+ fprintf(stderr, " [ [i|o]seq ] [ [i|o]key KEY ] [ [i|o]csum ]\n");
+ fprintf(stderr, " [ ttl TTL ] [ tos TOS ] [ [no]pmtudisc ] [ dev PHYS_DEV ]\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, "Where: NAME := STRING\n");
+ fprintf(stderr, " ADDR := { IP_ADDRESS | any }\n");
+ fprintf(stderr, " TOS := { NUMBER | inherit }\n");
+ fprintf(stderr, " TTL := { 1..255 | inherit }\n");
+ fprintf(stderr, " KEY := { DOTTED_QUAD | NUMBER }\n");
+ exit(-1);
+}
+
+static int gre_parse_opt(struct link_util *lu, int argc, char **argv,
+ struct nlmsghdr *n)
+{
+ __u16 iflags = 0;
+ __u16 oflags = 0;
+ unsigned ikey = 0;
+ unsigned okey = 0;
+ unsigned saddr = 0;
+ unsigned daddr = 0;
+
+ while (argc > 0) {
+ if (!matches(*argv, "key")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ iflags |= GRE_KEY;
+ oflags |= GRE_KEY;
+ if (strchr(*argv, '.'))
+ uval = get_addr32(*argv);
+ else {
+ if (get_unsigned(&uval, *argv, 0) < 0) {
+ fprintf(stderr,
+ "Invalid value for \"key\"\n");
+ exit(-1);
+ }
+ uval = htonl(uval);
+ }
+
+ ikey = okey = uval;
+ } else if (!matches(*argv, "ikey")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ iflags |= GRE_KEY;
+ if (strchr(*argv, '.'))
+ uval = get_addr32(*argv);
+ else {
+ if (get_unsigned(&uval, *argv, 0)<0) {
+ fprintf(stderr, "invalid value of \"ikey\"\n");
+ exit(-1);
+ }
+ uval = htonl(uval);
+ }
+ ikey = uval;
+ } else if (!matches(*argv, "okey")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ oflags |= GRE_KEY;
+ if (strchr(*argv, '.'))
+ uval = get_addr32(*argv);
+ else {
+ if (get_unsigned(&uval, *argv, 0)<0) {
+ fprintf(stderr, "invalid value of \"okey\"\n");
+ exit(-1);
+ }
+ uval = htonl(uval);
+ }
+ okey = uval;
+ } else if (!matches(*argv, "seq")) {
+ iflags |= GRE_SEQ;
+ oflags |= GRE_SEQ;
+ } else if (!matches(*argv, "iseq")) {
+ iflags |= GRE_SEQ;
+ } else if (!matches(*argv, "oseq")) {
+ oflags |= GRE_SEQ;
+ } else if (!matches(*argv, "csum")) {
+ iflags |= GRE_CSUM;
+ oflags |= GRE_CSUM;
+ } else if (!matches(*argv, "icsum")) {
+ iflags |= GRE_CSUM;
+ } else if (!matches(*argv, "ocsum")) {
+ oflags |= GRE_CSUM;
+ } else if (!matches(*argv, "nopmtudisc")) {
+ __u8 val = 0;
+
+ addattr_l(n, 1024, IFLA_GRE_PMTUDISC, &val, 1);
+ } else if (!matches(*argv, "pmtudisc")) {
+ __u8 val = 1;
+
+ addattr_l(n, 1024, IFLA_GRE_PMTUDISC, &val, 1);
+ } else if (!matches(*argv, "remote")) {
+ NEXT_ARG();
+ if (strcmp(*argv, "any"))
+ daddr = get_addr32(*argv);
+ } else if (!matches(*argv, "local")) {
+ NEXT_ARG();
+ if (strcmp(*argv, "any"))
+ saddr = get_addr32(*argv);
+ } else if (!matches(*argv, "dev")) {
+ unsigned link;
+
+ NEXT_ARG();
+ link = tnl_ioctl_get_ifindex(*argv);
+ if (link == 0)
+ exit(-1);
+
+ addattr32(n, 1024, IFLA_GRE_LINK, link);
+ } else if (!matches(*argv, "ttl") ||
+ !matches(*argv, "hoplimit")) {
+ unsigned uval;
+ __u8 ttl;
+
+ NEXT_ARG();
+ if (strcmp(*argv, "inherit") != 0) {
+ if (get_unsigned(&uval, *argv, 0))
+ invarg("invalid TTL\n", *argv);
+ if (uval > 255)
+ invarg("TTL must be <= 255\n", *argv);
+ ttl = uval;
+ addattr_l(n, 1024, IFLA_GRE_TTL, &ttl, 1);
+ }
+ } else if (!matches(*argv, "tos") ||
+ !matches(*argv, "tclass") ||
+ !matches(*argv, "dsfield")) {
+ __u32 uval;
+ __u8 tos;
+
+ NEXT_ARG();
+ if (strcmp(*argv, "inherit") != 0) {
+ if (rtnl_dsfield_a2n(&uval, *argv))
+ invarg("bad TOS value", *argv);
+ tos = uval;
+ } else
+ tos = 1;
+ addattr_l(n, 1024, IFLA_GRE_TOS, &tos, 1);
+ } else
+ usage();
+ argc--; argv++;
+ }
+
+ if (!ikey && IN_MULTICAST(ntohl(daddr))) {
+ ikey = daddr;
+ iflags |= GRE_KEY;
+ }
+ if (!okey && IN_MULTICAST(ntohl(daddr))) {
+ okey = daddr;
+ oflags |= GRE_KEY;
+ }
+ if (IN_MULTICAST(ntohl(daddr)) && !saddr) {
+ fprintf(stderr, "Broadcast tunnel requires a source address.\n");
+ return -1;
+ }
+
+ addattr32(n, 1024, IFLA_GRE_IKEY, ikey);
+ addattr32(n, 1024, IFLA_GRE_OKEY, okey);
+ addattr_l(n, 1024, IFLA_GRE_IFLAGS, &iflags, 2);
+ addattr_l(n, 1024, IFLA_GRE_OFLAGS, &oflags, 2);
+ addattr_l(n, 1024, IFLA_GRE_LOCAL, &saddr, 4);
+ addattr_l(n, 1024, IFLA_GRE_REMOTE, &daddr, 4);
+
+ return 0;
+}
+
+static void gre_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
+{
+ char s1[1024];
+ char s2[64];
+ const char *local = "any";
+ const char *remote = "any";
+ unsigned iflags = 0;
+ unsigned oflags = 0;
+
+ if (!tb)
+ return;
+
+ if (tb[IFLA_GRE_REMOTE]) {
+ unsigned addr = *(__u32 *)RTA_DATA(tb[IFLA_GRE_REMOTE]);
+
+ if (addr)
+ remote = format_host(AF_INET, 4, &addr, s1, sizeof(s1));
+ }
+
+ fprintf(f, "remote %s ", remote);
+
+ if (tb[IFLA_GRE_LOCAL]) {
+ unsigned addr = *(__u32 *)RTA_DATA(tb[IFLA_GRE_LOCAL]);
+
+ if (addr)
+ local = format_host(AF_INET, 4, &addr, s1, sizeof(s1));
+ }
+
+ fprintf(f, "local %s ", local);
+
+ if (tb[IFLA_GRE_LINK] && *(__u32 *)RTA_DATA(tb[IFLA_GRE_LINK])) {
+ unsigned link = *(__u32 *)RTA_DATA(tb[IFLA_GRE_LINK]);
+ char *n = tnl_ioctl_get_ifname(link);
+
+ if (n)
+ fprintf(f, "dev %s ", n);
+ else
+ fprintf(f, "dev %u ", link);
+ }
+
+ if (tb[IFLA_GRE_TTL] && *(__u8 *)RTA_DATA(tb[IFLA_GRE_TTL]))
+ fprintf(f, "ttl %d ", *(__u8 *)RTA_DATA(tb[IFLA_GRE_TTL]));
+ else
+ fprintf(f, "ttl inherit ");
+
+ if (tb[IFLA_GRE_TOS] && *(__u8 *)RTA_DATA(tb[IFLA_GRE_TOS])) {
+ int tos = *(__u8 *)RTA_DATA(tb[IFLA_GRE_TOS]);
+
+ fputs("tos ", f);
+ if (tos == 1)
+ fputs("inherit ", f);
+ else
+ fprintf(f, "0x%x ", tos);
+ }
+
+ if (tb[IFLA_GRE_PMTUDISC] &&
+ !*(__u8 *)RTA_DATA(tb[IFLA_GRE_PMTUDISC]))
+ fputs("nopmtudisc ", f);
+
+ if (tb[IFLA_GRE_IFLAGS])
+ iflags = *(__u16 *)RTA_DATA(tb[IFLA_GRE_IFLAGS]);
+
+ if (tb[IFLA_GRE_OFLAGS])
+ oflags = *(__u16 *)RTA_DATA(tb[IFLA_GRE_OFLAGS]);
+
+ if (iflags & GRE_KEY && tb[IFLA_GRE_IKEY] &&
+ *(__u32 *)RTA_DATA(tb[IFLA_GRE_IKEY])) {
+ inet_ntop(AF_INET, RTA_DATA(tb[IFLA_GRE_IKEY]), s2, sizeof(s2));
+ fprintf(f, "ikey %s ", s2);
+ }
+
+ if (oflags & GRE_KEY && tb[IFLA_GRE_OKEY] &&
+ *(__u32 *)RTA_DATA(tb[IFLA_GRE_OKEY])) {
+ inet_ntop(AF_INET, RTA_DATA(tb[IFLA_GRE_OKEY]), s2, sizeof(s2));
+ fprintf(f, "ikey %s ", s2);
+ }
+
+ if (iflags & GRE_SEQ)
+ fputs("iseq ", f);
+ if (oflags & GRE_SEQ)
+ fputs("oseq ", f);
+ if (iflags & GRE_CSUM)
+ fputs("icsum ", f);
+ if (oflags & GRE_CSUM)
+ fputs("ocsum ", f);
+}
+
+struct link_util gre_link_util = {
+ .id = "gre",
+ .maxattr = IFLA_GRE_MAX,
+ .parse_opt = gre_parse_opt,
+ .print_opt = gre_print_opt,
+};
+
+struct link_util gretap_link_util = {
+ .id = "gretap",
+ .maxattr = IFLA_GRE_MAX,
+ .parse_opt = gre_parse_opt,
+ .print_opt = gre_print_opt,
+};
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] gre: Add Transparent Ethernet Bridging
2008-10-09 7:05 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
@ 2008-10-09 8:43 ` Philip Craig
2008-10-09 10:39 ` Herbert Xu
2008-10-09 11:21 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
1 sibling, 1 reply; 16+ messages in thread
From: Philip Craig @ 2008-10-09 8:43 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev, Patrick McHardy
[-- Attachment #1: Type: text/plain, Size: 719 bytes --]
Thanks for doing this work, much appreciated.
I haven't tested your patches yet, but when I was attempting this,
I needed the attached patch to get it to work with bridging and
netfilter, otherwise gre tries to call
skb->dst->ops->update_pmtu(skb->dst, mtu);
Herbert Xu wrote:
> @@ -530,7 +572,13 @@ static int ipgre_rcv(struct sk_buff *skb)
> dst_release(skb->dst);
> skb->dst = NULL;
> nf_reset(skb);
> +
> + if (tunnel->dev->type == ARPHRD_ETHER)
> + skb->protocol = eth_type_trans(skb, skb->dev);
Do you need to check pskb_may_pull(skb, ETH_HLEN)?
> +
> + skb_reset_network_header(skb);
> ipgre_ecn_decapsulate(iph, skb);
> +
> netif_rx(skb);
> read_unlock(&ipgre_lock);
> return(0);
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1028 bytes --]
--- linux-2.6.x/net/bridge/br_netfilter.c 18 Jun 2006 23:30:55 -0000 1.1.1.25
+++ linux-2.6.x/net/bridge/br_netfilter.c 11 Aug 2006 04:10:04 -0000
@@ -765,14 +765,28 @@ out:
return NF_STOLEN;
}
+/*
+ * We've finished passing through netfilter, so we can remove the fake dst.
+ * This is required by some lower layers, eg ip_gre
+ */
+static int br_nf_dev_queue_xmit_finish(struct sk_buff *skb)
+{
+ if (skb->dst == (struct dst_entry *)&__fake_rtable) {
+ dst_release(skb->dst);
+ skb->dst = NULL;
+ }
+
+ return br_dev_queue_push_xmit(skb);
+}
+
static int br_nf_dev_queue_xmit(struct sk_buff *skb)
{
if (skb->protocol == htons(ETH_P_IP) &&
skb->len > skb->dev->mtu &&
!(skb_shinfo(skb)->ufo_size || skb_shinfo(skb)->tso_size))
- return ip_fragment(skb, br_dev_queue_push_xmit);
+ return ip_fragment(skb, br_nf_dev_queue_xmit_finish);
else
- return br_dev_queue_push_xmit(skb);
+ return br_nf_dev_queue_xmit_finish(skb);
}
/* PF_BRIDGE/POST_ROUTING ********************************************/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] gre: Add Transparent Ethernet Bridging
2008-10-09 8:43 ` Philip Craig
@ 2008-10-09 10:39 ` Herbert Xu
2008-11-24 6:49 ` bridge: Fix update_pmtu crash with GRE Herbert Xu
0 siblings, 1 reply; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 10:39 UTC (permalink / raw)
To: Philip Craig; +Cc: David S. Miller, netdev, Patrick McHardy
On Thu, Oct 09, 2008 at 06:43:21PM +1000, Philip Craig wrote:
>
> Herbert Xu wrote:
> > @@ -530,7 +572,13 @@ static int ipgre_rcv(struct sk_buff *skb)
> > dst_release(skb->dst);
> > skb->dst = NULL;
> > nf_reset(skb);
> > +
> > + if (tunnel->dev->type == ARPHRD_ETHER)
> > + skb->protocol = eth_type_trans(skb, skb->dev);
>
> Do you need to check pskb_may_pull(skb, ETH_HLEN)?
Good point, I'll fix this up.
> +/*
> + * We've finished passing through netfilter, so we can remove the fake dst.
> + * This is required by some lower layers, eg ip_gre
> + */
> +static int br_nf_dev_queue_xmit_finish(struct sk_buff *skb)
> +{
> + if (skb->dst == (struct dst_entry *)&__fake_rtable) {
> + dst_release(skb->dst);
> + skb->dst = NULL;
> + }
> +
> + return br_dev_queue_push_xmit(skb);
> +}
Alternatively we could give fake_rtable an ops structure.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] gre: Add Transparent Ethernet Bridging
2008-10-09 7:05 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
2008-10-09 8:43 ` Philip Craig
@ 2008-10-09 11:21 ` Herbert Xu
1 sibling, 0 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 11:21 UTC (permalink / raw)
To: David S. Miller, netdev, Patrick McHardy
On Thu, Oct 09, 2008 at 03:05:28PM +0800, Herbert Xu wrote:
> gre: Add Transparent Ethernet Bridging
This is an update of 4/4 incorporating Philip's comment.
It also fixes up any complete checksums in the packet.
gre: Add Transparent Ethernet Bridging
This patch adds support for Ethernet over GRE encapsulation.
This is exposed to user-space with a new link type of "gretap"
instead of "gre". It will create an ARPHRD_ETHER device in
lieu of the usual ARPHRD_IPGRE.
Note that to preserver backwards compatibility all Transparent
Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel
if its key matches and there is no ARPHRD_ETHER device whose
key matches more closely.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
include/linux/if_ether.h | 1
net/ipv4/ip_gre.c | 206 +++++++++++++++++++++++++++++++++++++++--------
2 files changed, 175 insertions(+), 32 deletions(-)
diff --git a/include/linux/if_ether.h b/include/linux/if_ether.h
index e157c13..56f9039 100644
--- a/include/linux/if_ether.h
+++ b/include/linux/if_ether.h
@@ -56,6 +56,7 @@
#define ETH_P_DIAG 0x6005 /* DEC Diagnostics */
#define ETH_P_CUST 0x6006 /* DEC Customer use */
#define ETH_P_SCA 0x6007 /* DEC Systems Comms Arch */
+#define ETH_P_TEB 0x6558 /* Trans Ether Bridging */
#define ETH_P_RARP 0x8035 /* Reverse Addr Res packet */
#define ETH_P_ATALK 0x809B /* Appletalk DDP */
#define ETH_P_AARP 0x80F3 /* Appletalk AARP */
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 25d2c77..44ed948 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -27,6 +27,7 @@
#include <linux/inetdevice.h>
#include <linux/igmp.h>
#include <linux/netfilter_ipv4.h>
+#include <linux/etherdevice.h>
#include <linux/if_ether.h>
#include <net/sock.h>
@@ -166,38 +167,64 @@ static DEFINE_RWLOCK(ipgre_lock);
/* Given src, dst and key, find appropriate for input tunnel. */
static struct ip_tunnel * ipgre_tunnel_lookup(struct net *net,
- __be32 remote, __be32 local, __be32 key)
+ __be32 remote, __be32 local,
+ __be32 key, __be16 gre_proto)
{
unsigned h0 = HASH(remote);
unsigned h1 = HASH(key);
struct ip_tunnel *t;
+ struct ip_tunnel *t2 = NULL;
struct ipgre_net *ign = net_generic(net, ipgre_net_id);
+ int dev_type = (gre_proto == htons(ETH_P_TEB)) ?
+ ARPHRD_ETHER : ARPHRD_IPGRE;
for (t = ign->tunnels_r_l[h0^h1]; t; t = t->next) {
if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
}
+
for (t = ign->tunnels_r[h0^h1]; t; t = t->next) {
if (remote == t->parms.iph.daddr) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
}
+
for (t = ign->tunnels_l[h1]; t; t = t->next) {
if (local == t->parms.iph.saddr ||
(local == t->parms.iph.daddr &&
ipv4_is_multicast(local))) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
}
+
for (t = ign->tunnels_wc[h1]; t; t = t->next) {
- if (t->parms.i_key == key && (t->dev->flags&IFF_UP))
- return t;
+ if (t->parms.i_key == key && t->dev->flags & IFF_UP) {
+ if (t->dev->type == dev_type)
+ return t;
+ if (t->dev->type == ARPHRD_IPGRE && !t2)
+ t2 = t;
+ }
}
+ if (t2)
+ return t2;
+
if (ign->fb_tunnel_dev->flags&IFF_UP)
return netdev_priv(ign->fb_tunnel_dev);
return NULL;
@@ -252,25 +279,37 @@ static void ipgre_tunnel_unlink(struct ipgre_net *ign, struct ip_tunnel *t)
}
}
-static struct ip_tunnel * ipgre_tunnel_locate(struct net *net,
- struct ip_tunnel_parm *parms, int create)
+static struct ip_tunnel *ipgre_tunnel_find(struct net *net,
+ struct ip_tunnel_parm *parms,
+ int type)
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
__be32 key = parms->i_key;
- struct ip_tunnel *t, **tp, *nt;
+ struct ip_tunnel *t, **tp;
+ struct ipgre_net *ign = net_generic(net, ipgre_net_id);
+
+ for (tp = __ipgre_bucket(ign, parms); (t = *tp) != NULL; tp = &t->next)
+ if (local == t->parms.iph.saddr &&
+ remote == t->parms.iph.daddr &&
+ key == t->parms.i_key &&
+ type == t->dev->type)
+ break;
+
+ return t;
+}
+
+static struct ip_tunnel * ipgre_tunnel_locate(struct net *net,
+ struct ip_tunnel_parm *parms, int create)
+{
+ struct ip_tunnel *t, *nt;
struct net_device *dev;
char name[IFNAMSIZ];
struct ipgre_net *ign = net_generic(net, ipgre_net_id);
- for (tp = __ipgre_bucket(ign, parms); (t = *tp) != NULL; tp = &t->next) {
- if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr) {
- if (key == t->parms.i_key)
- return t;
- }
- }
- if (!create)
- return NULL;
+ t = ipgre_tunnel_find(net, parms, ARPHRD_IPGRE);
+ if (t || !create)
+ return t;
if (parms->name[0])
strlcpy(name, parms->name, IFNAMSIZ);
@@ -385,8 +424,9 @@ static void ipgre_err(struct sk_buff *skb, u32 info)
read_lock(&ipgre_lock);
t = ipgre_tunnel_lookup(dev_net(skb->dev), iph->daddr, iph->saddr,
- (flags&GRE_KEY) ?
- *(((__be32*)p) + (grehlen>>2) - 1) : 0);
+ flags & GRE_KEY ?
+ *(((__be32 *)p) + (grehlen / 4) - 1) : 0,
+ p[1]);
if (t == NULL || t->parms.iph.daddr == 0 ||
ipv4_is_multicast(t->parms.iph.daddr))
goto out;
@@ -436,6 +476,7 @@ static int ipgre_rcv(struct sk_buff *skb)
u32 seqno = 0;
struct ip_tunnel *tunnel;
int offset = 4;
+ __be16 gre_proto;
if (!pskb_may_pull(skb, 16))
goto drop_nolock;
@@ -475,20 +516,22 @@ static int ipgre_rcv(struct sk_buff *skb)
}
}
+ gre_proto = *(__be16 *)(h + 2);
+
read_lock(&ipgre_lock);
if ((tunnel = ipgre_tunnel_lookup(dev_net(skb->dev),
- iph->saddr, iph->daddr, key)) != NULL) {
+ iph->saddr, iph->daddr, key,
+ gre_proto))) {
struct net_device_stats *stats = &tunnel->dev->stats;
secpath_reset(skb);
- skb->protocol = *(__be16*)(h + 2);
+ skb->protocol = gre_proto;
/* WCCP version 1 and 2 protocol decoding.
* - Change protocol to IP
* - When dealing with WCCPv2, Skip extra 4 bytes in GRE header
*/
- if (flags == 0 &&
- skb->protocol == htons(ETH_P_WCCP)) {
+ if (flags == 0 && gre_proto == htons(ETH_P_WCCP)) {
skb->protocol = htons(ETH_P_IP);
if ((*(h + offset) & 0xF0) != 0x40)
offset += 4;
@@ -496,7 +539,6 @@ static int ipgre_rcv(struct sk_buff *skb)
skb->mac_header = skb->network_header;
__pskb_pull(skb, offset);
- skb_reset_network_header(skb);
skb_postpull_rcsum(skb, skb_transport_header(skb), offset);
skb->pkt_type = PACKET_HOST;
#ifdef CONFIG_NET_IPGRE_BROADCAST
@@ -524,13 +566,30 @@ static int ipgre_rcv(struct sk_buff *skb)
}
tunnel->i_seqno = seqno + 1;
}
+
+ /* Warning: All skb pointers will be invalidated! */
+ if (tunnel->dev->type == ARPHRD_ETHER) {
+ if (!pskb_may_pull(skb, ETH_HLEN)) {
+ stats->rx_length_errors++;
+ stats->rx_errors++;
+ goto drop;
+ }
+
+ iph = ip_hdr(skb);
+ skb->protocol = eth_type_trans(skb, tunnel->dev);
+ skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
+ }
+
stats->rx_packets++;
stats->rx_bytes += skb->len;
skb->dev = tunnel->dev;
dst_release(skb->dst);
skb->dst = NULL;
nf_reset(skb);
+
+ skb_reset_network_header(skb);
ipgre_ecn_decapsulate(iph, skb);
+
netif_rx(skb);
read_unlock(&ipgre_lock);
return(0);
@@ -565,7 +624,10 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
goto tx_error;
}
- if (dev->header_ops) {
+ if (dev->type == ARPHRD_ETHER)
+ IPCB(skb)->flags = 0;
+
+ if (dev->header_ops && dev->type == ARPHRD_IPGRE) {
gre_hlen = 0;
tiph = (struct iphdr*)skb->data;
} else {
@@ -741,8 +803,9 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
iph->ttl = dst_metric(&rt->u.dst, RTAX_HOPLIMIT);
}
- ((__be16*)(iph+1))[0] = tunnel->parms.o_flags;
- ((__be16*)(iph+1))[1] = skb->protocol;
+ ((__be16 *)(iph + 1))[0] = tunnel->parms.o_flags;
+ ((__be16 *)(iph + 1))[1] = (dev->type == ARPHRD_ETHER) ?
+ htons(ETH_P_TEB) : skb->protocol;
if (tunnel->parms.o_flags&(GRE_KEY|GRE_CSUM|GRE_SEQ)) {
__be32 *ptr = (__be32*)(((u8*)iph) + tunnel->hlen - 4);
@@ -804,7 +867,9 @@ static int ipgre_tunnel_bind_dev(struct net_device *dev)
tdev = rt->u.dst.dev;
ip_rt_put(rt);
}
- dev->flags |= IFF_POINTOPOINT;
+
+ if (dev->type != ARPHRD_ETHER)
+ dev->flags |= IFF_POINTOPOINT;
}
if (!tdev && tunnel->parms.link)
@@ -1250,6 +1315,30 @@ static int ipgre_tunnel_validate(struct nlattr *tb[], struct nlattr *data[])
return 0;
}
+static int ipgre_tap_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+ __be32 daddr;
+
+ if (tb[IFLA_ADDRESS]) {
+ if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+ return -EINVAL;
+ if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+ return -EADDRNOTAVAIL;
+ }
+
+ if (!data)
+ goto out;
+
+ if (data[IFLA_GRE_REMOTE]) {
+ memcpy(&daddr, nla_data(data[IFLA_GRE_REMOTE]), 4);
+ if (!daddr)
+ return -EINVAL;
+ }
+
+out:
+ return ipgre_tunnel_validate(tb, data);
+}
+
static void ipgre_netlink_parms(struct nlattr *data[],
struct ip_tunnel_parm *parms)
{
@@ -1291,6 +1380,35 @@ static void ipgre_netlink_parms(struct nlattr *data[],
parms->iph.frag_off = htons(IP_DF);
}
+static int ipgre_tap_init(struct net_device *dev)
+{
+ struct ip_tunnel *tunnel;
+
+ tunnel = netdev_priv(dev);
+
+ tunnel->dev = dev;
+ strcpy(tunnel->parms.name, dev->name);
+
+ ipgre_tunnel_bind_dev(dev);
+
+ return 0;
+}
+
+static void ipgre_tap_setup(struct net_device *dev)
+{
+
+ ether_setup(dev);
+
+ dev->init = ipgre_tap_init;
+ dev->uninit = ipgre_tunnel_uninit;
+ dev->destructor = free_netdev;
+ dev->hard_start_xmit = ipgre_tunnel_xmit;
+ dev->change_mtu = ipgre_tunnel_change_mtu;
+
+ dev->iflink = 0;
+ dev->features |= NETIF_F_NETNS_LOCAL;
+}
+
static int ipgre_newlink(struct net_device *dev, struct nlattr *tb[],
struct nlattr *data[])
{
@@ -1303,9 +1421,12 @@ static int ipgre_newlink(struct net_device *dev, struct nlattr *tb[],
nt = netdev_priv(dev);
ipgre_netlink_parms(data, &nt->parms);
- if (ipgre_tunnel_locate(net, &nt->parms, 0))
+ if (ipgre_tunnel_find(net, &nt->parms, dev->type))
return -EEXIST;
+ if (dev->type == ARPHRD_ETHER && !tb[IFLA_ADDRESS])
+ random_ether_addr(dev->dev_addr);
+
mtu = ipgre_tunnel_bind_dev(dev);
if (!tb[IFLA_MTU])
dev->mtu = mtu;
@@ -1455,6 +1576,19 @@ static struct rtnl_link_ops ipgre_link_ops __read_mostly = {
.fill_info = ipgre_fill_info,
};
+static struct rtnl_link_ops ipgre_tap_ops __read_mostly = {
+ .kind = "gretap",
+ .maxtype = IFLA_GRE_MAX,
+ .policy = ipgre_policy,
+ .priv_size = sizeof(struct ip_tunnel),
+ .setup = ipgre_tap_setup,
+ .validate = ipgre_tap_validate,
+ .newlink = ipgre_newlink,
+ .changelink = ipgre_changelink,
+ .get_size = ipgre_get_size,
+ .fill_info = ipgre_fill_info,
+};
+
/*
* And now the modules code and kernel interface.
*/
@@ -1478,9 +1612,15 @@ static int __init ipgre_init(void)
if (err < 0)
goto rtnl_link_failed;
+ err = rtnl_link_register(&ipgre_tap_ops);
+ if (err < 0)
+ goto tap_ops_failed;
+
out:
return err;
+tap_ops_failed:
+ rtnl_link_unregister(&ipgre_link_ops);
rtnl_link_failed:
unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
gen_device_failed:
@@ -1490,6 +1630,7 @@ gen_device_failed:
static void __exit ipgre_fini(void)
{
+ rtnl_link_unregister(&ipgre_tap_ops);
rtnl_link_unregister(&ipgre_link_ops);
unregister_pernet_gen_device(ipgre_net_id, &ipgre_net_ops);
if (inet_del_protocol(&ipgre_protocol, IPPROTO_GRE) < 0)
@@ -1500,3 +1641,4 @@ module_init(ipgre_init);
module_exit(ipgre_fini);
MODULE_LICENSE("GPL");
MODULE_ALIAS("rtnl-link-gre");
+MODULE_ALIAS("rtnl-link-gretap");
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/4] inet: Make tunnel RX/TX byte counters more consistent
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
` (4 preceding siblings ...)
2008-10-09 7:08 ` ip: gre: Add GRE configuration support through rtnl_link Herbert Xu
@ 2008-10-09 11:23 ` Herbert Xu
2008-10-09 19:03 ` David Miller
2008-10-09 12:12 ` [0/4] gre: Ethernet over GRE Patrick McHardy
2008-10-09 19:01 ` David Miller
7 siblings, 1 reply; 16+ messages in thread
From: Herbert Xu @ 2008-10-09 11:23 UTC (permalink / raw)
To: David S. Miller, netdev; +Cc: Patrick McHardy, Philip Craig
inet: Make tunnel RX/TX byte counters more consistent
This patch makes the RX/TX byte counters for IPIP, GRE and SIT more
consistent. Previously we included the external IP headers on the
way out but not when the packet is inbound.
The new scheme is to count payload only in both directions. For
IPIP and SIT this simply means the exclusion of the external IP
header. For GRE this means that we exclude the GRE header as
well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
include/net/ipip.h | 2 +-
net/ipv4/ip_gre.c | 7 +++++--
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/net/ipip.h b/include/net/ipip.h
index a85bda6..fdf9bd7 100644
--- a/include/net/ipip.h
+++ b/include/net/ipip.h
@@ -37,7 +37,7 @@ struct ip_tunnel_prl_entry
#define IPTUNNEL_XMIT() do { \
int err; \
- int pkt_len = skb->len; \
+ int pkt_len = skb->len - skb_transport_offset(skb); \
\
skb->ip_summed = CHECKSUM_NONE; \
ip_select_ident(iph, &rt->u.dst, NULL); \
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 44ed948..0d5e35b 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -477,6 +477,7 @@ static int ipgre_rcv(struct sk_buff *skb)
struct ip_tunnel *tunnel;
int offset = 4;
__be16 gre_proto;
+ unsigned int len;
if (!pskb_may_pull(skb, 16))
goto drop_nolock;
@@ -567,6 +568,8 @@ static int ipgre_rcv(struct sk_buff *skb)
tunnel->i_seqno = seqno + 1;
}
+ len = skb->len;
+
/* Warning: All skb pointers will be invalidated! */
if (tunnel->dev->type == ARPHRD_ETHER) {
if (!pskb_may_pull(skb, ETH_HLEN)) {
@@ -581,7 +584,7 @@ static int ipgre_rcv(struct sk_buff *skb)
}
stats->rx_packets++;
- stats->rx_bytes += skb->len;
+ stats->rx_bytes += len;
skb->dev = tunnel->dev;
dst_release(skb->dst);
skb->dst = NULL;
@@ -770,7 +773,7 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
old_iph = ip_hdr(skb);
}
- skb->transport_header = skb->network_header;
+ skb_reset_transport_header(skb);
skb_push(skb, gre_hlen);
skb_reset_network_header(skb);
memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [0/4] gre: Ethernet over GRE
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
` (5 preceding siblings ...)
2008-10-09 11:23 ` [PATCH 5/4] inet: Make tunnel RX/TX byte counters more consistent Herbert Xu
@ 2008-10-09 12:12 ` Patrick McHardy
2008-10-09 19:01 ` David Miller
7 siblings, 0 replies; 16+ messages in thread
From: Patrick McHardy @ 2008-10-09 12:12 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, netdev
Herbert Xu wrote:
> This series of patches add Ethernet over GRE support. The user
> space interface is done with the rtnl_link mechanism. I'll post
> the patches for iproute too.
>
> There has been a small demand of such a tunneling mechanism that
> allows direct Ethernet connections over IP. In the past many
> efforts have been made in this direction. However, inadequacies
> with our user interface have foiled many of them.
>
> Recently Patrick McHardy created the rtnl_link mechanism which
> finally allows tunnel configuration to be brought into the 21st
> century. Having tried it I must say that it has been an absolute
> pleasure to use :)
Glad you like it :)
> This should be completely backwards compatible in that if you
> don't create Ethernet over GRE tunnels then your GRE experience
> should not differ one single bit from before. However, you can
> manage your existing GRE interfaces using the new interface should
> you choose to do so.
These patches look good to me, thanks for doing this work.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [0/4] gre: Ethernet over GRE
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
` (6 preceding siblings ...)
2008-10-09 12:12 ` [0/4] gre: Ethernet over GRE Patrick McHardy
@ 2008-10-09 19:01 ` David Miller
7 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2008-10-09 19:01 UTC (permalink / raw)
To: herbert; +Cc: netdev, kaber
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 9 Oct 2008 15:04:24 +0800
> This series of patches add Ethernet over GRE support. The user
> space interface is done with the rtnl_link mechanism. I'll post
> the patches for iproute too.
Nice.
All applied, and I made sure to use the updated version of
patch 4.
Thanks!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 5/4] inet: Make tunnel RX/TX byte counters more consistent
2008-10-09 11:23 ` [PATCH 5/4] inet: Make tunnel RX/TX byte counters more consistent Herbert Xu
@ 2008-10-09 19:03 ` David Miller
0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2008-10-09 19:03 UTC (permalink / raw)
To: herbert; +Cc: netdev, kaber, philipc
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 9 Oct 2008 19:23:08 +0800
> inet: Make tunnel RX/TX byte counters more consistent
>
> This patch makes the RX/TX byte counters for IPIP, GRE and SIT more
> consistent. Previously we included the external IP headers on the
> way out but not when the packet is inbound.
>
> The new scheme is to count payload only in both directions. For
> IPIP and SIT this simply means the exclusion of the external IP
> header. For GRE this means that we exclude the GRE header as
> well.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: ip: gre: Add GRE configuration support through rtnl_link
2008-10-09 7:08 ` ip: gre: Add GRE configuration support through rtnl_link Herbert Xu
@ 2008-10-11 13:03 ` Herbert Xu
0 siblings, 0 replies; 16+ messages in thread
From: Herbert Xu @ 2008-10-11 13:03 UTC (permalink / raw)
To: David S. Miller, netdev, Stephen Hemminger; +Cc: Patrick McHardy
On Thu, Oct 09, 2008 at 03:08:24PM +0800, Herbert Xu wrote:
> ip: gre: Add GRE configuration support through rtnl_link
Here's an updated version that preserves the existing ip tunnel
change semantics so that you can do piecemeal modifications.
ip: gre: Add GRE configuration support through rtnl_link
This patch adds support for configuring GRE tunnels using the
new rtnl_link interface. This only works on kernels that have
the new GRE configuration interface.
This is accessed through the "ip link" command. The previous
tunnel configuration interface "ip tunnel" remains as it is
and should be retained for compatibility with old kernels.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/if_tunnel.h b/include/linux/if_tunnel.h
index d4efe40..b9b8760 100644
--- a/include/linux/if_tunnel.h
+++ b/include/linux/if_tunnel.h
@@ -2,6 +2,7 @@
#define _IF_TUNNEL_H_
#include <linux/types.h>
+#include <linux/ip.h>
#define SIOCGETTUNNEL (SIOCDEVPRIVATE + 0)
#define SIOCADDTUNNEL (SIOCDEVPRIVATE + 1)
@@ -47,4 +48,26 @@ struct ip_tunnel_prl {
/* PRL flags */
#define PRL_DEFAULT 0x0001
+enum
+{
+ IFLA_GRE_UNSPEC,
+ IFLA_GRE_LINK,
+ IFLA_GRE_IFLAGS,
+ IFLA_GRE_OFLAGS,
+ IFLA_GRE_IKEY,
+ IFLA_GRE_OKEY,
+ IFLA_GRE_LOCAL,
+ IFLA_GRE_REMOTE,
+ IFLA_GRE_TTL,
+ IFLA_GRE_TOS,
+ IFLA_GRE_PMTUDISC,
+ IFLA_GRE_TYPE,
+ __IFLA_GRE_MAX,
+};
+
+#define IFLA_GRE_MAX (__IFLA_GRE_MAX - 1)
+
+#define GRE_TYPE_TUN 0x00000000
+#define GRE_TYPE_TAP 0x00000001
+
#endif /* _IF_TUNNEL_H_ */
diff --git a/ip/Makefile b/ip/Makefile
index 73978ff..98ba876 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -2,7 +2,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o \
rtm_map.o iptunnel.o ip6tunnel.o tunnel.o ipneigh.o ipntable.o iplink.o \
ipmaddr.o ipmonitor.o ipmroute.o ipprefix.o \
ipxfrm.o xfrm_state.o xfrm_policy.o xfrm_monitor.o \
- iplink_vlan.o link_veth.o
+ iplink_vlan.o link_veth.o link_gre.o
RTMONOBJ=rtmon.o
diff --git a/ip/iplink.c b/ip/iplink.c
index f4cbeb3..a4d1187 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -310,32 +310,6 @@ static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv)
argv += ret;
ll_init_map(&rth);
- if (type) {
- struct rtattr *linkinfo = NLMSG_TAIL(&req.n);
- addattr_l(&req.n, sizeof(req), IFLA_LINKINFO, NULL, 0);
- addattr_l(&req.n, sizeof(req), IFLA_INFO_KIND, type,
- strlen(type));
-
- lu = get_link_kind(type);
- if (lu && argc) {
- struct rtattr * data = NLMSG_TAIL(&req.n);
- addattr_l(&req.n, sizeof(req), IFLA_INFO_DATA, NULL, 0);
-
- if (lu->parse_opt &&
- lu->parse_opt(lu, argc, argv, &req.n))
- return -1;
-
- data->rta_len = (void *)NLMSG_TAIL(&req.n) - (void *)data;
- } else if (argc) {
- if (matches(*argv, "help") == 0)
- usage();
- fprintf(stderr, "Garbage instead of arguments \"%s ...\". "
- "Try \"ip link help\".\n", *argv);
- return -1;
- }
- linkinfo->rta_len = (void *)NLMSG_TAIL(&req.n) - (void *)linkinfo;
- }
-
if (!(flags & NLM_F_CREATE)) {
if (!dev) {
fprintf(stderr, "Not enough information: \"dev\" "
@@ -366,6 +340,33 @@ static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv)
}
}
+ if (type) {
+ struct rtattr *linkinfo = NLMSG_TAIL(&req.n);
+
+ addattr_l(&req.n, sizeof(req), IFLA_LINKINFO, NULL, 0);
+ addattr_l(&req.n, sizeof(req), IFLA_INFO_KIND, type,
+ strlen(type));
+
+ lu = get_link_kind(type);
+ if (lu && argc) {
+ struct rtattr * data = NLMSG_TAIL(&req.n);
+ addattr_l(&req.n, sizeof(req), IFLA_INFO_DATA, NULL, 0);
+
+ if (lu->parse_opt &&
+ lu->parse_opt(lu, argc, argv, &req.n))
+ return -1;
+
+ data->rta_len = (void *)NLMSG_TAIL(&req.n) - (void *)data;
+ } else if (argc) {
+ if (matches(*argv, "help") == 0)
+ usage();
+ fprintf(stderr, "Garbage instead of arguments \"%s ...\". "
+ "Try \"ip link help\".\n", *argv);
+ return -1;
+ }
+ linkinfo->rta_len = (void *)NLMSG_TAIL(&req.n) - (void *)linkinfo;
+ }
+
if (name) {
len = strlen(name) + 1;
if (len == 1)
diff --git a/ip/link_gre.c b/ip/link_gre.c
new file mode 100644
index 0000000..9109312
--- /dev/null
+++ b/ip/link_gre.c
@@ -0,0 +1,367 @@
+/*
+ * link_gre.c gre driver module
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors: Herbert Xu <herbert@gondor.apana.org.au>
+ *
+ */
+
+#include <string.h>
+#include <net/if.h>
+#include <linux/if_tunnel.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <arpa/inet.h>
+
+#include "rt_names.h"
+#include "utils.h"
+#include "ip_common.h"
+#include "tunnel.h"
+
+static void usage(void) __attribute__((noreturn));
+static void usage(void)
+{
+ fprintf(stderr, "Usage: ip link { add | set | change | replace | del } NAME\n");
+ fprintf(stderr, " type { gre | gretap } [ remote ADDR ] [ local ADDR ]\n");
+ fprintf(stderr, " [ [i|o]seq ] [ [i|o]key KEY ] [ [i|o]csum ]\n");
+ fprintf(stderr, " [ ttl TTL ] [ tos TOS ] [ [no]pmtudisc ] [ dev PHYS_DEV ]\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, "Where: NAME := STRING\n");
+ fprintf(stderr, " ADDR := { IP_ADDRESS | any }\n");
+ fprintf(stderr, " TOS := { NUMBER | inherit }\n");
+ fprintf(stderr, " TTL := { 1..255 | inherit }\n");
+ fprintf(stderr, " KEY := { DOTTED_QUAD | NUMBER }\n");
+ exit(-1);
+}
+
+static int gre_parse_opt(struct link_util *lu, int argc, char **argv,
+ struct nlmsghdr *n)
+{
+ struct {
+ struct nlmsghdr n;
+ struct ifinfomsg i;
+ char buf[1024];
+ } req;
+ struct ifinfomsg *ifi = (struct ifinfomsg *)(n + 1);
+ struct rtattr *tb[IFLA_MAX + 1];
+ struct rtattr *linkinfo[IFLA_INFO_MAX+1];
+ struct rtattr *greinfo[IFLA_GRE_MAX + 1];
+ __u16 iflags = 0;
+ __u16 oflags = 0;
+ unsigned ikey = 0;
+ unsigned okey = 0;
+ unsigned saddr = 0;
+ unsigned daddr = 0;
+ unsigned link = 0;
+ __u8 pmtudisc = 1;
+ __u8 ttl = 0;
+ __u8 tos = 0;
+ int len;
+
+ if (!(n->nlmsg_flags & NLM_F_CREATE)) {
+ memset(&req, 0, sizeof(req));
+
+ req.n.nlmsg_len = NLMSG_LENGTH(sizeof(*ifi));
+ req.n.nlmsg_flags = NLM_F_REQUEST;
+ req.n.nlmsg_type = RTM_GETLINK;
+ req.i.ifi_family = preferred_family;
+ req.i.ifi_index = ifi->ifi_index;
+
+ if (rtnl_talk(&rth, &req.n, 0, 0, &req.n, NULL, NULL) < 0) {
+get_failed:
+ fprintf(stderr,
+ "Failed to get existing tunnel info.\n");
+ return -1;
+ }
+
+ len = req.n.nlmsg_len;
+ len -= NLMSG_LENGTH(sizeof(*ifi));
+ if (len < 0)
+ goto get_failed;
+
+ parse_rtattr(tb, IFLA_MAX, IFLA_RTA(&req.i), len);
+
+ if (!tb[IFLA_LINKINFO])
+ goto get_failed;
+
+ parse_rtattr_nested(linkinfo, IFLA_INFO_MAX, tb[IFLA_LINKINFO]);
+
+ if (!linkinfo[IFLA_INFO_DATA])
+ goto get_failed;
+
+ parse_rtattr_nested(greinfo, IFLA_GRE_MAX,
+ linkinfo[IFLA_INFO_DATA]);
+
+ if (greinfo[IFLA_GRE_IKEY])
+ ikey = *(__u32 *)RTA_DATA(greinfo[IFLA_GRE_IKEY]);
+
+ if (greinfo[IFLA_GRE_OKEY])
+ okey = *(__u32 *)RTA_DATA(greinfo[IFLA_GRE_OKEY]);
+
+ if (greinfo[IFLA_GRE_IFLAGS])
+ iflags = *(__u16 *)RTA_DATA(greinfo[IFLA_GRE_IFLAGS]);
+
+ if (greinfo[IFLA_GRE_OFLAGS])
+ oflags = *(__u16 *)RTA_DATA(greinfo[IFLA_GRE_OFLAGS]);
+
+ if (greinfo[IFLA_GRE_LOCAL])
+ saddr = *(__u32 *)RTA_DATA(greinfo[IFLA_GRE_LOCAL]);
+
+ if (greinfo[IFLA_GRE_REMOTE])
+ daddr = *(__u32 *)RTA_DATA(greinfo[IFLA_GRE_REMOTE]);
+
+ if (greinfo[IFLA_GRE_PMTUDISC])
+ pmtudisc = *(__u8 *)RTA_DATA(
+ greinfo[IFLA_GRE_PMTUDISC]);
+
+ if (greinfo[IFLA_GRE_TTL])
+ ttl = *(__u8 *)RTA_DATA(greinfo[IFLA_GRE_TTL]);
+
+ if (greinfo[IFLA_GRE_TOS])
+ tos = *(__u8 *)RTA_DATA(greinfo[IFLA_GRE_TOS]);
+
+ if (greinfo[IFLA_GRE_LINK])
+ link = *(__u8 *)RTA_DATA(greinfo[IFLA_GRE_LINK]);
+ }
+
+ while (argc > 0) {
+ if (!matches(*argv, "key")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ iflags |= GRE_KEY;
+ oflags |= GRE_KEY;
+ if (strchr(*argv, '.'))
+ uval = get_addr32(*argv);
+ else {
+ if (get_unsigned(&uval, *argv, 0) < 0) {
+ fprintf(stderr,
+ "Invalid value for \"key\"\n");
+ exit(-1);
+ }
+ uval = htonl(uval);
+ }
+
+ ikey = okey = uval;
+ } else if (!matches(*argv, "ikey")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ iflags |= GRE_KEY;
+ if (strchr(*argv, '.'))
+ uval = get_addr32(*argv);
+ else {
+ if (get_unsigned(&uval, *argv, 0)<0) {
+ fprintf(stderr, "invalid value of \"ikey\"\n");
+ exit(-1);
+ }
+ uval = htonl(uval);
+ }
+ ikey = uval;
+ } else if (!matches(*argv, "okey")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ oflags |= GRE_KEY;
+ if (strchr(*argv, '.'))
+ uval = get_addr32(*argv);
+ else {
+ if (get_unsigned(&uval, *argv, 0)<0) {
+ fprintf(stderr, "invalid value of \"okey\"\n");
+ exit(-1);
+ }
+ uval = htonl(uval);
+ }
+ okey = uval;
+ } else if (!matches(*argv, "seq")) {
+ iflags |= GRE_SEQ;
+ oflags |= GRE_SEQ;
+ } else if (!matches(*argv, "iseq")) {
+ iflags |= GRE_SEQ;
+ } else if (!matches(*argv, "oseq")) {
+ oflags |= GRE_SEQ;
+ } else if (!matches(*argv, "csum")) {
+ iflags |= GRE_CSUM;
+ oflags |= GRE_CSUM;
+ } else if (!matches(*argv, "icsum")) {
+ iflags |= GRE_CSUM;
+ } else if (!matches(*argv, "ocsum")) {
+ oflags |= GRE_CSUM;
+ } else if (!matches(*argv, "nopmtudisc")) {
+ pmtudisc = 0;
+ } else if (!matches(*argv, "pmtudisc")) {
+ pmtudisc = 1;
+ } else if (!matches(*argv, "remote")) {
+ NEXT_ARG();
+ if (strcmp(*argv, "any"))
+ daddr = get_addr32(*argv);
+ } else if (!matches(*argv, "local")) {
+ NEXT_ARG();
+ if (strcmp(*argv, "any"))
+ saddr = get_addr32(*argv);
+ } else if (!matches(*argv, "dev")) {
+ NEXT_ARG();
+ link = tnl_ioctl_get_ifindex(*argv);
+ if (link == 0)
+ exit(-1);
+ } else if (!matches(*argv, "ttl") ||
+ !matches(*argv, "hoplimit")) {
+ unsigned uval;
+
+ NEXT_ARG();
+ if (strcmp(*argv, "inherit") != 0) {
+ if (get_unsigned(&uval, *argv, 0))
+ invarg("invalid TTL\n", *argv);
+ if (uval > 255)
+ invarg("TTL must be <= 255\n", *argv);
+ ttl = uval;
+ }
+ } else if (!matches(*argv, "tos") ||
+ !matches(*argv, "tclass") ||
+ !matches(*argv, "dsfield")) {
+ __u32 uval;
+
+ NEXT_ARG();
+ if (strcmp(*argv, "inherit") != 0) {
+ if (rtnl_dsfield_a2n(&uval, *argv))
+ invarg("bad TOS value", *argv);
+ tos = uval;
+ } else
+ tos = 1;
+ } else
+ usage();
+ argc--; argv++;
+ }
+
+ if (!ikey && IN_MULTICAST(ntohl(daddr))) {
+ ikey = daddr;
+ iflags |= GRE_KEY;
+ }
+ if (!okey && IN_MULTICAST(ntohl(daddr))) {
+ okey = daddr;
+ oflags |= GRE_KEY;
+ }
+ if (IN_MULTICAST(ntohl(daddr)) && !saddr) {
+ fprintf(stderr, "Broadcast tunnel requires a source address.\n");
+ return -1;
+ }
+
+ addattr32(n, 1024, IFLA_GRE_IKEY, ikey);
+ addattr32(n, 1024, IFLA_GRE_OKEY, okey);
+ addattr_l(n, 1024, IFLA_GRE_IFLAGS, &iflags, 2);
+ addattr_l(n, 1024, IFLA_GRE_OFLAGS, &oflags, 2);
+ addattr_l(n, 1024, IFLA_GRE_LOCAL, &saddr, 4);
+ addattr_l(n, 1024, IFLA_GRE_REMOTE, &daddr, 4);
+ addattr_l(n, 1024, IFLA_GRE_PMTUDISC, &pmtudisc, 1);
+ if (link)
+ addattr32(n, 1024, IFLA_GRE_LINK, link);
+ addattr_l(n, 1024, IFLA_GRE_TTL, &ttl, 1);
+ addattr_l(n, 1024, IFLA_GRE_TOS, &tos, 1);
+
+ return 0;
+}
+
+static void gre_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
+{
+ char s1[1024];
+ char s2[64];
+ const char *local = "any";
+ const char *remote = "any";
+ unsigned iflags = 0;
+ unsigned oflags = 0;
+
+ if (!tb)
+ return;
+
+ if (tb[IFLA_GRE_REMOTE]) {
+ unsigned addr = *(__u32 *)RTA_DATA(tb[IFLA_GRE_REMOTE]);
+
+ if (addr)
+ remote = format_host(AF_INET, 4, &addr, s1, sizeof(s1));
+ }
+
+ fprintf(f, "remote %s ", remote);
+
+ if (tb[IFLA_GRE_LOCAL]) {
+ unsigned addr = *(__u32 *)RTA_DATA(tb[IFLA_GRE_LOCAL]);
+
+ if (addr)
+ local = format_host(AF_INET, 4, &addr, s1, sizeof(s1));
+ }
+
+ fprintf(f, "local %s ", local);
+
+ if (tb[IFLA_GRE_LINK] && *(__u32 *)RTA_DATA(tb[IFLA_GRE_LINK])) {
+ unsigned link = *(__u32 *)RTA_DATA(tb[IFLA_GRE_LINK]);
+ char *n = tnl_ioctl_get_ifname(link);
+
+ if (n)
+ fprintf(f, "dev %s ", n);
+ else
+ fprintf(f, "dev %u ", link);
+ }
+
+ if (tb[IFLA_GRE_TTL] && *(__u8 *)RTA_DATA(tb[IFLA_GRE_TTL]))
+ fprintf(f, "ttl %d ", *(__u8 *)RTA_DATA(tb[IFLA_GRE_TTL]));
+ else
+ fprintf(f, "ttl inherit ");
+
+ if (tb[IFLA_GRE_TOS] && *(__u8 *)RTA_DATA(tb[IFLA_GRE_TOS])) {
+ int tos = *(__u8 *)RTA_DATA(tb[IFLA_GRE_TOS]);
+
+ fputs("tos ", f);
+ if (tos == 1)
+ fputs("inherit ", f);
+ else
+ fprintf(f, "0x%x ", tos);
+ }
+
+ if (tb[IFLA_GRE_PMTUDISC] &&
+ !*(__u8 *)RTA_DATA(tb[IFLA_GRE_PMTUDISC]))
+ fputs("nopmtudisc ", f);
+
+ if (tb[IFLA_GRE_IFLAGS])
+ iflags = *(__u16 *)RTA_DATA(tb[IFLA_GRE_IFLAGS]);
+
+ if (tb[IFLA_GRE_OFLAGS])
+ oflags = *(__u16 *)RTA_DATA(tb[IFLA_GRE_OFLAGS]);
+
+ if (iflags & GRE_KEY && tb[IFLA_GRE_IKEY] &&
+ *(__u32 *)RTA_DATA(tb[IFLA_GRE_IKEY])) {
+ inet_ntop(AF_INET, RTA_DATA(tb[IFLA_GRE_IKEY]), s2, sizeof(s2));
+ fprintf(f, "ikey %s ", s2);
+ }
+
+ if (oflags & GRE_KEY && tb[IFLA_GRE_OKEY] &&
+ *(__u32 *)RTA_DATA(tb[IFLA_GRE_OKEY])) {
+ inet_ntop(AF_INET, RTA_DATA(tb[IFLA_GRE_OKEY]), s2, sizeof(s2));
+ fprintf(f, "ikey %s ", s2);
+ }
+
+ if (iflags & GRE_SEQ)
+ fputs("iseq ", f);
+ if (oflags & GRE_SEQ)
+ fputs("oseq ", f);
+ if (iflags & GRE_CSUM)
+ fputs("icsum ", f);
+ if (oflags & GRE_CSUM)
+ fputs("ocsum ", f);
+}
+
+struct link_util gre_link_util = {
+ .id = "gre",
+ .maxattr = IFLA_GRE_MAX,
+ .parse_opt = gre_parse_opt,
+ .print_opt = gre_print_opt,
+};
+
+struct link_util gretap_link_util = {
+ .id = "gretap",
+ .maxattr = IFLA_GRE_MAX,
+ .parse_opt = gre_parse_opt,
+ .print_opt = gre_print_opt,
+};
^ permalink raw reply related [flat|nested] 16+ messages in thread
* bridge: Fix update_pmtu crash with GRE
2008-10-09 10:39 ` Herbert Xu
@ 2008-11-24 6:49 ` Herbert Xu
2008-11-24 12:35 ` Patrick McHardy
0 siblings, 1 reply; 16+ messages in thread
From: Herbert Xu @ 2008-11-24 6:49 UTC (permalink / raw)
To: Philip Craig; +Cc: David S. Miller, netdev, Patrick McHardy
On Thu, Oct 09, 2008 at 06:39:56PM +0800, Herbert Xu wrote:
>
> > +/*
> > + * We've finished passing through netfilter, so we can remove the fake dst.
> > + * This is required by some lower layers, eg ip_gre
> > + */
> > +static int br_nf_dev_queue_xmit_finish(struct sk_buff *skb)
> > +{
> > + if (skb->dst == (struct dst_entry *)&__fake_rtable) {
> > + dst_release(skb->dst);
> > + skb->dst = NULL;
> > + }
> > +
> > + return br_dev_queue_push_xmit(skb);
> > +}
>
> Alternatively we could give fake_rtable an ops structure.
Looks like this fell through the cracks:
bridge: Fix update_pmtu crash with GRE
As GRE tries to call the update_pmtu function on skb->dst and
bridge supplies an skb->dst that has a NULL ops field, all is
not well.
This patch fixes this by giving the bridge device an ops field
with an update_pmtu function. For the moment I've left all
other fields blank but we can fill them in later should the
need arise.
Based on report and patch by Philip Craig.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index fa5cda4..45f61c3 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -101,6 +101,18 @@ static inline __be16 pppoe_proto(const struct sk_buff *skb)
pppoe_proto(skb) == htons(PPP_IPV6) && \
brnf_filter_pppoe_tagged)
+static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
+{
+}
+
+static struct dst_ops fake_dst_ops = {
+ .family = AF_INET,
+ .protocol = __constant_htons(ETH_P_IP),
+ .update_pmtu = fake_update_pmtu,
+ .entry_size = sizeof(struct rtable),
+ .entries = ATOMIC_INIT(0),
+};
+
/*
* Initialize bogus route table used to keep netfilter happy.
* Currently, we fill in the PMTU entry because netfilter
@@ -117,6 +129,7 @@ void br_netfilter_rtable_init(struct net_bridge *br)
rt->u.dst.path = &rt->u.dst;
rt->u.dst.metrics[RTAX_MTU - 1] = 1500;
rt->u.dst.flags = DST_NOXFRM;
+ rt->u.dst.ops = &fake_dst_ops;
}
static inline struct rtable *bridge_parent_rtable(const struct net_device *dev)
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: bridge: Fix update_pmtu crash with GRE
2008-11-24 6:49 ` bridge: Fix update_pmtu crash with GRE Herbert Xu
@ 2008-11-24 12:35 ` Patrick McHardy
0 siblings, 0 replies; 16+ messages in thread
From: Patrick McHardy @ 2008-11-24 12:35 UTC (permalink / raw)
To: Herbert Xu; +Cc: Philip Craig, David S. Miller, netdev
Herbert Xu wrote:
> Looks like this fell through the cracks:
>
> bridge: Fix update_pmtu crash with GRE
>
> As GRE tries to call the update_pmtu function on skb->dst and
> bridge supplies an skb->dst that has a NULL ops field, all is
> not well.
>
> This patch fixes this by giving the bridge device an ops field
> with an update_pmtu function. For the moment I've left all
> other fields blank but we can fill them in later should the
> need arise.
>
> Based on report and patch by Philip Craig.
Applied, thanks Herbert.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-11-24 12:35 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-09 7:04 [0/4] gre: Ethernet over GRE Herbert Xu
2008-10-09 7:05 ` [PATCH 1/4] gre: Use needed_headroom Herbert Xu
2008-10-09 7:05 ` [PATCH 2/4] gre: Move MTU setting out of ipgre_tunnel_bind_dev Herbert Xu
2008-10-09 7:05 ` [PATCH 3/4] gre: Add netlink interface Herbert Xu
2008-10-09 7:05 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
2008-10-09 8:43 ` Philip Craig
2008-10-09 10:39 ` Herbert Xu
2008-11-24 6:49 ` bridge: Fix update_pmtu crash with GRE Herbert Xu
2008-11-24 12:35 ` Patrick McHardy
2008-10-09 11:21 ` [PATCH 4/4] gre: Add Transparent Ethernet Bridging Herbert Xu
2008-10-09 7:08 ` ip: gre: Add GRE configuration support through rtnl_link Herbert Xu
2008-10-11 13:03 ` Herbert Xu
2008-10-09 11:23 ` [PATCH 5/4] inet: Make tunnel RX/TX byte counters more consistent Herbert Xu
2008-10-09 19:03 ` David Miller
2008-10-09 12:12 ` [0/4] gre: Ethernet over GRE Patrick McHardy
2008-10-09 19:01 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).