* [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
@ 2005-01-12 22:24 Lennert Buytenhek
2005-01-12 22:42 ` Ben Greear
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-12 22:24 UTC (permalink / raw)
To: netdev; +Cc: shemminger, rhousley, shollenbeck
Hi,
After struggling with various userland VPN solutions for a while (and
failing to make IPSEC tunnel mode do what I want), I decided to just
implement ethernet-in-IP tunneling in the kernel and let IPSEC transport
mode handle the rest.
There appeared to be an RFC for ethernet-in-IP already, RFC 3378, so I
just implemented that. It's very simple -- slap a 16-bit header (0x3000,
which is 4 bits of etherip version number and 12 bits of padding) onto
the beginning of the ethernet packet, and then wrap it in an IP packet.
Below is what I came up with, against the latest Fedora Core 3 kernel,
which is 2.6.10-something. It survives some fairly basic testing between
a number of different machines, UP and SMP. (Corresponding iproute2
patch is available from http://www.wantstofly.org/~buytenh/etherip/ )
Notes:
- daddr=0 tunnel mode is meaningless for generic ethernet tunneling,
so I didn't implement that. Packets are just dropped on the floor
if daddr==0 at the time of sending, which is the default mode for
the etherip0 device.
Issues and TODO:
- Implement MULTICAST(daddr) tunnel mode, seems useful to have.
- Perhaps we should always present a MTU=1500 device to the user and
deal with fragmentation issues ourselves.
- Don't take TTL of outer packet from inner packet.
- Figure out what to do with DF.
- Check whether ECN bits are correctly {en,de}capsulated.
- Check out iffy-looking '2 * sizeof(struct etheriphdr)' construct
(same problem in ip_gre.c?)
Comments? I would like to see this upstream when the remaining issues
have been sorted out.
cheers,
Lennert
diff -urN linux-2.6.10.orig/include/linux/in.h linux-2.6.10/include/linux/in.h
--- linux-2.6.10.orig/include/linux/in.h 2005-01-12 21:44:31.000000000 +0100
+++ linux-2.6.10/include/linux/in.h 2005-01-12 21:44:28.000000000 +0100
@@ -39,6 +39,7 @@
IPPROTO_ESP = 50, /* Encapsulation Security Payload protocol */
IPPROTO_AH = 51, /* Authentication Header protocol */
+ IPPROTO_ETHERIP = 97, /* Ethernet-in-IP tunneling (rfc 3378) */
IPPROTO_PIM = 103, /* Protocol Independent Multicast */
IPPROTO_COMP = 108, /* Compression Header protocol */
diff -urN linux-2.6.10.orig/net/ipv4/etherip.c linux-2.6.10/net/ipv4/etherip.c
--- linux-2.6.10.orig/net/ipv4/etherip.c 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.10/net/ipv4/etherip.c 2005-01-12 22:09:51.699077165 +0100
@@ -0,0 +1,758 @@
+/*
+ * Linux NET3: Ethernet over IP protocol decoder.
+ *
+ * Authors: Alexey Kuznetsov (kuznet@ms2.inr.ac.ru)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+
+/*
+ This version of net/ipv4/etherip.c created by Lennert Buytenhek
+ by mashing net/ipv4/ip_gre.c and net/ipv4/ipip.c together.
+
+ For comments look at net/ipv4/ip_gre.c
+ */
+
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/sched.h>
+#include <linux/kernel.h>
+#include <asm/uaccess.h>
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/in.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include <linux/if_arp.h>
+#include <linux/mroute.h>
+#include <linux/init.h>
+#include <linux/in6.h>
+#include <linux/inetdevice.h>
+#include <linux/igmp.h>
+#include <linux/netfilter_ipv4.h>
+#include <linux/etherdevice.h>
+
+#include <net/sock.h>
+#include <net/ip.h>
+#include <net/icmp.h>
+#include <net/protocol.h>
+#include <net/ipip.h>
+#include <net/arp.h>
+#include <net/checksum.h>
+#include <net/dsfield.h>
+#include <net/inet_ecn.h>
+#include <net/xfrm.h>
+
+#ifdef CONFIG_IPV6
+#include <net/ipv6.h>
+#include <net/ip6_fib.h>
+#include <net/ip6_route.h>
+#endif
+
+struct etheriphdr
+{
+ struct iphdr iph;
+ u16 version;
+} __attribute__((packed));
+
+#define HASH_SIZE 16
+#define HASH(addr) ((addr^(addr>>4))&0xF)
+
+#define ETHERIP_VERSION 0x3000
+
+static int etherip_fb_tunnel_init(struct net_device *dev);
+static int etherip_tunnel_init(struct net_device *dev);
+static void etherip_tunnel_setup(struct net_device *dev);
+
+static struct net_device *etherip_fb_tunnel_dev;
+
+static struct ip_tunnel *tunnels_r_l[HASH_SIZE];
+static struct ip_tunnel *tunnels_r[HASH_SIZE];
+static struct ip_tunnel *tunnels_l[HASH_SIZE];
+static struct ip_tunnel *tunnels_wc[1];
+
+static struct ip_tunnel **tunnels[4] = { tunnels_wc, tunnels_l, tunnels_r, tunnels_r_l };
+
+static rwlock_t etherip_lock = RW_LOCK_UNLOCKED;
+
+static struct ip_tunnel * etherip_tunnel_lookup(u32 remote, u32 local)
+{
+ unsigned h0 = HASH(remote);
+ unsigned h1 = HASH(local);
+ struct ip_tunnel *t;
+
+ for (t = tunnels_r_l[h0^h1]; t; t = t->next) {
+ if (local == t->parms.iph.saddr &&
+ remote == t->parms.iph.daddr && (t->dev->flags&IFF_UP))
+ return t;
+ }
+ for (t = tunnels_r[h0]; t; t = t->next) {
+ if (remote == t->parms.iph.daddr && (t->dev->flags&IFF_UP))
+ return t;
+ }
+ for (t = tunnels_l[h1]; t; t = t->next) {
+ if (local == t->parms.iph.saddr && (t->dev->flags&IFF_UP))
+ return t;
+ }
+ if ((t = tunnels_wc[0]) != NULL && (t->dev->flags&IFF_UP))
+ return t;
+ return NULL;
+}
+
+static struct ip_tunnel **etherip_bucket(struct ip_tunnel *t)
+{
+ u32 remote = t->parms.iph.daddr;
+ u32 local = t->parms.iph.saddr;
+ unsigned h = 0;
+ int prio = 0;
+
+ if (remote) {
+ prio |= 2;
+ h ^= HASH(remote);
+ }
+ if (local) {
+ prio |= 1;
+ h ^= HASH(local);
+ }
+ return &tunnels[prio][h];
+}
+
+static void etherip_tunnel_unlink(struct ip_tunnel *t)
+{
+ struct ip_tunnel **tp;
+
+ for (tp = etherip_bucket(t); *tp; tp = &(*tp)->next) {
+ if (t == *tp) {
+ write_lock_bh(ðerip_lock);
+ *tp = t->next;
+ write_unlock_bh(ðerip_lock);
+ break;
+ }
+ }
+}
+
+static void etherip_tunnel_link(struct ip_tunnel *t)
+{
+ struct ip_tunnel **tp = etherip_bucket(t);
+
+ t->next = *tp;
+ write_lock_bh(ðerip_lock);
+ *tp = t;
+ write_unlock_bh(ðerip_lock);
+}
+
+static struct ip_tunnel * etherip_tunnel_locate(struct ip_tunnel_parm *parms, int create)
+{
+ u32 remote = parms->iph.daddr;
+ u32 local = parms->iph.saddr;
+ struct ip_tunnel *t, **tp, *nt;
+ struct net_device *dev;
+ unsigned h = 0;
+ int prio = 0;
+ char name[IFNAMSIZ];
+
+ if (remote) {
+ prio |= 2;
+ h ^= HASH(remote);
+ }
+ if (local) {
+ prio |= 1;
+ h ^= HASH(local);
+ }
+ for (tp = &tunnels[prio][h]; (t = *tp) != NULL; tp = &t->next) {
+ if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr)
+ return t;
+ }
+ if (!create)
+ return NULL;
+
+ if (parms->name[0])
+ strlcpy(name, parms->name, IFNAMSIZ);
+ else {
+ int i;
+ for (i=1; i<100; i++) {
+ sprintf(name, "etherip%d", i);
+ if (__dev_get_by_name(name) == NULL)
+ break;
+ }
+ if (i==100)
+ goto failed;
+ }
+
+ dev = alloc_netdev(sizeof(*t), name, etherip_tunnel_setup);
+ if (!dev)
+ return NULL;
+
+ nt = dev->priv;
+ SET_MODULE_OWNER(dev);
+ dev->init = etherip_tunnel_init;
+ nt->parms = *parms;
+
+ if (register_netdevice(dev) < 0) {
+ free_netdev(dev);
+ goto failed;
+ }
+
+ dev_hold(dev);
+ etherip_tunnel_link(nt);
+ /* Do not decrement MOD_USE_COUNT here. */
+ return nt;
+
+failed:
+ return NULL;
+}
+
+static void etherip_tunnel_uninit(struct net_device *dev)
+{
+ if (dev == etherip_fb_tunnel_dev) {
+ write_lock_bh(ðerip_lock);
+ tunnels_wc[0] = NULL;
+ write_unlock_bh(ðerip_lock);
+ } else
+ etherip_tunnel_unlink((struct ip_tunnel*)dev->priv);
+ dev_put(dev);
+}
+
+
+void etherip_err(struct sk_buff *skb, u32 info)
+{
+#ifndef I_WISH_WORLD_WERE_PERFECT
+/* It is not :-( All the routers (except for Linux) return only
+ 8 bytes of packet payload. It means, that precise relaying of
+ ICMP in the real Internet is absolutely infeasible.
+ */
+
+ struct iphdr *iph = (struct iphdr*)skb->data;
+ int type = skb->h.icmph->type;
+ int code = skb->h.icmph->code;
+ struct ip_tunnel *t;
+
+ switch (type) {
+ default:
+ case ICMP_PARAMETERPROB:
+ return;
+
+ case ICMP_DEST_UNREACH:
+ switch (code) {
+ case ICMP_SR_FAILED:
+ case ICMP_PORT_UNREACH:
+ /* Impossible event. */
+ return;
+ case ICMP_FRAG_NEEDED:
+ /* Soft state for pmtu is maintained by IP core. */
+ return;
+ default:
+ /* All others are translated to HOST_UNREACH.
+ rfc2003 contains "deep thoughts" about NET_UNREACH,
+ I believe they are just ether pollution. --ANK
+ */
+ break;
+ }
+ break;
+ case ICMP_TIME_EXCEEDED:
+ if (code != ICMP_EXC_TTL)
+ return;
+ break;
+ }
+
+ read_lock(ðerip_lock);
+ t = etherip_tunnel_lookup(iph->daddr, iph->saddr);
+ if (t == NULL || t->parms.iph.daddr == 0)
+ goto out;
+ if (t->parms.iph.ttl == 0 && type == ICMP_TIME_EXCEEDED)
+ goto out;
+
+ if (jiffies - t->err_time < IPTUNNEL_ERR_TIMEO)
+ t->err_count++;
+ else
+ t->err_count = 1;
+ t->err_time = jiffies;
+out:
+ read_unlock(ðerip_lock);
+ return;
+#endif
+}
+
+static inline void etherip_ecn_decapsulate(struct iphdr *iph, struct sk_buff *skb)
+{
+ if (INET_ECN_is_ce(iph->tos)) {
+ if (skb->protocol == htons(ETH_P_IP)) {
+ IP_ECN_set_ce(skb->nh.iph);
+ } else if (skb->protocol == htons(ETH_P_IPV6)) {
+ IP6_ECN_set_ce(skb->nh.ipv6h);
+ }
+ }
+}
+
+static inline u8
+etherip_ecn_encapsulate(u8 tos, struct iphdr *old_iph, struct sk_buff *skb)
+{
+ u8 inner = 0;
+ if (skb->protocol == htons(ETH_P_IP))
+ inner = old_iph->tos;
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ inner = ipv6_get_dsfield((struct ipv6hdr *)old_iph);
+ return INET_ECN_encapsulate(tos, inner);
+}
+
+int etherip_rcv(struct sk_buff *skb)
+{
+ struct iphdr *iph;
+ struct ip_tunnel *tunnel;
+ struct etheriphdr *ethiph;
+
+ if (!pskb_may_pull(skb, sizeof(struct etheriphdr)))
+ goto out;
+
+ ethiph = (struct etheriphdr *)skb->nh.raw;
+ if (ethiph->version != htons(ETHERIP_VERSION)) {
+ kfree_skb(skb);
+ return 0;
+ }
+
+ iph = skb->nh.iph;
+
+ read_lock(ðerip_lock);
+ if ((tunnel = etherip_tunnel_lookup(iph->saddr, iph->daddr)) != NULL) {
+ secpath_reset(skb);
+
+ /* Pull etherip header. */
+ skb_pull(skb, 2);
+ skb->protocol = eth_type_trans(skb, tunnel->dev);
+
+ memset(&(IPCB(skb)->opt), 0, sizeof(struct ip_options));
+
+ tunnel->stat.rx_packets++;
+ tunnel->stat.rx_bytes += skb->len;
+ skb->dev = tunnel->dev;
+ dst_release(skb->dst);
+ skb->dst = NULL;
+ nf_reset(skb);
+ etherip_ecn_decapsulate(iph, skb);
+ netif_rx(skb);
+ read_unlock(ðerip_lock);
+ return 0;
+ }
+ read_unlock(ðerip_lock);
+
+out:
+ return -1;
+}
+
+static int etherip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct ip_tunnel *tunnel = (struct ip_tunnel*)dev->priv;
+ struct net_device_stats *stats = &tunnel->stat;
+ struct iphdr *old_iph = skb->nh.iph;
+ struct iphdr *tiph = &tunnel->parms.iph;
+ u8 tos;
+ u16 df;
+ struct rtable *rt; /* Route to the other host */
+ struct net_device *tdev; /* Device to other host */
+ struct etheriphdr *ethiph;
+ struct iphdr *iph; /* Our new IP header */
+ int max_headroom; /* The extra header space needed */
+ int mtu;
+
+ if (tunnel->recursion++) {
+ stats->collisions++;
+ goto tx_error;
+ }
+
+ /* Need valid non-multicast daddr. */
+ if (tiph->daddr == 0 || MULTICAST(tiph->daddr))
+ goto tx_error;
+
+ tos = tiph->tos;
+ if (tos&1) {
+ if (skb->protocol == htons(ETH_P_IP))
+ tos = old_iph->tos;
+ tos &= ~1;
+ }
+
+ {
+ struct flowi fl = { .oif = tunnel->parms.link,
+ .nl_u = { .ip4_u =
+ { .daddr = tiph->daddr,
+ .saddr = tiph->saddr,
+ .tos = RT_TOS(tos) } },
+ .proto = IPPROTO_ETHERIP };
+ if (ip_route_output_key(&rt, &fl)) {
+ stats->tx_carrier_errors++;
+ goto tx_error_icmp;
+ }
+ }
+ tdev = rt->u.dst.dev;
+
+ if (tdev == dev) {
+ ip_rt_put(rt);
+ stats->collisions++;
+ goto tx_error;
+ }
+
+ df = tiph->frag_off;
+ if (df)
+ mtu = dst_pmtu(&rt->u.dst) - sizeof(struct etheriphdr);
+ else
+ mtu = skb->dst ? dst_pmtu(skb->dst) : dev->mtu;
+
+ if (skb->dst)
+ skb->dst->ops->update_pmtu(skb->dst, mtu);
+
+ if (skb->protocol == htons(ETH_P_IP)) {
+ df |= (old_iph->frag_off&htons(IP_DF));
+
+ if ((old_iph->frag_off & htons(IP_DF)) &&
+ mtu < ntohs(old_iph->tot_len)) {
+ icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
+ ip_rt_put(rt);
+ goto tx_error;
+ }
+ }
+#ifdef CONFIG_IPV6
+ else if (skb->protocol == htons(ETH_P_IPV6)) {
+ struct rt6_info *rt6 = (struct rt6_info*)skb->dst;
+
+ if (rt6 && mtu < dst_pmtu(skb->dst) && mtu >= IPV6_MIN_MTU) {
+ if (tiph->daddr || rt6->rt6i_dst.plen == 128) {
+ rt6->rt6i_flags |= RTF_MODIFIED;
+ skb->dst->metrics[RTAX_MTU-1] = mtu;
+ }
+ }
+
+ /* @@@ Is this correct? */
+ if (mtu >= IPV6_MIN_MTU && mtu < skb->len - 2 * sizeof(struct etheriphdr)) {
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, dev);
+ ip_rt_put(rt);
+ goto tx_error;
+ }
+ }
+#endif
+
+ if (tunnel->err_count > 0) {
+ if (jiffies - tunnel->err_time < IPTUNNEL_ERR_TIMEO) {
+ tunnel->err_count--;
+ dst_link_failure(skb);
+ } else
+ tunnel->err_count = 0;
+ }
+
+ max_headroom = LL_RESERVED_SPACE(tdev) + sizeof(struct etheriphdr);
+
+ if (skb_headroom(skb) < max_headroom || skb_cloned(skb) || skb_shared(skb)) {
+ struct sk_buff *new_skb = skb_realloc_headroom(skb, max_headroom);
+ if (!new_skb) {
+ ip_rt_put(rt);
+ stats->tx_dropped++;
+ dev_kfree_skb(skb);
+ tunnel->recursion--;
+ return 0;
+ }
+ if (skb->sk)
+ skb_set_owner_w(new_skb, skb->sk);
+ dev_kfree_skb(skb);
+ skb = new_skb;
+ old_iph = skb->nh.iph;
+ }
+
+ skb->h.raw = skb->nh.raw;
+ skb->nh.raw = skb_push(skb, sizeof(struct etheriphdr));
+ memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
+ dst_release(skb->dst);
+ skb->dst = &rt->u.dst;
+
+ /*
+ * Push down and install the etherip header.
+ */
+
+ ethiph = (struct etheriphdr *)skb->nh.iph;
+
+ iph = ðiph->iph;
+ iph->version = 4;
+ iph->ihl = sizeof(struct iphdr) >> 2;
+ iph->frag_off = df;
+ iph->protocol = IPPROTO_ETHERIP;
+ iph->tos = etherip_ecn_encapsulate(tos, old_iph, skb);
+ iph->daddr = rt->rt_dst;
+ iph->saddr = rt->rt_src;
+
+ ethiph->version = htons(ETHERIP_VERSION);
+
+ if ((ethiph->iph.ttl = tiph->ttl) == 0) {
+ if (skb->protocol == htons(ETH_P_IP))
+ ethiph->iph.ttl = old_iph->ttl;
+#ifdef CONFIG_IPV6
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ ethiph->iph.ttl = ((struct ipv6hdr*)old_iph)->hop_limit;
+#endif
+ else
+ ethiph->iph.ttl = dst_metric(&rt->u.dst, RTAX_HOPLIMIT);
+ }
+
+ nf_reset(skb);
+
+ IPTUNNEL_XMIT();
+ tunnel->recursion--;
+ return 0;
+
+tx_error_icmp:
+ dst_link_failure(skb);
+
+tx_error:
+ stats->tx_errors++;
+ dev_kfree_skb(skb);
+ tunnel->recursion--;
+ return 0;
+}
+
+static int
+etherip_tunnel_ioctl (struct net_device *dev, struct ifreq *ifr, int cmd)
+{
+ int err = 0;
+ struct ip_tunnel_parm p;
+ struct ip_tunnel *t;
+
+ switch (cmd) {
+ case SIOCGETTUNNEL:
+ t = NULL;
+ if (dev == etherip_fb_tunnel_dev) {
+ if (copy_from_user(&p, ifr->ifr_ifru.ifru_data, sizeof(p))) {
+ err = -EFAULT;
+ break;
+ }
+ t = etherip_tunnel_locate(&p, 0);
+ }
+ if (t == NULL)
+ t = (struct ip_tunnel*)dev->priv;
+ memcpy(&p, &t->parms, sizeof(p));
+ if (copy_to_user(ifr->ifr_ifru.ifru_data, &p, sizeof(p)))
+ err = -EFAULT;
+ break;
+
+ case SIOCADDTUNNEL:
+ case SIOCCHGTUNNEL:
+ err = -EPERM;
+ if (!capable(CAP_NET_ADMIN))
+ goto done;
+
+ err = -EFAULT;
+ if (copy_from_user(&p, ifr->ifr_ifru.ifru_data, sizeof(p)))
+ goto done;
+
+ err = -EINVAL;
+ if (p.iph.version != 4 || p.iph.protocol != IPPROTO_ETHERIP ||
+ p.iph.ihl != 5 || (p.iph.frag_off&htons(~IP_DF)))
+ goto done;
+ if (p.iph.ttl)
+ p.iph.frag_off |= htons(IP_DF);
+
+ t = etherip_tunnel_locate(&p, cmd == SIOCADDTUNNEL);
+
+ if (dev != etherip_fb_tunnel_dev && cmd == SIOCCHGTUNNEL) {
+ if (t != NULL) {
+ if (t->dev != dev) {
+ err = -EEXIST;
+ break;
+ }
+ } else {
+ if (!p.iph.daddr) {
+ err = -EINVAL;
+ break;
+ }
+
+ t = (struct ip_tunnel*)dev->priv;
+ etherip_tunnel_unlink(t);
+ t->parms.iph.saddr = p.iph.saddr;
+ t->parms.iph.daddr = p.iph.daddr;
+ etherip_tunnel_link(t);
+ netdev_state_change(dev);
+ }
+ }
+
+ if (t) {
+ err = 0;
+ if (cmd == SIOCCHGTUNNEL) {
+ t->parms.iph.ttl = p.iph.ttl;
+ t->parms.iph.tos = p.iph.tos;
+ t->parms.iph.frag_off = p.iph.frag_off;
+ }
+ if (copy_to_user(ifr->ifr_ifru.ifru_data, &t->parms, sizeof(p)))
+ err = -EFAULT;
+ } else
+ err = (cmd == SIOCADDTUNNEL ? -ENOBUFS : -ENOENT);
+ break;
+
+ case SIOCDELTUNNEL:
+ err = -EPERM;
+ if (!capable(CAP_NET_ADMIN))
+ goto done;
+
+ if (dev == etherip_fb_tunnel_dev) {
+ err = -EFAULT;
+ if (copy_from_user(&p, ifr->ifr_ifru.ifru_data, sizeof(p)))
+ goto done;
+ err = -ENOENT;
+ if ((t = etherip_tunnel_locate(&p, 0)) == NULL)
+ goto done;
+ err = -EPERM;
+ if (t->dev == etherip_fb_tunnel_dev)
+ goto done;
+ dev = t->dev;
+ }
+ err = unregister_netdevice(dev);
+ break;
+
+ default:
+ err = -EINVAL;
+ }
+
+done:
+ return err;
+}
+
+static struct net_device_stats *etherip_tunnel_get_stats(struct net_device *dev)
+{
+ return &(((struct ip_tunnel*)dev->priv)->stat);
+}
+
+static int etherip_tunnel_change_mtu(struct net_device *dev, int new_mtu)
+{
+ if (new_mtu < 68 || new_mtu > 0xFFF8 - sizeof(struct etheriphdr))
+ return -EINVAL;
+ dev->mtu = new_mtu;
+ return 0;
+}
+
+static void etherip_tunnel_setup(struct net_device *dev)
+{
+ SET_MODULE_OWNER(dev);
+ ether_setup(dev);
+
+ dev->uninit = etherip_tunnel_uninit;
+ dev->destructor = free_netdev;
+ dev->hard_start_xmit = etherip_tunnel_xmit;
+ dev->get_stats = etherip_tunnel_get_stats;
+ dev->do_ioctl = etherip_tunnel_ioctl;
+ dev->change_mtu = etherip_tunnel_change_mtu;
+
+ dev->hard_header_len = ETH_HLEN; // + sizeof(struct etheriphdr);
+ dev->tx_queue_len = 0;
+ random_ether_addr(dev->dev_addr);
+
+ dev->iflink = 0;
+}
+
+static int etherip_tunnel_init(struct net_device *dev)
+{
+ struct net_device *tdev = NULL;
+ struct ip_tunnel *tunnel;
+ struct iphdr *iph;
+
+ tunnel = (struct ip_tunnel*)dev->priv;
+ iph = &tunnel->parms.iph;
+
+ tunnel->dev = dev;
+ strcpy(tunnel->parms.name, dev->name);
+
+ /* Guess output device to choose reasonable mtu and hard_header_len */
+ if (iph->daddr) {
+ struct flowi fl = { .oif = tunnel->parms.link,
+ .nl_u = { .ip4_u =
+ { .daddr = iph->daddr,
+ .saddr = iph->saddr,
+ .tos = RT_TOS(iph->tos) } },
+ .proto = IPPROTO_ETHERIP };
+ struct rtable *rt;
+ if (!ip_route_output_key(&rt, &fl)) {
+ tdev = rt->u.dst.dev;
+ ip_rt_put(rt);
+ }
+ }
+
+ if (!tdev && tunnel->parms.link)
+ tdev = __dev_get_by_index(tunnel->parms.link);
+
+ if (tdev) {
+ dev->hard_header_len = tdev->hard_header_len + sizeof(struct etheriphdr);
+ dev->mtu = tdev->mtu - sizeof(struct etheriphdr);
+ }
+ dev->iflink = tunnel->parms.link;
+
+ return 0;
+}
+
+int __init etherip_fb_tunnel_init(struct net_device *dev)
+{
+ struct ip_tunnel *tunnel = (struct ip_tunnel*)dev->priv;
+ struct iphdr *iph = &tunnel->parms.iph;
+
+ tunnel->dev = dev;
+ strcpy(tunnel->parms.name, dev->name);
+
+ iph->version = 4;
+ iph->protocol = IPPROTO_ETHERIP;
+ iph->ihl = 5;
+
+ dev_hold(dev);
+ tunnels_wc[0] = tunnel;
+ return 0;
+}
+
+
+static struct net_protocol etherip_protocol = {
+ .handler = etherip_rcv,
+ .err_handler = etherip_err,
+};
+
+
+/*
+ * And now the modules code and kernel interface.
+ */
+
+static int __init etherip_init(void)
+{
+ int err;
+
+ printk(KERN_INFO "Ethernet over IPv4 tunneling driver\n");
+
+ if (inet_add_protocol(ðerip_protocol, IPPROTO_ETHERIP) < 0) {
+ printk(KERN_INFO "etherip init: can't add protocol\n");
+ return -EAGAIN;
+ }
+
+ etherip_fb_tunnel_dev = alloc_netdev(sizeof(struct ip_tunnel),
+ "etherip0", etherip_tunnel_setup);
+ if (!etherip_fb_tunnel_dev) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ etherip_fb_tunnel_dev->init = etherip_fb_tunnel_init;
+
+ if ((err = register_netdev(etherip_fb_tunnel_dev)))
+ goto err2;
+out:
+ return err;
+err2:
+ free_netdev(etherip_fb_tunnel_dev);
+err1:
+ inet_del_protocol(ðerip_protocol, IPPROTO_ETHERIP);
+ goto out;
+}
+
+void etherip_fini(void)
+{
+ if (inet_del_protocol(ðerip_protocol, IPPROTO_ETHERIP) < 0)
+ printk(KERN_INFO "etherip close: can't remove protocol\n");
+
+ unregister_netdev(etherip_fb_tunnel_dev);
+}
+
+module_init(etherip_init);
+module_exit(etherip_fini);
+MODULE_LICENSE("GPL");
diff -urN linux-2.6.10.orig/net/ipv4/Kconfig linux-2.6.10/net/ipv4/Kconfig
--- linux-2.6.10.orig/net/ipv4/Kconfig 2005-01-12 21:48:11.000000000 +0100
+++ linux-2.6.10/net/ipv4/Kconfig 2005-01-12 21:53:51.063081266 +0100
@@ -202,6 +202,15 @@
Network), but can be distributed all over the Internet. If you want
to do that, say Y here and to "IP multicast routing" below.
+config NET_ETHERIP
+ tristate "IP: ethernet-in-IP tunneling"
+ depends on INET
+ help
+ Tunneling means encapsulating data of one protocol type within
+ another protocol and sending it over a channel that understands the
+ encapsulating protocol. This particular tunneling driver implements
+ the etherip Ethernet-in-IP protocol as described in RFC 3378.
+
config IP_MROUTE
bool "IP: multicast routing"
depends on IP_MULTICAST
diff -urN linux-2.6.10.orig/net/ipv4/Makefile linux-2.6.10/net/ipv4/Makefile
--- linux-2.6.10.orig/net/ipv4/Makefile 2004-12-24 22:34:26.000000000 +0100
+++ linux-2.6.10/net/ipv4/Makefile 2005-01-12 21:48:38.762364033 +0100
@@ -14,6 +14,7 @@
obj-$(CONFIG_IP_MROUTE) += ipmr.o
obj-$(CONFIG_NET_IPIP) += ipip.o
obj-$(CONFIG_NET_IPGRE) += ip_gre.o
+obj-$(CONFIG_NET_ETHERIP) += etherip.o
obj-$(CONFIG_SYN_COOKIES) += syncookies.o
obj-$(CONFIG_INET_AH) += ah4.o
obj-$(CONFIG_INET_ESP) += esp4.o
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 22:24 [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
@ 2005-01-12 22:42 ` Ben Greear
2005-01-12 22:48 ` Lennert Buytenhek
2005-01-13 0:04 ` Stephen Hemminger
2005-01-13 7:49 ` Pekka Savola
2 siblings, 1 reply; 30+ messages in thread
From: Ben Greear @ 2005-01-12 22:42 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev, shemminger, rhousley, shollenbeck
Lennert Buytenhek wrote:
> Hi,
>
> After struggling with various userland VPN solutions for a while (and
> failing to make IPSEC tunnel mode do what I want), I decided to just
> implement ethernet-in-IP tunneling in the kernel and let IPSEC transport
> mode handle the rest.
>
> There appeared to be an RFC for ethernet-in-IP already, RFC 3378, so I
> just implemented that. It's very simple -- slap a 16-bit header (0x3000,
> which is 4 bits of etherip version number and 12 bits of padding) onto
> the beginning of the ethernet packet, and then wrap it in an IP packet.
>
> Below is what I came up with, against the latest Fedora Core 3 kernel,
> which is 2.6.10-something. It survives some fairly basic testing between
> a number of different machines, UP and SMP. (Corresponding iproute2
> patch is available from http://www.wantstofly.org/~buytenh/etherip/ )
>
> Notes:
> - daddr=0 tunnel mode is meaningless for generic ethernet tunneling,
> so I didn't implement that. Packets are just dropped on the floor
> if daddr==0 at the time of sending, which is the default mode for
> the etherip0 device.
>
> Issues and TODO:
> - Implement MULTICAST(daddr) tunnel mode, seems useful to have.
> - Perhaps we should always present a MTU=1500 device to the user and
> deal with fragmentation issues ourselves.
> - Don't take TTL of outer packet from inner packet.
> - Figure out what to do with DF.
> - Check whether ECN bits are correctly {en,de}capsulated.
> - Check out iffy-looking '2 * sizeof(struct etheriphdr)' construct
> (same problem in ip_gre.c?)
>
> Comments? I would like to see this upstream when the remaining issues
> have been sorted out.
Why do you add a single device when loading the module? Is this just
so you have something to hook the ioctl to?
My personal preference would be something where you do not automatically
create an etherip0, but would have an etherip-config tool or similar to
create/destroy interfaces.
Also, could you add an ioctl that allowed one to query whether or not
a particular device is an etherip device? I had always wished I had added
this earlier to the VLAN code :)
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 22:42 ` Ben Greear
@ 2005-01-12 22:48 ` Lennert Buytenhek
2005-01-12 23:11 ` Ben Greear
0 siblings, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-12 22:48 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev, shemminger, rhousley, shollenbeck
On Wed, Jan 12, 2005 at 02:42:49PM -0800, Ben Greear wrote:
> Why do you add a single device when loading the module? Is this just
> so you have something to hook the ioctl to?
>
> My personal preference would be something where you do not automatically
> create an etherip0, but would have an etherip-config tool or similar to
> create/destroy interfaces.
I modelled this after the way ipip/gre/sit do things -- they create
tunl0/gre0/sit0, and have 'ip tunnel' send its ioctls to those devices.
I wouldn't mind changing this to another mechanism, but 1) we'd have to
agree on a mechanism, and 2) we'd have to change the other tunnel types
over to this mechanism as well.
> Also, could you add an ioctl that allowed one to query whether or not
> a particular device is an etherip device? I had always wished I had added
> this earlier to the VLAN code :)
Hmmm. Bridge devices don't have this either, do they? Can you name
an advantage of having this?
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 22:48 ` Lennert Buytenhek
@ 2005-01-12 23:11 ` Ben Greear
2005-01-12 23:16 ` Lennert Buytenhek
0 siblings, 1 reply; 30+ messages in thread
From: Ben Greear @ 2005-01-12 23:11 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev, shemminger, rhousley, shollenbeck
Lennert Buytenhek wrote:
> On Wed, Jan 12, 2005 at 02:42:49PM -0800, Ben Greear wrote:
>
>
>>Why do you add a single device when loading the module? Is this just
>>so you have something to hook the ioctl to?
>>
>>My personal preference would be something where you do not automatically
>>create an etherip0, but would have an etherip-config tool or similar to
>>create/destroy interfaces.
>
>
> I modelled this after the way ipip/gre/sit do things -- they create
> tunl0/gre0/sit0, and have 'ip tunnel' send its ioctls to those devices.
>
> I wouldn't mind changing this to another mechanism, but 1) we'd have to
> agree on a mechanism, and 2) we'd have to change the other tunnel types
> over to this mechanism as well.
Ok, it's not a big deal to me.
>>Also, could you add an ioctl that allowed one to query whether or not
>>a particular device is an etherip device? I had always wished I had added
>>this earlier to the VLAN code :)
>
>
> Hmmm. Bridge devices don't have this either, do they? Can you name
> an advantage of having this?
I got the request several times with regard to VLANs. Lots of people
(and applications) will want to know the interface type for various
reasons. If you don't give them a nice programatic thing like an
IOCTL to call, they will undoubtedly start making assumptions based
off of the device name...
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 23:11 ` Ben Greear
@ 2005-01-12 23:16 ` Lennert Buytenhek
2005-01-12 23:43 ` Thomas Graf
2005-01-12 23:43 ` Ben Greear
0 siblings, 2 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-12 23:16 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev, shemminger, shollenbeck
On Wed, Jan 12, 2005 at 03:11:40PM -0800, Ben Greear wrote:
> >>Also, could you add an ioctl that allowed one to query whether or not
> >>a particular device is an etherip device? I had always wished I had added
> >>this earlier to the VLAN code :)
> >
> >Hmmm. Bridge devices don't have this either, do they? Can you name
> >an advantage of having this?
>
> I got the request several times with regard to VLANs. Lots of people
> (and applications) will want to know the interface type for various
> reasons. If you don't give them a nice programatic thing like an
> IOCTL to call, they will undoubtedly start making assumptions based
> off of the device name...
Makes sense..
Unfortunately SIOCGETTUNNEL is (SIOCDEVPRIVATE + 3), otherwise we could
just say something like "If an ARPHRD_ETHER device supports SIOCGETTUNNEL,
it's an ether/ip tunnel."
Any better ideas? I hate adding more ioctls.
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 23:16 ` Lennert Buytenhek
@ 2005-01-12 23:43 ` Thomas Graf
2005-01-13 0:18 ` Lennert Buytenhek
2005-01-12 23:43 ` Ben Greear
1 sibling, 1 reply; 30+ messages in thread
From: Thomas Graf @ 2005-01-12 23:43 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: Ben Greear, netdev, shemminger, shollenbeck
* Lennert Buytenhek <20050112231615.GF14280@xi.wantstofly.org> 2005-01-13 00:16
> On Wed, Jan 12, 2005 at 03:11:40PM -0800, Ben Greear wrote:
>
> > >>Also, could you add an ioctl that allowed one to query whether or not
> > >>a particular device is an etherip device? I had always wished I had added
> > >>this earlier to the VLAN code :)
> > >
> > >Hmmm. Bridge devices don't have this either, do they? Can you name
> > >an advantage of having this?
> >
> > I got the request several times with regard to VLANs. Lots of people
> > (and applications) will want to know the interface type for various
> > reasons. If you don't give them a nice programatic thing like an
> > IOCTL to call, they will undoubtedly start making assumptions based
> > off of the device name...
>
> Makes sense..
>
> Unfortunately SIOCGETTUNNEL is (SIOCDEVPRIVATE + 3), otherwise we could
> just say something like "If an ARPHRD_ETHER device supports SIOCGETTUNNEL,
> it's an ether/ip tunnel."
>
> Any better ideas? I hate adding more ioctls.
I think it should go into ip_tunnel_param, unforunately there are no
unused fields. Maybe schedule this for 2.7 together with a clean up
of all the tunnels so they share redundant code? Lots of gre related
code and comments spread over non-gre tunnels that should go away.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 23:16 ` Lennert Buytenhek
2005-01-12 23:43 ` Thomas Graf
@ 2005-01-12 23:43 ` Ben Greear
1 sibling, 0 replies; 30+ messages in thread
From: Ben Greear @ 2005-01-12 23:43 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev
Lennert Buytenhek wrote:
> On Wed, Jan 12, 2005 at 03:11:40PM -0800, Ben Greear wrote:
>
>
>>>>Also, could you add an ioctl that allowed one to query whether or not
>>>>a particular device is an etherip device? I had always wished I had added
>>>>this earlier to the VLAN code :)
>>>
>>>Hmmm. Bridge devices don't have this either, do they? Can you name
>>>an advantage of having this?
>>
>>I got the request several times with regard to VLANs. Lots of people
>>(and applications) will want to know the interface type for various
>>reasons. If you don't give them a nice programatic thing like an
>>IOCTL to call, they will undoubtedly start making assumptions based
>>off of the device name...
>
>
> Makes sense..
>
> Unfortunately SIOCGETTUNNEL is (SIOCDEVPRIVATE + 3), otherwise we could
> just say something like "If an ARPHRD_ETHER device supports SIOCGETTUNNEL,
> it's an ether/ip tunnel."
>
> Any better ideas? I hate adding more ioctls.
How about add one IOCTL that takes a small (naturally packed/aligned, fixed-size!)
struct that has within it an enumeration of specific commands and whatever fields
are needed for arguments. Then you only need to add a single IOCTL to the system,
and you can add more commands at will.
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 22:24 [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
2005-01-12 22:42 ` Ben Greear
@ 2005-01-13 0:04 ` Stephen Hemminger
2005-01-13 0:29 ` Lennert Buytenhek
2005-01-13 7:49 ` Pekka Savola
2 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2005-01-13 0:04 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev, rhousley, shollenbeck
On Wed, 12 Jan 2005 23:24:37 +0100
Lennert Buytenhek <buytenh@wantstofly.org> wrote:
> Hi,
>
> After struggling with various userland VPN solutions for a while (and
> failing to make IPSEC tunnel mode do what I want), I decided to just
> implement ethernet-in-IP tunneling in the kernel and let IPSEC transport
> mode handle the rest.
>
> There appeared to be an RFC for ethernet-in-IP already, RFC 3378, so I
> just implemented that. It's very simple -- slap a 16-bit header (0x3000,
> which is 4 bits of etherip version number and 12 bits of padding) onto
> the beginning of the ethernet packet, and then wrap it in an IP packet.
>
> Below is what I came up with, against the latest Fedora Core 3 kernel,
> which is 2.6.10-something. It survives some fairly basic testing between
> a number of different machines, UP and SMP. (Corresponding iproute2
> patch is available from http://www.wantstofly.org/~buytenh/etherip/ )
>
Since it is an RFC, any chance of interoperability testing it with
something besides Linux on the other end?
--
Stephen Hemminger <shemminger@osdl.org>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 23:43 ` Thomas Graf
@ 2005-01-13 0:18 ` Lennert Buytenhek
2005-01-13 0:28 ` Thomas Graf
0 siblings, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-13 0:18 UTC (permalink / raw)
To: Thomas Graf; +Cc: Ben Greear, netdev, shemminger, shollenbeck
On Thu, Jan 13, 2005 at 12:43:44AM +0100, Thomas Graf wrote:
> > > >>Also, could you add an ioctl that allowed one to query whether or not
> > > >>a particular device is an etherip device? I had always wished I had added
> > > >>this earlier to the VLAN code :)
> > > >
> > > >Hmmm. Bridge devices don't have this either, do they? Can you name
> > > >an advantage of having this?
> > >
> > > I got the request several times with regard to VLANs. Lots of people
> > > (and applications) will want to know the interface type for various
> > > reasons. If you don't give them a nice programatic thing like an
> > > IOCTL to call, they will undoubtedly start making assumptions based
> > > off of the device name...
> >
> > Makes sense..
> >
> > Unfortunately SIOCGETTUNNEL is (SIOCDEVPRIVATE + 3), otherwise we could
> > just say something like "If an ARPHRD_ETHER device supports SIOCGETTUNNEL,
> > it's an ether/ip tunnel."
> >
> > Any better ideas? I hate adding more ioctls.
>
> I think it should go into ip_tunnel_param, unforunately there are no
> unused fields.
What's that? I can't find it in my kernel tree nor on google.
> Maybe schedule this for 2.7 together with a clean up
> of all the tunnels so they share redundant code? Lots of gre related
> code and comments spread over non-gre tunnels that should go away.
Indeed, it would be good to have this cleaned up.
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-13 0:18 ` Lennert Buytenhek
@ 2005-01-13 0:28 ` Thomas Graf
2005-01-13 0:36 ` Lennert Buytenhek
0 siblings, 1 reply; 30+ messages in thread
From: Thomas Graf @ 2005-01-13 0:28 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: Ben Greear, netdev, shemminger, shollenbeck
* Lennert Buytenhek <20050113001837.GH14280@xi.wantstofly.org> 2005-01-13 01:18
> What's that? I can't find it in my kernel tree nor on google.
Typo, sorry. I meant ip_tunnel_parm. Thinking of it, shouldn't protocol
in ip_tunnel_parm->iph->protocol be set to ETHER_IP so userspace could find out
this way?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-13 0:04 ` Stephen Hemminger
@ 2005-01-13 0:29 ` Lennert Buytenhek
0 siblings, 0 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-13 0:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, shollenbeck
On Wed, Jan 12, 2005 at 04:04:55PM -0800, Stephen Hemminger wrote:
> > After struggling with various userland VPN solutions for a while (and
> > failing to make IPSEC tunnel mode do what I want), I decided to just
> > implement ethernet-in-IP tunneling in the kernel and let IPSEC transport
> > mode handle the rest.
> >
> > There appeared to be an RFC for ethernet-in-IP already, RFC 3378, so I
> > just implemented that. It's very simple -- slap a 16-bit header (0x3000,
> > which is 4 bits of etherip version number and 12 bits of padding) onto
> > the beginning of the ethernet packet, and then wrap it in an IP packet.
> >
> > Below is what I came up with, against the latest Fedora Core 3 kernel,
> > which is 2.6.10-something. It survives some fairly basic testing between
> > a number of different machines, UP and SMP. (Corresponding iproute2
> > patch is available from http://www.wantstofly.org/~buytenh/etherip/ )
>
> Since it is an RFC, any chance of interoperability testing it with
> something besides Linux on the other end?
Some googling suggests that OpenBSD implements this as well.
Anyone here with an OpenBSD box on the 'net that's willing to do
some tests?
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-13 0:28 ` Thomas Graf
@ 2005-01-13 0:36 ` Lennert Buytenhek
2005-01-13 1:20 ` Thomas Graf
0 siblings, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-13 0:36 UTC (permalink / raw)
To: Thomas Graf; +Cc: Ben Greear, netdev, shemminger, shollenbeck
On Thu, Jan 13, 2005 at 01:28:06AM +0100, Thomas Graf wrote:
> > What's that? I can't find it in my kernel tree nor on google.
>
> Typo, sorry. I meant ip_tunnel_parm. Thinking of it, shouldn't protocol
> in ip_tunnel_parm->iph->protocol be set to ETHER_IP so userspace could
> find out this way?
ip_tunnel_parm->iph->protocol for ether/ip tunnels is IPPROTO_ETHERIP,
which is 97. So yeah, the info is in there. But this doesn't help you
much in determining whether an arbitrary network device is in fact an
ether/ip tunnel or not, since SIOCGETTUNNEL aliases with SIOCDEVPRIVATE+3.
The only way to make it stand out would be to give it its own ARPHRD_
type, but then it wouldn't look like an ethernet device anymore.
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-13 0:36 ` Lennert Buytenhek
@ 2005-01-13 1:20 ` Thomas Graf
0 siblings, 0 replies; 30+ messages in thread
From: Thomas Graf @ 2005-01-13 1:20 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: Ben Greear, netdev, shemminger, shollenbeck
* Lennert Buytenhek <20050113003625.GJ14280@xi.wantstofly.org> 2005-01-13 01:36
> On Thu, Jan 13, 2005 at 01:28:06AM +0100, Thomas Graf wrote:
>
> > > What's that? I can't find it in my kernel tree nor on google.
> >
> > Typo, sorry. I meant ip_tunnel_parm. Thinking of it, shouldn't protocol
> > in ip_tunnel_parm->iph->protocol be set to ETHER_IP so userspace could
> > find out this way?
>
> ip_tunnel_parm->iph->protocol for ether/ip tunnels is IPPROTO_ETHERIP,
> which is 97. So yeah, the info is in there. But this doesn't help you
> much in determining whether an arbitrary network device is in fact an
> ether/ip tunnel or not, since SIOCGETTUNNEL aliases with SIOCDEVPRIVATE+3.
Ahh.. You want userspace to be able to tell even if it is unclear
wether it is a tunnel. I think the only way to avoid a new ioctl would
be to introduce IFF_TUNNEL or alike and have userspace call GETTUNNEL
if it is set. This also solves the problem for all other tunnels.
You can also reuse one of the unused flags such as IFF_NOTRAILERS
since it would be read only anyway but I guess this would be too
much of a hack ;->
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-12 22:24 [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
2005-01-12 22:42 ` Ben Greear
2005-01-13 0:04 ` Stephen Hemminger
@ 2005-01-13 7:49 ` Pekka Savola
2005-01-13 9:23 ` Lennert Buytenhek
2 siblings, 1 reply; 30+ messages in thread
From: Pekka Savola @ 2005-01-13 7:49 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev, shemminger, rhousley, shollenbeck
On Wed, 12 Jan 2005, Lennert Buytenhek wrote:
> After struggling with various userland VPN solutions for a while (and
> failing to make IPSEC tunnel mode do what I want), I decided to just
> implement ethernet-in-IP tunneling in the kernel and let IPSEC transport
> mode handle the rest.
>
> There appeared to be an RFC for ethernet-in-IP already, RFC 3378, so I
> just implemented that. It's very simple -- slap a 16-bit header (0x3000,
> which is 4 bits of etherip version number and 12 bits of padding) onto
> the beginning of the ethernet packet, and then wrap it in an IP packet.
EtherIP is not recommended for use; it's an Informational RFC.
Is there a particular reason why GRE tunnel is not sufficient?
AFAICS, it provides all the benefits of Ethernet-in-IP, is very
commonly deployed, and only has 4 bytes of overhead.
And if GRE is not sufficient, the next step would be L2TP.
--
Pekka Savola "You each name yourselves king, yet the
Netcore Oy kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-13 7:49 ` Pekka Savola
@ 2005-01-13 9:23 ` Lennert Buytenhek
2005-01-16 17:37 ` jamal
0 siblings, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-13 9:23 UTC (permalink / raw)
To: Pekka Savola; +Cc: netdev, shemminger, shollenbeck
On Thu, Jan 13, 2005 at 09:49:55AM +0200, Pekka Savola wrote:
> Is there a particular reason why GRE tunnel is not sufficient?
No particular reason, apart from not being aware that GRE provides
this functionality.
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-13 9:23 ` Lennert Buytenhek
@ 2005-01-16 17:37 ` jamal
2005-01-16 18:55 ` tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling) Lennert Buytenhek
2005-01-16 19:02 ` [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
0 siblings, 2 replies; 30+ messages in thread
From: jamal @ 2005-01-16 17:37 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: Pekka Savola, netdev, shemminger, shollenbeck
On Thu, 2005-01-13 at 04:23, Lennert Buytenhek wrote:
> On Thu, Jan 13, 2005 at 09:49:55AM +0200, Pekka Savola wrote:
>
> > Is there a particular reason why GRE tunnel is not sufficient?
>
> No particular reason, apart from not being aware that GRE provides
> this functionality.
True that GRE can do all this (and they have thought out well the
broadcasting etc) but i dont think it will harm to push this into the
kernel if some odd OS like openbsd supports it.
cheers,
jamal
^ permalink raw reply [flat|nested] 30+ messages in thread
* tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 17:37 ` jamal
@ 2005-01-16 18:55 ` Lennert Buytenhek
2005-01-16 19:51 ` Pekka Savola
2005-01-16 20:02 ` jamal
2005-01-16 19:02 ` [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
1 sibling, 2 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 18:55 UTC (permalink / raw)
To: netdev; +Cc: jamal, Pekka Savola, shemminger, shollenbeck
Summarising private thread on netdev.
The main issue under discussion is whether to implement ethernet-over-IP
tunneling by using etherip, or whether using something like GRE would be
better. The main argument against etherip is that its RFC has
'Informational' status (while the latest GRE RFC I could find, 2784, is
'Proposed Standard'.)
Another argument against etherip would be that OpenBSD apparently
mis-implemented etherip by putting the etherip version nibble in the
second nibble of the etherip header instead of the first, which would
probably prevent the linux and OpenBSD versions from interoperating,
negating the advantage of using etherip in the first place.
All I personally care about is that when I install a random linux distro
two years from now, that ethernet-over-IP tunneling will simply work, using
whatever protocol -- I don't care about which.
Any opinions?
If we do end up using GRE for ethernet tunneling, there's some work that
needs to be done. For one, ip_gre in its current form would need a certain
amount of hacking for tunneling ethernet frames instead of IPv4/IPv6 as
it does now. We might as rename it to plain 'gre' and move it out of
net/ipv4/ to net/core/ or something while we're at it.
The way we currently use (f.e. in iproute2) for finding out whether a
given netdevice is a tunnel or not is by looking at ARPHRD_*, but this
scheme breaks down for ethernet tunnels, since there is no other way of
distinguishing them from regular ethernet devices. We could issue
SIOCGETTUNNEL and see if that succeeds, but that unfortunately aliases
with SIOCDEVPRIVATE which aliases to BOND_ENSLAVE_OLD, SIOCGMSTATS,
EQL_ENSLAVE, FRAD_GET_CONF, SIOCDEVPLIP, SIOCGPPPSTATS and a million
others, so you never know if the netdevice really interpreted it as
SIOCGETTUNNEL or no.
Other things that suck about tunneling?
- If we're going to overhaul the way tunneling works, we should try to
remove the need for the gre0 interface as well.
- Tunneling over IPv6 should be implemented.
- How to share more code between sit/ipip/gre?
Ideas?
On Sun, Jan 16, 2005 at 12:37:00PM -0500, jamal wrote:
> > > Is there a particular reason why GRE tunnel is not sufficient?
> >
> > No particular reason, apart from not being aware that GRE provides
> > this functionality.
>
> True that GRE can do all this (and they have thought out well the
> broadcasting etc) but i dont think it will harm to push this into the
> kernel if some odd OS like openbsd supports it.
>
> cheers,
> jamal
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-16 17:37 ` jamal
2005-01-16 18:55 ` tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling) Lennert Buytenhek
@ 2005-01-16 19:02 ` Lennert Buytenhek
2005-01-16 20:05 ` jamal
1 sibling, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 19:02 UTC (permalink / raw)
To: jamal; +Cc: Pekka Savola, netdev, shemminger, shollenbeck
On Sun, Jan 16, 2005 at 12:37:00PM -0500, jamal wrote:
> > > Is there a particular reason why GRE tunnel is not sufficient?
> >
> > No particular reason, apart from not being aware that GRE provides
> > this functionality.
>
> True that GRE can do all this (and they have thought out well the
> broadcasting etc) but i dont think it will harm to push this into the
> kernel if some odd OS like openbsd supports it.
Apparently they mis-read the RFC and write the etherip header as 0x0300
instead of 0x3000 (they have the version nibble in the wrong place.) This
would likely prevent interoperability.
cheers,
Lennert
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 18:55 ` tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling) Lennert Buytenhek
@ 2005-01-16 19:51 ` Pekka Savola
2005-01-16 19:57 ` Lennert Buytenhek
2005-01-16 20:02 ` jamal
1 sibling, 1 reply; 30+ messages in thread
From: Pekka Savola @ 2005-01-16 19:51 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev
On Sun, 16 Jan 2005, Lennert Buytenhek wrote:
> If we do end up using GRE for ethernet tunneling, there's some work that
> needs to be done. For one, ip_gre in its current form would need a certain
> amount of hacking for tunneling ethernet frames instead of IPv4/IPv6 as
> it does now. We might as rename it to plain 'gre' and move it out of
> net/ipv4/ to net/core/ or something while we're at it.
Now that I think about this a bit more, there may be a potential
issue..
If the payload of GRE is an ethernet frame, which GRE 'Protocol Type'
(i.e., ethertype) would that be?
I doubt anyone has defined an ethertype which would be used to
transmit full ethernet frames inside an ethernet header, so this would
likely need to be some special value..
http://www.iana.org/assignments/ethernet-numbers
(Another solution for this problem space is obviously L2TP, which is
also rather widely supported..)
--
Pekka Savola "You each name yourselves king, yet the
Netcore Oy kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 19:51 ` Pekka Savola
@ 2005-01-16 19:57 ` Lennert Buytenhek
2005-01-17 5:45 ` Pekka Savola
0 siblings, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 19:57 UTC (permalink / raw)
To: Pekka Savola; +Cc: netdev
On Sun, Jan 16, 2005 at 09:51:41PM +0200, Pekka Savola wrote:
> >If we do end up using GRE for ethernet tunneling, there's some work that
> >needs to be done. For one, ip_gre in its current form would need a certain
> >amount of hacking for tunneling ethernet frames instead of IPv4/IPv6 as
> >it does now. We might as rename it to plain 'gre' and move it out of
> >net/ipv4/ to net/core/ or something while we're at it.
>
> Now that I think about this a bit more, there may be a potential
> issue..
>
> If the payload of GRE is an ethernet frame, which GRE 'Protocol Type'
> (i.e., ethertype) would that be?
I assumed that "Transparent Ethernet Bridging" would be used for that.
>From ethernet-numbers:
Ethernet Exp. Ethernet Description References
------------- ------------- ----------- ----------
decimal Hex decimal octal
[snip]
25944 6558 - - Trans Ethen Bridging [RFC1701]
RFC1701 is the original GRE RFC, which mentions:
The following are currently assigned protocol types for GRE. Future
protocol types must be taken from DIX ethernet encoding. For
historical reasons, a number of other values have been used for some
protocols. The following table of values MUST be used to identify
the following protocols:
Protocol Family PTYPE
--------------- -----
[snip]
Transparent Ethernet Bridging 6558
--L
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 18:55 ` tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling) Lennert Buytenhek
2005-01-16 19:51 ` Pekka Savola
@ 2005-01-16 20:02 ` jamal
2005-01-16 20:20 ` Lennert Buytenhek
1 sibling, 1 reply; 30+ messages in thread
From: jamal @ 2005-01-16 20:02 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev, Pekka Savola, shemminger, shollenbeck
On Sun, 2005-01-16 at 13:55, Lennert Buytenhek wrote:
[..]
> Another argument against etherip would be that OpenBSD apparently
> mis-implemented etherip by putting the etherip version nibble in the
> second nibble of the etherip header instead of the first, which would
> probably prevent the linux and OpenBSD versions from interoperating,
> negating the advantage of using etherip in the first place.
Should be pretty easy for them to fix, no?
> All I personally care about is that when I install a random linux distro
> two years from now, that ethernet-over-IP tunneling will simply work, using
> whatever protocol -- I don't care about which.
>
> Any opinions?
>
My opinion is it doesnt harm to have it in. BTW, in one of your emails i
noticed you cced the authors of that RFC - did they respond? Whats their
deployment experiences?
>
> If we do end up using GRE for ethernet tunneling, there's some work that
> needs to be done. For one, ip_gre in its current form would need a certain
> amount of hacking for tunneling ethernet frames instead of IPv4/IPv6 as
> it does now. We might as rename it to plain 'gre' and move it out of
> net/ipv4/ to net/core/ or something while we're at it.
>
> The way we currently use (f.e. in iproute2) for finding out whether a
> given netdevice is a tunnel or not is by looking at ARPHRD_*, but this
> scheme breaks down for ethernet tunnels,
the dev->type is intended precisely for that. So if this needs a new
type then you should introduce a new ARPHRD type for it and set it at
device creation time.
> since there is no other way of
> distinguishing them from regular ethernet devices. We could issue
> SIOCGETTUNNEL and see if that succeeds, but that unfortunately aliases
> with SIOCDEVPRIVATE which aliases to BOND_ENSLAVE_OLD, SIOCGMSTATS,
> EQL_ENSLAVE, FRAD_GET_CONF, SIOCDEVPLIP, SIOCGPPPSTATS and a million
> others, so you never know if the netdevice really interpreted it as
> SIOCGETTUNNEL or no.
Introducing the new type should help. Also the iflink is typically set
to the mother netdevice. So that should go a long way to give you
details.
Ia m not sure about this ioctl stuff - but shouldnt there be a backway
via netlink for all these details.
> Other things that suck about tunneling?
> - If we're going to overhaul the way tunneling works, we should try to
> remove the need for the gre0 interface as well.
Why is this first instance needed? Its not like theres a bus that is
scanned at boot time and we need to create at that discovery time.
> - Tunneling over IPv6 should be implemented.
sit? or v6-v6?
> - How to share more code between sit/ipip/gre?
Lots of shareable stuff there.
BTW, have you looked at any of the L2VPN stuff? browse the ietf web
page. Some interesting stuff there.
cheers,
jamal
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-16 19:02 ` [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
@ 2005-01-16 20:05 ` jamal
2005-01-16 20:22 ` Lennert Buytenhek
0 siblings, 1 reply; 30+ messages in thread
From: jamal @ 2005-01-16 20:05 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: Pekka Savola, netdev, shemminger, shollenbeck
On Sun, 2005-01-16 at 14:02, Lennert Buytenhek wrote:
> Apparently they mis-read the RFC and write the etherip header as 0x0300
> instead of 0x3000 (they have the version nibble in the wrong place.) This
> would likely prevent interoperability.
Thinking about it a bit - you should be able to break yours to do 0x0300
to test with them ;->
cheers,
jamal
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 20:02 ` jamal
@ 2005-01-16 20:20 ` Lennert Buytenhek
2005-01-16 20:37 ` Hasso Tepper
2005-01-16 23:09 ` jamal
0 siblings, 2 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 20:20 UTC (permalink / raw)
To: jamal; +Cc: netdev, Pekka Savola, shemminger, shollenbeck
On Sun, Jan 16, 2005 at 03:02:26PM -0500, jamal wrote:
> > Another argument against etherip would be that OpenBSD apparently
> > mis-implemented etherip by putting the etherip version nibble in the
> > second nibble of the etherip header instead of the first, which would
> > probably prevent the linux and OpenBSD versions from interoperating,
> > negating the advantage of using etherip in the first place.
>
> Should be pretty easy for them to fix, no?
Sure, but, the argument in favor of etherip would be interoperability
with existing implementations. If at least OpenBSD gets it wrong (and
the proposed patch for NetBSD seems to also use the wrong value) then
this argument isn't all that strong anymore.
> > All I personally care about is that when I install a random linux distro
> > two years from now, that ethernet-over-IP tunneling will simply work, using
> > whatever protocol -- I don't care about which.
> >
> > Any opinions?
>
> My opinion is it doesnt harm to have it in.
We'd need to sort out the ARPHRD_* issue in any case -- see below.
> BTW, in one of your emails i
> noticed you cced the authors of that RFC - did they respond? Whats their
> deployment experiences?
The @rsa.. address bounces, the other address is still in the CC list
and I didn't hear from him yet :-)
> > If we do end up using GRE for ethernet tunneling, there's some work that
> > needs to be done. For one, ip_gre in its current form would need a certain
> > amount of hacking for tunneling ethernet frames instead of IPv4/IPv6 as
> > it does now. We might as rename it to plain 'gre' and move it out of
> > net/ipv4/ to net/core/ or something while we're at it.
> >
> > The way we currently use (f.e. in iproute2) for finding out whether a
> > given netdevice is a tunnel or not is by looking at ARPHRD_*, but this
> > scheme breaks down for ethernet tunnels,
>
> the dev->type is intended precisely for that. So if this needs a new
> type then you should introduce a new ARPHRD type for it and set it at
> device creation time.
Bridges present themselves as ARPHRD_ETHER, even though they are not
'real' ethernet devices as such. Should they have their own type?
What about ethertap devices?
If we create ARPHRD_ETHERTUNNEL for ethernet tunnels instead of using
ARPHRD_ETHER, we'd have to make modifications all over the place to
teach other code that ARPHRD_ETHERTUNNEL is basically just another type
of ethernet device. For example, we'd have to modify net/bridge/*
because it will only enslave devices which are ARPHRD_ETHER. But even
more modifications would be needed in userland -- we'd have to adapt
ifconfig, ip route, etc.
The issue is "We can not blindly issue SIOCGETTUNNEL to ARPHRD_ETHER
devices like we do for ARPHRD_{TUNNEL,IPGRE,SIT} because on _ETHER, it
might alias with another ioctl (SIOCDEVPRIVATE)."
> > since there is no other way of
> > distinguishing them from regular ethernet devices. We could issue
> > SIOCGETTUNNEL and see if that succeeds, but that unfortunately aliases
> > with SIOCDEVPRIVATE which aliases to BOND_ENSLAVE_OLD, SIOCGMSTATS,
> > EQL_ENSLAVE, FRAD_GET_CONF, SIOCDEVPLIP, SIOCGPPPSTATS and a million
> > others, so you never know if the netdevice really interpreted it as
> > SIOCGETTUNNEL or no.
>
> Introducing the new type should help. Also the iflink is typically set
> to the mother netdevice. So that should go a long way to give you
> details.
There is no 'mother' device for a tunnel -- there might be no route to
the remote host at creation time, and that route might change over time
due to routing changes too.
> > Other things that suck about tunneling?
> > - If we're going to overhaul the way tunneling works, we should try to
> > remove the need for the gre0 interface as well.
>
> Why is this first instance needed? Its not like theres a bus that is
> scanned at boot time and we need to create at that discovery time.
This first instance is used for sending ioctls to to create subsequent
tunnel devices. Doing something netlink-based would be fine with me.
> > - Tunneling over IPv6 should be implemented.
>
> sit? or v6-v6?
Basically, 'ip tunnel' should accept v6 addresses for 'local' and
'remote'. Having GRE-over-v6 would then imply v6-over-v6, v4-over-v6,
ethernet-over-v6, etc.
> BTW, have you looked at any of the L2VPN stuff? browse the ietf web
> page. Some interesting stuff there.
The L2TP RFC (RFC 2661, is that the same thing?) is 170kB, which scared
me off somewhat.
At least the GRE RFC fits in 16kB.
--L
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling
2005-01-16 20:05 ` jamal
@ 2005-01-16 20:22 ` Lennert Buytenhek
0 siblings, 0 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 20:22 UTC (permalink / raw)
To: jamal; +Cc: Pekka Savola, netdev, shemminger, shollenbeck
On Sun, Jan 16, 2005 at 03:05:06PM -0500, jamal wrote:
> > Apparently they mis-read the RFC and write the etherip header as 0x0300
> > instead of 0x3000 (they have the version nibble in the wrong place.) This
> > would likely prevent interoperability.
>
> Thinking about it a bit - you should be able to break yours to do 0x0300
> to test with them ;->
Etherip is a really simple standard, the header is only 16 bits which
can only assume a single value (0x3000). Looking at encapsulated packets
with tcpdump makes me 99.99% confident that it's all okay on that level.
--L
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 20:20 ` Lennert Buytenhek
@ 2005-01-16 20:37 ` Hasso Tepper
2005-01-16 21:21 ` Lennert Buytenhek
2005-01-16 23:09 ` jamal
1 sibling, 1 reply; 30+ messages in thread
From: Hasso Tepper @ 2005-01-16 20:37 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: jamal, netdev, Pekka Savola, shemminger, shollenbeck
Lennert Buytenhek wrote:
> On Sun, Jan 16, 2005 at 03:02:26PM -0500, jamal wrote:
> > BTW, have you looked at any of the L2VPN stuff? browse the ietf web
> > page. Some interesting stuff there.
>
> The L2TP RFC (RFC 2661, is that the same thing?) is 170kB, which scared
> me off somewhat.
If L2TP, then v3 probably - RFC3931. But that's far from just ethernet over
ip, it's anything over ip.
--
Hasso Tepper
Elion Enterprises Ltd.
WAN administrator
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 20:37 ` Hasso Tepper
@ 2005-01-16 21:21 ` Lennert Buytenhek
2005-01-16 21:32 ` Hasso Tepper
0 siblings, 1 reply; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 21:21 UTC (permalink / raw)
To: Hasso Tepper; +Cc: jamal, netdev, Pekka Savola, shemminger, shollenbeck
On Sun, Jan 16, 2005 at 10:37:33PM +0200, Hasso Tepper wrote:
> > > BTW, have you looked at any of the L2VPN stuff? browse the ietf web
> > > page. Some interesting stuff there.
> >
> > The L2TP RFC (RFC 2661, is that the same thing?) is 170kB, which scared
> > me off somewhat.
>
> If L2TP, then v3 probably - RFC3931.
That RFC appears not to exist.
> But that's far from just ethernet over ip, it's anything over ip.
Yeah, I really doubt that L2TP is what I'm after.
--L
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 21:21 ` Lennert Buytenhek
@ 2005-01-16 21:32 ` Hasso Tepper
2005-01-16 21:44 ` Lennert Buytenhek
0 siblings, 1 reply; 30+ messages in thread
From: Hasso Tepper @ 2005-01-16 21:32 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: jamal, netdev, Pekka Savola, shemminger, shollenbeck
Lennert Buytenhek wrote:
> On Sun, Jan 16, 2005 at 10:37:33PM +0200, Hasso Tepper wrote:
> > > > BTW, have you looked at any of the L2VPN stuff? browse the ietf web
> > > > page. Some interesting stuff there.
> > >
> > > The L2TP RFC (RFC 2661, is that the same thing?) is 170kB, which
> > > scared me off somewhat.
> >
> > If L2TP, then v3 probably - RFC3931.
>
> That RFC appears not to exist.
Hmmm. Yes, seems that there was decision for yet another draft before 3931
was made official. Some sites contain info about it though. So official one
at the moment:
http://www.ietf.org/internet-drafts/draft-ietf-l2tpext-l2tp-base-15.txt
--
Hasso Tepper
Elion Enterprises Ltd.
WAN administrator
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 21:32 ` Hasso Tepper
@ 2005-01-16 21:44 ` Lennert Buytenhek
0 siblings, 0 replies; 30+ messages in thread
From: Lennert Buytenhek @ 2005-01-16 21:44 UTC (permalink / raw)
To: Hasso Tepper; +Cc: jamal, netdev, Pekka Savola, shemminger, shollenbeck
On Sun, Jan 16, 2005 at 11:32:38PM +0200, Hasso Tepper wrote:
> > > > > BTW, have you looked at any of the L2VPN stuff? browse the ietf web
> > > > > page. Some interesting stuff there.
> > > >
> > > > The L2TP RFC (RFC 2661, is that the same thing?) is 170kB, which
> > > > scared me off somewhat.
> > >
> > > If L2TP, then v3 probably - RFC3931.
> >
> > That RFC appears not to exist.
>
> Hmmm. Yes, seems that there was decision for yet another draft before 3931
> was made official. Some sites contain info about it though. So official one
> at the moment:
>
> http://www.ietf.org/internet-drafts/draft-ietf-l2tpext-l2tp-base-15.txt
Things like this make me shiver:
<quote>
Appendix A: Control Slow Start and Congestion Avoidance
Although each side has indicated the maximum size of its receive
window, it is recommended that a slow start and congestion avoidance
method be used to transmit control packets. The methods described
here are based upon the TCP congestion avoidance algorithm as
described in Section 21.6 of TCP/IP Illustrated, Volume I, by W.
Richard Stevens [STEVENS] (this algorithm is also described in
[RFC2581]).
[snip]
</quote>
For some reason, people really really like reinventing TCP.
I'm definitely not going to implement this, all I wanted from the very
beginning is a _simple_ way of tunneling ethernet packets. :-)
--L
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 20:20 ` Lennert Buytenhek
2005-01-16 20:37 ` Hasso Tepper
@ 2005-01-16 23:09 ` jamal
1 sibling, 0 replies; 30+ messages in thread
From: jamal @ 2005-01-16 23:09 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev, Pekka Savola, shemminger, shollenbeck
On Sun, 2005-01-16 at 15:20, Lennert Buytenhek wrote:
> On Sun, Jan 16, 2005 at 03:02:26PM -0500, jamal wrote:
> Sure, but, the argument in favor of etherip would be interoperability
> with existing implementations. If at least OpenBSD gets it wrong (and
> the proposed patch for NetBSD seems to also use the wrong value) then
> this argument isn't all that strong anymore.
>
Is 0x300 owned by anything? Maybe thats what it should be from now on
and an update to be issued agains the RFC ;->
> > BTW, in one of your emails i
> > noticed you cced the authors of that RFC - did they respond? Whats their
> > deployment experiences?
>
> The @rsa.. address bounces, the other address is still in the CC list
> and I didn't hear from him yet :-)
Hopefully we'll hear from them
> > the dev->type is intended precisely for that. So if this needs a new
> > type then you should introduce a new ARPHRD type for it and set it at
> > device creation time.
>
> Bridges present themselves as ARPHRD_ETHER, even though they are not
> 'real' ethernet devices as such. Should they have their own type?
some devices are "ethernet like" - meaning they operate on ethernet
headers, so its fine to use that. I would think bridges would fall under
that category.
> What about ethertap devices?
They are ethernet like as well. OTOH, if you look at tuntap you may find
one of the modes is not ethernet-like. Infact it doesnt have a header at
all;-> (ARPHRD_NONE). rule of thumb: Whats the header like?
> If we create ARPHRD_ETHERTUNNEL for ethernet tunnels instead of using
> ARPHRD_ETHER, we'd have to make modifications all over the place to
> teach other code that ARPHRD_ETHERTUNNEL is basically just another type
> of ethernet device.
IANA actually keeps track of these iftype numbers - they dont seem to
match with Linux - I wonder why we havent heard some SNMP weenie
bitching about this:
http://www.iana.org/assignments/ianaiftype-mib
So, yes, should be able to add ARPHRD_ETHERTUNNEL in some free number
and i am pretty sure youd hear from them someday;->
If i have time i will sneak a look at net-snmp
> For example, we'd have to modify net/bridge/*
> because it will only enslave devices which are ARPHRD_ETHER. But even
> more modifications would be needed in userland -- we'd have to adapt
> ifconfig, ip route, etc.
You probably want net/bridge to enslave _only_ ethernet like type
netdevices. Otherwise it gets confused trying to see where the fsck dst
MAC is to be found.
> The issue is "We can not blindly issue SIOCGETTUNNEL to ARPHRD_ETHER
> devices like we do for ARPHRD_{TUNNEL,IPGRE,SIT} because on _ETHER, it
> might alias with another ioctl (SIOCDEVPRIVATE)."
I dont think it will be an issue with proper dev->type.
You know theres another way to do this:
Write this as a simple action. If all you are doing is slapping headers
back and forth then you want:
ingress:
match protocol 0x3000
action: etherip decap, format, --> netif_rx
egress:
outgoing:
match someIP
action: etherip encap, format --> out we go
All very simple to control with netlink. And would be an excuse to
finaly get you to look at this stuff ;->
> > Introducing the new type should help. Also the iflink is typically set
> > to the mother netdevice. So that should go a long way to give you
> > details.
>
> There is no 'mother' device for a tunnel -- there might be no route to
> the remote host at creation time, and that route might change over time
> due to routing changes too.
>
I thought you should be able to specify _exactly_ what the physical
device is ("ip tunnel ... dev eth0" or something along those lines) . I
could be wrong.
> > Why is this first instance needed? Its not like theres a bus that is
> > scanned at boot time and we need to create at that discovery time.
>
> This first instance is used for sending ioctls to to create subsequent
> tunnel devices. Doing something netlink-based would be fine with me.
>
Yes, that would kill need for those devices.. 2.6.x is now fine and
should handle all that fine with no need for ioctls - backward comapt
maybe an issue.
>
> > > - Tunneling over IPv6 should be implemented.
> >
> > sit? or v6-v6?
>
> Basically, 'ip tunnel' should accept v6 addresses for 'local' and
> 'remote'. Having GRE-over-v6 would then imply v6-over-v6, v4-over-v6,
> ethernet-over-v6, etc.
>
At the moment you specify the mode - is that the only painful thing?
Didnt quiet follow
>
> > BTW, have you looked at any of the L2VPN stuff? browse the ietf web
> > page. Some interesting stuff there.
>
> The L2TP RFC (RFC 2661, is that the same thing?) is 170kB, which scared
> me off somewhat.
>
> At least the GRE RFC fits in 16kB.
No meant L2VPNs - purely layer 2 and then somewhere along the path
carried by MPLS.
start here: http://www.ietf.org/html.charters/l2vpn-charter.html
cheers,
jamal
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling)
2005-01-16 19:57 ` Lennert Buytenhek
@ 2005-01-17 5:45 ` Pekka Savola
0 siblings, 0 replies; 30+ messages in thread
From: Pekka Savola @ 2005-01-17 5:45 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: netdev
On Sun, 16 Jan 2005, Lennert Buytenhek wrote:
> I assumed that "Transparent Ethernet Bridging" would be used for that.
>
> 25944 6558 - - Trans Ethen Bridging [RFC1701]
Yes. Sorry I missed that.
--
Pekka Savola "You each name yourselves king, yet the
Netcore Oy kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2005-01-17 5:45 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-12 22:24 [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
2005-01-12 22:42 ` Ben Greear
2005-01-12 22:48 ` Lennert Buytenhek
2005-01-12 23:11 ` Ben Greear
2005-01-12 23:16 ` Lennert Buytenhek
2005-01-12 23:43 ` Thomas Graf
2005-01-13 0:18 ` Lennert Buytenhek
2005-01-13 0:28 ` Thomas Graf
2005-01-13 0:36 ` Lennert Buytenhek
2005-01-13 1:20 ` Thomas Graf
2005-01-12 23:43 ` Ben Greear
2005-01-13 0:04 ` Stephen Hemminger
2005-01-13 0:29 ` Lennert Buytenhek
2005-01-13 7:49 ` Pekka Savola
2005-01-13 9:23 ` Lennert Buytenhek
2005-01-16 17:37 ` jamal
2005-01-16 18:55 ` tunneling in linux (was: Re: [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling) Lennert Buytenhek
2005-01-16 19:51 ` Pekka Savola
2005-01-16 19:57 ` Lennert Buytenhek
2005-01-17 5:45 ` Pekka Savola
2005-01-16 20:02 ` jamal
2005-01-16 20:20 ` Lennert Buytenhek
2005-01-16 20:37 ` Hasso Tepper
2005-01-16 21:21 ` Lennert Buytenhek
2005-01-16 21:32 ` Hasso Tepper
2005-01-16 21:44 ` Lennert Buytenhek
2005-01-16 23:09 ` jamal
2005-01-16 19:02 ` [PATCH][RFC] etherip: Ethernet-in-IPv4 tunneling Lennert Buytenhek
2005-01-16 20:05 ` jamal
2005-01-16 20:22 ` Lennert Buytenhek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).