netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies
@ 2008-07-08 10:14 Patrick McHardy
  2008-07-08 10:15 ` [RFC PATCH 01/04]: vlan: Don't store VLAN tag in cb Patrick McHardy
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 10:14 UTC (permalink / raw)
  To: Linux Netdev List

There are currently a number of inconsistencies when using
VLANs with VLAN acceleration and packet sockets:

1) With hardware tagging, outgoing packets are visible without
    the VLAN header, while with software tagging the full VLAN
    header is visible

2) With hardware stripping, incoming packets for locally configured
    VLANs appear on the VLAN device without being visible on the
    underlying device. Packets for unknown VLANs are not visible
    at all. Without hardware stripping, all VLAN packets are visible
    on the underlying device.

3) With hardware filtering, some drivers disable filtering in
    promiscous mode, some don't. The consensus of some discussions
    on netdev about this was that promiscous mode should disable
    all hardware filters, but keep HW acceleration enabled.

These patches fix point 1) and 2). Point 3) needs changes to the
drivers and will be fixed separetly.

The first patch moves the VLAN tag from the cb to a new skb
member, which is necessary to avoid clashes between VLANs and
qdiscs/packet sockets use of the cb. This means on TX, packet
sockets can get the tag from the skb.

The second patch uses the new skb field to store the VLAN TCI of
stripped VLAN packets on RX and deliver them to packet sockets.

The third patch adds the VLAN TCI to the packet auxdata and delivers
it to userspace, where it can be used to reconstruct the original
packet.

The fourth patch modifies libpcap to reconstruct the original
packet. There might be a better way to do this, but it was
useful for testing.

There are mainly two remaining problems with this approach:

- socket filters for VLAN headers don't work properly since they
   expect the VLAN header to be present. Since with the approach
   taken by these patches, userspace has to have knowledge about
   VLAN acceleration anyway, it seems OK to simply add a filter
   instruction to filter on skb->vlan_tci and have userspace
   construct its filters accordingly.

- packet socket auxdata is only available for non-mmaped sockets.
   For mmaped sockets the only place to store the data is in
   struct tpacket_hdr, but that would break compatibility. Not sure
   what to do about this case.

Comments welcome.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 01/04]: vlan: Don't store VLAN tag in cb
  2008-07-08 10:14 [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies Patrick McHardy
@ 2008-07-08 10:15 ` Patrick McHardy
  2008-07-08 10:16 ` [RFC PATCH 02/04]: vlan: deliver packets received with VLAN acceleration to network taps Patrick McHardy
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 10:15 UTC (permalink / raw)
  To: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: 01.diff --]
[-- Type: text/x-diff, Size: 5020 bytes --]

commit 73b938d774bb28b2f6548de8abed598505d8f63f
Author: Patrick McHardy <kaber@trash.net>
Date:   Tue Jul 8 11:38:42 2008 +0200

    vlan: Don't store VLAN tag in cb
    
    Use a real skb member to store the skb to avoid clashes with qdiscs,
    which are allowed to use the cb area themselves. As currently only real
    devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit
    invalidation is neccessary.
    
    The new member fills a hole on 64 bit, the skb layout changes from:
    
            __u32                      mark;                 /*   172     4 */
            sk_buff_data_t             transport_header;     /*   176     4 */
            sk_buff_data_t             network_header;       /*   180     4 */
            sk_buff_data_t             mac_header;           /*   184     4 */
            sk_buff_data_t             tail;                 /*   188     4 */
            /* --- cacheline 3 boundary (192 bytes) --- */
            sk_buff_data_t             end;                  /*   192     4 */
    
            /* XXX 4 bytes hole, try to pack */
    
    to
    
            __u32                      mark;                 /*   172     4 */
            __u16                      vlan_tci;             /*   176     2 */
    
            /* XXX 2 bytes hole, try to pack */
    
            sk_buff_data_t             transport_header;     /*   180     4 */
            sk_buff_data_t             network_header;       /*   184     4 */
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index d36515d..e8360a2 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -105,17 +105,8 @@ static inline void vlan_group_set_device(struct vlan_group *vg,
 	array[vlan_id % VLAN_GROUP_ARRAY_PART_LEN] = dev;
 }
 
-/* VLAN tx hw acceleration helpers. */
-struct vlan_skb_tx_cookie {
-	u32	magic;
-	u32	vlan_tag;
-};
-
-#define VLAN_TX_COOKIE_MAGIC	0x564c414e	/* "VLAN" in ascii. */
-#define VLAN_TX_SKB_CB(__skb)	((struct vlan_skb_tx_cookie *)&((__skb)->cb[0]))
-#define vlan_tx_tag_present(__skb) \
-	(VLAN_TX_SKB_CB(__skb)->magic == VLAN_TX_COOKIE_MAGIC)
-#define vlan_tx_tag_get(__skb)	(VLAN_TX_SKB_CB(__skb)->vlan_tag)
+#define vlan_tx_tag_present(__skb)	((__skb)->vlan_tci)
+#define vlan_tx_tag_get(__skb)		((__skb)->vlan_tci)
 
 #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
 extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
@@ -222,17 +213,12 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
  * @skb: skbuff to tag
  * @vlan_tci: VLAN TCI to insert
  *
- * Puts the VLAN TCI in @skb->cb[] and lets the device do the rest
+ * Puts the VLAN TCI in @skb->vlan_tci and lets the device do the rest
  */
 static inline struct sk_buff *__vlan_hwaccel_put_tag(struct sk_buff *skb,
 						     u16 vlan_tci)
 {
-	struct vlan_skb_tx_cookie *cookie;
-
-	cookie = VLAN_TX_SKB_CB(skb);
-	cookie->magic = VLAN_TX_COOKIE_MAGIC;
-	cookie->vlan_tag = vlan_tci;
-
+	skb->vlan_tci = vlan_tci;
 	return skb;
 }
 
@@ -279,16 +265,13 @@ static inline int __vlan_get_tag(const struct sk_buff *skb, u16 *vlan_tci)
  * @skb: skbuff to query
  * @vlan_tci: buffer to store vlaue
  *
- * Returns error if @skb->cb[] is not set correctly
+ * Returns error if @skb->vlan_tci is not set correctly
  */
 static inline int __vlan_hwaccel_get_tag(const struct sk_buff *skb,
 					 u16 *vlan_tci)
 {
-	struct vlan_skb_tx_cookie *cookie;
-
-	cookie = VLAN_TX_SKB_CB(skb);
-	if (cookie->magic == VLAN_TX_COOKIE_MAGIC) {
-		*vlan_tci = cookie->vlan_tag;
+	if (vlan_tx_tag_present(skb)) {
+		*vlan_tci = skb->vlan_tci;
 		return 0;
 	} else {
 		*vlan_tci = 0;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2220b9e..14a3273 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -246,6 +246,7 @@ typedef unsigned char *sk_buff_data_t;
  *	@dma_cookie: a cookie to one of several possible DMA operations
  *		done by skb DMA functions
  *	@secmark: security marking
+ *	@vlan_tci: vlan tag control information
  */
 
 struct sk_buff {
@@ -328,6 +329,8 @@ struct sk_buff {
 
 	__u32			mark;
 
+	__u16			vlan_tci;
+
 	sk_buff_data_t		transport_header;
 	sk_buff_data_t		network_header;
 	sk_buff_data_t		mac_header;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7c57156..50a853f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -459,6 +459,8 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->tc_verd		= old->tc_verd;
 #endif
 #endif
+	new->vlan_tci		= old->vlan_tci;
+
 	skb_copy_secmark(new, old);
 }
 
@@ -2286,6 +2288,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
 		skb_copy_queue_mapping(nskb, skb);
 		nskb->priority = skb->priority;
 		nskb->protocol = skb->protocol;
+		nskb->vlan_tci = skb->vlan_tci;
 		nskb->dst = dst_clone(skb->dst);
 		memcpy(nskb->cb, skb->cb, sizeof(skb->cb));
 		nskb->pkt_type = skb->pkt_type;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 02/04]: vlan: deliver packets received with VLAN acceleration to network taps
  2008-07-08 10:14 [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies Patrick McHardy
  2008-07-08 10:15 ` [RFC PATCH 01/04]: vlan: Don't store VLAN tag in cb Patrick McHardy
@ 2008-07-08 10:16 ` Patrick McHardy
  2008-07-08 10:16 ` [RFC PATCH 03/04]: packet: Store VLAN tag in auxillary data Patrick McHardy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 10:16 UTC (permalink / raw)
  To: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: 02.diff --]
[-- Type: text/x-diff, Size: 2939 bytes --]

commit d6e43bdeb617ea0885d682f52dad1e4f05e1cc39
Author: Patrick McHardy <kaber@trash.net>
Date:   Tue Jul 8 11:38:43 2008 +0200

    vlan: deliver packets received with VLAN acceleration to network taps
    
    When VLAN header stripping is used, packets currently bypass packet
    sockets (and other network taps) completely. For locally existing
    VLANs, they appear directly on the VLAN device, for unknown VLANs
    they are silently dropped.
    
    Add a new function netif_nit_deliver() to deliver incoming packets
    to all network interface taps and use it in __vlan_hwaccel_rx() to
    make VLAN packets visible on the underlying device.
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e009c6f..d0469eb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1145,6 +1145,7 @@ extern int		netif_rx(struct sk_buff *skb);
 extern int		netif_rx_ni(struct sk_buff *skb);
 #define HAVE_NETIF_RECEIVE_SKB 1
 extern int		netif_receive_skb(struct sk_buff *skb);
+extern void		netif_nit_deliver(struct sk_buff *skb);
 extern int		dev_valid_name(const char *name);
 extern int		dev_ioctl(struct net *net, unsigned int cmd, void __user *);
 extern int		dev_ethtool(struct net *net, struct ifreq *);
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 2bec74c..9a66e80 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,9 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 		return NET_RX_DROP;
 	}
 
+	skb->vlan_tci = vlan_tci;
+	netif_nit_deliver(skb);
+
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 	if (skb->dev == NULL) {
 		dev_kfree_skb_any(skb);
@@ -22,6 +25,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 		return NET_RX_SUCCESS;
 	}
 	skb->dev->last_rx = jiffies;
+	skb->vlan_tci = 0;
 
 	stats = &skb->dev->stats;
 	stats->rx_packets++;
diff --git a/net/core/dev.c b/net/core/dev.c
index 7593393..96f1528 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2060,6 +2060,33 @@ out:
 }
 #endif
 
+/*
+ * 	netif_tap_deliver - deliver received packets to network taps
+ * 	@skb: buffer
+ *
+ * 	This function is used to deliver incoming packets to network
+ * 	taps. It should be used when the normal netif_receive_skb path
+ * 	is bypassed, for example because of VLAN acceleration.
+ */
+void netif_nit_deliver(struct sk_buff *skb)
+{
+	struct packet_type *ptype;
+
+	if (list_empty(&ptype_all))
+		return;
+
+	skb_reset_network_header(skb);
+	skb_reset_transport_header(skb);
+	skb->mac_len = skb->network_header - skb->mac_header;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ptype, &ptype_all, list) {
+		if (!ptype->dev || ptype->dev == skb->dev)
+			deliver_skb(skb, ptype, skb->dev);
+	}
+	rcu_read_unlock();
+}
+
 /**
  *	netif_receive_skb - process receive buffer from network
  *	@skb: buffer to process

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 03/04]: packet: Store VLAN tag in auxillary data
  2008-07-08 10:14 [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies Patrick McHardy
  2008-07-08 10:15 ` [RFC PATCH 01/04]: vlan: Don't store VLAN tag in cb Patrick McHardy
  2008-07-08 10:16 ` [RFC PATCH 02/04]: vlan: deliver packets received with VLAN acceleration to network taps Patrick McHardy
@ 2008-07-08 10:16 ` Patrick McHardy
  2008-07-08 10:21   ` Patrick McHardy
  2008-07-08 10:17 ` [RFC PATCH 04/04]: libpcap: reconstruct VLAN header from auxdata Patrick McHardy
  2008-07-08 22:12 ` [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies David Miller
  4 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 10:16 UTC (permalink / raw)
  To: Linux Netdev List



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 04/04]: libpcap: reconstruct VLAN header from auxdata
  2008-07-08 10:14 [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies Patrick McHardy
                   ` (2 preceding siblings ...)
  2008-07-08 10:16 ` [RFC PATCH 03/04]: packet: Store VLAN tag in auxillary data Patrick McHardy
@ 2008-07-08 10:17 ` Patrick McHardy
  2008-07-08 22:12 ` [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies David Miller
  4 siblings, 0 replies; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 10:17 UTC (permalink / raw)
  To: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: libpcap.diff --]
[-- Type: text/x-diff, Size: 3176 bytes --]

diff --git a/pcap-linux.c b/pcap-linux.c
index e9db010..e877cd8 100644
--- a/pcap-linux.c
+++ b/pcap-linux.c
@@ -471,7 +471,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata)
 	socklen_t		fromlen;
 	int			packet_len, caplen;
 	struct pcap_pkthdr	pcap_header;
-
+	struct iovec		iov;
+	struct msghdr		msg;
+	struct cmsghdr		*cmsg;
+	union {
+		struct cmsghdr	cmsg;
+		char		buf[CMSG_SPACE(sizeof(struct tpacket_auxdata))];
+	} cmsg_buf;
 #ifdef HAVE_PF_PACKET_SOCKETS
 	/*
 	 * If this is a cooked device, leave extra room for a
@@ -492,6 +498,15 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata)
 	/* Receive a single packet from the kernel */
 
 	bp = handle->buffer + handle->offset;
+
+	msg.msg_name		= &from;
+	msg.msg_namelen		= sizeof(from);
+	msg.msg_iov		= &iov;
+	msg.msg_iovlen		= 1;
+	msg.msg_control		= &cmsg_buf;
+	msg.msg_controllen	= sizeof(cmsg_buf);
+	msg.msg_flags		= 0;
+
 	do {
 		/*
 		 * Has "pcap_breakloop()" been called?
@@ -505,11 +520,11 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata)
 			handle->break_loop = 0;
 			return -2;
 		}
-		fromlen = sizeof(from);
-		packet_len = recvfrom(
-			handle->fd, bp + offset,
-			handle->bufsize - offset, MSG_TRUNC,
-			(struct sockaddr *) &from, &fromlen);
+
+		iov.iov_len	= handle->bufsize - offset;
+		iov.iov_base	= bp + offset;
+
+		packet_len = recvmsg(handle->fd, &msg, MSG_TRUNC);
 	} while (packet_len == -1 && errno == EINTR);
 
 	/* Check if an error occured */
@@ -524,6 +539,38 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, u_char *userdata)
 		}
 	}
 
+	for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) {
+		struct tpacket_auxdata *aux;
+		unsigned int len, copy;
+		unsigned short *ptr;
+
+		if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct tpacket_auxdata)) ||
+		    cmsg->cmsg_level != SOL_PACKET ||
+		    cmsg->cmsg_type != PACKET_AUXDATA)
+			continue;
+
+		aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
+		if (aux->tp_vlan_tci == 0)
+			continue;
+
+		len = packet_len > iov.iov_len ? iov.iov_len : packet_len;
+		if (len > 2 * ETH_ALEN + 4) {
+			copy = len - 2 * ETH_ALEN - 4;
+			if (copy > iov.iov_len - 2 * ETH_ALEN - 4)
+				copy = iov.iov_len - 2 * ETH_ALEN - 4;
+
+			memmove(iov.iov_base + 2 * ETH_ALEN + 4,
+				iov.iov_base + 2 * ETH_ALEN, copy);
+		}
+
+		ptr = (unsigned short *)(iov.iov_base + 2 * ETH_ALEN);
+		if (len >= 2 * ETH_ALEN + 2)
+			*(ptr++) = htons(ETH_P_8021Q);
+		if (len >= 2 * ETH_ALEN + 4)
+			*(ptr++) = htons(aux->tp_vlan_tci);
+		packet_len += 4;
+	}
+
 #ifdef HAVE_PF_PACKET_SOCKETS
 	if (!handle->md.sock_packet) {
 		/*
@@ -1631,6 +1678,7 @@ iface_bind(int fd, int ifindex, char *ebuf)
 	struct sockaddr_ll	sll;
 	int			err;
 	socklen_t		errlen = sizeof(err);
+	int			val;
 
 	memset(&sll, 0, sizeof(sll));
 	sll.sll_family		= AF_PACKET;
@@ -1657,6 +1705,12 @@ iface_bind(int fd, int ifindex, char *ebuf)
 		return -2;
 	}
 
+	val = 1;
+	if (setsockopt(fd, SOL_PACKET, PACKET_AUXDATA, &val, sizeof(val)) == -1) {
+		snprintf(ebuf, PCAP_ERRBUF_SIZE,
+			 "setsockopt: %s", pcap_strerror(errno));
+		return -3;
+	}
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 03/04]: packet: Store VLAN tag in auxillary data
  2008-07-08 10:16 ` [RFC PATCH 03/04]: packet: Store VLAN tag in auxillary data Patrick McHardy
@ 2008-07-08 10:21   ` Patrick McHardy
  0 siblings, 0 replies; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 10:21 UTC (permalink / raw)
  To: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 25 bytes --]

Missing patch attached.


[-- Attachment #2: 03.diff --]
[-- Type: text/x-diff, Size: 1096 bytes --]

commit 97a314a172d9840f9c72ef12d7456b60891b4290
Author: Patrick McHardy <kaber@trash.net>
Date:   Tue Jul 8 11:38:43 2008 +0200

    packet: Store VLAN tag in auxillary data
    
    Store the VLAN tag in the auxillary data so userspace can properly deal with
    hardware VLAN tagging/stripping.
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h
index ad09609..4e1fc2a 100644
--- a/include/linux/if_packet.h
+++ b/include/linux/if_packet.h
@@ -57,6 +57,7 @@ struct tpacket_auxdata
 	__u32		tp_snaplen;
 	__u16		tp_mac;
 	__u16		tp_net;
+	__u16		tp_vlan_tci;
 };
 
 struct tpacket_hdr
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index beca640..951e62e 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1107,6 +1107,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 		aux.tp_snaplen = skb->len;
 		aux.tp_mac = 0;
 		aux.tp_net = skb_network_offset(skb);
+		aux.tp_vlan_tci = skb->vlan_tci;
 
 		put_cmsg(msg, SOL_PACKET, PACKET_AUXDATA, sizeof(aux), &aux);
 	}

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies
  2008-07-08 10:14 [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies Patrick McHardy
                   ` (3 preceding siblings ...)
  2008-07-08 10:17 ` [RFC PATCH 04/04]: libpcap: reconstruct VLAN header from auxdata Patrick McHardy
@ 2008-07-08 22:12 ` David Miller
  2008-07-08 22:30   ` Patrick McHardy
  4 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2008-07-08 22:12 UTC (permalink / raw)
  To: kaber; +Cc: netdev

From: Patrick McHardy <kaber@trash.net>
Date: Tue, 08 Jul 2008 12:14:51 +0200

> There are mainly two remaining problems with this approach:
> 
> - socket filters for VLAN headers don't work properly since they
>    expect the VLAN header to be present. Since with the approach
>    taken by these patches, userspace has to have knowledge about
>    VLAN acceleration anyway, it seems OK to simply add a filter
>    instruction to filter on skb->vlan_tci and have userspace
>    construct its filters accordingly.
> 
> - packet socket auxdata is only available for non-mmaped sockets.
>    For mmaped sockets the only place to store the data is in
>    struct tpacket_hdr, but that would break compatibility. Not sure
>    what to do about this case.
> 
> Comments welcome.

The mmap socket limitation is pretty serious.  It's the only
thing holding me back from just applying these patches right now.

I think we should make a new version of the tpacket_hdr structure
anyways, in order to make it more compat-friendly.  It currently
has "unsigned long" members and other crap.

We can design this thing to be extensible quite easily, by simply
giving it an "offset" field.  My suggestion is:

1) sed 's/tpacket_hdr/tpacket_hdr_old/'

2) Create packet socket option PACKET_NEW_TPHDR which enables
   use of a new tpacket_hdr layout.

3) Make new tpacket_hdr which has the VLAN tag stuff as well
   as an 'offset' field so that we can add more stuff later
   in a backwards compat way.

4) New new tpacket_hdr layout when PACKET_NEW_TPHDR has been
   set.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies
  2008-07-08 22:12 ` [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies David Miller
@ 2008-07-08 22:30   ` Patrick McHardy
  2008-07-08 22:35     ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 22:30 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Tue, 08 Jul 2008 12:14:51 +0200
>>
>> - packet socket auxdata is only available for non-mmaped sockets.
>>    For mmaped sockets the only place to store the data is in
>>    struct tpacket_hdr, but that would break compatibility. Not sure
>>    what to do about this case.
>>     
> The mmap socket limitation is pretty serious.  It's the only
> thing holding me back from just applying these patches right now.
>
> I think we should make a new version of the tpacket_hdr structure
> anyways, in order to make it more compat-friendly.  It currently
> has "unsigned long" members and other crap.
>
> We can design this thing to be extensible quite easily, by simply
> giving it an "offset" field.  My suggestion is:
>
> 1) sed 's/tpacket_hdr/tpacket_hdr_old/'
>
> 2) Create packet socket option PACKET_NEW_TPHDR which enables
>    use of a new tpacket_hdr layout.
>
> 3) Make new tpacket_hdr which has the VLAN tag stuff as well
>    as an 'offset' field so that we can add more stuff later
>    in a backwards compat way.
>
> 4) New new tpacket_hdr layout when PACKET_NEW_TPHDR has been
>    set.

That sounds good. Userspace needs to know about the size of
the tpacket_hdr before setting the ring parameters so it can
size the ring frames appropriately for the largest packet size
it wants to receive. This means the offset field in the
tpacket_hdr is redundant, so I'll just add a getsockopt
option for getting the size. Unless we want to be able to
include only a partial tpacket_hdr, but I don't think that
would be very useful.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies
  2008-07-08 22:30   ` Patrick McHardy
@ 2008-07-08 22:35     ` David Miller
  2008-07-08 22:38       ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2008-07-08 22:35 UTC (permalink / raw)
  To: kaber; +Cc: netdev

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 09 Jul 2008 00:30:04 +0200

> That sounds good. Userspace needs to know about the size of
> the tpacket_hdr before setting the ring parameters so it can
> size the ring frames appropriately for the largest packet size
> it wants to receive. This means the offset field in the
> tpacket_hdr is redundant, so I'll just add a getsockopt
> option for getting the size. Unless we want to be able to
> include only a partial tpacket_hdr, but I don't think that
> would be very useful.

Ok, in that case using a getsockopt() to query the size sounds great.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies
  2008-07-08 22:35     ` David Miller
@ 2008-07-08 22:38       ` David Miller
  2008-07-08 22:48         ` Patrick McHardy
  0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2008-07-08 22:38 UTC (permalink / raw)
  To: kaber; +Cc: netdev

From: David Miller <davem@davemloft.net>
Date: Tue, 08 Jul 2008 15:35:10 -0700 (PDT)

> From: Patrick McHardy <kaber@trash.net>
> Date: Wed, 09 Jul 2008 00:30:04 +0200
> 
> > That sounds good. Userspace needs to know about the size of
> > the tpacket_hdr before setting the ring parameters so it can
> > size the ring frames appropriately for the largest packet size
> > it wants to receive. This means the offset field in the
> > tpacket_hdr is redundant, so I'll just add a getsockopt
> > option for getting the size. Unless we want to be able to
> > include only a partial tpacket_hdr, but I don't think that
> > would be very useful.
> 
> Ok, in that case using a getsockopt() to query the size sounds great.

BTW, what you might want to do is say that if the user
makes the new getsockopt() size query, he understands
and wants the new style tpacket_hdr layout.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies
  2008-07-08 22:38       ` David Miller
@ 2008-07-08 22:48         ` Patrick McHardy
  0 siblings, 0 replies; 11+ messages in thread
From: Patrick McHardy @ 2008-07-08 22:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Tue, 08 Jul 2008 15:35:10 -0700 (PDT)
>
>   
>> From: Patrick McHardy <kaber@trash.net>
>> Date: Wed, 09 Jul 2008 00:30:04 +0200
>>
>>     
>>> That sounds good. Userspace needs to know about the size of
>>> the tpacket_hdr before setting the ring parameters so it can
>>> size the ring frames appropriately for the largest packet size
>>> it wants to receive. This means the offset field in the
>>> tpacket_hdr is redundant, so I'll just add a getsockopt
>>> option for getting the size. Unless we want to be able to
>>> include only a partial tpacket_hdr, but I don't think that
>>> would be very useful.
>>>       
>> Ok, in that case using a getsockopt() to query the size sounds great.
>>     
>
> BTW, what you might want to do is say that if the user
> makes the new getsockopt() size query, he understands
> and wants the new style tpacket_hdr layout.
>   

I'm torn between "nice hack" and "doesn't belong there".
I'll see how I feel about this once I get to that point :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-07-08 22:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-08 10:14 [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies Patrick McHardy
2008-07-08 10:15 ` [RFC PATCH 01/04]: vlan: Don't store VLAN tag in cb Patrick McHardy
2008-07-08 10:16 ` [RFC PATCH 02/04]: vlan: deliver packets received with VLAN acceleration to network taps Patrick McHardy
2008-07-08 10:16 ` [RFC PATCH 03/04]: packet: Store VLAN tag in auxillary data Patrick McHardy
2008-07-08 10:21   ` Patrick McHardy
2008-07-08 10:17 ` [RFC PATCH 04/04]: libpcap: reconstruct VLAN header from auxdata Patrick McHardy
2008-07-08 22:12 ` [RFC PATCH 00/04]: VLAN vs. packet socket inconsistencies David Miller
2008-07-08 22:30   ` Patrick McHardy
2008-07-08 22:35     ` David Miller
2008-07-08 22:38       ` David Miller
2008-07-08 22:48         ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).