netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rfc 0/4] timestamping updates
@ 2014-11-25 17:58 Willem de Bruijn
  2014-11-25 17:58 ` [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM Willem de Bruijn
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 17:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, luto, richardcochran, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

The main goal for this patchset is to allow correlating timestamps
with the egress interface. That change requires a few others:

1/4: TCP sockets should not loop L2/L3/L4 headers
2/4: main feature
3/4: bugfix: TCP sockets should call the family specific .._recv_error
4/4: revise the test to verify these changes

Willem de Bruijn (4):
  net-timestamp: pull headers for SOCK_STREAM
  net-errqueue: add IP(V6)_PKTINFO support
  net-timestamp: tcp sockets return v6 errors on v6 sockets
  net-timestamp: expand txtimestamp test with payload and PKTINFO

 .../networking/timestamping/txtimestamp.c          | 89 +++++++++++++++++++---
 net/core/skbuff.c                                  |  6 +-
 net/ipv4/ip_sockglue.c                             | 15 ++++
 net/ipv4/tcp.c                                     |  8 +-
 net/ipv6/datagram.c                                | 22 ++++++
 5 files changed, 125 insertions(+), 15 deletions(-)

-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-25 17:58 [PATCH rfc 0/4] timestamping updates Willem de Bruijn
@ 2014-11-25 17:58 ` Willem de Bruijn
  2014-11-25 18:42   ` David Miller
  2014-11-25 17:58 ` [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support Willem de Bruijn
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 17:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, luto, richardcochran, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

When returning timestamped packets on the error queue, only return
the data that the application initially sent: not the protocol
headers.

This changes the ABI. The TCP interface is new enough that it should
be safe to do so. The UDP interface could be changed analogously with

+  else if (sk->sk_protocol == IPPROTO_UDP)
+    skb_pull(skb, skb_transport_offset(skb) + sizeof(struct udphdr));

Tested with Documentation/networking/timestamping/txtimestamp -l 60 -x

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/core/skbuff.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 92116df..70a8596 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3580,6 +3580,7 @@ static void __skb_complete_tx_timestamp(struct sk_buff *skb,
 					int tstype)
 {
 	struct sock_exterr_skb *serr;
+	bool is_tcp = sk->sk_protocol == IPPROTO_TCP;
 	int err;
 
 	serr = SKB_EXT_ERR(skb);
@@ -3589,10 +3590,13 @@ static void __skb_complete_tx_timestamp(struct sk_buff *skb,
 	serr->ee.ee_info = tstype;
 	if (sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID) {
 		serr->ee.ee_data = skb_shinfo(skb)->tskey;
-		if (sk->sk_protocol == IPPROTO_TCP)
+		if (is_tcp)
 			serr->ee.ee_data -= sk->sk_tskey;
 	}
 
+	if (is_tcp)
+		skb_pull(skb, skb_transport_offset(skb) + tcp_hdrlen(skb));
+
 	err = sock_queue_err_skb(sk, skb);
 
 	if (err)
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support
  2014-11-25 17:58 [PATCH rfc 0/4] timestamping updates Willem de Bruijn
  2014-11-25 17:58 ` [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM Willem de Bruijn
@ 2014-11-25 17:58 ` Willem de Bruijn
  2014-11-25 18:04   ` Willem de Bruijn
                     ` (2 more replies)
  2014-11-25 17:58 ` [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets Willem de Bruijn
  2014-11-25 17:58 ` [PATCH rfc 4/4] net-timestamp: expand txtimestamp test with payload and PKTINFO Willem de Bruijn
  3 siblings, 3 replies; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 17:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, luto, richardcochran, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

On INET and INET6 sockets with the IP_PKTINFO or IPV6_RECVPKTINFO
socket option set, return a matching cmsg also for packets queued
to the error queue.

These packets are transmitted packets looped back. The fields of
struct in[6]_pktinfo are changed to reflect that:

  ifindex:   index of the outgoing device, if configured
  addr:      destination address
  spec_dst:  source address  (absent in v6)

On IPv6, the mechanism currently returns two IPV6_PKTINFO when
the option is enabled, because the existing code already supports
the feature. The legacy data does not produce a correct ifindex,
however. Perhaps I can revise the patch to only send the legacy
patch, but use the correct ifindex source in case of ERRQUEUE.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/ipv4/ip_sockglue.c | 15 +++++++++++++++
 net/ipv6/datagram.c    | 22 ++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b782657..615f783 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -399,6 +399,20 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf
 		kfree_skb(skb);
 }
 
+static void ip_recv_error_pktinfo(struct msghdr *msg, struct sock *sk,
+				  struct sk_buff *skb)
+{
+	if (inet_sk(sk)->cmsg_flags & IP_CMSG_PKTINFO && skb->dev) {
+		struct in_pktinfo info = {0};
+
+		info.ipi_spec_dst.s_addr = ip_hdr(skb)->saddr;
+		info.ipi_addr.s_addr = ip_hdr(skb)->daddr;
+		info.ipi_ifindex = skb->dev->ifindex;
+
+		put_cmsg(msg, SOL_IP, IP_PKTINFO, sizeof(info), &info);
+	}
+}
+
 /*
  *	Handle MSG_ERRQUEUE
  */
@@ -429,6 +443,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		goto out_free_skb;
 
 	sock_recv_timestamp(msg, sk, skb);
+	ip_recv_error_pktinfo(msg, sk, skb);
 
 	serr = SKB_EXT_ERR(skb);
 
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index cc11396..7d2ef7c 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -325,6 +325,27 @@ void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu)
 	kfree_skb(skb);
 }
 
+static void ipv6_recv_error_pktinfo(struct msghdr *msg, struct sock *sk,
+				    struct sk_buff *skb)
+{
+	struct ipv6_pinfo *np = inet6_sk(sk);
+
+	if (np->rxopt.bits.rxinfo && skb->dev) {
+		struct in6_pktinfo info;
+
+		memset(&info, 0, sizeof(info));
+		if (skb->protocol == htons(ETH_P_IPV6))
+			info.ipi6_addr = ipv6_hdr(skb)->daddr;
+		else
+			ipv6_addr_set_v4mapped(ip_hdr(skb)->daddr,
+					       &info.ipi6_addr);
+
+		info.ipi6_ifindex = skb->dev->ifindex;
+		net_info_ratelimited("yes: ifindex=%d\n", info.ipi6_ifindex);
+		put_cmsg(msg, SOL_IPV6, IPV6_PKTINFO, sizeof(info), &info);
+	}
+}
+
 /*
  *	Handle MSG_ERRQUEUE
  */
@@ -356,6 +377,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		goto out_free_skb;
 
 	sock_recv_timestamp(msg, sk, skb);
+	ipv6_recv_error_pktinfo(msg, sk, skb);
 
 	serr = SKB_EXT_ERR(skb);
 
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets
  2014-11-25 17:58 [PATCH rfc 0/4] timestamping updates Willem de Bruijn
  2014-11-25 17:58 ` [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM Willem de Bruijn
  2014-11-25 17:58 ` [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support Willem de Bruijn
@ 2014-11-25 17:58 ` Willem de Bruijn
  2014-11-25 18:41   ` David Miller
  2014-11-25 17:58 ` [PATCH rfc 4/4] net-timestamp: expand txtimestamp test with payload and PKTINFO Willem de Bruijn
  3 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 17:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, luto, richardcochran, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets.
It always passed errorqueue requests onto ip_recv_error, but the
same tcp_recvmsg code may also be called for IPv6 sockets. In that
case, pass to ipv6_recv_error.

Tested by asking for PKTINFO with

  Documentation/networking/timestamping/txtimestamp -I

Before this change, IPv6 sockets would return AF_INET/IP_PKTINFO
after the change, these sockets return AF_INET6/IPV6_PKTINFO

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/ipv4/tcp.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c239f47..04ea9b2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1597,8 +1597,12 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	struct sk_buff *skb;
 	u32 urg_hole = 0;
 
-	if (unlikely(flags & MSG_ERRQUEUE))
-		return ip_recv_error(sk, msg, len, addr_len);
+	if (unlikely(flags & MSG_ERRQUEUE)) {
+		if (sk->sk_family == AF_INET6)
+			return ipv6_recv_error(sk, msg, len, addr_len);
+		else
+			return ip_recv_error(sk, msg, len, addr_len);
+	}
 
 	if (sk_can_busy_loop(sk) && skb_queue_empty(&sk->sk_receive_queue) &&
 	    (sk->sk_state == TCP_ESTABLISHED))
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH rfc 4/4] net-timestamp: expand txtimestamp test with payload and PKTINFO
  2014-11-25 17:58 [PATCH rfc 0/4] timestamping updates Willem de Bruijn
                   ` (2 preceding siblings ...)
  2014-11-25 17:58 ` [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets Willem de Bruijn
@ 2014-11-25 17:58 ` Willem de Bruijn
  3 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 17:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, luto, richardcochran, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Test:
  new: flag -I requests and prints PKTINFO
  new: flag -x prints payload (possibly truncated)
  fix: remove pretty print that breaks common flag '-l 1'

Signed-off-by: Willem de Bruijn <willemb@google.com>

----

I did not modify the documentation yet, to avoid conflicting with
the other queued ee_info -> ee_data patch
---
 .../networking/timestamping/txtimestamp.c          | 89 +++++++++++++++++++---
 1 file changed, 77 insertions(+), 12 deletions(-)

diff --git a/Documentation/networking/timestamping/txtimestamp.c b/Documentation/networking/timestamping/txtimestamp.c
index b32fc2a..8946143 100644
--- a/Documentation/networking/timestamping/txtimestamp.c
+++ b/Documentation/networking/timestamping/txtimestamp.c
@@ -46,6 +46,7 @@
 #include <netpacket/packet.h>
 #include <poll.h>
 #include <stdarg.h>
+#include <stdbool.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -58,6 +59,14 @@
 #include <time.h>
 #include <unistd.h>
 
+/* ugly hack to work around netinet/in.h and linux/ipv6.h conflicts */
+#ifndef in6_pktinfo
+struct in6_pktinfo {
+	struct in6_addr	ipi6_addr;
+	int		ipi6_ifindex;
+};
+#endif
+
 /* command line parameters */
 static int cfg_proto = SOCK_STREAM;
 static int cfg_ipproto = IPPROTO_TCP;
@@ -65,6 +74,8 @@ static int cfg_num_pkts = 4;
 static int do_ipv4 = 1;
 static int do_ipv6 = 1;
 static int cfg_payload_len = 10;
+static bool cfg_show_payload;
+static bool cfg_do_pktinfo;
 static uint16_t dest_port = 9000;
 
 static struct sockaddr_in daddr;
@@ -131,6 +142,30 @@ static void print_timestamp(struct scm_timestamping *tss, int tstype,
 	__print_timestamp(tsname, &tss->ts[0], tskey, payload_len);
 }
 
+/* TODO: convert to check_and_print payload once API is stable */
+static void print_payload(char *data, int len)
+{
+	int i;
+
+	if (len > 70)
+		len = 70;
+
+	fprintf(stderr, "payload: ");
+	for (i = 0; i < len; i++)
+		fprintf(stderr, "%02hhx ", data[i]);
+	fprintf(stderr, "\n");
+}
+
+static void print_pktinfo(int family, int ifindex, void *saddr, void *daddr)
+{
+	char sa[INET6_ADDRSTRLEN], da[INET6_ADDRSTRLEN];
+
+	fprintf(stderr, "pktinfo: ifindex=%u src=%s dst=%s\n",
+		ifindex,
+		saddr ? inet_ntop(family, saddr, sa, sizeof(sa)) : "unknown",
+		daddr ? inet_ntop(family, daddr, da, sizeof(da)) : "unknown");
+}
+
 static void __poll(int fd)
 {
 	struct pollfd pollfd;
@@ -156,10 +191,9 @@ static void __recv_errmsg_cmsg(struct msghdr *msg, int payload_len)
 		    cm->cmsg_type == SCM_TIMESTAMPING) {
 			tss = (void *) CMSG_DATA(cm);
 		} else if ((cm->cmsg_level == SOL_IP &&
-		     cm->cmsg_type == IP_RECVERR) ||
-		    (cm->cmsg_level == SOL_IPV6 &&
-		     cm->cmsg_type == IPV6_RECVERR)) {
-
+			    cm->cmsg_type == IP_RECVERR) ||
+			   (cm->cmsg_level == SOL_IPV6 &&
+			    cm->cmsg_type == IPV6_RECVERR)) {
 			serr = (void *) CMSG_DATA(cm);
 			if (serr->ee_errno != ENOMSG ||
 			    serr->ee_origin != SO_EE_ORIGIN_TIMESTAMPING) {
@@ -168,6 +202,16 @@ static void __recv_errmsg_cmsg(struct msghdr *msg, int payload_len)
 						serr->ee_origin);
 				serr = NULL;
 			}
+		} else if (cm->cmsg_level == SOL_IP &&
+			   cm->cmsg_type == IP_PKTINFO) {
+			struct in_pktinfo *info = (void *) CMSG_DATA(cm);
+			print_pktinfo(AF_INET, info->ipi_ifindex,
+				      &info->ipi_spec_dst, &info->ipi_addr);
+		} else if (cm->cmsg_level == SOL_IPV6 &&
+			   cm->cmsg_type == IPV6_PKTINFO) {
+			struct in6_pktinfo *info6 = (void *) CMSG_DATA(cm);
+			print_pktinfo(AF_INET6, info6->ipi6_ifindex,
+				      NULL, &info6->ipi6_addr);
 		} else
 			fprintf(stderr, "unknown cmsg %d,%d\n",
 					cm->cmsg_level, cm->cmsg_type);
@@ -206,7 +250,11 @@ static int recv_errmsg(int fd)
 	if (ret == -1 && errno != EAGAIN)
 		error(1, errno, "recvmsg");
 
-	__recv_errmsg_cmsg(&msg, ret);
+	if (ret > 0) {
+		__recv_errmsg_cmsg(&msg, ret);
+		if (cfg_show_payload)
+			print_payload(data, cfg_payload_len);
+	}
 
 	free(data);
 	return ret == -1;
@@ -215,9 +263,9 @@ static int recv_errmsg(int fd)
 static void do_test(int family, unsigned int opt)
 {
 	char *buf;
-	int fd, i, val, total_len;
+	int fd, i, val = 1, total_len;
 
-	if (family == IPPROTO_IPV6 && cfg_proto != SOCK_STREAM) {
+	if (family == AF_INET6 && cfg_proto != SOCK_STREAM) {
 		/* due to lack of checksum generation code */
 		fprintf(stderr, "test: skipping datagram over IPv6\n");
 		return;
@@ -239,7 +287,6 @@ static void do_test(int family, unsigned int opt)
 		error(1, errno, "socket");
 
 	if (cfg_proto == SOCK_STREAM) {
-		val = 1;
 		if (setsockopt(fd, IPPROTO_TCP, TCP_NODELAY,
 			       (char*) &val, sizeof(val)))
 			error(1, 0, "setsockopt no nagle");
@@ -253,6 +300,18 @@ static void do_test(int family, unsigned int opt)
 		}
 	}
 
+	if (cfg_do_pktinfo) {
+		if (family == AF_INET6) {
+			if (setsockopt(fd, SOL_IPV6, IPV6_RECVPKTINFO,
+				       &val, sizeof(val)))
+				error(1, errno, "setsockopt pktinfo ipv6");
+		} else {
+			if (setsockopt(fd, SOL_IP, IP_PKTINFO,
+				       &val, sizeof(val)))
+				error(1, errno, "setsockopt pktinfo ipv4");
+		}
+	}
+
 	opt |= SOF_TIMESTAMPING_SOFTWARE |
 	       SOF_TIMESTAMPING_OPT_ID;
 	if (setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING,
@@ -262,8 +321,6 @@ static void do_test(int family, unsigned int opt)
 	for (i = 0; i < cfg_num_pkts; i++) {
 		memset(&ts_prev, 0, sizeof(ts_prev));
 		memset(buf, 'a' + i, total_len);
-		buf[total_len - 2] = '\n';
-		buf[total_len - 1] = '\0';
 
 		if (cfg_proto == SOCK_RAW) {
 			struct udphdr *udph;
@@ -324,11 +381,13 @@ static void __attribute__((noreturn)) usage(const char *filepath)
 			"  -4:   only IPv4\n"
 			"  -6:   only IPv6\n"
 			"  -h:   show this message\n"
+			"  -I:   request PKTINFO\n"
 			"  -l N: send N bytes at a time\n"
 			"  -r:   use raw\n"
 			"  -R:   use raw (IP_HDRINCL)\n"
 			"  -p N: connect to port N\n"
-			"  -u:   use udp\n",
+			"  -u:   use udp\n"
+			"  -x:   show payload (up to 70 bytes)\n",
 			filepath);
 	exit(1);
 }
@@ -338,7 +397,7 @@ static void parse_opt(int argc, char **argv)
 	int proto_count = 0;
 	char c;
 
-	while ((c = getopt(argc, argv, "46hl:p:rRu")) != -1) {
+	while ((c = getopt(argc, argv, "46hIl:p:rRux")) != -1) {
 		switch (c) {
 		case '4':
 			do_ipv6 = 0;
@@ -346,6 +405,9 @@ static void parse_opt(int argc, char **argv)
 		case '6':
 			do_ipv4 = 0;
 			break;
+		case 'I':
+			cfg_do_pktinfo = true;
+			break;
 		case 'r':
 			proto_count++;
 			cfg_proto = SOCK_RAW;
@@ -367,6 +429,9 @@ static void parse_opt(int argc, char **argv)
 		case 'p':
 			dest_port = strtoul(optarg, NULL, 10);
 			break;
+		case 'x':
+			cfg_show_payload = true;
+			break;
 		case 'h':
 		default:
 			usage(argv[0]);
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support
  2014-11-25 17:58 ` [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support Willem de Bruijn
@ 2014-11-25 18:04   ` Willem de Bruijn
  2014-11-25 18:41   ` David Miller
  2014-11-25 21:16   ` Willem de Bruijn
  2 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 18:04 UTC (permalink / raw)
  To: Network Development
  Cc: David Miller, Andy Lutomirski, Richard Cochran, Willem de Bruijn

>
> +static void ipv6_recv_error_pktinfo(struct msghdr *msg, struct sock *sk,
> +                                   struct sk_buff *skb)
> +{
> +       struct ipv6_pinfo *np = inet6_sk(sk);
> +
> +       if (np->rxopt.bits.rxinfo && skb->dev) {
> +               struct in6_pktinfo info;
> +
> +               memset(&info, 0, sizeof(info));
> +               if (skb->protocol == htons(ETH_P_IPV6))
> +                       info.ipi6_addr = ipv6_hdr(skb)->daddr;
> +               else
> +                       ipv6_addr_set_v4mapped(ip_hdr(skb)->daddr,
> +                                              &info.ipi6_addr);
> +
> +               info.ipi6_ifindex = skb->dev->ifindex;
> +               net_info_ratelimited("yes: ifindex=%d\n", info.ipi6_ifindex);

forgot to remove this debug message.. removing it from my local copy right away.

> +               put_cmsg(msg, SOL_IPV6, IPV6_PKTINFO, sizeof(info), &info);
> +       }
> +}
> +

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support
  2014-11-25 17:58 ` [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support Willem de Bruijn
  2014-11-25 18:04   ` Willem de Bruijn
@ 2014-11-25 18:41   ` David Miller
  2014-11-25 21:16   ` Willem de Bruijn
  2 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2014-11-25 18:41 UTC (permalink / raw)
  To: willemb; +Cc: netdev, luto, richardcochran

From: Willem de Bruijn <willemb@google.com>
Date: Tue, 25 Nov 2014 12:58:04 -0500

> +	if (inet_sk(sk)->cmsg_flags & IP_CMSG_PKTINFO && skb->dev) {
> +		struct in_pktinfo info = {0};

I think memset(&info... is cleaner, and:

> +		struct in6_pktinfo info;
> +
> +		memset(&info, 0, sizeof(info));

Would make the code consistent with the ipv6 side.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets
  2014-11-25 17:58 ` [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets Willem de Bruijn
@ 2014-11-25 18:41   ` David Miller
  2014-11-25 19:53     ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2014-11-25 18:41 UTC (permalink / raw)
  To: willemb; +Cc: netdev, luto, richardcochran

From: Willem de Bruijn <willemb@google.com>
Date: Tue, 25 Nov 2014 12:58:05 -0500

> From: Willem de Bruijn <willemb@google.com>
> 
> TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets.
> It always passed errorqueue requests onto ip_recv_error, but the
> same tcp_recvmsg code may also be called for IPv6 sockets. In that
> case, pass to ipv6_recv_error.
> 
> Tested by asking for PKTINFO with
> 
>   Documentation/networking/timestamping/txtimestamp -I
> 
> Before this change, IPv6 sockets would return AF_INET/IP_PKTINFO
> after the change, these sockets return AF_INET6/IPV6_PKTINFO
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>

This looks like a bug fix to me, and is therefore probably 'net'
material.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-25 17:58 ` [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM Willem de Bruijn
@ 2014-11-25 18:42   ` David Miller
  2014-11-25 19:52     ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2014-11-25 18:42 UTC (permalink / raw)
  To: willemb; +Cc: netdev, luto, richardcochran

From: Willem de Bruijn <willemb@google.com>
Date: Tue, 25 Nov 2014 12:58:03 -0500

> From: Willem de Bruijn <willemb@google.com>
> 
> When returning timestamped packets on the error queue, only return
> the data that the application initially sent: not the protocol
> headers.
> 
> This changes the ABI. The TCP interface is new enough that it should
> be safe to do so. The UDP interface could be changed analogously with
> 
> +  else if (sk->sk_protocol == IPPROTO_UDP)
> +    skb_pull(skb, skb_transport_offset(skb) + sizeof(struct udphdr));
> 
> Tested with Documentation/networking/timestamping/txtimestamp -l 60 -x
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>

What's the harm in exposing the headers?  Either it's harmful, and
therefore doing so for UDP is bad too, or it's harmless and we should
probably leave it alone to not risk breaking anyone.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-25 18:42   ` David Miller
@ 2014-11-25 19:52     ` Willem de Bruijn
  2014-11-25 19:54       ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 19:52 UTC (permalink / raw)
  To: David Miller; +Cc: Network Development, Andy Lutomirski, Richard Cochran

On Tue, Nov 25, 2014 at 1:42 PM, David Miller <davem@davemloft.net> wrote:
> From: Willem de Bruijn <willemb@google.com>
> Date: Tue, 25 Nov 2014 12:58:03 -0500
>
>> From: Willem de Bruijn <willemb@google.com>
>>
>> When returning timestamped packets on the error queue, only return
>> the data that the application initially sent: not the protocol
>> headers.
>>
>> This changes the ABI. The TCP interface is new enough that it should
>> be safe to do so. The UDP interface could be changed analogously with
>>
>> +  else if (sk->sk_protocol == IPPROTO_UDP)
>> +    skb_pull(skb, skb_transport_offset(skb) + sizeof(struct udphdr));
>>
>> Tested with Documentation/networking/timestamping/txtimestamp -l 60 -x
>>
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
>
> What's the harm in exposing the headers?  Either it's harmful, and
> therefore doing so for UDP is bad too, or it's harmless and

Headers may expose information not available otherwise. I don't
immediately see critical problems, but that does not mean that they
might not lurk there.

We so far avoid exposing the sequence number, though keeping it hidden
is more about third parties. More in general, unprivileged processes
may start requesting timestamps only to learn tcp state that they
should either get from tcpinfo or cannot currently get at all, likely
for good reason. A far-fetched example is identifying admin iptables
tos mangling rules by reading the tos bits at the driver layer. At least
on my machine, iptables -L is privileged.

> we should probably leave it alone to not risk breaking anyone.

That's fair. I sent it for rfc first for that reason. I won't resubmit
unless more serious concerns are raised.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets
  2014-11-25 18:41   ` David Miller
@ 2014-11-25 19:53     ` Willem de Bruijn
  0 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 19:53 UTC (permalink / raw)
  To: David Miller; +Cc: Network Development, Andy Lutomirski, Richard Cochran

On Tue, Nov 25, 2014 at 1:41 PM, David Miller <davem@davemloft.net> wrote:
> From: Willem de Bruijn <willemb@google.com>
> Date: Tue, 25 Nov 2014 12:58:05 -0500
>
>> From: Willem de Bruijn <willemb@google.com>
>>
>> TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets.
>> It always passed errorqueue requests onto ip_recv_error, but the
>> same tcp_recvmsg code may also be called for IPv6 sockets. In that
>> case, pass to ipv6_recv_error.
>>
>> Tested by asking for PKTINFO with
>>
>>   Documentation/networking/timestamping/txtimestamp -I
>>
>> Before this change, IPv6 sockets would return AF_INET/IP_PKTINFO
>> after the change, these sockets return AF_INET6/IPV6_PKTINFO
>>
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
>
> This looks like a bug fix to me, and is therefore probably 'net'
> material.

Okay. I'll submit it separately to net.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-25 19:52     ` Willem de Bruijn
@ 2014-11-25 19:54       ` David Miller
  2014-11-25 21:39         ` Andy Lutomirski
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2014-11-25 19:54 UTC (permalink / raw)
  To: willemb; +Cc: netdev, luto, richardcochran

From: Willem de Bruijn <willemb@google.com>
Date: Tue, 25 Nov 2014 14:52:00 -0500

> On Tue, Nov 25, 2014 at 1:42 PM, David Miller <davem@davemloft.net> wrote:
>> From: Willem de Bruijn <willemb@google.com>
>> Date: Tue, 25 Nov 2014 12:58:03 -0500
>>
>> What's the harm in exposing the headers?  Either it's harmful, and
>> therefore doing so for UDP is bad too, or it's harmless and
> 
> Headers may expose information not available otherwise. I don't
> immediately see critical problems, but that does not mean that they
> might not lurk there.
> 
> We so far avoid exposing the sequence number, though keeping it hidden
> is more about third parties. More in general, unprivileged processes
> may start requesting timestamps only to learn tcp state that they
> should either get from tcpinfo or cannot currently get at all, likely
> for good reason. A far-fetched example is identifying admin iptables
> tos mangling rules by reading the tos bits at the driver layer. At least
> on my machine, iptables -L is privileged.
> 
>> we should probably leave it alone to not risk breaking anyone.
> 
> That's fair. I sent it for rfc first for that reason. I won't resubmit
> unless more serious concerns are raised.

I just worry about the potential breakage.

Your concerns are valid... I honestly don't know what we should do here.
Both choices have merit.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support
  2014-11-25 17:58 ` [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support Willem de Bruijn
  2014-11-25 18:04   ` Willem de Bruijn
  2014-11-25 18:41   ` David Miller
@ 2014-11-25 21:16   ` Willem de Bruijn
  2 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-25 21:16 UTC (permalink / raw)
  To: Network Development
  Cc: David Miller, Andy Lutomirski, Richard Cochran, Willem de Bruijn

On Tue, Nov 25, 2014 at 12:58 PM, Willem de Bruijn <willemb@google.com> wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> On INET and INET6 sockets with the IP_PKTINFO or IPV6_RECVPKTINFO
> socket option set, return a matching cmsg also for packets queued
> to the error queue.
>
> These packets are transmitted packets looped back. The fields of
> struct in[6]_pktinfo are changed to reflect that:
>
>   ifindex:   index of the outgoing device, if configured
>   addr:      destination address
>   spec_dst:  source address  (absent in v6)
>
> On IPv6, the mechanism currently returns two IPV6_PKTINFO when
> the option is enabled, because the existing code already supports
> the feature.

This points to a wider difference between when
ip_recv_error and ipv6_recv_error fill sockaddr and cmsg:

    if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) {

vs

    if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) {

The two were introduced in the same patch, when they were equivalent
bar the difference between ICMP and ICMP6, which this neatly handles.

Now that there are SO_EE_ORIGIN_TXSTATUS and
SO_EE_ORIGIN_TIMESTAMPING, this is no longer the case. This
is another case where it's not clear whether the difference should be
preserved for legacy reasons, or the two should be harmonized. The
fix, if any, is to make ipv6 explicitly check for ICMP and IMCP6.

> The legacy data does not produce a correct ifindex,
> however. Perhaps I can revise the patch to only send the legacy
> patch, but use the correct ifindex source in case of ERRQUEUE.
>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/ipv4/ip_sockglue.c | 15 +++++++++++++++
>  net/ipv6/datagram.c    | 22 ++++++++++++++++++++++
>  2 files changed, 37 insertions(+)
>
> diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
> index b782657..615f783 100644
> --- a/net/ipv4/ip_sockglue.c
> +++ b/net/ipv4/ip_sockglue.c
> @@ -399,6 +399,20 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf
>                 kfree_skb(skb);
>  }
>
> +static void ip_recv_error_pktinfo(struct msghdr *msg, struct sock *sk,
> +                                 struct sk_buff *skb)
> +{
> +       if (inet_sk(sk)->cmsg_flags & IP_CMSG_PKTINFO && skb->dev) {
> +               struct in_pktinfo info = {0};
> +
> +               info.ipi_spec_dst.s_addr = ip_hdr(skb)->saddr;
> +               info.ipi_addr.s_addr = ip_hdr(skb)->daddr;
> +               info.ipi_ifindex = skb->dev->ifindex;
> +
> +               put_cmsg(msg, SOL_IP, IP_PKTINFO, sizeof(info), &info);
> +       }
> +}
> +
>  /*
>   *     Handle MSG_ERRQUEUE
>   */
> @@ -429,6 +443,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
>                 goto out_free_skb;
>
>         sock_recv_timestamp(msg, sk, skb);
> +       ip_recv_error_pktinfo(msg, sk, skb);
>
>         serr = SKB_EXT_ERR(skb);
>
> diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
> index cc11396..7d2ef7c 100644
> --- a/net/ipv6/datagram.c
> +++ b/net/ipv6/datagram.c
> @@ -325,6 +325,27 @@ void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu)
>         kfree_skb(skb);
>  }
>
> +static void ipv6_recv_error_pktinfo(struct msghdr *msg, struct sock *sk,
> +                                   struct sk_buff *skb)
> +{
> +       struct ipv6_pinfo *np = inet6_sk(sk);
> +
> +       if (np->rxopt.bits.rxinfo && skb->dev) {
> +               struct in6_pktinfo info;
> +
> +               memset(&info, 0, sizeof(info));
> +               if (skb->protocol == htons(ETH_P_IPV6))
> +                       info.ipi6_addr = ipv6_hdr(skb)->daddr;
> +               else
> +                       ipv6_addr_set_v4mapped(ip_hdr(skb)->daddr,
> +                                              &info.ipi6_addr);
> +
> +               info.ipi6_ifindex = skb->dev->ifindex;
> +               net_info_ratelimited("yes: ifindex=%d\n", info.ipi6_ifindex);
> +               put_cmsg(msg, SOL_IPV6, IPV6_PKTINFO, sizeof(info), &info);
> +       }
> +}
> +
>  /*
>   *     Handle MSG_ERRQUEUE
>   */
> @@ -356,6 +377,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
>                 goto out_free_skb;
>
>         sock_recv_timestamp(msg, sk, skb);
> +       ipv6_recv_error_pktinfo(msg, sk, skb);
>
>         serr = SKB_EXT_ERR(skb);
>
> --
> 2.1.0.rc2.206.gedb03e5
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-25 19:54       ` David Miller
@ 2014-11-25 21:39         ` Andy Lutomirski
  2014-11-26 21:03           ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2014-11-25 21:39 UTC (permalink / raw)
  To: David Miller; +Cc: Willem de Bruijn, Network Development, Richard Cochran

On Tue, Nov 25, 2014 at 11:54 AM, David Miller <davem@davemloft.net> wrote:
> From: Willem de Bruijn <willemb@google.com>
> Date: Tue, 25 Nov 2014 14:52:00 -0500
>
>> On Tue, Nov 25, 2014 at 1:42 PM, David Miller <davem@davemloft.net> wrote:
>>> From: Willem de Bruijn <willemb@google.com>
>>> Date: Tue, 25 Nov 2014 12:58:03 -0500
>>>
>>> What's the harm in exposing the headers?  Either it's harmful, and
>>> therefore doing so for UDP is bad too, or it's harmless and
>>
>> Headers may expose information not available otherwise. I don't
>> immediately see critical problems, but that does not mean that they
>> might not lurk there.
>>
>> We so far avoid exposing the sequence number, though keeping it hidden
>> is more about third parties. More in general, unprivileged processes
>> may start requesting timestamps only to learn tcp state that they
>> should either get from tcpinfo or cannot currently get at all, likely
>> for good reason. A far-fetched example is identifying admin iptables
>> tos mangling rules by reading the tos bits at the driver layer. At least
>> on my machine, iptables -L is privileged.
>>
>>> we should probably leave it alone to not risk breaking anyone.
>>
>> That's fair. I sent it for rfc first for that reason. I won't resubmit
>> unless more serious concerns are raised.
>
> I just worry about the potential breakage.
>
> Your concerns are valid... I honestly don't know what we should do here.
> Both choices have merit.

Here's a scenario in which giving the headers might be dangerous:

Suppose I create a network namespace that's designed to contain
something, e.g. a Tor or Tor-like client, that shouldn't know any of
its public addressing information.  I might assign something like a
tunnel interface to the namespace, but, if the contained code can get
lower-level headers, it might learn something that would identify the
*other* end of the tunnel, which wouldn't be so good.  Admittedly,
this would be just one of several things that would require care to
get this right.

Also, what happens if the output is transformed by ipsec?  Does the
timestamp message show the ciphertext?

TBH, I'd rather send no payload at all and have an scm message that
the sender provides that specifies a cookie identifying the particular
sent data.  But that ship mostly sailed awhile ago.

For bytestreams, though, isn't this all new in 3.18?  Or am I off by a release.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-25 21:39         ` Andy Lutomirski
@ 2014-11-26 21:03           ` Willem de Bruijn
  2014-11-27  0:36             ` Andy Lutomirski
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2014-11-26 21:03 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: David Miller, Network Development, Richard Cochran

On Tue, Nov 25, 2014 at 4:39 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Tue, Nov 25, 2014 at 11:54 AM, David Miller <davem@davemloft.net> wrote:
>> From: Willem de Bruijn <willemb@google.com>
>> Date: Tue, 25 Nov 2014 14:52:00 -0500
>>
>>> On Tue, Nov 25, 2014 at 1:42 PM, David Miller <davem@davemloft.net> wrote:
>>>> From: Willem de Bruijn <willemb@google.com>
>>>> Date: Tue, 25 Nov 2014 12:58:03 -0500
>>>>
>>>> What's the harm in exposing the headers?  Either it's harmful, and
>>>> therefore doing so for UDP is bad too, or it's harmless and
>>>
>>> Headers may expose information not available otherwise. I don't
>>> immediately see critical problems, but that does not mean that they
>>> might not lurk there.
>>>
>>> We so far avoid exposing the sequence number, though keeping it hidden
>>> is more about third parties. More in general, unprivileged processes
>>> may start requesting timestamps only to learn tcp state that they
>>> should either get from tcpinfo or cannot currently get at all, likely
>>> for good reason. A far-fetched example is identifying admin iptables
>>> tos mangling rules by reading the tos bits at the driver layer. At least
>>> on my machine, iptables -L is privileged.
>>>
>>>> we should probably leave it alone to not risk breaking anyone.
>>>
>>> That's fair. I sent it for rfc first for that reason. I won't resubmit
>>> unless more serious concerns are raised.
>>
>> I just worry about the potential breakage.
>>
>> Your concerns are valid... I honestly don't know what we should do here.
>> Both choices have merit.
>
> Here's a scenario in which giving the headers might be dangerous:
>
> Suppose I create a network namespace that's designed to contain
> something, e.g. a Tor or Tor-like client, that shouldn't know any of
> its public addressing information.  I might assign something like a
> tunnel interface to the namespace, but, if the contained code can get
> lower-level headers, it might learn something that would identify the
> *other* end of the tunnel, which wouldn't be so good.  Admittedly,
> this would be just one of several things that would require care to
> get this right.

network namespaces are an interesting case, indeed.

>
> Also, what happens if the output is transformed by ipsec?  Does the
> timestamp message show the ciphertext?
>
> TBH, I'd rather send no payload at all and have an scm message that
> the sender provides that specifies a cookie identifying the particular
> sent data.  But that ship mostly sailed awhile ago.
>
> For bytestreams, though, isn't this all new in 3.18?  Or am I off by a release.

It was added in 3.17. That is still very recent.

One third option, though hardly pretty, is to put display of headers
under administrator control. An application cannot easily infer whether
headers are stripped, and legacy applications do not even know to try.
So, this is a bit too crude:

+    if (sk->sk_protocol == IPPROTO_TCP && sysctl_net_blind_errqueue)
+        skb_pull(skb, skb_transport_offset(skb) + tcp_hdrlen(skb));
+    else if (sk->sk_protocol == IPPROTO_UDP && sysctl_net_blind_errqueue >= 2)
+        skb_pull(skb, skb_transport_offset(skb) + sizeof(struct udphdr));

An alternative is to add a timestamping option to skip headers (or
even full payload, basically
http://patchwork.ozlabs.org/patch/366967/) and give the administrator
a sysctl to drop all requests that do not pass this flag. The intent
is that future proof applications will start requesting the flag, and
relying on the ts counter. Hardened installations can set the sysctl
from the start, accepting possible breakage.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-26 21:03           ` Willem de Bruijn
@ 2014-11-27  0:36             ` Andy Lutomirski
  2014-11-27 12:30               ` Richard Cochran
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2014-11-27  0:36 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: David Miller, Network Development, Richard Cochran

On Wed, Nov 26, 2014 at 1:03 PM, Willem de Bruijn <willemb@google.com> wrote:
> On Tue, Nov 25, 2014 at 4:39 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Tue, Nov 25, 2014 at 11:54 AM, David Miller <davem@davemloft.net> wrote:
>>> From: Willem de Bruijn <willemb@google.com>
>>> Date: Tue, 25 Nov 2014 14:52:00 -0500
>>>
>>>> On Tue, Nov 25, 2014 at 1:42 PM, David Miller <davem@davemloft.net> wrote:
>>>>> From: Willem de Bruijn <willemb@google.com>
>>>>> Date: Tue, 25 Nov 2014 12:58:03 -0500
>>>>>
>>>>> What's the harm in exposing the headers?  Either it's harmful, and
>>>>> therefore doing so for UDP is bad too, or it's harmless and
>>>>
>>>> Headers may expose information not available otherwise. I don't
>>>> immediately see critical problems, but that does not mean that they
>>>> might not lurk there.
>>>>
>>>> We so far avoid exposing the sequence number, though keeping it hidden
>>>> is more about third parties. More in general, unprivileged processes
>>>> may start requesting timestamps only to learn tcp state that they
>>>> should either get from tcpinfo or cannot currently get at all, likely
>>>> for good reason. A far-fetched example is identifying admin iptables
>>>> tos mangling rules by reading the tos bits at the driver layer. At least
>>>> on my machine, iptables -L is privileged.
>>>>
>>>>> we should probably leave it alone to not risk breaking anyone.
>>>>
>>>> That's fair. I sent it for rfc first for that reason. I won't resubmit
>>>> unless more serious concerns are raised.
>>>
>>> I just worry about the potential breakage.
>>>
>>> Your concerns are valid... I honestly don't know what we should do here.
>>> Both choices have merit.
>>
>> Here's a scenario in which giving the headers might be dangerous:
>>
>> Suppose I create a network namespace that's designed to contain
>> something, e.g. a Tor or Tor-like client, that shouldn't know any of
>> its public addressing information.  I might assign something like a
>> tunnel interface to the namespace, but, if the contained code can get
>> lower-level headers, it might learn something that would identify the
>> *other* end of the tunnel, which wouldn't be so good.  Admittedly,
>> this would be just one of several things that would require care to
>> get this right.
>
> network namespaces are an interesting case, indeed.
>
>>
>> Also, what happens if the output is transformed by ipsec?  Does the
>> timestamp message show the ciphertext?
>>
>> TBH, I'd rather send no payload at all and have an scm message that
>> the sender provides that specifies a cookie identifying the particular
>> sent data.  But that ship mostly sailed awhile ago.
>>
>> For bytestreams, though, isn't this all new in 3.18?  Or am I off by a release.
>
> It was added in 3.17. That is still very recent.
>
> One third option, though hardly pretty, is to put display of headers
> under administrator control. An application cannot easily infer whether
> headers are stripped, and legacy applications do not even know to try.
> So, this is a bit too crude:
>
> +    if (sk->sk_protocol == IPPROTO_TCP && sysctl_net_blind_errqueue)
> +        skb_pull(skb, skb_transport_offset(skb) + tcp_hdrlen(skb));
> +    else if (sk->sk_protocol == IPPROTO_UDP && sysctl_net_blind_errqueue >= 2)
> +        skb_pull(skb, skb_transport_offset(skb) + sizeof(struct udphdr));
>
> An alternative is to add a timestamping option to skip headers (or
> even full payload, basically
> http://patchwork.ozlabs.org/patch/366967/) and give the administrator
> a sysctl to drop all requests that do not pass this flag. The intent
> is that future proof applications will start requesting the flag, and
> relying on the ts counter. Hardened installations can set the sysctl
> from the start, accepting possible breakage.

Is there any reason to believe that unconditionally dropping the
headers would break anything?  I find it a bit hard to believe that
anyone has actually implemented logic to figure out *what* L2 header
type should be decoded and decode it.

I can imagine that someone has hardcoded an assumption that the
underlying interface is Ethernet, but there's still the whole pile of
vlan, random datacenter encapsulation protocols and such to worry
about.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM
  2014-11-27  0:36             ` Andy Lutomirski
@ 2014-11-27 12:30               ` Richard Cochran
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Cochran @ 2014-11-27 12:30 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Willem de Bruijn, David Miller, Network Development

On Wed, Nov 26, 2014 at 04:36:39PM -0800, Andy Lutomirski wrote:
> Is there any reason to believe that unconditionally dropping the
> headers would break anything?  I find it a bit hard to believe that
> anyone has actually implemented logic to figure out *what* L2 header
> type should be decoded and decode it.

Documentation/networking/timestamping/timestamping.c

				else if (!memcmp(sync, data + res - sizeof(sync),
							sizeof(sync)))
					printf(" => GOT OUR DATA BACK (HURRAY!)");

The example program looks from the end of the buffer, ignoring the lower headers.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-11-27 12:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-25 17:58 [PATCH rfc 0/4] timestamping updates Willem de Bruijn
2014-11-25 17:58 ` [PATCH rfc 1/4] net-timestamp: pull headers for SOCK_STREAM Willem de Bruijn
2014-11-25 18:42   ` David Miller
2014-11-25 19:52     ` Willem de Bruijn
2014-11-25 19:54       ` David Miller
2014-11-25 21:39         ` Andy Lutomirski
2014-11-26 21:03           ` Willem de Bruijn
2014-11-27  0:36             ` Andy Lutomirski
2014-11-27 12:30               ` Richard Cochran
2014-11-25 17:58 ` [PATCH rfc 2/4] net-errqueue: add IP(V6)_PKTINFO support Willem de Bruijn
2014-11-25 18:04   ` Willem de Bruijn
2014-11-25 18:41   ` David Miller
2014-11-25 21:16   ` Willem de Bruijn
2014-11-25 17:58 ` [PATCH rfc 3/4] net-timestamp: tcp sockets return v6 errors on v6 sockets Willem de Bruijn
2014-11-25 18:41   ` David Miller
2014-11-25 19:53     ` Willem de Bruijn
2014-11-25 17:58 ` [PATCH rfc 4/4] net-timestamp: expand txtimestamp test with payload and PKTINFO Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).