[PATCH net v2 0/3] packet fixes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net v2 0/3] packet fixes
@ 2015-11-10 22:03 Daniel Borkmann
  2015-11-10 22:03 ` [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-10 22:03 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

Fixes a couple of issues in packet sockets, i.e. on TX ring side. See
individual patches for details.

v1 -> v2:
 - Added patch 2 as suggested by Dave
 - Rest is unchanged from previous submission

Daniel Borkmann (3):
  packet: do skb_probe_transport_header when we actually have data
  packet: always probe for transport header
  packet: fix tpacket_snd max frame and vlan handling

 net/packet/af_packet.c | 109 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 42 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data
  2015-11-10 22:03 [PATCH net v2 0/3] packet fixes Daniel Borkmann
@ 2015-11-10 22:03 ` Daniel Borkmann
  2015-11-11  5:07   ` Jason Wang
  2015-11-10 22:03 ` [PATCH net v2 2/3] packet: always probe for transport header Daniel Borkmann
  2015-11-10 22:03 ` [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling Daniel Borkmann
  2 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-10 22:03 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann, Jason Wang

In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
header before doing xmit") and later on 40893fd0fd4e ("net: switch
to use skb_probe_transport_header()") was probing for a transport
header on the skb from a ring buffer slot, but at a time, where
the skb has _not even_ been filled with data yet. So that call into
the flow dissector is pretty useless. Lets do it after we've set
up the skb frags.

Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jason Wang <jasowang@redhat.com>
---
 net/packet/af_packet.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index af399ca..80c36c0 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2368,8 +2368,6 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 	skb_reserve(skb, hlen);
 	skb_reset_network_header(skb);
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, 0);
 	if (unlikely(po->tp_tx_has_off)) {
 		int off_min, off_max, off;
 		off_min = po->tp_hdrlen - sizeof(struct sockaddr_ll);
@@ -2449,6 +2447,9 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
+	if (!packet_use_direct_xmit(po))
+		skb_probe_transport_header(skb, 0);
+
 	return tp_len;
 }
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data
  2015-11-10 22:03 ` [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
@ 2015-11-11  5:07   ` Jason Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2015-11-11  5:07 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: edumazet, willemb, tklauser, netdev



On 11/11/2015 06:03 AM, Daniel Borkmann wrote:
> In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
> header before doing xmit") and later on 40893fd0fd4e ("net: switch
> to use skb_probe_transport_header()") was probing for a transport
> header on the skb from a ring buffer slot, but at a time, where
> the skb has _not even_ been filled with data yet. So that call into
> the flow dissector is pretty useless. Lets do it after we've set
> up the skb frags.
>
> Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
> Reported-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
>  net/packet/af_packet.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index af399ca..80c36c0 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2368,8 +2368,6 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>  	skb_reserve(skb, hlen);
>  	skb_reset_network_header(skb);
>  
> -	if (!packet_use_direct_xmit(po))
> -		skb_probe_transport_header(skb, 0);
>  	if (unlikely(po->tp_tx_has_off)) {
>  		int off_min, off_max, off;
>  		off_min = po->tp_hdrlen - sizeof(struct sockaddr_ll);
> @@ -2449,6 +2447,9 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>  		len = ((to_write > len_max) ? len_max : to_write);
>  	}
>  
> +	if (!packet_use_direct_xmit(po))
> +		skb_probe_transport_header(skb, 0);
> +
>  	return tp_len;
>  }
>  

Acked-by: Jason Wang <jasowang@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v2 2/3] packet: always probe for transport header
  2015-11-10 22:03 [PATCH net v2 0/3] packet fixes Daniel Borkmann
  2015-11-10 22:03 ` [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
@ 2015-11-10 22:03 ` Daniel Borkmann
  2015-11-11  5:08   ` Jason Wang
  2015-11-10 22:03 ` [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling Daniel Borkmann
  2 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-10 22:03 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann, Jason Wang

We concluded that the skb_probe_transport_header() should better be
called unconditionally. Avoiding the call into the flow dissector has
also not really much to do with the direct xmit mode.

While it seems that only virtio_net code makes use of GSO from non
RX/TX ring packet socket paths, we should probe for a transport header
nevertheless before they hit devices.

Reference: http://thread.gmane.org/gmane.linux.network/386173/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jason Wang <jasowang@redhat.com>
---
 net/packet/af_packet.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 80c36c0..bdecf17 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2447,8 +2447,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, 0);
+	skb_probe_transport_header(skb, 0);
 
 	return tp_len;
 }
@@ -2808,8 +2807,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		len += vnet_hdr_len;
 	}
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, reserve);
+	skb_probe_transport_header(skb, reserve);
+
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 2/3] packet: always probe for transport header
  2015-11-10 22:03 ` [PATCH net v2 2/3] packet: always probe for transport header Daniel Borkmann
@ 2015-11-11  5:08   ` Jason Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2015-11-11  5:08 UTC (permalink / raw)
  To: Daniel Borkmann, davem; +Cc: edumazet, willemb, tklauser, netdev



On 11/11/2015 06:03 AM, Daniel Borkmann wrote:
> We concluded that the skb_probe_transport_header() should better be
> called unconditionally. Avoiding the call into the flow dissector has
> also not really much to do with the direct xmit mode.
>
> While it seems that only virtio_net code makes use of GSO from non
> RX/TX ring packet socket paths, we should probe for a transport header
> nevertheless before they hit devices.
>
> Reference: http://thread.gmane.org/gmane.linux.network/386173/
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
>  net/packet/af_packet.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 80c36c0..bdecf17 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2447,8 +2447,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>  		len = ((to_write > len_max) ? len_max : to_write);
>  	}
>  
> -	if (!packet_use_direct_xmit(po))
> -		skb_probe_transport_header(skb, 0);
> +	skb_probe_transport_header(skb, 0);
>  
>  	return tp_len;
>  }
> @@ -2808,8 +2807,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
>  		len += vnet_hdr_len;
>  	}
>  
> -	if (!packet_use_direct_xmit(po))
> -		skb_probe_transport_header(skb, reserve);
> +	skb_probe_transport_header(skb, reserve);
> +
>  	if (unlikely(extra_len == 4))
>  		skb->no_fcs = 1;
>  

Acked-by: Jason Wang <jasowang@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-10 22:03 [PATCH net v2 0/3] packet fixes Daniel Borkmann
  2015-11-10 22:03 ` [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
  2015-11-10 22:03 ` [PATCH net v2 2/3] packet: always probe for transport header Daniel Borkmann
@ 2015-11-10 22:03 ` Daniel Borkmann
  2015-11-10 22:52   ` Willem de Bruijn
  2 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-10 22:03 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

There seem to be a couple of issues in tpacket_snd() path. Since it's
introduction in commit 69e3c75f4d54 ("net: TX_RING and packet mmap"),
TX_RING could be used from SOCK_DGRAM and SOCK_RAW side. When used
with SOCK_DGRAM only, the size_max > dev->mtu + reserve check should
have reserve as 0, but currently, this is unconditionally set (in it's
original form as dev->hard_header_len).

I think this is not correct since tpacket_fill_skb() would then take
dev->mtu and dev->hard_header_len into account for SOCK_DGRAM, the
extra VLAN_HLEN could be possible in both cases. Presumably, the
reserve code was copied from packet_snd(), but later on missed the
check.

Moreover, the check for ETH_P_8021Q seems buggy as we potentially
operate on non-linear header data. In sock_alloc_send_skb() we seem
to allocate enough linear head-room in most cases, so the test could
just yield false results. The sock_alloc_send_skb() has an extra room
of (for some reason) sizeof(struct sockaddr_ll), so we have enough
header room for at least ethernet header in case the device doesn't
ask for it. Note that in TX_RING's tpacket_fill_skb() we fetch the
frame data after the slot header (or at some provided offset), thus
from TX side (as opposed to RX_RING), it doesn't contain a sockaddr_ll
structure in between.

Anyway, the ETH_P_8021Q check would be better fixed in tpacket_fill_skb()
by making sure that in SOCK_RAW cases, where we deal with tp_len >
dev->mtu + dev->hard_header_len, to place at least the ethernet header
into the linear section (which is the case in SOCK_DGRAM already).
Doing this test on ring buffer data (either from set up skb or on data
directly) could race underneath us. The test is done at the end so we
can take both socket types into consideration. We check that
skb->protocol and also h_proto are correct in these cases. For packet
sockets that have proto of 0, guess the skb->protocol from the user
payload as suggested.

Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
Fixes: 52f1454f629f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <edumazet@google.com>
---
 net/packet/af_packet.c | 101 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 63 insertions(+), 38 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index bdecf17..5d40ad3 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2320,12 +2320,12 @@ static void tpacket_destruct_skb(struct sk_buff *skb)
 	sock_wfree(skb);
 }
 
-static bool ll_header_truncated(const struct net_device *dev, int len)
+static bool ll_header_truncated(int hard_header_len, int len)
 {
 	/* net device doesn't like empty head */
-	if (unlikely(len <= dev->hard_header_len)) {
+	if (unlikely(len <= hard_header_len)) {
 		net_warn_ratelimited("%s: packet size is too short (%d <= %d)\n",
-				     current->comm, len, dev->hard_header_len);
+				     current->comm, len, hard_header_len);
 		return true;
 	}
 
@@ -2333,8 +2333,9 @@ static bool ll_header_truncated(const struct net_device *dev, int len)
 }
 
 static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
-		void *frame, struct net_device *dev, int size_max,
-		__be16 proto, unsigned char *addr, int hlen)
+			    void *frame, struct net_device *dev, int size_max,
+			    __be16 proto, unsigned char *addr, int hlen,
+			    int reserve)
 {
 	union tpacket_uhdr ph;
 	int to_write, offset, len, tp_len, nr_frags, len_max;
@@ -2400,22 +2401,45 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 	to_write = tp_len;
 
 	if (sock->type == SOCK_DGRAM) {
-		err = dev_hard_header(skb, dev, ntohs(proto), addr,
-				NULL, tp_len);
+		/* In DGRAM sockets, we expect struct sockaddr_ll was filled
+		 * via struct msghdr, so we have dest mac and skb->protocol.
+		 * Otherwise there's not too much useful things we can do in
+		 * this flush run.
+		 */
+		err = dev_hard_header(skb, dev, ntohs(skb->protocol), addr,
+				      NULL, tp_len);
 		if (unlikely(err < 0))
 			return -EINVAL;
-	} else if (dev->hard_header_len) {
-		if (ll_header_truncated(dev, tp_len))
-			return -EINVAL;
+	} else {
+		/* If skb->protocol is still 0, try to infer/guess it. Might
+		 * not be fully reliable in the sense that a user could still
+		 * change/race data afterwards, but on the other hand the proto
+		 * can be set arbitrarily anyways. We only need to take care
+		 * in case of extra large VLAN frames.
+		 */
+		if (!skb->protocol && tp_len >= ETH_HLEN)
+			skb->protocol = ((struct ethhdr *)data)->h_proto;
 
-		skb_push(skb, dev->hard_header_len);
-		err = skb_store_bits(skb, 0, data,
-				dev->hard_header_len);
-		if (unlikely(err))
-			return err;
+		/* Copy Ethernet header in case we need to deal with extra
+		 * VLAN space as otherwise data could change underneath us.
+		 * The caller already accomodated for enough linear room.
+		 */
+		if (dev->hard_header_len || tp_len > dev->mtu + reserve) {
+			int hdr_len = dev->hard_header_len;
+
+			if (hdr_len < ETH_HLEN)
+				hdr_len = ETH_HLEN;
+			if (unlikely(ll_header_truncated(hdr_len, tp_len)))
+				return -EINVAL;
+
+			skb_push(skb, hdr_len);
+			err = skb_store_bits(skb, 0, data, hdr_len);
+			if (unlikely(err))
+				return err;
 
-		data += dev->hard_header_len;
-		to_write -= dev->hard_header_len;
+			data     += hdr_len;
+			to_write -= hdr_len;
+		}
 	}
 
 	offset = offset_in_page(data);
@@ -2447,6 +2471,20 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
+	/* Check for full frame with extra VLAN space. We can probe
+	 * here on the linear header, if necessary. Earlier code
+	 * assumed this would be a VLAN pkt, double-check this now
+	 * that we have the actual packet and protocol at hand.
+	 */
+	if (tp_len > dev->mtu + reserve) {
+		if (skb->protocol != htons(ETH_P_8021Q))
+			return -EMSGSIZE;
+
+		skb_reset_mac_header(skb);
+		if (eth_hdr(skb)->h_proto != htons(ETH_P_8021Q))
+			return -EMSGSIZE;
+	}
+
 	skb_probe_transport_header(skb, 0);
 
 	return tp_len;
@@ -2493,12 +2531,12 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 	if (unlikely(!(dev->flags & IFF_UP)))
 		goto out_put;
 
-	reserve = dev->hard_header_len + VLAN_HLEN;
-	size_max = po->tx_ring.frame_size
-		- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
-
-	if (size_max > dev->mtu + reserve)
-		size_max = dev->mtu + reserve;
+	if (po->sk.sk_socket->type == SOCK_RAW)
+		reserve = dev->hard_header_len;
+	size_max = po->tx_ring.frame_size - (po->tp_hdrlen -
+					     sizeof(struct sockaddr_ll));
+	if (size_max > dev->mtu + reserve + VLAN_HLEN)
+		size_max = dev->mtu + reserve + VLAN_HLEN;
 
 	do {
 		ph = packet_current_frame(po, &po->tx_ring,
@@ -2523,20 +2561,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 			goto out_status;
 		}
 		tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
-					  addr, hlen);
-		if (likely(tp_len >= 0) &&
-		    tp_len > dev->mtu + dev->hard_header_len) {
-			struct ethhdr *ehdr;
-			/* Earlier code assumed this would be a VLAN pkt,
-			 * double-check this now that we have the actual
-			 * packet in hand.
-			 */
-
-			skb_reset_mac_header(skb);
-			ehdr = eth_hdr(skb);
-			if (ehdr->h_proto != htons(ETH_P_8021Q))
-				tp_len = -EMSGSIZE;
-		}
+					  addr, hlen, reserve);
 		if (unlikely(tp_len < 0)) {
 			if (po->tp_loss) {
 				__packet_set_status(po, ph,
@@ -2754,7 +2779,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		if (unlikely(offset < 0))
 			goto out_free;
 	} else {
-		if (ll_header_truncated(dev, len))
+		if (unlikely(ll_header_truncated(dev->hard_header_len, len)))
 			goto out_free;
 	}
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-10 22:03 ` [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling Daniel Borkmann
@ 2015-11-10 22:52   ` Willem de Bruijn
  2015-11-10 23:12     ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-10 22:52 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Miller, Eric Dumazet, tklauser, Network Development

>         if (sock->type == SOCK_DGRAM) {
> -               err = dev_hard_header(skb, dev, ntohs(proto), addr,
> -                               NULL, tp_len);
> +               /* In DGRAM sockets, we expect struct sockaddr_ll was filled
> +                * via struct msghdr, so we have dest mac and skb->protocol.
> +                * Otherwise there's not too much useful things we can do in
> +                * this flush run.
> +                */
> +               err = dev_hard_header(skb, dev, ntohs(skb->protocol), addr,
> +                                     NULL, tp_len);

This change is not really necessary.

>                 if (unlikely(err < 0))
>                         return -EINVAL;
> -       } else if (dev->hard_header_len) {

Why remove the check on hard_header_len?

> -               if (ll_header_truncated(dev, tp_len))
> -                       return -EINVAL;
> +       } else {
> +               /* If skb->protocol is still 0, try to infer/guess it. Might
> +                * not be fully reliable in the sense that a user could still
> +                * change/race data afterwards, but on the other hand the proto

The race goes away when probing it after the copy in skb_store_bits.
Then it is also certain that tp_len is long enough to hold the entire
link layer header.

> +                * can be set arbitrarily anyways. We only need to take care
> +                * in case of extra large VLAN frames.
> +                */
> +               if (!skb->protocol && tp_len >= ETH_HLEN)
> +                       skb->protocol = ((struct ethhdr *)data)->h_proto;

Packet sockets are not restricted to link layer of type Ethernet.

There are a few other points in this file that also cast mac header
to eth_hdr(skb).

>
> -               skb_push(skb, dev->hard_header_len);
> -               err = skb_store_bits(skb, 0, data,
> -                               dev->hard_header_len);
> -               if (unlikely(err))
> -                       return err;
> +               /* Copy Ethernet header in case we need to deal with extra
> +                * VLAN space as otherwise data could change underneath us.
> +                * The caller already accomodated for enough linear room.
> +                */
> +               if (dev->hard_header_len || tp_len > dev->mtu + reserve) {
> +                       int hdr_len = dev->hard_header_len;
> +
> +                       if (hdr_len < ETH_HLEN)
> +                               hdr_len = ETH_HLEN;
> +                       if (unlikely(ll_header_truncated(hdr_len, tp_len)))
> +                               return -EINVAL;
> +
> +                       skb_push(skb, hdr_len);
> +                       err = skb_store_bits(skb, 0, data, hdr_len);
> +                       if (unlikely(err))
> +                               return err;
>
> -               data += dev->hard_header_len;
> -               to_write -= dev->hard_header_len;
> +                       data     += hdr_len;
> +                       to_write -= hdr_len;
> +               }
>         }
>
>         offset = offset_in_page(data);
> @@ -2447,6 +2471,20 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>                 len = ((to_write > len_max) ? len_max : to_write);
>         }
>
> +       /* Check for full frame with extra VLAN space. We can probe
> +        * here on the linear header, if necessary. Earlier code
> +        * assumed this would be a VLAN pkt, double-check this now
> +        * that we have the actual packet and protocol at hand.
> +        */
> +       if (tp_len > dev->mtu + reserve) {
> +               if (skb->protocol != htons(ETH_P_8021Q))
> +                       return -EMSGSIZE;
> +
> +               skb_reset_mac_header(skb);
> +               if (eth_hdr(skb)->h_proto != htons(ETH_P_8021Q))
> +                       return -EMSGSIZE;
> +       }
> +
>         skb_probe_transport_header(skb, 0);
>
>         return tp_len;
> @@ -2493,12 +2531,12 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>         if (unlikely(!(dev->flags & IFF_UP)))
>                 goto out_put;
>
> -       reserve = dev->hard_header_len + VLAN_HLEN;
> -       size_max = po->tx_ring.frame_size
> -               - (po->tp_hdrlen - sizeof(struct sockaddr_ll));
> -
> -       if (size_max > dev->mtu + reserve)
> -               size_max = dev->mtu + reserve;
> +       if (po->sk.sk_socket->type == SOCK_RAW)
> +               reserve = dev->hard_header_len;
> +       size_max = po->tx_ring.frame_size - (po->tp_hdrlen -
> +                                            sizeof(struct sockaddr_ll));
> +       if (size_max > dev->mtu + reserve + VLAN_HLEN)
> +               size_max = dev->mtu + reserve + VLAN_HLEN;
>
>         do {
>                 ph = packet_current_frame(po, &po->tx_ring,
> @@ -2523,20 +2561,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>                         goto out_status;
>                 }
>                 tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
> -                                         addr, hlen);
> -               if (likely(tp_len >= 0) &&
> -                   tp_len > dev->mtu + dev->hard_header_len) {
> -                       struct ethhdr *ehdr;
> -                       /* Earlier code assumed this would be a VLAN pkt,
> -                        * double-check this now that we have the actual
> -                        * packet in hand.
> -                        */
> -
> -                       skb_reset_mac_header(skb);
> -                       ehdr = eth_hdr(skb);
> -                       if (ehdr->h_proto != htons(ETH_P_8021Q))
> -                               tp_len = -EMSGSIZE;
> -               }
> +                                         addr, hlen, reserve);
>                 if (unlikely(tp_len < 0)) {
>                         if (po->tp_loss) {
>                                 __packet_set_status(po, ph,
> @@ -2754,7 +2779,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
>                 if (unlikely(offset < 0))
>                         goto out_free;
>         } else {
> -               if (ll_header_truncated(dev, len))
> +               if (unlikely(ll_header_truncated(dev->hard_header_len, len)))
>                         goto out_free;
>         }
>
> --
> 1.9.3
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-10 22:52   ` Willem de Bruijn
@ 2015-11-10 23:12     ` Daniel Borkmann
  2015-11-10 23:24       ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-10 23:12 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Eric Dumazet, tklauser, Network Development

On 11/10/2015 11:52 PM, Willem de Bruijn wrote:
>>          if (sock->type == SOCK_DGRAM) {
>> -               err = dev_hard_header(skb, dev, ntohs(proto), addr,
>> -                               NULL, tp_len);
>> +               /* In DGRAM sockets, we expect struct sockaddr_ll was filled
>> +                * via struct msghdr, so we have dest mac and skb->protocol.
>> +                * Otherwise there's not too much useful things we can do in
>> +                * this flush run.
>> +                */
>> +               err = dev_hard_header(skb, dev, ntohs(skb->protocol), addr,
>> +                                     NULL, tp_len);
>
> This change is not really necessary.

Sure agreed, I found it helpful though. Don't mind removing it.

>>                  if (unlikely(err < 0))
>>                          return -EINVAL;
>> -       } else if (dev->hard_header_len) {
>
> Why remove the check on hard_header_len?

Hmm, the patch doesn't remove the check (it's moved further below).

>> -               if (ll_header_truncated(dev, tp_len))
>> -                       return -EINVAL;
>> +       } else {
>> +               /* If skb->protocol is still 0, try to infer/guess it. Might
>> +                * not be fully reliable in the sense that a user could still
>> +                * change/race data afterwards, but on the other hand the proto
>
> The race goes away when probing it after the copy in skb_store_bits.
> Then it is also certain that tp_len is long enough to hold the entire
> link layer header.

The skb_store_bits() is only done in case we do have a dev->hard_header_len
or in case where we run into a possible situation where we have the additional
4 bytes on a full frame. In that case we need to check them properly, which
requires copying, otherwise we don't copy any header.

>> +                * can be set arbitrarily anyways. We only need to take care
>> +                * in case of extra large VLAN frames.
>> +                */
>> +               if (!skb->protocol && tp_len >= ETH_HLEN)
>> +                       skb->protocol = ((struct ethhdr *)data)->h_proto;
>
> Packet sockets are not restricted to link layer of type Ethernet.
>
> There are a few other points in this file that also cast mac header
> to eth_hdr(skb).

Ok, the set doesn't address this assumption which we have elsewhere, too.
Do you suggest to also check on dev->type for these cases?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-10 23:12     ` Daniel Borkmann
@ 2015-11-10 23:24       ` Willem de Bruijn
  2015-11-10 23:51         ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-10 23:24 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Willem de Bruijn, David Miller, Eric Dumazet, Tobias Klauser,
	Network Development

On Tue, Nov 10, 2015 at 6:12 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 11/10/2015 11:52 PM, Willem de Bruijn wrote:
>>>
>>>          if (sock->type == SOCK_DGRAM) {
>>> -               err = dev_hard_header(skb, dev, ntohs(proto), addr,
>>> -                               NULL, tp_len);
>>> +               /* In DGRAM sockets, we expect struct sockaddr_ll was
>>> filled
>>> +                * via struct msghdr, so we have dest mac and
>>> skb->protocol.
>>> +                * Otherwise there's not too much useful things we can do
>>> in
>>> +                * this flush run.
>>> +                */
>>> +               err = dev_hard_header(skb, dev, ntohs(skb->protocol),
>>> addr,
>>> +                                     NULL, tp_len);
>>
>>
>> This change is not really necessary.
>
>
> Sure agreed, I found it helpful though. Don't mind removing it.
>
>>>                  if (unlikely(err < 0))
>>>                          return -EINVAL;
>>> -       } else if (dev->hard_header_len) {
>>
>>
>> Why remove the check on hard_header_len?
>
>
> Hmm, the patch doesn't remove the check (it's moved further below).
>
>>> -               if (ll_header_truncated(dev, tp_len))
>>> -                       return -EINVAL;
>>> +       } else {
>>> +               /* If skb->protocol is still 0, try to infer/guess it.
>>> Might
>>> +                * not be fully reliable in the sense that a user could
>>> still
>>> +                * change/race data afterwards, but on the other hand the
>>> proto
>>
>>
>> The race goes away when probing it after the copy in skb_store_bits.
>> Then it is also certain that tp_len is long enough to hold the entire
>> link layer header.
>
>
> The skb_store_bits() is only done in case we do have a dev->hard_header_len
> or in case where we run into a possible situation where we have the
> additional
> 4 bytes on a full frame. In that case we need to check them properly, which
> requires copying, otherwise we don't copy any header.

I assumed that hard_header_len has to be non-zero if there
is a link layer header to probe. If we only intend to implement
probing in the case of Ethernet, then it certainly holds.

>
>>> +                * can be set arbitrarily anyways. We only need to take
>>> care
>>> +                * in case of extra large VLAN frames.
>>> +                */
>>> +               if (!skb->protocol && tp_len >= ETH_HLEN)
>>> +                       skb->protocol = ((struct ethhdr *)data)->h_proto;
>>
>>
>> Packet sockets are not restricted to link layer of type Ethernet.
>>
>> There are a few other points in this file that also cast mac header
>> to eth_hdr(skb).
>
>
> Ok, the set doesn't address this assumption which we have elsewhere, too.
> Do you suggest to also check on dev->type for these cases?

Yes. If I'm right, then the other cases have to be fixed, too. One of the
eth_hdr(skb) calls was introduced by patch 09effa67a18d, where the
deleted code shows how it is safely restricted to ethernet:

  -       if (dev->type == ARPHRD_ETHER) {
  -               skb->protocol = eth_type_trans(skb, dev);
  -               if (skb->protocol == htons(ETH_P_8021Q))

>
> Thanks,
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-10 23:24       ` Willem de Bruijn
@ 2015-11-10 23:51         ` Daniel Borkmann
  2015-11-11 14:56           ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-10 23:51 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Eric Dumazet, Tobias Klauser, Network Development

On 11/11/2015 12:24 AM, Willem de Bruijn wrote:
> On Tue, Nov 10, 2015 at 6:12 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 11/10/2015 11:52 PM, Willem de Bruijn wrote:
>>>>
>>>>           if (sock->type == SOCK_DGRAM) {
>>>> -               err = dev_hard_header(skb, dev, ntohs(proto), addr,
>>>> -                               NULL, tp_len);
>>>> +               /* In DGRAM sockets, we expect struct sockaddr_ll was
>>>> filled
>>>> +                * via struct msghdr, so we have dest mac and
>>>> skb->protocol.
>>>> +                * Otherwise there's not too much useful things we can do
>>>> in
>>>> +                * this flush run.
>>>> +                */
>>>> +               err = dev_hard_header(skb, dev, ntohs(skb->protocol),
>>>> addr,
>>>> +                                     NULL, tp_len);
>>>
>>>
>>> This change is not really necessary.
>>
>>
>> Sure agreed, I found it helpful though. Don't mind removing it.
>>
>>>>                   if (unlikely(err < 0))
>>>>                           return -EINVAL;
>>>> -       } else if (dev->hard_header_len) {
>>>
>>>
>>> Why remove the check on hard_header_len?
>>
>>
>> Hmm, the patch doesn't remove the check (it's moved further below).
>>
>>>> -               if (ll_header_truncated(dev, tp_len))
>>>> -                       return -EINVAL;
>>>> +       } else {
>>>> +               /* If skb->protocol is still 0, try to infer/guess it.
>>>> Might
>>>> +                * not be fully reliable in the sense that a user could
>>>> still
>>>> +                * change/race data afterwards, but on the other hand the
>>>> proto
>>>
>>>
>>> The race goes away when probing it after the copy in skb_store_bits.
>>> Then it is also certain that tp_len is long enough to hold the entire
>>> link layer header.
>>
>>
>> The skb_store_bits() is only done in case we do have a dev->hard_header_len
>> or in case where we run into a possible situation where we have the
>> additional
>> 4 bytes on a full frame. In that case we need to check them properly, which
>> requires copying, otherwise we don't copy any header.
>
> I assumed that hard_header_len has to be non-zero if there
> is a link layer header to probe. If we only intend to implement
> probing in the case of Ethernet, then it certainly holds.

Yeah, I guess we only care about ether_setup() alike devices, so we'd have
ETH_HLEN room as dev->hard_header_len. That will hold, yes.

>>>> +                * can be set arbitrarily anyways. We only need to take
>>>> care
>>>> +                * in case of extra large VLAN frames.
>>>> +                */
>>>> +               if (!skb->protocol && tp_len >= ETH_HLEN)
>>>> +                       skb->protocol = ((struct ethhdr *)data)->h_proto;
>>>
>>>
>>> Packet sockets are not restricted to link layer of type Ethernet.
>>>
>>> There are a few other points in this file that also cast mac header
>>> to eth_hdr(skb).
>>
>>
>> Ok, the set doesn't address this assumption which we have elsewhere, too.
>> Do you suggest to also check on dev->type for these cases?
>
> Yes. If I'm right, then the other cases have to be fixed, too. One of the
> eth_hdr(skb) calls was introduced by patch 09effa67a18d, where the
> deleted code shows how it is safely restricted to ethernet:
>
>    -       if (dev->type == ARPHRD_ETHER) {
>    -               skb->protocol = eth_type_trans(skb, dev);
>    -               if (skb->protocol == htons(ETH_P_8021Q))

Saw that, so we need the check as one more fix for 57f89bfa2140 ("network:
Allow af_packet to transmit +4 bytes for VLAN packets.") as well.

I'll see to respin a v3 tomorrow.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-10 23:51         ` Daniel Borkmann
@ 2015-11-11 14:56           ` Willem de Bruijn
  2015-11-11 22:39             ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-11 14:56 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Willem de Bruijn, David Miller, Eric Dumazet, Tobias Klauser,
	Network Development

On Tue, Nov 10, 2015 at 6:51 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 11/11/2015 12:24 AM, Willem de Bruijn wrote:
>>
>> On Tue, Nov 10, 2015 at 6:12 PM, Daniel Borkmann <daniel@iogearbox.net>
>> wrote:
>>>
>>> On 11/10/2015 11:52 PM, Willem de Bruijn wrote:
>>>>>
>>>>>
>>>>>           if (sock->type == SOCK_DGRAM) {
>>>>> -               err = dev_hard_header(skb, dev, ntohs(proto), addr,
>>>>> -                               NULL, tp_len);
>>>>> +               /* In DGRAM sockets, we expect struct sockaddr_ll was
>>>>> filled
>>>>> +                * via struct msghdr, so we have dest mac and
>>>>> skb->protocol.
>>>>> +                * Otherwise there's not too much useful things we can
>>>>> do
>>>>> in
>>>>> +                * this flush run.
>>>>> +                */
>>>>> +               err = dev_hard_header(skb, dev, ntohs(skb->protocol),
>>>>> addr,
>>>>> +                                     NULL, tp_len);
>>>>
>>>>
>>>>
>>>> This change is not really necessary.
>>>
>>>
>>>
>>> Sure agreed, I found it helpful though. Don't mind removing it.
>>>
>>>>>                   if (unlikely(err < 0))
>>>>>                           return -EINVAL;
>>>>> -       } else if (dev->hard_header_len) {
>>>>
>>>>
>>>>
>>>> Why remove the check on hard_header_len?
>>>
>>>
>>>
>>> Hmm, the patch doesn't remove the check (it's moved further below).
>>>
>>>>> -               if (ll_header_truncated(dev, tp_len))
>>>>> -                       return -EINVAL;
>>>>> +       } else {
>>>>> +               /* If skb->protocol is still 0, try to infer/guess it.
>>>>> Might
>>>>> +                * not be fully reliable in the sense that a user could
>>>>> still
>>>>> +                * change/race data afterwards, but on the other hand
>>>>> the
>>>>> proto
>>>>
>>>>
>>>>
>>>> The race goes away when probing it after the copy in skb_store_bits.
>>>> Then it is also certain that tp_len is long enough to hold the entire
>>>> link layer header.
>>>
>>>
>>>
>>> The skb_store_bits() is only done in case we do have a
>>> dev->hard_header_len
>>> or in case where we run into a possible situation where we have the
>>> additional
>>> 4 bytes on a full frame. In that case we need to check them properly,
>>> which
>>> requires copying, otherwise we don't copy any header.
>>
>>
>> I assumed that hard_header_len has to be non-zero if there
>> is a link layer header to probe. If we only intend to implement
>> probing in the case of Ethernet, then it certainly holds.
>
>
> Yeah, I guess we only care about ether_setup() alike devices, so we'd have
> ETH_HLEN room as dev->hard_header_len. That will hold, yes.
>
>>>>> +                * can be set arbitrarily anyways. We only need to take
>>>>> care
>>>>> +                * in case of extra large VLAN frames.
>>>>> +                */
>>>>> +               if (!skb->protocol && tp_len >= ETH_HLEN)
>>>>> +                       skb->protocol = ((struct ethhdr
>>>>> *)data)->h_proto;
>>>>
>>>>
>>>>
>>>> Packet sockets are not restricted to link layer of type Ethernet.
>>>>
>>>> There are a few other points in this file that also cast mac header
>>>> to eth_hdr(skb).
>>>
>>>
>>>
>>> Ok, the set doesn't address this assumption which we have elsewhere, too.
>>> Do you suggest to also check on dev->type for these cases?
>>
>>
>> Yes. If I'm right, then the other cases have to be fixed, too. One of the
>> eth_hdr(skb) calls was introduced by patch 09effa67a18d, where the
>> deleted code shows how it is safely restricted to ethernet:
>>
>>    -       if (dev->type == ARPHRD_ETHER) {
>>    -               skb->protocol = eth_type_trans(skb, dev);
>>    -               if (skb->protocol == htons(ETH_P_8021Q))
>
>
> Saw that, so we need the check as one more fix for 57f89bfa2140 ("network:
> Allow af_packet to transmit +4 bytes for VLAN packets.") as well.
>
> I'll see to respin a v3 tomorrow.

Thanks, Daniel.

The three existing uses of eth_hdr were applied in 2.6.39, 3.12 and 3.15.
We probably need a patch per occurrence to allow backporting to the
relevant stable branches. I can create the fixes independent from your
patch set, if you prefer.

>
>
> Thanks,
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling
  2015-11-11 14:56           ` Willem de Bruijn
@ 2015-11-11 22:39             ` Daniel Borkmann
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:39 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Eric Dumazet, Tobias Klauser, Network Development

On 11/11/2015 03:56 PM, Willem de Bruijn wrote:
> On Tue, Nov 10, 2015 at 6:51 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
...
>> Saw that, so we need the check as one more fix for 57f89bfa2140 ("network:
>> Allow af_packet to transmit +4 bytes for VLAN packets.") as well.
>>
>> I'll see to respin a v3 tomorrow.
>
> Thanks, Daniel.
>
> The three existing uses of eth_hdr were applied in 2.6.39, 3.12 and 3.15.
> We probably need a patch per occurrence to allow backporting to the
> relevant stable branches. I can create the fixes independent from your
> patch set, if you prefer.

Thanks, I've pushed the new set out for review with the 3rd patch simplified
quite a bit. I think we could let this linger a bit in -net first, and in
case backporting help would be needed, happy to jump in.

Best & thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-11-11 22:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-10 22:03 [PATCH net v2 0/3] packet fixes Daniel Borkmann
2015-11-10 22:03 ` [PATCH net v2 1/3] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
2015-11-11  5:07   ` Jason Wang
2015-11-10 22:03 ` [PATCH net v2 2/3] packet: always probe for transport header Daniel Borkmann
2015-11-11  5:08   ` Jason Wang
2015-11-10 22:03 ` [PATCH net v2 3/3] packet: fix tpacket_snd max frame and vlan handling Daniel Borkmann
2015-11-10 22:52   ` Willem de Bruijn
2015-11-10 23:12     ` Daniel Borkmann
2015-11-10 23:24       ` Willem de Bruijn
2015-11-10 23:51         ` Daniel Borkmann
2015-11-11 14:56           ` Willem de Bruijn
2015-11-11 22:39             ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).