[PATCH net v3 0/5] packet fixes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net v3 0/5] packet fixes
@ 2015-11-11 22:25 Daniel Borkmann
  2015-11-11 22:25 ` [PATCH net v3 1/5] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:25 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

Fixes a couple of issues in packet sockets, i.e. on TX ring side. See
individual patches for details.

v2 -> v3:
 - First two patches unchanged, kept Jason's Ack
 - Reworked 3rd patch and split into 3:
  - check for dev type as discussed with Willem
  - infer skb->protocol
  - fix max len for dgram
v1 -> v2:
 - Added patch 2 as suggested by Dave
 - Rest is unchanged from previous submission

Daniel Borkmann (5):
  packet: do skb_probe_transport_header when we actually have data
  packet: always probe for transport header
  packet: only allow extra vlan len on ethernet devices
  packet: infer protocol from ethernet header if unset
  packet: fix tpacket_snd max frame len

 net/packet/af_packet.c | 86 ++++++++++++++++++++++++++------------------------
 1 file changed, 44 insertions(+), 42 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v3 1/5] packet: do skb_probe_transport_header when we actually have data
  2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
@ 2015-11-11 22:25 ` Daniel Borkmann
  2015-11-11 22:25 ` [PATCH net v3 2/5] packet: always probe for transport header Daniel Borkmann
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:25 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
header before doing xmit") and later on 40893fd0fd4e ("net: switch
to use skb_probe_transport_header()") was probing for a transport
header on the skb from a ring buffer slot, but at a time, where
the skb has _not even_ been filled with data yet. So that call into
the flow dissector is pretty useless. Lets do it after we've set
up the skb frags.

Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 net/packet/af_packet.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index af399ca..80c36c0 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2368,8 +2368,6 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 	skb_reserve(skb, hlen);
 	skb_reset_network_header(skb);
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, 0);
 	if (unlikely(po->tp_tx_has_off)) {
 		int off_min, off_max, off;
 		off_min = po->tp_hdrlen - sizeof(struct sockaddr_ll);
@@ -2449,6 +2447,9 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
+	if (!packet_use_direct_xmit(po))
+		skb_probe_transport_header(skb, 0);
+
 	return tp_len;
 }
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net v3 2/5] packet: always probe for transport header
  2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
  2015-11-11 22:25 ` [PATCH net v3 1/5] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
@ 2015-11-11 22:25 ` Daniel Borkmann
  2015-11-11 22:25 ` [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices Daniel Borkmann
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:25 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

We concluded that the skb_probe_transport_header() should better be
called unconditionally. Avoiding the call into the flow dissector has
also not really much to do with the direct xmit mode.

While it seems that only virtio_net code makes use of GSO from non
RX/TX ring packet socket paths, we should probe for a transport header
nevertheless before they hit devices.

Reference: http://thread.gmane.org/gmane.linux.network/386173/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 net/packet/af_packet.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 80c36c0..bdecf17 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2447,8 +2447,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, 0);
+	skb_probe_transport_header(skb, 0);
 
 	return tp_len;
 }
@@ -2808,8 +2807,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		len += vnet_hdr_len;
 	}
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, reserve);
+	skb_probe_transport_header(skb, reserve);
+
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices
  2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
  2015-11-11 22:25 ` [PATCH net v3 1/5] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
  2015-11-11 22:25 ` [PATCH net v3 2/5] packet: always probe for transport header Daniel Borkmann
@ 2015-11-11 22:25 ` Daniel Borkmann
  2015-11-11 22:55   ` Willem de Bruijn
  2015-11-11 22:25 ` [PATCH net v3 4/5] packet: infer protocol from ethernet header if unset Daniel Borkmann
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:25 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

Packet sockets can be used by various net devices and are not
really restricted to ARPHRD_ETHER device types. However, when
currently checking for the extra 4 bytes that can be transmitted
in VLAN case, our assumption is that we generally probe on
ARPHRD_ETHER devices. Therefore, before looking into Ethernet
header, check the device type first.

This also fixes the issue where non-ARPHRD_ETHER devices could
have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus
the check would test unfilled linear part of the skb (instead
of non-linear).

Fixes: 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.")
Fixes: 52f1454f629f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 net/packet/af_packet.c | 60 +++++++++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 35 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index bdecf17..8795b0f 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1741,6 +1741,20 @@ static void fanout_release(struct sock *sk)
 		kfree_rcu(po->rollover, rcu);
 }
 
+static bool packet_extra_vlan_len_allowed(const struct net_device *dev,
+					  struct sk_buff *skb)
+{
+	/* Earlier code assumed this would be a VLAN pkt, double-check
+	 * this now that we have the actual packet in hand. We can only
+	 * do this check on Ethernet devices.
+	 */
+	if (unlikely(dev->type != ARPHRD_ETHER))
+		return false;
+
+	skb_reset_mac_header(skb);
+	return likely(eth_hdr(skb)->h_proto == htons(ETH_P_8021Q));
+}
+
 static const struct proto_ops packet_ops;
 
 static const struct proto_ops packet_ops_spkt;
@@ -1902,18 +1916,10 @@ retry:
 		goto retry;
 	}
 
-	if (len > (dev->mtu + dev->hard_header_len + extra_len)) {
-		/* Earlier code assumed this would be a VLAN pkt,
-		 * double-check this now that we have the actual
-		 * packet in hand.
-		 */
-		struct ethhdr *ehdr;
-		skb_reset_mac_header(skb);
-		ehdr = eth_hdr(skb);
-		if (ehdr->h_proto != htons(ETH_P_8021Q)) {
-			err = -EMSGSIZE;
-			goto out_unlock;
-		}
+	if (len > (dev->mtu + dev->hard_header_len + extra_len) &&
+	    !packet_extra_vlan_len_allowed(dev, skb)) {
+		err = -EMSGSIZE;
+		goto out_unlock;
 	}
 
 	skb->protocol = proto;
@@ -2525,18 +2531,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 		tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
 					  addr, hlen);
 		if (likely(tp_len >= 0) &&
-		    tp_len > dev->mtu + dev->hard_header_len) {
-			struct ethhdr *ehdr;
-			/* Earlier code assumed this would be a VLAN pkt,
-			 * double-check this now that we have the actual
-			 * packet in hand.
-			 */
+		    tp_len > dev->mtu + dev->hard_header_len &&
+		    !packet_extra_vlan_len_allowed(dev, skb))
+			tp_len = -EMSGSIZE;
 
-			skb_reset_mac_header(skb);
-			ehdr = eth_hdr(skb);
-			if (ehdr->h_proto != htons(ETH_P_8021Q))
-				tp_len = -EMSGSIZE;
-		}
 		if (unlikely(tp_len < 0)) {
 			if (po->tp_loss) {
 				__packet_set_status(po, ph,
@@ -2765,18 +2763,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 	sock_tx_timestamp(sk, &skb_shinfo(skb)->tx_flags);
 
-	if (!gso_type && (len > dev->mtu + reserve + extra_len)) {
-		/* Earlier code assumed this would be a VLAN pkt,
-		 * double-check this now that we have the actual
-		 * packet in hand.
-		 */
-		struct ethhdr *ehdr;
-		skb_reset_mac_header(skb);
-		ehdr = eth_hdr(skb);
-		if (ehdr->h_proto != htons(ETH_P_8021Q)) {
-			err = -EMSGSIZE;
-			goto out_free;
-		}
+	if (!gso_type && (len > dev->mtu + reserve + extra_len) &&
+	    !packet_extra_vlan_len_allowed(dev, skb)) {
+		err = -EMSGSIZE;
+		goto out_free;
 	}
 
 	skb->protocol = proto;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices
  2015-11-11 22:25 ` [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices Daniel Borkmann
@ 2015-11-11 22:55   ` Willem de Bruijn
  2015-11-11 23:07     ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-11 22:55 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Eric Dumazet, Tobias Klauser, Network Development

On Wed, Nov 11, 2015 at 5:25 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Packet sockets can be used by various net devices and are not
> really restricted to ARPHRD_ETHER device types. However, when
> currently checking for the extra 4 bytes that can be transmitted
> in VLAN case, our assumption is that we generally probe on
> ARPHRD_ETHER devices. Therefore, before looking into Ethernet
> header, check the device type first.
>
> This also fixes the issue where non-ARPHRD_ETHER devices could
> have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus
> the check would test unfilled linear part of the skb (instead
> of non-linear).
>
> Fixes: 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.")
> Fixes: 52f1454f629f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")

Fixes: 09effa67a18d ("packet: Revert recent header parsing changes.")

> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Separate patches would probably be preferable, to be able to backport
to stable branches. But I won't press that point further. A minor
comment inline, but feel free to ignore. Otherwise,

Acked-by: Willem de Bruijn <willemb@google.com>

Thanks for fixing all occurrences, Daniel.

> ---
>  net/packet/af_packet.c | 60 +++++++++++++++++++++-----------------------------
>  1 file changed, 25 insertions(+), 35 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index bdecf17..8795b0f 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1741,6 +1741,20 @@ static void fanout_release(struct sock *sk)
>                 kfree_rcu(po->rollover, rcu);
>  }
>
> +static bool packet_extra_vlan_len_allowed(const struct net_device *dev,
> +                                         struct sk_buff *skb)
> +{
> +       /* Earlier code assumed this would be a VLAN pkt, double-check
> +        * this now that we have the actual packet in hand. We can only
> +        * do this check on Ethernet devices.
> +        */
> +       if (unlikely(dev->type != ARPHRD_ETHER))
> +               return false;
> +
> +       skb_reset_mac_header(skb);
> +       return likely(eth_hdr(skb)->h_proto == htons(ETH_P_8021Q));
> +}
> +
>  static const struct proto_ops packet_ops;
>
>  static const struct proto_ops packet_ops_spkt;
> @@ -1902,18 +1916,10 @@ retry:
>                 goto retry;
>         }
>
> -       if (len > (dev->mtu + dev->hard_header_len + extra_len)) {
> -               /* Earlier code assumed this would be a VLAN pkt,
> -                * double-check this now that we have the actual
> -                * packet in hand.
> -                */
> -               struct ethhdr *ehdr;
> -               skb_reset_mac_header(skb);
> -               ehdr = eth_hdr(skb);
> -               if (ehdr->h_proto != htons(ETH_P_8021Q)) {
> -                       err = -EMSGSIZE;
> -                       goto out_unlock;
> -               }
> +       if (len > (dev->mtu + dev->hard_header_len + extra_len) &&
> +           !packet_extra_vlan_len_allowed(dev, skb)) {
> +               err = -EMSGSIZE;
> +               goto out_unlock;
>         }
>
>         skb->protocol = proto;
> @@ -2525,18 +2531,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>                 tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
>                                           addr, hlen);
>                 if (likely(tp_len >= 0) &&
> -                   tp_len > dev->mtu + dev->hard_header_len) {
> -                       struct ethhdr *ehdr;
> -                       /* Earlier code assumed this would be a VLAN pkt,
> -                        * double-check this now that we have the actual
> -                        * packet in hand.
> -                        */
> +                   tp_len > dev->mtu + dev->hard_header_len &&
> +                   !packet_extra_vlan_len_allowed(dev, skb))
> +                       tp_len = -EMSGSIZE;
>
> -                       skb_reset_mac_header(skb);
> -                       ehdr = eth_hdr(skb);
> -                       if (ehdr->h_proto != htons(ETH_P_8021Q))
> -                               tp_len = -EMSGSIZE;
> -               }
>                 if (unlikely(tp_len < 0)) {
>                         if (po->tp_loss) {
>                                 __packet_set_status(po, ph,
> @@ -2765,18 +2763,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
>
>         sock_tx_timestamp(sk, &skb_shinfo(skb)->tx_flags);
>
> -       if (!gso_type && (len > dev->mtu + reserve + extra_len)) {
> -               /* Earlier code assumed this would be a VLAN pkt,
> -                * double-check this now that we have the actual
> -                * packet in hand.
> -                */
> -               struct ethhdr *ehdr;
> -               skb_reset_mac_header(skb);
> -               ehdr = eth_hdr(skb);
> -               if (ehdr->h_proto != htons(ETH_P_8021Q)) {
> -                       err = -EMSGSIZE;
> -                       goto out_free;
> -               }
> +       if (!gso_type && (len > dev->mtu + reserve + extra_len) &&
> +           !packet_extra_vlan_len_allowed(dev, skb)) {
> +               err = -EMSGSIZE;
> +               goto out_free;

This nicely reuses the same code in three locations.

If you end up having to send a v4, it would be nice to also fold the
repeated len check into the shared code. Variable reserve here is
just dev->hard_header_len. No need to spin a patch just for that
cleanup, though.


>         }
>
>         skb->protocol = proto;
> --
> 1.9.3
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices
  2015-11-11 22:55   ` Willem de Bruijn
@ 2015-11-11 23:07     ` Daniel Borkmann
  2015-11-11 23:11       ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 23:07 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Eric Dumazet, Tobias Klauser, Network Development

On 11/11/2015 11:55 PM, Willem de Bruijn wrote:
> On Wed, Nov 11, 2015 at 5:25 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
...
>> +       if (!gso_type && (len > dev->mtu + reserve + extra_len) &&
>> +           !packet_extra_vlan_len_allowed(dev, skb)) {
>> +               err = -EMSGSIZE;
>> +               goto out_free;
>
> This nicely reuses the same code in three locations.
>
> If you end up having to send a v4, it would be nice to also fold the
> repeated len check into the shared code. Variable reserve here is
> just dev->hard_header_len. No need to spin a patch just for that
> cleanup, though.

Noted, I was also thinking for net-next to move the tpacket_snd() check
into tpacket_fill_skb(), that would save us to do the likely(tp_len >= 0)
check there. Can take care of both when net-next opens, I feared there
would be critics that it's not ending up minimal enough otherwise. ;)

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices
  2015-11-11 23:07     ` Daniel Borkmann
@ 2015-11-11 23:11       ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-11 23:11 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Willem de Bruijn, David Miller, Eric Dumazet, Tobias Klauser,
	Network Development

On Wed, Nov 11, 2015 at 6:07 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 11/11/2015 11:55 PM, Willem de Bruijn wrote:
>>
>> On Wed, Nov 11, 2015 at 5:25 PM, Daniel Borkmann <daniel@iogearbox.net>
>> wrote:
>
> ...
>>>
>>> +       if (!gso_type && (len > dev->mtu + reserve + extra_len) &&
>>> +           !packet_extra_vlan_len_allowed(dev, skb)) {
>>> +               err = -EMSGSIZE;
>>> +               goto out_free;
>>
>>
>> This nicely reuses the same code in three locations.
>>
>> If you end up having to send a v4, it would be nice to also fold the
>> repeated len check into the shared code. Variable reserve here is
>> just dev->hard_header_len. No need to spin a patch just for that
>> cleanup, though.
>
>
> Noted, I was also thinking for net-next to move the tpacket_snd() check
> into tpacket_fill_skb(), that would save us to do the likely(tp_len >= 0)
> check there. Can take care of both when net-next opens, I feared there
> would be critics that it's not ending up minimal enough otherwise. ;)

Fair point :)

>
> Thanks,
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v3 4/5] packet: infer protocol from ethernet header if unset
  2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
                   ` (2 preceding siblings ...)
  2015-11-11 22:25 ` [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices Daniel Borkmann
@ 2015-11-11 22:25 ` Daniel Borkmann
  2015-11-11 23:10   ` Willem de Bruijn
  2015-11-11 22:25 ` [PATCH net v3 5/5] packet: fix tpacket_snd max frame len Daniel Borkmann
  2015-11-15 23:01 ` [PATCH net v3 0/5] packet fixes David Miller
  5 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:25 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

In case no struct sockaddr_ll has been passed to packet
socket's sendmsg() when doing a TX_RING flush run, then
skb->protocol is set to po->num instead, which is the protocol
passed via socket(2)/bind(2).

Applications only xmitting can go the path of allocating the
socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the
TX_RING with sll_protocol of 0. That way, register_prot_hook()
is neither called on creation nor on bind time, which saves
cycles when there's no interest in capturing anyway.

That leaves us however with po->num 0 instead and therefore
the TX_RING flush run sets skb->protocol to 0 as well. Eric
reported that this leads to problems when using tools like
trafgen over bonding device. I.e. the bonding's hash function
could invoke the kernel's flow dissector, which depends on
skb->protocol being properly set. In the current situation, all
the traffic is then directed to a single slave.

Fix it up by inferring skb->protocol from the Ethernet header
when not set and we have ARPHRD_ETHER device type. This is only
done in case of SOCK_RAW and where we have a dev->hard_header_len
length. In case of ARPHRD_ETHER devices, this is guaranteed to
cover ETH_HLEN, and therefore being accessed on the skb after
the skb_store_bits().

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 net/packet/af_packet.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8795b0f..0066da2 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2338,6 +2338,15 @@ static bool ll_header_truncated(const struct net_device *dev, int len)
 	return false;
 }
 
+static void tpacket_set_protocol(const struct net_device *dev,
+				 struct sk_buff *skb)
+{
+	if (dev->type == ARPHRD_ETHER) {
+		skb_reset_mac_header(skb);
+		skb->protocol = eth_hdr(skb)->h_proto;
+	}
+}
+
 static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		void *frame, struct net_device *dev, int size_max,
 		__be16 proto, unsigned char *addr, int hlen)
@@ -2419,6 +2428,8 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 				dev->hard_header_len);
 		if (unlikely(err))
 			return err;
+		if (!skb->protocol)
+			tpacket_set_protocol(dev, skb);
 
 		data += dev->hard_header_len;
 		to_write -= dev->hard_header_len;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 4/5] packet: infer protocol from ethernet header if unset
  2015-11-11 22:25 ` [PATCH net v3 4/5] packet: infer protocol from ethernet header if unset Daniel Borkmann
@ 2015-11-11 23:10   ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-11 23:10 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Eric Dumazet, Tobias Klauser, Network Development

On Wed, Nov 11, 2015 at 5:25 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> In case no struct sockaddr_ll has been passed to packet
> socket's sendmsg() when doing a TX_RING flush run, then
> skb->protocol is set to po->num instead, which is the protocol
> passed via socket(2)/bind(2).
>
> Applications only xmitting can go the path of allocating the
> socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the
> TX_RING with sll_protocol of 0. That way, register_prot_hook()
> is neither called on creation nor on bind time, which saves
> cycles when there's no interest in capturing anyway.
>
> That leaves us however with po->num 0 instead and therefore
> the TX_RING flush run sets skb->protocol to 0 as well. Eric
> reported that this leads to problems when using tools like
> trafgen over bonding device.
> I.e. the bonding's hash function
> could invoke the kernel's flow dissector, which depends on
> skb->protocol being properly set. In the current situation, all
> the traffic is then directed to a single slave.
>
> Fix it up by inferring skb->protocol from the Ethernet header
> when not set and we have ARPHRD_ETHER device type. This is only
> done in case of SOCK_RAW and where we have a dev->hard_header_len
> length. In case of ARPHRD_ETHER devices, this is guaranteed to
> cover ETH_HLEN, and therefore being accessed on the skb after
> the skb_store_bits().
>
> Reported-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/packet/af_packet.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 8795b0f..0066da2 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2338,6 +2338,15 @@ static bool ll_header_truncated(const struct net_device *dev, int len)
>         return false;
>  }
>
> +static void tpacket_set_protocol(const struct net_device *dev,
> +                                struct sk_buff *skb)
> +{
> +       if (dev->type == ARPHRD_ETHER) {
> +               skb_reset_mac_header(skb);
> +               skb->protocol = eth_hdr(skb)->h_proto;
> +       }
> +}
> +
>  static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>                 void *frame, struct net_device *dev, int size_max,
>                 __be16 proto, unsigned char *addr, int hlen)
> @@ -2419,6 +2428,8 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>                                 dev->hard_header_len);
>                 if (unlikely(err))
>                         return err;
> +               if (!skb->protocol)
> +                       tpacket_set_protocol(dev, skb);
>
>                 data += dev->hard_header_len;
>                 to_write -= dev->hard_header_len;
> --
> 1.9.3
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v3 5/5] packet: fix tpacket_snd max frame len
  2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
                   ` (3 preceding siblings ...)
  2015-11-11 22:25 ` [PATCH net v3 4/5] packet: infer protocol from ethernet header if unset Daniel Borkmann
@ 2015-11-11 22:25 ` Daniel Borkmann
  2015-11-11 22:56   ` Willem de Bruijn
  2015-11-15 23:01 ` [PATCH net v3 0/5] packet fixes David Miller
  5 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2015-11-11 22:25 UTC (permalink / raw)
  To: davem; +Cc: edumazet, willemb, tklauser, netdev, Daniel Borkmann

Since it's introduction in commit 69e3c75f4d54 ("net: TX_RING and
packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW
side. When used with SOCK_DGRAM only, the size_max > dev->mtu +
reserve check should have reserve as 0, but currently, this is
unconditionally set (in it's original form as dev->hard_header_len).

I think this is not correct since tpacket_fill_skb() would then
take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM,
the extra VLAN_HLEN could be possible in both cases. Presumably, the
reserve code was copied from packet_snd(), but later on missed the
check. Make it similar as we have it in packet_snd().

Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 net/packet/af_packet.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 0066da2..242bce1 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2510,12 +2510,13 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 	if (unlikely(!(dev->flags & IFF_UP)))
 		goto out_put;
 
-	reserve = dev->hard_header_len + VLAN_HLEN;
+	if (po->sk.sk_socket->type == SOCK_RAW)
+		reserve = dev->hard_header_len;
 	size_max = po->tx_ring.frame_size
 		- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
 
-	if (size_max > dev->mtu + reserve)
-		size_max = dev->mtu + reserve;
+	if (size_max > dev->mtu + reserve + VLAN_HLEN)
+		size_max = dev->mtu + reserve + VLAN_HLEN;
 
 	do {
 		ph = packet_current_frame(po, &po->tx_ring,
@@ -2542,7 +2543,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 		tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
 					  addr, hlen);
 		if (likely(tp_len >= 0) &&
-		    tp_len > dev->mtu + dev->hard_header_len &&
+		    tp_len > dev->mtu + reserve &&
 		    !packet_extra_vlan_len_allowed(dev, skb))
 			tp_len = -EMSGSIZE;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 5/5] packet: fix tpacket_snd max frame len
  2015-11-11 22:25 ` [PATCH net v3 5/5] packet: fix tpacket_snd max frame len Daniel Borkmann
@ 2015-11-11 22:56   ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2015-11-11 22:56 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Eric Dumazet, Tobias Klauser, Network Development

On Wed, Nov 11, 2015 at 5:25 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Since it's introduction in commit 69e3c75f4d54 ("net: TX_RING and
> packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW
> side. When used with SOCK_DGRAM only, the size_max > dev->mtu +
> reserve check should have reserve as 0, but currently, this is
> unconditionally set (in it's original form as dev->hard_header_len).
>
> I think this is not correct since tpacket_fill_skb() would then
> take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM,
> the extra VLAN_HLEN could be possible in both cases. Presumably, the
> reserve code was copied from packet_snd(), but later on missed the
> check. Make it similar as we have it in packet_snd().
>
> Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/packet/af_packet.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 0066da2..242bce1 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2510,12 +2510,13 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>         if (unlikely(!(dev->flags & IFF_UP)))
>                 goto out_put;
>
> -       reserve = dev->hard_header_len + VLAN_HLEN;
> +       if (po->sk.sk_socket->type == SOCK_RAW)
> +               reserve = dev->hard_header_len;
>         size_max = po->tx_ring.frame_size
>                 - (po->tp_hdrlen - sizeof(struct sockaddr_ll));
>
> -       if (size_max > dev->mtu + reserve)
> -               size_max = dev->mtu + reserve;
> +       if (size_max > dev->mtu + reserve + VLAN_HLEN)
> +               size_max = dev->mtu + reserve + VLAN_HLEN;
>
>         do {
>                 ph = packet_current_frame(po, &po->tx_ring,
> @@ -2542,7 +2543,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>                 tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
>                                           addr, hlen);
>                 if (likely(tp_len >= 0) &&
> -                   tp_len > dev->mtu + dev->hard_header_len &&
> +                   tp_len > dev->mtu + reserve &&
>                     !packet_extra_vlan_len_allowed(dev, skb))
>                         tp_len = -EMSGSIZE;
>
> --
> 1.9.3
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 0/5] packet fixes
  2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
                   ` (4 preceding siblings ...)
  2015-11-11 22:25 ` [PATCH net v3 5/5] packet: fix tpacket_snd max frame len Daniel Borkmann
@ 2015-11-15 23:01 ` David Miller
  5 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2015-11-15 23:01 UTC (permalink / raw)
  To: daniel; +Cc: edumazet, willemb, tklauser, netdev

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed, 11 Nov 2015 23:25:39 +0100

> Fixes a couple of issues in packet sockets, i.e. on TX ring side. See
> individual patches for details.
> 
> v2 -> v3:
>  - First two patches unchanged, kept Jason's Ack
>  - Reworked 3rd patch and split into 3:
>   - check for dev type as discussed with Willem
>   - infer skb->protocol
>   - fix max len for dgram
> v1 -> v2:
>  - Added patch 2 as suggested by Dave
>  - Rest is unchanged from previous submission

Series applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-11-15 23:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-11 22:25 [PATCH net v3 0/5] packet fixes Daniel Borkmann
2015-11-11 22:25 ` [PATCH net v3 1/5] packet: do skb_probe_transport_header when we actually have data Daniel Borkmann
2015-11-11 22:25 ` [PATCH net v3 2/5] packet: always probe for transport header Daniel Borkmann
2015-11-11 22:25 ` [PATCH net v3 3/5] packet: only allow extra vlan len on ethernet devices Daniel Borkmann
2015-11-11 22:55   ` Willem de Bruijn
2015-11-11 23:07     ` Daniel Borkmann
2015-11-11 23:11       ` Willem de Bruijn
2015-11-11 22:25 ` [PATCH net v3 4/5] packet: infer protocol from ethernet header if unset Daniel Borkmann
2015-11-11 23:10   ` Willem de Bruijn
2015-11-11 22:25 ` [PATCH net v3 5/5] packet: fix tpacket_snd max frame len Daniel Borkmann
2015-11-11 22:56   ` Willem de Bruijn
2015-11-15 23:01 ` [PATCH net v3 0/5] packet fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).