Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] ipv6: move stub initialization after ipv6 setup completion
From: David Miller @ 2017-04-25 15:43 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, xiyou.wangcong, hannes
In-Reply-To: <fd7c85f650661466a782a71485f227f0b84454a7.1493035843.git.pabeni@redhat.com>

From: Paolo Abeni <pabeni@redhat.com>
Date: Mon, 24 Apr 2017 14:18:28 +0200

> The ipv6 stub pointer is currently initialized before the ipv6
> routing subsystem: a 3rd party can access and use such stub
> before the routing data is ready.
> Moreover, such pointer is not cleared in case of initialization
> error, possibly leading to dangling pointers usage.
> 
> This change addresses the above moving the stub initialization
> at the end of ipv6 init code.
> 
> Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v2] net/packet: initialize val in packet_getsockopt()
From: David Miller @ 2017-04-25 15:44 UTC (permalink / raw)
  To: glider; +Cc: dvyukov, kcc, edumazet, kuznet, linux-kernel, netdev
In-Reply-To: <20170424125914.43270-1-glider@google.com>

From: Alexander Potapenko <glider@google.com>
Date: Mon, 24 Apr 2017 14:59:14 +0200

> In the case getsockopt() is called with PACKET_HDRLEN and optlen < 4
> |val| remains uninitialized and the syscall may behave differently
> depending on its value. This doesn't have security consequences (as the
> uninit bytes aren't copied back), but it's still cleaner to initialize
> |val| and ensure optlen is not less than sizeof(int).
> 
> This bug has been detected with KMSAN.
> 
> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
> v2: - if len < sizeof(int), make it 0

No, you should signal an error if the len is too small.

Returning zero bytes to userspace silently makes the user think that
he got the data he asked for.

^ permalink raw reply

* Re: [PATCH net v1 1/2] tipc: fix socket flow control accounting error at tipc_send_stream
From: David Miller @ 2017-04-25 15:45 UTC (permalink / raw)
  To: parthasarathy.bhuvaragan; +Cc: netdev, tipc-discussion
In-Reply-To: <1493038843-30621-1-git-send-email-parthasarathy.bhuvaragan@ericsson.com>

From: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Date: Mon, 24 Apr 2017 15:00:42 +0200

> Until now in tipc_send_stream(), we return -1 when the socket
> encounters link congestion even if the socket had successfully
> sent partial data. This is incorrect as the application resends
> the same the partial data leading to data corruption at
> receiver's end.
> 
> In this commit, we return the partially sent bytes as the return
> value at link congestion.
> 
> Fixes: 10724cc7bb78 ("tipc: redesign connection-level flow control")
> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>

Both patches #1 and #2 applied, thanks.

^ permalink raw reply

* Re: [PATCH net] macsec: avoid heap overflow in skb_to_sgvec on receive
From: Sabrina Dubroca @ 2017-04-25 15:51 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Netdev
In-Reply-To: <CAHmME9rxdTEe1-5ApgUhdkqiJztSgJb0QuhV2uSemobeXm_oNg@mail.gmail.com>

2017-04-25, 17:39:09 +0200, Jason A. Donenfeld wrote:
> Hi Sabrina,
> 
> I think I may have beaten you to the punch here by a few minutes. :)

I said I was going to post a patch.
Mail headers seem to disagree with you ;)


> The difference between our two versions is that you don't re-add the
> FRAGLIST attribute, whereas my patch does, and then it does the
> dynamic allocation. I suspect this might be a bit more robust. It also
> ensures that skb_cow_data is called on both paths. So perhaps let's
> roll with mine?

I don't see the "more robust" argument.

Unless I missed something, encrypt was already handling fragments
correctly. An skb with ->frag_list should have no skb_tailroom, so it
will be linearized skb_copy_expand().

-- 
Sabrina

^ permalink raw reply

* Re: net: heap out-of-bounds in fib6_clean_node/rt6_fill_node/fib6_age/fib6_prune_clone
From: David Ahern @ 2017-04-25 15:51 UTC (permalink / raw)
  To: Andrey Konovalov, Dmitry Vyukov
  Cc: Eric Dumazet, Mahesh Bandewar, Eric Dumazet, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, netdev, LKML, Cong Wang, syzkaller
In-Reply-To: <CAAeHK+zx6W_yxnoEQ2Pc1AT4uLXqYZaoi5oQBeuCntdGS38TQg@mail.gmail.com>

On 4/18/17 2:43 PM, Andrey Konovalov wrote:
> kasan: CONFIG_KASAN_INLINE enabled
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] SMP KASAN
> Modules linked in:
> CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: ffff880069809600 task.stack: ffff880062dc8000
> RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975
> RSP: 0018:ffff880062dced30 EFLAGS: 00010206
> RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006
> RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018
> RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000
> FS:  00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0
> Call Trace:
>  ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128

This one is fixed by:

commit 557c44be917c322860665be3d28376afa84aa936
Author: David Ahern <dsa@cumulusnetworks.com>
Date:   Wed Apr 19 14:19:43 2017 -0700

    net: ipv6: RTF_PCPU should not be settable from userspace

^ permalink raw reply

* [PATCH v5 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: Jason A. Donenfeld @ 2017-04-25 15:52 UTC (permalink / raw)
  To: netdev, linux-kernel, David.Laight, kernel-hardening, davem
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425.111717.1973239615715123840.davem@davemloft.net>

This is a defense-in-depth measure in response to bugs like
4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec")

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
This is a resend of v4 with all the other child commits along with it.

 net/core/skbuff.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f86bf69cfb8d..e75640006d78 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3489,7 +3489,9 @@ void __init skb_init(void)
  *	@len: Length of buffer space to be mapped
  *
  *	Fill the specified scatter-gather list with mappings/pointers into a
- *	region of the buffer space attached to a socket buffer.
+ *	region of the buffer space attached to a socket buffer. Returns either
+ *	the number of scatterlist items used, or -EMSGSIZE if the contents
+ *	could not fit.
  */
 static int
 __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
@@ -3517,6 +3519,8 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 		end = start + skb_frag_size(&skb_shinfo(skb)->frags[i]);
 		if ((copy = end - offset) > 0) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+			if (elt && sg_is_last(&sg[elt - 1]))
+				return -EMSGSIZE;
 
 			if (copy > len)
 				copy = len;
@@ -3537,6 +3541,9 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 
 		end = start + frag_iter->len;
 		if ((copy = end - offset) > 0) {
+			if (elt && sg_is_last(&sg[elt - 1]))
+				return -EMSGSIZE;
+
 			if (copy > len)
 				copy = len;
 			elt += __skb_to_sgvec(frag_iter, sg+elt, offset - start,
@@ -3581,6 +3588,9 @@ int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int le
 {
 	int nsg = __skb_to_sgvec(skb, sg, offset, len);
 
+	if (nsg <= 0)
+		return nsg;
+
 	sg_mark_end(&sg[nsg - 1]);
 
 	return nsg;
-- 
2.12.2

^ permalink raw reply related

* [PATCH v5 3/5] rxrpc: check return value of skb_to_sgvec always
From: Jason A. Donenfeld @ 2017-04-25 15:52 UTC (permalink / raw)
  To: netdev, linux-kernel, David.Laight, kernel-hardening, davem
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425155215.4835-1-Jason@zx2c4.com>

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 net/rxrpc/rxkad.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index 4374e7b9c7bf..dcf46c9c3ece 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -229,7 +229,9 @@ static int rxkad_secure_packet_encrypt(const struct rxrpc_call *call,
 	len &= ~(call->conn->size_align - 1);
 
 	sg_init_table(sg, nsg);
-	skb_to_sgvec(skb, sg, 0, len);
+	err = skb_to_sgvec(skb, sg, 0, len);
+	if (unlikely(err < 0))
+		goto out;
 	skcipher_request_set_crypt(req, sg, sg, len, iv.x);
 	crypto_skcipher_encrypt(req);
 
@@ -342,7 +344,8 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
 		goto nomem;
 
 	sg_init_table(sg, nsg);
-	skb_to_sgvec(skb, sg, offset, 8);
+	if (unlikely(skb_to_sgvec(skb, sg, offset, 8) < 0))
+		goto nomem;
 
 	/* start the decryption afresh */
 	memset(&iv, 0, sizeof(iv));
@@ -429,7 +432,8 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
 	}
 
 	sg_init_table(sg, nsg);
-	skb_to_sgvec(skb, sg, offset, len);
+	if (unlikely(skb_to_sgvec(skb, sg, offset, len) < 0))
+		goto nomem;
 
 	/* decrypt from the session key */
 	token = call->conn->params.key->payload.data[0];
-- 
2.12.2

^ permalink raw reply related

* [PATCH v5 4/5] macsec: check return value of skb_to_sgvec always
From: Jason A. Donenfeld @ 2017-04-25 15:52 UTC (permalink / raw)
  To: netdev, linux-kernel, David.Laight, kernel-hardening, davem
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425155215.4835-1-Jason@zx2c4.com>

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/net/macsec.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index dbab05afcdbe..d846f42b99ec 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -733,7 +733,12 @@ static struct sk_buff *macsec_encrypt(struct sk_buff *skb,
 	macsec_fill_iv(iv, secy->sci, pn);
 
 	sg_init_table(sg, MAX_SKB_FRAGS + 1);
-	skb_to_sgvec(skb, sg, 0, skb->len);
+	ret = skb_to_sgvec(skb, sg, 0, skb->len);
+	if (unlikely(ret < 0)) {
+		macsec_txsa_put(tx_sa);
+		kfree_skb(skb);
+		return ERR_PTR(ret);
+	}
 
 	if (tx_sc->encrypt) {
 		int len = skb->len - macsec_hdr_len(sci_present) -
@@ -937,7 +942,11 @@ static struct sk_buff *macsec_decrypt(struct sk_buff *skb,
 	macsec_fill_iv(iv, sci, ntohl(hdr->packet_number));
 
 	sg_init_table(sg, MAX_SKB_FRAGS + 1);
-	skb_to_sgvec(skb, sg, 0, skb->len);
+	ret = skb_to_sgvec(skb, sg, 0, skb->len);
+	if (unlikely(ret < 0)) {
+		kfree_skb(skb);
+		return ERR_PTR(ret);
+	}
 
 	if (hdr->tci_an & MACSEC_TCI_E) {
 		/* confidentiality: ethernet + macsec header
-- 
2.12.2

^ permalink raw reply related

* [PATCH v5 5/5] virtio_net: check return value of skb_to_sgvec always
From: Jason A. Donenfeld @ 2017-04-25 15:52 UTC (permalink / raw)
  To: netdev, linux-kernel, David.Laight, kernel-hardening, davem
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425155215.4835-1-Jason@zx2c4.com>

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/net/virtio_net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f36584616e7d..1709fd0b4bf7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1081,7 +1081,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	struct virtio_net_hdr_mrg_rxbuf *hdr;
 	const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
 	struct virtnet_info *vi = sq->vq->vdev->priv;
-	unsigned num_sg;
+	int num_sg;
 	unsigned hdr_len = vi->hdr_len;
 	bool can_push;
 
@@ -1114,6 +1114,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 		sg_set_buf(sq->sg, hdr, hdr_len);
 		num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
 	}
+	if (unlikely(num_sg < 0))
+		return num_sg;
 	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
 }
 
-- 
2.12.2

^ permalink raw reply related

* Re: [PATCH net-next 0/6] qed/qede: VF tunnelling support
From: David Miller @ 2017-04-25 15:52 UTC (permalink / raw)
  To: manish.chopra; +Cc: netdev, Yuval.Mintz
In-Reply-To: <20170424170049.16027-1-manish.chopra@cavium.com>

From: Manish Chopra <manish.chopra@cavium.com>
Date: Mon, 24 Apr 2017 10:00:43 -0700

> With this series VFs can run vxlan/geneve/gre tunnels over it.
> Please consider applying this series to "net-next"

Series applied.

^ permalink raw reply

* [PATCH v5 2/5] ipsec: check return value of skb_to_sgvec always
From: Jason A. Donenfeld @ 2017-04-25 15:52 UTC (permalink / raw)
  To: netdev, linux-kernel, David.Laight, kernel-hardening, davem
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425155215.4835-1-Jason@zx2c4.com>

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 net/ipv4/ah4.c  |  8 ++++++--
 net/ipv4/esp4.c | 30 ++++++++++++++++++++----------
 net/ipv6/ah6.c  |  8 ++++++--
 net/ipv6/esp6.c | 31 +++++++++++++++++++++----------
 4 files changed, 53 insertions(+), 24 deletions(-)

diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index 22377c8ff14b..e8f862358518 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -220,7 +220,9 @@ static int ah_output(struct xfrm_state *x, struct sk_buff *skb)
 	ah->seq_no = htonl(XFRM_SKB_CB(skb)->seq.output.low);
 
 	sg_init_table(sg, nfrags + sglists);
-	skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	err = skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	if (unlikely(err < 0))
+		goto out_free;
 
 	if (x->props.flags & XFRM_STATE_ESN) {
 		/* Attach seqhi sg right after packet payload */
@@ -393,7 +395,9 @@ static int ah_input(struct xfrm_state *x, struct sk_buff *skb)
 	skb_push(skb, ihl);
 
 	sg_init_table(sg, nfrags + sglists);
-	skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	err = skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	if (unlikely(err < 0))
+		goto out_free;
 
 	if (x->props.flags & XFRM_STATE_ESN) {
 		/* Attach seqhi sg right after packet payload */
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index b1e24446e297..42cb09cc8533 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -360,9 +360,13 @@ static int esp_output(struct xfrm_state *x, struct sk_buff *skb)
 			esph = esp_output_set_extra(skb, esph, extra);
 
 			sg_init_table(sg, nfrags);
-			skb_to_sgvec(skb, sg,
-				     (unsigned char *)esph - skb->data,
-				     assoclen + ivlen + clen + alen);
+			err = skb_to_sgvec(skb, sg,
+				           (unsigned char *)esph - skb->data,
+				           assoclen + ivlen + clen + alen);
+			if (unlikely(err < 0)) {
+				spin_unlock_bh(&x->lock);
+				goto error;
+			}
 
 			allocsize = ALIGN(skb->data_len, L1_CACHE_BYTES);
 
@@ -381,11 +385,13 @@ static int esp_output(struct xfrm_state *x, struct sk_buff *skb)
 			pfrag->offset = pfrag->offset + allocsize;
 
 			sg_init_table(dsg, skb_shinfo(skb)->nr_frags + 1);
-			skb_to_sgvec(skb, dsg,
-				     (unsigned char *)esph - skb->data,
-				     assoclen + ivlen + clen + alen);
+			err = skb_to_sgvec(skb, dsg,
+				           (unsigned char *)esph - skb->data,
+				           assoclen + ivlen + clen + alen);
 
 			spin_unlock_bh(&x->lock);
+			if (unlikely(err < 0))
+				goto error;
 
 			goto skip_cow2;
 		}
@@ -422,9 +428,11 @@ static int esp_output(struct xfrm_state *x, struct sk_buff *skb)
 	esph = esp_output_set_extra(skb, esph, extra);
 
 	sg_init_table(sg, nfrags);
-	skb_to_sgvec(skb, sg,
-		     (unsigned char *)esph - skb->data,
-		     assoclen + ivlen + clen + alen);
+	err = skb_to_sgvec(skb, sg,
+		           (unsigned char *)esph - skb->data,
+		           assoclen + ivlen + clen + alen);
+	if (unlikely(err < 0))
+		goto error;
 
 skip_cow2:
 	if ((x->props.flags & XFRM_STATE_ESN))
@@ -658,7 +666,9 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb)
 	esp_input_set_header(skb, seqhi);
 
 	sg_init_table(sg, nfrags);
-	skb_to_sgvec(skb, sg, 0, skb->len);
+	err = skb_to_sgvec(skb, sg, 0, skb->len);
+	if (unlikely(err < 0))
+		goto out;
 
 	skb->ip_summed = CHECKSUM_NONE;
 
diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index dda6035e3b84..755f38271dd5 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -423,7 +423,9 @@ static int ah6_output(struct xfrm_state *x, struct sk_buff *skb)
 	ah->seq_no = htonl(XFRM_SKB_CB(skb)->seq.output.low);
 
 	sg_init_table(sg, nfrags + sglists);
-	skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	err = skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	if (unlikely(err < 0))
+		goto out_free;
 
 	if (x->props.flags & XFRM_STATE_ESN) {
 		/* Attach seqhi sg right after packet payload */
@@ -606,7 +608,9 @@ static int ah6_input(struct xfrm_state *x, struct sk_buff *skb)
 	ip6h->hop_limit   = 0;
 
 	sg_init_table(sg, nfrags + sglists);
-	skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	err = skb_to_sgvec_nomark(skb, sg, 0, skb->len);
+	if (unlikely(err < 0))
+		goto out_free;
 
 	if (x->props.flags & XFRM_STATE_ESN) {
 		/* Attach seqhi sg right after packet payload */
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index ff54faa75631..017e2c2d36e1 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -339,9 +339,13 @@ static int esp6_output(struct xfrm_state *x, struct sk_buff *skb)
 			esph = esp_output_set_esn(skb, esph, seqhi);
 
 			sg_init_table(sg, nfrags);
-			skb_to_sgvec(skb, sg,
-				     (unsigned char *)esph - skb->data,
-				     assoclen + ivlen + clen + alen);
+			err = skb_to_sgvec(skb, sg,
+				           (unsigned char *)esph - skb->data,
+				           assoclen + ivlen + clen + alen);
+			if (unlikely(err < 0)) {
+				spin_unlock_bh(&x->lock);
+				goto error;
+			}
 
 			allocsize = ALIGN(skb->data_len, L1_CACHE_BYTES);
 
@@ -360,12 +364,15 @@ static int esp6_output(struct xfrm_state *x, struct sk_buff *skb)
 			pfrag->offset = pfrag->offset + allocsize;
 
 			sg_init_table(dsg, skb_shinfo(skb)->nr_frags + 1);
-			skb_to_sgvec(skb, dsg,
-				     (unsigned char *)esph - skb->data,
-				     assoclen + ivlen + clen + alen);
+			err = skb_to_sgvec(skb, dsg,
+				           (unsigned char *)esph - skb->data,
+				           assoclen + ivlen + clen + alen);
 
 			spin_unlock_bh(&x->lock);
 
+			if (unlikely(err < 0))
+				goto error;
+
 			goto skip_cow2;
 		}
 	}
@@ -403,9 +410,11 @@ static int esp6_output(struct xfrm_state *x, struct sk_buff *skb)
 	esph = esp_output_set_esn(skb, esph, seqhi);
 
 	sg_init_table(sg, nfrags);
-	skb_to_sgvec(skb, sg,
-		     (unsigned char *)esph - skb->data,
-		     assoclen + ivlen + clen + alen);
+	err = skb_to_sgvec(skb, sg,
+		           (unsigned char *)esph - skb->data,
+		           assoclen + ivlen + clen + alen);
+	if (unlikely(err < 0))
+		goto error;
 
 skip_cow2:
 	if ((x->props.flags & XFRM_STATE_ESN))
@@ -600,7 +609,9 @@ static int esp6_input(struct xfrm_state *x, struct sk_buff *skb)
 	esp_input_set_header(skb, seqhi);
 
 	sg_init_table(sg, nfrags);
-	skb_to_sgvec(skb, sg, 0, skb->len);
+	ret = skb_to_sgvec(skb, sg, 0, skb->len);
+	if (unlikely(ret < 0))
+		goto out;
 
 	skb->ip_summed = CHECKSUM_NONE;
 
-- 
2.12.2

^ permalink raw reply related

* Re: [PATCH] net: hso: fix module unloading
From: David Miller @ 2017-04-25 15:52 UTC (permalink / raw)
  To: andreas; +Cc: joe, peter, linux-usb, netdev, linux-kernel, letux-kernel
In-Reply-To: <1493061519-15785-1-git-send-email-andreas@kemnade.info>

From: Andreas Kemnade <andreas@kemnade.info>
Date: Mon, 24 Apr 2017 21:18:39 +0200

> keep tty driver until usb driver is unregistered
> rmmod hso
> produces traces like this without that:
 ...
> Signed-off-by: Andreas Kemnade <andreas@kemnade.info>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH v3 net] net: ipv6: regenerate host route if moved to gc list
From: David Ahern @ 2017-04-25 15:54 UTC (permalink / raw)
  To: Andrey Konovalov; +Cc: netdev, Dmitry Vyukov, mmanning, Martin KaFai Lau
In-Reply-To: <CAAeHK+yLe1PoFMqmGK0B6wXqfc0AZzRAmMLxks-vSkOp5Eq3VQ@mail.gmail.com>

On 4/25/17 6:50 AM, Andrey Konovalov wrote:
> I've been running syzkaller with your patch and got another report
> from ip6_pol_route.

In general the existing patch cleans up all of the ipv6 fib kasan and
WARN_ON traces that were seen?


> 
> It happened only once so far and I couldn't reproduce it.

Some similarity in the sense of a ipv4 route in the ipv6 fib. That's
only going to happen if it hits the dst gc list and then back.

This duplicates what Dmitry reported on March 3rd.

^ permalink raw reply

* Re: [PATCH] net: ethernet: ti: netcp_core: remove unused compl queue mapping
From: David Miller @ 2017-04-25 15:55 UTC (permalink / raw)
  To: ivan.khoronzhuk
  Cc: netdev, w-kwok2, m-karicheri2, linux-kernel, grygorii.strashko
In-Reply-To: <1493067246-3247-1-git-send-email-ivan.khoronzhuk@linaro.org>

From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Date: Mon, 24 Apr 2017 23:54:06 +0300

> This code is unused and probably was unintentionally left while
> moving completion queue mapping in submit function.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] selftests/net: Fix broken test case in psock_fanout
From: David Miller @ 2017-04-25 15:56 UTC (permalink / raw)
  To: maloneykernel; +Cc: netdev, maloney
In-Reply-To: <20170425012911.903-1-maloneykernel@gmail.com>

From: Mike Maloney <maloneykernel@gmail.com>
Date: Mon, 24 Apr 2017 21:29:11 -0400

> From: Mike Maloney <maloney@google.com>
> 
> The error return falue form sock_fanout_open is -1, not zero.  One test
> case was checking for 0 instead of -1.
> 
> Tested: Built and tested in clean client.
> Signed-off-by: Mike Maloney <maloney@google.com>

Applied, thanks.

^ permalink raw reply

* Re: net: heap out-of-bounds in fib6_clean_node/rt6_fill_node/fib6_age/fib6_prune_clone
From: David Ahern @ 2017-04-25 15:56 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mahesh Bandewar, Eric Dumazet, David Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML,
	Cong Wang, syzkaller
In-Reply-To: <CACT4Y+YCPxKkaZBfAabxtAV71R0ghLT-vUKVewJpm9wb2GXaEQ@mail.gmail.com>

On 3/4/17 11:57 AM, Dmitry Vyukov wrote:
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in rt6_dump_route+0x293/0x2f0
> net/ipv6/route.c:3551 at addr ffff88007e523694
> Read of size 4 by task syz-executor3/24426
> CPU: 2 PID: 24426 Comm: syz-executor3 Not tainted 4.10.0+ #293
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
>  print_address_description mm/kasan/report.c:208 [inline]
>  kasan_report_error mm/kasan/report.c:292 [inline]
>  kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
>  kasan_report mm/kasan/report.c:334 [inline]
>  __asan_report_load4_noabort+0x29/0x30 mm/kasan/report.c:334
>  rt6_dump_route+0x293/0x2f0 net/ipv6/route.c:3551
>  fib6_dump_node+0x101/0x1a0 net/ipv6/ip6_fib.c:315
>  fib6_walk_continue+0x4b3/0x620 net/ipv6/ip6_fib.c:1576
>  fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1621
>  fib6_dump_table net/ipv6/ip6_fib.c:374 [inline]
>  inet6_dump_fib+0x832/0xea0 net/ipv6/ip6_fib.c:447

My expectation is that this one is fixed with the ipv6 patch I have sent
(not yet committed). Are you seeing this backtrace with that patch?

^ permalink raw reply

* Re: [PATCH net] netvsc: fix calculation of available send sections
From: David Miller @ 2017-04-25 15:57 UTC (permalink / raw)
  To: stephen; +Cc: netdev, sthemmin
In-Reply-To: <20170425013338.2653-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Mon, 24 Apr 2017 18:33:38 -0700

> My change (introduced in 4.11) to use find_first_clear_bit
> incorrectly assumed that the size argument was words, not bits.
> The effect was only a small limited number of the available send
> sections were being actually used. This can cause performance loss
> with some workloads.
> 
> Since map_words is now used only during initialization, it can
> be on stack instead of in per-device data.
> 
> Fixes: b58a185801da ("netvsc: simplify get next send section")
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>

Applied, thanks.

^ permalink raw reply

* Re: net: heap out-of-bounds in fib6_clean_node/rt6_fill_node/fib6_age/fib6_prune_clone
From: David Ahern @ 2017-04-25 15:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Dumazet, Mahesh Bandewar, Eric Dumazet, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, netdev, LKML, Cong Wang, syzkaller
In-Reply-To: <CACT4Y+aDzAYs9mbNGeFEkYSB2wTwomZTGEKn4hDH0PAb4uiLqw@mail.gmail.com>

On 3/7/17 2:21 AM, Dmitry Vyukov wrote:
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 3990 at net/ipv6/ip6_fib.c:991
> fib6_add+0x2e12/0x3290 net/ipv6/ip6_fib.c:991 net/ipv6/ip6_fib.c:991
> Kernel panic - not syncing: panic_on_warn set ...
> 
> CPU: 2 PID: 3990 Comm: kworker/2:4 Not tainted 4.11.0-rc1+ #311
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: ipv6_addrconf addrconf_dad_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  __dump_stack lib/dump_stack.c:16 [inline] lib/dump_stack.c:52
>  dump_stack+0x2fb/0x3fd lib/dump_stack.c:52 lib/dump_stack.c:52
>  panic+0x20f/0x426 kernel/panic.c:180 kernel/panic.c:180
>  __warn+0x1c4/0x1e0 kernel/panic.c:541 kernel/panic.c:541
>  warn_slowpath_null+0x2c/0x40 kernel/panic.c:584 kernel/panic.c:584
>  fib6_add+0x2e12/0x3290 net/ipv6/ip6_fib.c:991 net/ipv6/ip6_fib.c:991
>  __ip6_ins_rt+0x60/0x80 net/ipv6/route.c:948 net/ipv6/route.c:948
>  ip6_ins_rt+0x19b/0x220 net/ipv6/route.c:959 net/ipv6/route.c:959
>  __ipv6_ifa_notify+0x62e/0x7a0 net/ipv6/addrconf.c:5485 net/ipv6/addrconf.c:5485
>  ipv6_ifa_notify+0xdf/0x1d0 net/ipv6/addrconf.c:5518 net/ipv6/addrconf.c:5518
>  addrconf_dad_completed+0xe6/0x950 net/ipv6/addrconf.c:3983
> net/ipv6/addrconf.c:3983
>  addrconf_dad_begin net/ipv6/addrconf.c:3797 [inline]

Similarly for this one.

^ permalink raw reply

* Re: [PATCH net-next] bpf: map_get_next_key to return first key on NULL
From: David Miller @ 2017-04-25 15:57 UTC (permalink / raw)
  To: ast; +Cc: daniel, qinteng, netdev, kernel-team
In-Reply-To: <20170425020037.1114047-1-ast@fb.com>

From: Alexei Starovoitov <ast@fb.com>
Date: Mon, 24 Apr 2017 19:00:37 -0700

> From: Teng Qin <qinteng@fb.com>
> 
> When iterating through a map, we need to find a key that does not exist
> in the map so map_get_next_key will give us the first key of the map.
> This often requires a lot of guessing in production systems.
> 
> This patch makes map_get_next_key return the first key when the key
> pointer in the parameter is NULL.
> 
> Signed-off-by: Teng Qin <qinteng@fb.com>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Applied, thanks Alexei.

^ permalink raw reply

* Re: [PATCH net 1/1] qed: Fix error in the dcbx app meta data initialization.
From: David Miller @ 2017-04-25 15:58 UTC (permalink / raw)
  To: sudarsana.kalluru; +Cc: netdev, Yuval.Mintz
In-Reply-To: <20170425035910.29296-1-sudarsana.kalluru@cavium.com>

From: Sudarsana Reddy Kalluru <sudarsana.kalluru@cavium.com>
Date: Mon, 24 Apr 2017 20:59:10 -0700

> DCBX app_data array is initialized with the incorrect values for
> personality field. This would  prevent offloaded protocols from
> honoring the PFC.
> 
> Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] macsec: avoid heap overflow in skb_to_sgvec on receive
From: Jason A. Donenfeld @ 2017-04-25 15:59 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Netdev
In-Reply-To: <20170425155140.GA13414@bistromath.localdomain>

On Tue, Apr 25, 2017 at 5:51 PM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> 2017-04-25, 17:39:09 +0200, Jason A. Donenfeld wrote:
>> Hi Sabrina,
>>
>> I think I may have beaten you to the punch here by a few minutes. :)
>
> I said I was going to post a patch.
> Mail headers seem to disagree with you ;)

Oh, whoops, sorry -- thought you wanted me to. Misread, but happy to
have done it anyway.

> Unless I missed something, encrypt was already handling fragments
> correctly. An skb with ->frag_list should have no skb_tailroom, so it
> will be linearized skb_copy_expand().

You're right. In that case, you might as well add back the FRAGLIST
annotation to the FEATURES, so that packets with frag_lists aren't
copied twice -- once upon linearization into xmit and again when
calling copy_expand in case they need more head or tail space. In
general, too, using the precise number of frags saves memory. But
above all, it seems to me the important thing is that it's very
_clear_ there isn't going to be an overflow, that the code doesn't
rely on so many odd particulars that then get violated later in its
life. Using the explicit length makes things a lot more clear. So
maybe better to go with mine?

^ permalink raw reply

* Re: [PATCH net-next] qed: fix invalid use of sizeof in qed_alloc_qm_data()
From: David Miller @ 2017-04-25 16:00 UTC (permalink / raw)
  To: weiyj.lk; +Cc: Yuval.Mintz, Ariel.Elior, weiyongjun1, everest-linux-l2, netdev
In-Reply-To: <20170425070718.14790-1-weiyj.lk@gmail.com>

From: Wei Yongjun <weiyj.lk@gmail.com>
Date: Tue, 25 Apr 2017 07:07:18 +0000

> From: Wei Yongjun <weiyongjun1@huawei.com>
> 
> sizeof() when applied to a pointer typed expression gives the
> size of the pointer, not that of the pointed data.
> 
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied with Yuval's Fixes: tag added.

^ permalink raw reply

* Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
From: Jiri Pirko @ 2017-04-25 16:04 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: davem, xiyou.wangcong, eric.dumazet, netdev
In-Reply-To: <5e54edd8-3943-6f09-490f-ff04b83077f6@mojatatu.com>

Tue, Apr 25, 2017 at 03:01:22PM CEST, jhs@mojatatu.com wrote:
>On 17-04-25 08:13 AM, Jiri Pirko wrote:
>> Tue, Apr 25, 2017 at 01:54:06PM CEST, jhs@mojatatu.com wrote:
>
>
>[..]
>
>> > -#define TCAA_MAX 1
>> > +/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
>> > + *
>> > + * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
>> > + * actions in a dump. All dump responses will contain the number of actions
>> > + * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
>> > + *
>> > + */
>> > +#define TCA_FLAG_LARGE_DUMP_ON		(1 << 0)
>> 
>> BIT (I think I mentioned this before)
>> 
>
>You did - but i took it out about two submissions back (per cover
>letter) because it is no part of UAPI today. I noticed devlink was
>using it but they defined  their own variant.
>So if i added this, iproute2 doesnt compile. I could fix iproute2
>to move it somewhere to a common header then restore this.

So fix iproute2. It is always first kernel, then iproute2.


>
>> > +#define VALID_TCA_FLAGS TCA_FLAG_LARGE_DUMP_ON
>> 
>> Again, namespace please. "TCA_ROOT_FLAGS_VALID"
>
>Good point.
>
>> Why is this UAPI?
>> 
>
>Should not be. I will fix in next update.
>
>
>> 
>> > 
>> > /* New extended info filters for IFLA_EXT_MASK */
>> > #define RTEXT_FILTER_VF		(1 << 0)
>> > diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> > index 9ce22b7..2756213 100644
>> > --- a/net/sched/act_api.c
>> > +++ b/net/sched/act_api.c
>> > @@ -83,6 +83,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> > 			   struct netlink_callback *cb)
>> > {
>> > 	int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
>> > +	u32 act_flags = cb->args[2];
>> > 	struct nlattr *nest;
>> > 
>> > 	spin_lock_bh(&hinfo->lock);
>> > @@ -111,14 +112,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
>> > 			}
>> > 			nla_nest_end(skb, nest);
>> > 			n_i++;
>> > -			if (n_i >= TCA_ACT_MAX_PRIO)
>> > +			if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
>> > +			    n_i >= TCA_ACT_MAX_PRIO)
>> > 				goto done;
>> > 		}
>> > 	}
>> > done:
>> > 	spin_unlock_bh(&hinfo->lock);
>> > -	if (n_i)
>> > +	if (n_i) {
>> > 		cb->args[0] += n_i;
>> > +		if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
>> > +			cb->args[1] = n_i;
>> > +	}
>> > 	return n_i;
>> > 
>> > nla_put_failure:
>> > @@ -993,11 +998,15 @@ static int tcf_action_add(struct net *net, struct nlattr *nla,
>> > 	return tcf_add_notify(net, n, &actions, portid);
>> > }
>> > 
>> > +static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
>> > +	[TCA_ROOT_FLAGS]      = { .type = NLA_U32 },
>> > +};
>> > +
>> > static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> > 			 struct netlink_ext_ack *extack)
>> > {
>> > 	struct net *net = sock_net(skb->sk);
>> > -	struct nlattr *tca[TCAA_MAX + 1];
>> > +	struct nlattr *tca[TCA_ROOT_MAX + 1];
>> > 	u32 portid = skb ? NETLINK_CB(skb).portid : 0;
>> > 	int ret = 0, ovr = 0;
>> > 
>> > @@ -1005,7 +1014,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> > 	    !netlink_capable(skb, CAP_NET_ADMIN))
>> > 		return -EPERM;
>> > 
>> > -	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
>> > +	ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ROOT_MAX, NULL,
>> > 			  extack);
>> > 	if (ret < 0)
>> > 		return ret;
>> > @@ -1046,16 +1055,12 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
>> > 	return ret;
>> > }
>> > 
>> > -static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>> > +static struct nlattr *find_dump_kind(struct nlattr **nla)
>> > {
>> > 	struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
>> > 	struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
>> > -	struct nlattr *nla[TCAA_MAX + 1];
>> > 	struct nlattr *kind;
>> > 
>> > -	if (nlmsg_parse(n, sizeof(struct tcamsg), nla, TCAA_MAX,
>> > -			NULL, NULL) < 0)
>> > -		return NULL;
>> > 	tb1 = nla[TCA_ACT_TAB];
>> > 	if (tb1 == NULL)
>> > 		return NULL;
>> > @@ -1073,6 +1078,16 @@ static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
>> > 	return kind;
>> > }
>> > 
>> > +static inline bool tca_flags_valid(u32 act_flags)
>> > +{
>> > +	u32 invalid_flags_mask  = ~VALID_TCA_FLAGS;
>> > +
>> > +	if (act_flags & invalid_flags_mask)
>> > +		return false;
>> 
>> I don't see how this resolves anything. VALID_TCA_FLAGS is set in stone
>> not going to change anytime in future, right?
>
>Every time a new bit gets added VALID_TCA_FLAGS changes.

You mean flag that user can set? If that is the case, you are breaking
UAPI for newer app running on older kernel.


>
>> Then the TCA_ROOT_FLAGS
>> attr will always contain only one flag, right?
>
>Not true - forever. Just in this patch discussion:
>I have added 2 other flags since removed. So I cant predict the
>future. You keep coming back to this assumption there will always
>ever be this one flag. I am not following how you reach this
>conclusion.

I read this paragraph 5 times, still I don't get what you say :/


>
>> Then, I don't see why do we need this dance... u8 flag attr resolves
>> this in elegant way. I know I sound like a broken record. This is the
>> last time I'm commenting on this. I give up.
>> 
>
>Why is a u8 better than u32 Jiri?
>The TLV space consumed is still 64 bits in both cases. If I define u8,
>u16, u32 - it is still 64 bits being alloced. If i use a u8 then i have
>24 bits which are pads - given current discussions I can never ever use
>again!

I don't care, use u8 or u32. Just 1 attr - 1 flag.


>
>If i keep 32 bits I am free to use those without ever changing the
>data structure (except for the restrictions to now make sure nothing
>gets set).

what datastructure? I'm confused.

>
>cheers,
>jamal

^ permalink raw reply

* Re: qed*: debug infrastructures
From: Florian Fainelli @ 2017-04-25 16:13 UTC (permalink / raw)
  To: Elior, Ariel, David Miller
  Cc: netdev@vger.kernel.org, Mintz, Yuval, Tayar, Tomer, Dupuis, Chad
In-Reply-To: <CY1PR0701MB1337FA26F1503FC9928B0340901F0@CY1PR0701MB1337.namprd07.prod.outlook.com>

On 04/24/2017 10:38 AM, Elior, Ariel wrote:
> Hi Dave,
> 
> According to the recent messages on the list indicating debugfs is not the way
> to go, I am looking for some guidance on what is. dpipe approach was
> mentioned as favorable, but I wanted to make sure molding our debug features to
> this infrastructure will result in something acceptable. A few points:
> 
> 1.
> One of our HW debug features is a signal recording feature. HW is configured to
> output specific signals, which are then continuously dumped into a cyclic
> buffer on host. There are ~8000 signals, which can be logically divided to ~100
> groups. I believe this can be modeled in dpipe (or similar tool) as a set of
> ~100 tables with ~80 entries each, and user would be able to see them all and
> choose what they like. The output data itself is binary, and meaningless to
> "the user". The amount of data is basically as large a contiguous buffer as
> driver can allocate, i.e. usually 4MB. When user selects the signals, and sets
> meta data regarding to mode of operations, some device configuration will have
> to take place. Does that sound reasonable?
> This debug feature already exists out of tree for bnx2x and qed* drivers and is
> *very* effective in field deployments. I would very much like to see this as an
> in-tree feature via some infrastructure or another.

By signals do you mean actual electrical signals coming from your chip's
pins/balls? FWIW, there is kind of something similar existing with ARM
Ltd's Embedded Trace Module which allows capturing a SoC's CPU activity
(loads/stores/pc progression etc.) using a high speed interface. You may
want to look at drivers/hwtracing/{coresight,stm}/ for examples on how
this interfaces with Linux' perf events subsystems. There are existing
trace buffers infrastructures and it sounds like your use case fits
nicely within the perf events framework here where you may want the
entire set or subset of these 8000 signals to be recorded in a buffer.


> 
> 2.
> Sometimes we want to debug the probe or removal flow of the driver. ethtool has
> the disadvantage of only being available once network device is available. If a
> bug stops the load flow before the ethtool debug paths are available, there is
> no way to collect a dump. Similarly, removal flows which hit a bug but do remove
> the network device, can't be debugged from ethtool. Does dpipe suffer from the
> same problem? qed* (like mlx*) has a common-functionality module. This allows
> creating debugfs nodes even before the network drivers are probed, providing a
> solution for this (debug nodes are also available after network driver removal).
> If dpipe does hold an answer here (e.g. provide preconfiguration which would
> kick in when network device registers) then we might want to port all of our
> register dump logic over there for this advantage. Does that sound reasonable?

Can you consider using tracing for that particular purpose with trace
filters that are specific to the device of interest? That should allow
you to get events down from the Linux device driver model all the way to
when your driver actually gets initialized. Is that an option?
-- 
Florian

^ permalink raw reply

* Re: IGMP on IPv6
From: Murali Karicheri @ 2017-04-25 16:16 UTC (permalink / raw)
  To: Cong Wang; +Cc: Hangbin Liu, open list:TI NETCP ETHERNET DRIVER, David Miller
In-Reply-To: <CAM_iQpWJ_oPSWV-PBeBGj5LoddMHTmrimLbViXLJ-VKR21e4mQ@mail.gmail.com>

On 04/18/2017 06:37 PM, Cong Wang wrote:
> On Tue, Apr 18, 2017 at 10:20 AM, Murali Karicheri <m-karicheri2@ti.com> wrote:
>> On 04/18/2017 01:12 PM, Murali Karicheri wrote:
>>> On 04/17/2017 05:38 PM, Cong Wang wrote:
>>>> Hello,
>>>>
>>>> On Thu, Apr 13, 2017 at 9:36 AM, Murali Karicheri <m-karicheri2@ti.com> wrote:
>>>>> On 03/22/2017 11:04 AM, Murali Karicheri wrote:
>>>>>> This is going directly to the slave Ethernet interface.
>>>>>>
>>>>>> When I put a WARN_ONCE, I found this is coming directly from
>>>>>> mld_ifc_timer_expire() -> mld_sendpack() -> ip6_output()
>>>>>>
>>>>>> Do you think this is fixed in latest kernel at master? If so, could
>>>>>> you point me to some commits.
>>>>>>
>>>>>>
>>>>> Ping... I see this behavior is also seen on v4.9.x Kernel. Any clue if
>>>>> this is fixed by some commit or I need to debug? I see IGMPv6 has some
>>>>> fixes on the list to make it similar to IGMPv4. So can someone clarify this is
>>>>> is a bug at IGMPv6 code or I need to look into the HSR driver code?
>>>>> Since IGMPv4 is going over the HSR interface I am assuming this is a
>>>>> bug in the IGMPv6 code. But since I have not experience with this code
>>>>> can some expert comment please?
>>>>>
>>>>
>>>> How did you configure your network interfaces and IPv4/IPv6 multicast?
>>>> IOW, how did you reproduce this? For example, did you change your
>>>> HSR setup when this happened since you mentioned
>>>> NETDEV_CHANGEUPPER?
>>>>
>>> Thanks for responding! Really appreciate.
>>>
>>> I didn't set up anything explicitly for IPv4/IPv6 multicast. As part of
>>> my testing, I dump the packets going through the slave interfaces attached
>>> to the hsr interface (for example my Ethernet interfaces eth2 and eth3
>>> are attached to the hsr interface and I dump the packets at the egress
>>> of eth2 and eth3 in my driver along with that at hsr xmit function). As
>>> soon as I create the hsr interface, I see a bunch of packets going directly
>>> through the lower interface, not through the upper one (i.e hsr interface)
>>> and these are of eth_type = 86 dd. Please ignore my reference to
>>> NETDEV_CHANGEUPPER for now as it was wild guess.
> 
> OK. Note: I know nothing about HSR, I assume it is similar to bonding
> in your case?
> 

Similar in the sense it glues two standard Ethernet interfaces and run
HSR protocol frames over it to support redundancy.

>>>
>>> I have not done any debugging, but the WARN_ONCE which I have placed
>>> in the lower level driver looking for eth_type = 86 dd provided the
>>> above trace.
>>>
>> Here is the command I have used to create the hsr interface...
>>
>> ip link add name hsr0 type hsr slave1 eth2 slave2 eth3 supervision 45 version 1
> 
> Did you assign IPv4 and IPv6 addresses to the HSR master device?

No. I just used IPv4. From the trace mld_ifc_timer_expire() -> mld_sendpack() -> ip6_output()
do you know what is it trying to do? Is it some neighbor discovery message or something
going over the lower interface instead of the hsr interface?

Murali

> 


-- 
Murali Karicheri
Linux Kernel, Keystone

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox