Netdev List
 help / color / mirror / Atom feed
* [PATCH 07/16] irda: Fix missing msg_namelen update in irda_recvmsg_dgram()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about irda_recvmsg_dgram() not filling the msg_name in case it was
set.

Cc: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/irda/af_irda.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index d28e7f0..e493b33 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1386,6 +1386,8 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
 
 	IRDA_DEBUG(4, "%s()\n", __func__);
 
+	msg->msg_namelen = 0;
+
 	skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
 				flags & MSG_DONTWAIT, &err);
 	if (!skb)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 08/16] iucv: Fix missing msg_namelen update in iucv_sock_recvmsg()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about iucv_sock_recvmsg() not filling the msg_name in case it was set.

Cc: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
Remark: I failed to test that one as I've no access to a S390 system.

 net/iucv/af_iucv.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index a7d11ffe..bf69358 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1328,6 +1328,8 @@ static int iucv_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	struct sk_buff *skb, *rskb, *cskb;
 	int err = 0;
 
+	msg->msg_namelen = 0;
+
 	if ((sk->sk_state == IUCV_DISCONN) &&
 	    skb_queue_empty(&iucv->backlog_skb_q) &&
 	    skb_queue_empty(&sk->sk_receive_queue) &&
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 09/16] l2tp: fix info leak in l2tp_ip6_recvmsg()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The L2TP code for IPv6 fails to initialize the l2tp_conn_id member of
struct sockaddr_l2tpip6 and therefore leaks four bytes kernel stack
in l2tp_ip6_recvmsg() in case msg_name is set.

Initialize l2tp_conn_id with 0 to avoid the info leak.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/l2tp/l2tp_ip6.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index c74f5a9..b8a6039 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -690,6 +690,7 @@ static int l2tp_ip6_recvmsg(struct kiocb *iocb, struct sock *sk,
 		lsa->l2tp_addr = ipv6_hdr(skb)->saddr;
 		lsa->l2tp_flowinfo = 0;
 		lsa->l2tp_scope_id = 0;
+		lsa->l2tp_conn_id = 0;
 		if (ipv6_addr_type(&lsa->l2tp_addr) & IPV6_ADDR_LINKLOCAL)
 			lsa->l2tp_scope_id = IP6CB(skb)->iif;
 	}
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 10/16] llc: Fix missing msg_namelen update in llc_ui_recvmsg()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

For stream sockets the code misses to update the msg_namelen member
to 0 and therefore makes net/socket.c leak the local, uninitialized
sockaddr_storage variable to userland -- 128 bytes of kernel stack
memory. The msg_namelen update is also missing for datagram sockets
in case the socket is shutting down during receive.

Fix both issues by setting msg_namelen to 0 early. It will be
updated later if we're going to fill the msg_name member.

Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/llc/af_llc.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 8870988..48aaa89 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -720,6 +720,8 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 	int target;	/* Read at least this many bytes */
 	long timeo;
 
+	msg->msg_namelen = 0;
+
 	lock_sock(sk);
 	copied = -ENOTCONN;
 	if (unlikely(sk->sk_type == SOCK_STREAM && sk->sk_state == TCP_LISTEN))
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 11/16] netrom: fix info leak via msg_name in nr_recvmsg()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

In case msg_name is set the sockaddr info gets filled out, as
requested, but the code fails to initialize the padding bytes of
struct sockaddr_ax25 inserted by the compiler for alignment. Also
the sax25_ndigis member does not get assigned, leaking four more
bytes.

Both issues lead to the fact that the code will leak uninitialized
kernel stack bytes in net/socket.c.

Fix both issues by initializing the memory with memset(0).

Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/netrom/af_netrom.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index d1fa1d9..7fcb307 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1173,6 +1173,7 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	if (sax != NULL) {
+		memset(sax, 0, sizeof(sax));
 		sax->sax25_family = AF_NETROM;
 		skb_copy_from_linear_data_offset(skb, 7, sax->sax25_call.ax25_call,
 			      AX25_ADDR_LEN);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 12/16] NFC: llcp: fix info leaks via msg_name in llcp_sock_recvmsg()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The code in llcp_sock_recvmsg() does not initialize all the members of
struct sockaddr_nfc_llcp when filling the sockaddr info. Nor does it
initialize the padding bytes of the structure inserted by the compiler
for alignment.

Also, if the socket is in state LLCP_CLOSED or is shutting down during
receive the msg_namelen member is not updated to 0 while otherwise
returning with 0, i.e. "success". The msg_namelen update is also
missing for stream and seqpacket sockets which don't fill the sockaddr
info.

Both issues lead to the fact that the code will leak uninitialized
kernel stack bytes in net/socket.c.

Fix the first issue by initializing the memory used for sockaddr info
with memset(0). Fix the second one by setting msg_namelen to 0 early.
It will be updated later if we're going to fill the msg_name member.

Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/nfc/llcp/sock.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/nfc/llcp/sock.c b/net/nfc/llcp/sock.c
index 5c7cdf3..4741adc 100644
--- a/net/nfc/llcp/sock.c
+++ b/net/nfc/llcp/sock.c
@@ -646,6 +646,8 @@ static int llcp_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	pr_debug("%p %zu\n", sk, len);
 
+	msg->msg_namelen = 0;
+
 	lock_sock(sk);
 
 	if (sk->sk_state == LLCP_CLOSED &&
@@ -691,6 +693,7 @@ static int llcp_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 		pr_debug("Datagram socket %d %d\n", ui_cb->dsap, ui_cb->ssap);
 
+		memset(sockaddr, 0, sizeof(*sockaddr));
 		sockaddr->sa_family = AF_NFC;
 		sockaddr->nfc_protocol = NFC_PROTO_NFC_DEP;
 		sockaddr->dsap = ui_cb->dsap;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 13/16] rose: fix info leak via msg_name in rose_recvmsg()
From: Mathias Krause @ 2013-04-07 11:51 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The code in rose_recvmsg() does not initialize all of the members of
struct sockaddr_rose/full_sockaddr_rose when filling the sockaddr info.
Nor does it initialize the padding bytes of the structure inserted by
the compiler for alignment. This will lead to leaking uninitialized
kernel stack bytes in net/socket.c.

Fix the issue by initializing the memory used for sockaddr info with
memset(0).

Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/rose/af_rose.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index cf68e6e..9c83474 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1253,6 +1253,7 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 
 	if (srose != NULL) {
+		memset(srose, 0, msg->msg_namelen);
 		srose->srose_family = AF_ROSE;
 		srose->srose_addr   = rose->dest_addr;
 		srose->srose_call   = rose->dest_call;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 14/16] tipc: fix info leaks via msg_name in recv_msg/recv_stream
From: Mathias Krause @ 2013-04-07 11:52 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The code in set_orig_addr() does not initialize all of the members of
struct sockaddr_tipc when filling the sockaddr info -- namely the union
is only partly filled. This will make recv_msg() and recv_stream() --
the only users of this function -- leak kernel stack memory as the
msg_name member is a local variable in net/socket.c.

Additionally to that both recv_msg() and recv_stream() fail to update
the msg_namelen member to 0 while otherwise returning with 0, i.e.
"success". This is the case for, e.g., non-blocking sockets. This will
lead to a 128 byte kernel stack leak in net/socket.c.

Fix the first issue by initializing the memory of the union with
memset(0). Fix the second one by setting msg_namelen to 0 early as it
will be updated later if we're going to fill the msg_name member.

Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/tipc/socket.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index a9622b6..515ce38 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -790,6 +790,7 @@ static void set_orig_addr(struct msghdr *m, struct tipc_msg *msg)
 	if (addr) {
 		addr->family = AF_TIPC;
 		addr->addrtype = TIPC_ADDR_ID;
+		memset(&addr->addr, 0, sizeof(addr->addr));
 		addr->addr.id.ref = msg_origport(msg);
 		addr->addr.id.node = msg_orignode(msg);
 		addr->addr.name.domain = 0;	/* could leave uninitialized */
@@ -904,6 +905,9 @@ static int recv_msg(struct kiocb *iocb, struct socket *sock,
 		goto exit;
 	}
 
+	/* will be updated in set_orig_addr() if needed */
+	m->msg_namelen = 0;
+
 	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
 restart:
 
@@ -1013,6 +1017,9 @@ static int recv_stream(struct kiocb *iocb, struct socket *sock,
 		goto exit;
 	}
 
+	/* will be updated in set_orig_addr() if needed */
+	m->msg_namelen = 0;
+
 	target = sock_rcvlowat(sk, flags & MSG_WAITALL, buf_len);
 	timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 15/16] VSOCK: vmci - fix possible info leak in vmci_transport_dgram_dequeue()
From: Mathias Krause @ 2013-04-07 11:52 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

In case we received no data on the call to skb_recv_datagram(), i.e.
skb->data is NULL, vmci_transport_dgram_dequeue() will return with 0
without updating msg_namelen leading to net/socket.c leaking the local,
uninitialized sockaddr_storage variable to userland -- 128 bytes of
kernel stack memory.

Fix this by moving the already existing msg_namelen assignment a few
lines above.

Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: George Zhang <georgezhang@vmware.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/vmw_vsock/vmci_transport.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index a70ace8..a8e4e70 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1727,6 +1727,8 @@ static int vmci_transport_dgram_dequeue(struct kiocb *kiocb,
 	if (flags & MSG_OOB || flags & MSG_ERRQUEUE)
 		return -EOPNOTSUPP;
 
+	msg->msg_namelen = 0;
+
 	/* Retrieve the head sk_buff from the socket's receive queue. */
 	err = 0;
 	skb = skb_recv_datagram(&vsk->sk, flags, noblock, &err);
@@ -1759,7 +1761,6 @@ static int vmci_transport_dgram_dequeue(struct kiocb *kiocb,
 	if (err)
 		goto out;
 
-	msg->msg_namelen = 0;
 	if (msg->msg_name) {
 		struct sockaddr_vm *vm_addr;
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 16/16] VSOCK: Fix missing msg_namelen update in vsock_stream_recvmsg()
From: Mathias Krause @ 2013-04-07 11:52 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Allan Stephens, Aloisio Almeida Jr, Andy King,
	Arnaldo Carvalho de Melo, Dmitry Torokhov, George Zhang,
	Gustavo Padovan, Johan Hedberg, Jon Maloy, Lauro Ramos Venancio,
	Marcel Holtmann, Ralf Baechle, Samuel Ortiz, Samuel Ortiz,
	Sjur Braendeland, Ursula Braun, Brad Spengler, Mathias Krause
In-Reply-To: <1365335522-29931-1-git-send-email-minipli@googlemail.com>

The code misses to update the msg_namelen member to 0 and therefore
makes net/socket.c leak the local, uninitialized sockaddr_storage
variable to userland -- 128 bytes of kernel stack memory.

Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: George Zhang <georgezhang@vmware.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
---
 net/vmw_vsock/af_vsock.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index ca511c4..08a228d 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1670,6 +1670,8 @@ vsock_stream_recvmsg(struct kiocb *kiocb,
 	vsk = vsock_sk(sk);
 	err = 0;
 
+	msg->msg_namelen = 0;
+
 	lock_sock(sk);
 
 	if (sk->sk_state != SS_CONNECTED) {
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH iproute2 2/2] ip: ipv6: add tokenized interface identifier support
From: Daniel Borkmann @ 2013-04-07 11:58 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: davem, netdev, Hannes Frederic Sowa, YOSHIFUJI Hideaki
In-Reply-To: <20130405093914.55c9114c@nehalam.linuxnetplumber.net>

On 04/05/2013 06:39 PM, Stephen Hemminger wrote:
> On Thu,  4 Apr 2013 16:37:38 +0200
> Daniel Borkmann <dborkman@redhat.com> wrote:
>
>> This is experimental support for tokenized IIDs, that enable
>> administrators to assign well-known host-part addresses to nodes
>> whilst still obtaining global network prefix from Router
>> Advertisements. It is currently in IETF RFC draft status [1].
>>
>> Example commands with iproute2:
>>
>> Setting a device token:
>>    # ip token set ::1a:2b:3c:4d/64 dev eth1
>>
>> Getting a device token:
>>    # ip token get dev eth1
>>    token ::1a:2b:3c:4d dev eth1
>>
>> Listing all tokens:
>>    # ip token list  (or: ip token)
>>    token :: dev eth0
>>    token ::1a:2b:3c:4d dev eth1
>>
>>   [1] http://tools.ietf.org/html/draft-chown-6man-tokenised-ipv6-identifiers-02
>>
>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>
> Deferred.
> Please resubmit during 3.10 merge window.

Thanks, I'll do.

^ permalink raw reply

* Re: [PATCH 4/5] tipc: add InfiniBand media type
From: Patrick McHardy @ 2013-04-07 12:04 UTC (permalink / raw)
  To: Ying Xue
  Cc: jon.maloy-IzeFyvvaP7pWk0Htik3J/w,
	allan.stephens-CWA4WttNNZF54TAoqtyWWQ,
	netdev-u79uwXL29TY76Z2rM5mHXA, roland-DgEjT+Ai2ygdnm+yROfE0A,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <51613740.6050501-CWA4WttNNZF54TAoqtyWWQ@public.gmane.org>

On Sun, Apr 07, 2013 at 05:07:12PM +0800, Ying Xue wrote:
> 
> > --- a/net/tipc/core.c
> > +++ b/net/tipc/core.c
> > @@ -82,6 +82,7 @@ static void tipc_core_stop_net(void)
> >  {
> >  	tipc_net_stop();
> >  	tipc_eth_media_stop();
> > +	tipc_ib_media_stop();
> >  }
> >  
> >  /**
> > @@ -93,8 +94,17 @@ int tipc_core_start_net(unsigned long addr)
> >  
> >  	tipc_net_start(addr);
> >  	res = tipc_eth_media_start();
> > -	if (res)
> > -		tipc_core_stop_net();
> > +	if (res < 0)
> > +		goto err1;
> > +	res = tipc_ib_media_start();
> > +	if (res < 0)
> > +		goto err2;
> > +	return res;
> > +
> > +err2:
> > +	tipc_eth_media_stop();
> 
> Why do we need to call tipc_eth_media_stop() separately?
> In any failed case, we will finally invoke tipc_core_stop_net() which
> already places tipc_eth_media_stop().

Right, that's not necessary, although I think its cleaner to do have error
handling just be the opposite of initialization, IOW calling tipc_net_stop()
instead of tipc_core_stop_net().

But I don't care much either way, will fix this up for the next submission,
thanks.

> 
> > +err1:
> > +	tipc_core_stop_net();
> >  	return res;
> >  }
> >  
> 
> Regards,
> Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* AMD Vi error and lost networking with r8169
From: David R @ 2013-04-07 12:29 UTC (permalink / raw)
  To: Linux Kernel Mailing List, netdev

[-- Attachment #1: Type: text/plain, Size: 4151 bytes --]

I'm been seeing some problems with my new ish AMD motherboard/processor
combo and networking (r8169). I see the following page fault :-

Apr  7 12:25:14 david kernel: [156421.436545] AMD-Vi: Event logged
[IO_PAGE_FAULT device=02:00.0 domain=0x0015 address=0x0000000000003000
flags=0x0050]

Followed by the transmit queue timing out. This seems to hit randomly,
sometimes it can take a day or so. A hard reset is the only way to
recover networking.

(Userspace is Ubuntu 10.04, kernel 3.9.0 rc5+)

Cheers
David

Apr  7 12:26:09 david kernel: [156475.568257] ------------[ cut here
]------------
Apr  7 12:26:09 david kernel: [156475.568273] WARNING: at
net/sched/sch_generic.c:255 dev_watchdog+0x250/0x260()
Apr  7 12:26:09 david kernel: [156475.568278] Hardware name: To be
filled by O.E.M.
Apr  7 12:26:09 david kernel: [156475.568282] NETDEV WATCHDOG: eth2
(r8169): transmit queue 0 timed out
Apr  7 12:26:09 david kernel: [156475.568285] Modules linked in: xfs
exportfs libcrc32c nls_iso8859_1 nls_cp437 vfat fat ecryptfs
encrypted_keys nfsv3 nfs_acl nfs lockd sunrpc binfmt_misc ppdev dm_crypt
hid_logitech uvcvideo ff_memless videobuf2_core videodev snd_usb_audio
videobuf2_vmalloc snd_usbmidi_lib videobuf2_memops usbhid
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_seq_dummy snd_hwdep snd_seq_oss fbcon ttm tileblit font bitblit
snd_pcm_oss softcursor snd_seq_midi drm_kms_helper snd_rawmidi
snd_mixer_oss drm snd_seq_midi_event crc32_pclmul serio_raw snd_pcm
snd_seq i2c_piix4 r8169 snd_timer snd_seq_device mii snd i2c_algo_bit
soundcore snd_page_alloc lp parport
Apr  7 12:26:09 david kernel: [156475.568348] Pid: 0, comm: swapper/0
Not tainted 3.9.0-rc5+ #17
Apr  7 12:26:09 david kernel: [156475.568351] Call Trace:
Apr  7 12:26:09 david kernel: [156475.568354]  <IRQ> 
[<ffffffff81039c2a>] warn_slowpath_common+0x7a/0xc0
Apr  7 12:26:09 david kernel: [156475.568366]  [<ffffffff81039d11>]
warn_slowpath_fmt+0x41/0x50
Apr  7 12:26:09 david kernel: [156475.568374]  [<ffffffff81544630>]
dev_watchdog+0x250/0x260
Apr  7 12:26:09 david kernel: [156475.568380]  [<ffffffff815443e0>] ?
__netdev_watchdog_up+0x80/0x80
Apr  7 12:26:09 david kernel: [156475.568386]  [<ffffffff810497a4>]
call_timer_fn+0x44/0x120
Apr  7 12:26:09 david kernel: [156475.568391]  [<ffffffff815443e0>] ?
__netdev_watchdog_up+0x80/0x80
Apr  7 12:26:09 david kernel: [156475.568396]  [<ffffffff81049d83>]
run_timer_softirq+0x213/0x280
Apr  7 12:26:09 david kernel: [156475.568402]  [<ffffffff81041dbf>]
__do_softirq+0xdf/0x260
Apr  7 12:26:09 david kernel: [156475.568408]  [<ffffffff81042025>]
irq_exit+0xb5/0xc0
Apr  7 12:26:09 david kernel: [156475.568413]  [<ffffffff81023fc9>]
smp_apic_timer_interrupt+0x69/0xa0
Apr  7 12:26:09 david kernel: [156475.568418]  [<ffffffff81638e8a>]
apic_timer_interrupt+0x6a/0x70
Apr  7 12:26:09 david kernel: [156475.568420]  <EOI> 
[<ffffffff8106e9a5>] ? sched_clock_cpu+0xc5/0x100
Apr  7 12:26:09 david kernel: [156475.568432]  [<ffffffff814e5542>] ?
cpuidle_wrap_enter+0x42/0x80
Apr  7 12:26:09 david kernel: [156475.568437]  [<ffffffff814e553e>] ?
cpuidle_wrap_enter+0x3e/0x80
Apr  7 12:26:09 david kernel: [156475.568443]  [<ffffffff814e5590>]
cpuidle_enter_tk+0x10/0x20
Apr  7 12:26:09 david kernel: [156475.568448]  [<ffffffff814e4fc2>]
cpuidle_enter_state+0x12/0x50
Apr  7 12:26:09 david kernel: [156475.568453]  [<ffffffff814e57d2>]
cpuidle_idle_call+0xa2/0x100
Apr  7 12:26:09 david kernel: [156475.568459]  [<ffffffff8100a3b7>]
cpu_idle+0xc7/0x120
Apr  7 12:26:09 david kernel: [156475.568463]  [<ffffffff8161fddd>]
rest_init+0x6d/0x80
Apr  7 12:26:09 david kernel: [156475.568470]  [<ffffffff81cc7fe3>]
start_kernel+0x3b6/0x3c3
Apr  7 12:26:09 david kernel: [156475.568475]  [<ffffffff81cc7a4d>] ?
repair_env_string+0x5b/0x5b
Apr  7 12:26:09 david kernel: [156475.568481]  [<ffffffff81cc75a1>]
x86_64_start_reservations+0x2a/0x2c
Apr  7 12:26:09 david kernel: [156475.568486]  [<ffffffff81cc76cc>]
x86_64_start_kernel+0x129/0x130
Apr  7 12:26:09 david kernel: [156475.568489] ---[ end trace
31688db2ca49b077 ]---
Apr  7 12:26:09 david kernel: [156475.720834] r8169 0000:02:00.0 eth2:
link up


[-- Attachment #2: dmesg.log.bz2 --]
[-- Type: application/x-bzip, Size: 14201 bytes --]

[-- Attachment #3: config.bz2 --]
[-- Type: application/x-bzip, Size: 22803 bytes --]

^ permalink raw reply

* [PATCH net 1/1] bnx2x: Fix KR2 rapid link flap
From: Yaniv Rosner @ 2013-04-07 15:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eilon Greenstein, Yaniv Rosner

Check KR2 recovery time at the beginning of the work-around function.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 77ebae0..0283f34 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -13437,13 +13437,7 @@ static void bnx2x_check_kr2_wa(struct link_params *params,
 {
 	struct bnx2x *bp = params->bp;
 	u16 base_page, next_page, not_kr2_device, lane;
-	int sigdet = bnx2x_warpcore_get_sigdet(phy, params);
-
-	if (!sigdet) {
-		if (!(vars->link_attr_sync & LINK_ATTR_SYNC_KR2_ENABLE))
-			bnx2x_kr2_recovery(params, vars, phy);
-		return;
-	}
+	int sigdet;
 
 	/* Once KR2 was disabled, wait 5 seconds before checking KR2 recovery
 	 * since some switches tend to reinit the AN process and clear the
@@ -13454,6 +13448,16 @@ static void bnx2x_check_kr2_wa(struct link_params *params,
 		vars->check_kr2_recovery_cnt--;
 		return;
 	}
+
+	sigdet = bnx2x_warpcore_get_sigdet(phy, params);
+	if (!sigdet) {
+		if (!(vars->link_attr_sync & LINK_ATTR_SYNC_KR2_ENABLE)) {
+			bnx2x_kr2_recovery(params, vars, phy);
+			DP(NETIF_MSG_LINK, "No sigdet\n");
+		}
+		return;
+	}
+
 	lane = bnx2x_get_warpcore_lane(phy, params);
 	CL22_WR_OVER_CL45(bp, phy, MDIO_REG_BANK_AER_BLOCK,
 			  MDIO_AER_BLOCK_AER_REG, lane);
-- 
1.7.7.1

^ permalink raw reply related

* [PATCH V1 net-next 1/3] net/mlx4_core: Added proper description for two device capabilities
From: Or Gerlitz @ 2013-04-07 13:44 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, sagig, john.r.fastabend, Or Gerlitz
In-Reply-To: <1365342248-31901-1-git-send-email-ogerlitz@mellanox.com>

Added readable description for the DPDP and port sensing device capabilities.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/fw.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index f624557..8764397 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -91,7 +91,7 @@ static void dump_dev_cap_flags(struct mlx4_dev *dev, u64 flags)
 		[ 8] = "P_Key violation counter",
 		[ 9] = "Q_Key violation counter",
 		[10] = "VMM",
-		[12] = "DPDP",
+		[12] = "Dual Port Different Protocol (DPDP) support",
 		[15] = "Big LSO headers",
 		[16] = "MW support",
 		[17] = "APM support",
@@ -109,6 +109,7 @@ static void dump_dev_cap_flags(struct mlx4_dev *dev, u64 flags)
 		[41] = "Unicast VEP steering support",
 		[42] = "Multicast VEP steering support",
 		[48] = "Counters support",
+		[55] = "Port link type sensing support",
 		[59] = "Port management change event support",
 		[61] = "64 byte EQE support",
 		[62] = "64 byte CQE support",
-- 
1.7.1

^ permalink raw reply related

* [PATCH V1 net-next 0/3] Mellanox Core and Ethernet driver updates 2013-04-07
From: Or Gerlitz @ 2013-04-07 13:44 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, sagig, john.r.fastabend, Or Gerlitz

Hi Dave,

Here's V1 of this batch of mlx4 driver updates for 3.10, mostly DCB related. 

Series done against the net-next tree as of commit a210576c "Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"

changes from V0: 

  - in patch #2 advertize getdcbx/setdcbx callbacks also for devices that support only PFC

  - per feedback from John, dropeed patch #3 which turned to be work-around for user space 
    bug in open-lldp. John posted a user space patch to fix the bug

  - per feedback from John, added small a patch under which we advertize DCB_CAP_DCBX_HOST in getdcbx

Or.


Or Gerlitz (3):
  net/mlx4_core: Added proper description for two device capabilities
  net/mlx4_en: Enable DCB ETS ops only when supported by the firmware
  net/mlx4_en: Advertize DCB_CAP_DCBX_HOST in getdcbx

 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c |   10 +++++++++-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |   10 ++++++++--
 drivers/net/ethernet/mellanox/mlx4/fw.c        |    4 +++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    1 +
 include/linux/mlx4/device.h                    |    1 +
 5 files changed, 22 insertions(+), 4 deletions(-)

^ permalink raw reply

* [PATCH V1 net-next 2/3] net/mlx4_en: Enable DCB ETS ops only when supported by the firmware
From: Or Gerlitz @ 2013-04-07 13:44 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, sagig, john.r.fastabend, Or Gerlitz,
	Eugenia Emantayev
In-Reply-To: <1365342248-31901-1-git-send-email-ogerlitz@mellanox.com>

Enable the DCB ETS ops only when supported by the firmware. For older firmware/cards
which don't support ETS, advertize only PFC DCB ops.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c |    8 ++++++++
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |   10 ++++++++--
 drivers/net/ethernet/mellanox/mlx4/fw.c        |    1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    1 +
 include/linux/mlx4/device.h                    |    1 +
 5 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index b799ab1..321553f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -253,3 +253,11 @@ const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops = {
 	.getdcbx	= mlx4_en_dcbnl_getdcbx,
 	.setdcbx	= mlx4_en_dcbnl_setdcbx,
 };
+
+const struct dcbnl_rtnl_ops mlx4_en_dcbnl_pfc_ops = {
+	.ieee_getpfc	= mlx4_en_dcbnl_ieee_getpfc,
+	.ieee_setpfc	= mlx4_en_dcbnl_ieee_setpfc,
+
+	.getdcbx	= mlx4_en_dcbnl_getdcbx,
+	.setdcbx	= mlx4_en_dcbnl_setdcbx,
+};
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 61b5678..62795b5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2013,8 +2013,14 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 	INIT_WORK(&priv->linkstate_task, mlx4_en_linkstate);
 	INIT_DELAYED_WORK(&priv->stats_task, mlx4_en_do_get_stats);
 #ifdef CONFIG_MLX4_EN_DCB
-	if (!mlx4_is_slave(priv->mdev->dev))
-		dev->dcbnl_ops = &mlx4_en_dcbnl_ops;
+	if (!mlx4_is_slave(priv->mdev->dev)) {
+		if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_SET_ETH_SCHED) {
+			dev->dcbnl_ops = &mlx4_en_dcbnl_ops;
+		} else {
+			en_info(priv, "enabling only PFC DCB ops\n");
+			dev->dcbnl_ops = &mlx4_en_dcbnl_pfc_ops;
+		}
+	}
 #endif
 
 	for (i = 0; i < MLX4_EN_MAC_HASH_SIZE; ++i)
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 8764397..ab470d9 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -109,6 +109,7 @@ static void dump_dev_cap_flags(struct mlx4_dev *dev, u64 flags)
 		[41] = "Unicast VEP steering support",
 		[42] = "Multicast VEP steering support",
 		[48] = "Counters support",
+		[53] = "Port ETS Scheduler support",
 		[55] = "Port link type sensing support",
 		[59] = "Port management change event support",
 		[61] = "64 byte EQE support",
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index f710b7c..d4cb5d3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -624,6 +624,7 @@ int mlx4_en_QUERY_PORT(struct mlx4_en_dev *mdev, u8 port);
 
 #ifdef CONFIG_MLX4_EN_DCB
 extern const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops;
+extern const struct dcbnl_rtnl_ops mlx4_en_dcbnl_pfc_ops;
 #endif
 
 int mlx4_en_setup_tc(struct net_device *dev, u8 up);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 811f91c..1bc5a75 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -140,6 +140,7 @@ enum {
 	MLX4_DEV_CAP_FLAG_VEP_UC_STEER	= 1LL << 41,
 	MLX4_DEV_CAP_FLAG_VEP_MC_STEER	= 1LL << 42,
 	MLX4_DEV_CAP_FLAG_COUNTERS	= 1LL << 48,
+	MLX4_DEV_CAP_FLAG_SET_ETH_SCHED = 1LL << 53,
 	MLX4_DEV_CAP_FLAG_SENSE_SUPPORT	= 1LL << 55,
 	MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV = 1LL << 59,
 	MLX4_DEV_CAP_FLAG_64B_EQE	= 1LL << 61,
-- 
1.7.1

^ permalink raw reply related

* [PATCH V1 net-next 3/3] net/mlx4_en: Advertize DCB_CAP_DCBX_HOST in getdcbx
From: Or Gerlitz @ 2013-04-07 13:44 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, sagig, john.r.fastabend, Or Gerlitz
In-Reply-To: <1365342248-31901-1-git-send-email-ogerlitz@mellanox.com>

When our getdcbx entry is called, DCB_CAP_DCBX_HOST should be advertized too.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index 321553f..9199aed 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -186,7 +186,7 @@ static int mlx4_en_dcbnl_ieee_setpfc(struct net_device *dev,
 
 static u8 mlx4_en_dcbnl_getdcbx(struct net_device *dev)
 {
-	return DCB_CAP_DCBX_VER_IEEE;
+	return (DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE);
 }
 
 static u8 mlx4_en_dcbnl_setdcbx(struct net_device *dev, u8 mode)
-- 
1.7.1

^ permalink raw reply related

* Re: PROBLEM: IPv6 TCP-Connections resetting
From: Christoph Paasch @ 2013-04-07 14:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, Tetja Rediske, djduanjiong, netdev,
	steffen.klassert, Neal Cardwell, David Miller
In-Reply-To: <1365270879.3887.4.camel@edumazet-glaptop>

On Saturday 06 April 2013 10:54:39 Eric Dumazet wrote:
> OK, it seems bug was added in commit
> ec18d9a2691d69cd14b48f9b919fddcef28b7f5c
> (ipv6: Add redirect support to all protocol icmp error handlers.)
> 
> Not sure why Tetja Rediske  bisected to
> 093d04d42fa094f6740bb188f0ad0c215ff61e2c

I made a setup to trigger the ICMPv6 Redirect.
Yes, the bug was added by ec18d9a26 (ipv6: Add redirect support to all 
protocol icmp error handlers.), but prior to 093d04d4 (ipv6: Change skb->data 
before using icmpv6_notify() to propagate redirect) the stack did not enter 
tcp_v6_err upon a redirect message.

This, because inside icmpv6_notify, skb->data did not point to the inner IP 
header. So, when icmpv6_notify looks up the protocol-handler, it will not 
match on tcp_v6_err.

> Could you send a patch with this single line change (no cleanup), and
> a more detailed changelog, once the bug origin is clearly identified ?

Yes, will resend.


Cheers,
Christoph

-- 
IP Networking Lab --- http://inl.info.ucl.ac.be
MultiPath TCP in the Linux Kernel --- http://multipath-tcp.org
UCLouvain
--

^ permalink raw reply

* [PATCH v2] ipv6/tcp: Stop processing ICMPv6 redirect messages
From: Christoph Paasch @ 2013-04-07 14:53 UTC (permalink / raw)
  To: netdev
  Cc: David Miller, Eric Dumazet, djduanjiong, steffen.klassert,
	Neal Cardwell, Hannes Frederic Sowa

Tetja Rediske found that if the host receives an ICMPv6 redirect message
after sending a SYN+ACK, the connection will be reset.

He bisected it down to 093d04d (ipv6: Change skb->data before using
icmpv6_notify() to propagate redirect), but the origin of the bug comes
from ec18d9a26 (ipv6: Add redirect support to all protocol icmp error
handlers.). The bug simply did not trigger prior to 093d04d, because
skb->data did not point to the inner IP header and thus icmpv6_notify
did not call the correct err_handler.

This patch adds the missing "goto out;" in tcp_v6_err. After receiving
an ICMPv6 Redirect, we should not continue processing the ICMP in
tcp_v6_err, as this may trigger the removal of request-socks or setting
sk_err(_soft).

Reported-by: Tetja Rediske <tetja@tetja.de>
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
---
v2: Removed code-cleanup from the patch and improved the commit-message upon 
suggestion from Eric Dumazet.

 net/ipv6/tcp_ipv6.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1033d2b..e51bd1a 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -386,6 +386,7 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 		if (dst)
 			dst->ops->redirect(dst, sk, skb);
+		goto out;
 	}
 
 	if (type == ICMPV6_PKT_TOOBIG) {
-- 
1.8.1.227.g44fe835

^ permalink raw reply related

* Re: [Patch net-next v2 3/4] vxlan: add ipv6 support
From: Cong Wang @ 2013-04-07 14:54 UTC (permalink / raw)
  To: David Stevens; +Cc: David S. Miller, netdev, Stephen Hemminger
In-Reply-To: <OFEB6AEA10.78B1A4D7-ON85257B44.004548C2-85257B44.004A7DDE@us.ibm.com>

On Fri, 2013-04-05 at 09:33 -0400, David Stevens wrote:
> Cong Wang <amwang@redhat.com> wrote on 04/05/2013 08:16:24 AM:
>  
> 
> > +struct vxlan_ip {
> > +   union {
> > +      struct sockaddr_in   sin;
> > +#if IS_ENABLED(CONFIG_IPV6)
> > +      struct sockaddr_in6   sin6;
> > +#endif
> > +      struct sockaddr      sa;
> > +   } u;
> > +#define ip4 u.sin.sin_addr.s_addr
> > +#define ip6 u.sin6.sin6_addr
> > +#define proto u.sa.sa_family
> > +};
> > +
>         I'd prefer:
> 
> 1) all #defines at the same level in the union/struct -- you
>         have a) a field within an in6_addr struct, b) a in6_addr
>         struct and c) a field within a sockaddr -- all at a different
>         level in the union/struct

Why? They are aliases of the fields, so should stay inside the
union/struct.

> 2) union tags the odd names and the defines indicating the type 
>         and also not something likely to appear as a variable name
>         (ie, maybe sun_sin6 for the union and vip_sin6 for the #define)

Why "sun_" prefix?

> 3) a family (e.g. AF_INET6) is not a proto (e.g. IPPROTO_IPV6)

Agreed, s/proto/family/.

> 4) "vxlan_ip" sounds like an ipv4 type to me -- maybe "vxlan_addr"?

Agreed.

> 5) I'd leave out the #ifdef; I believe sockaddr_in6 is defined without
>         CONFIG_IPV6. I may me wrong on that, and it makes it bigger for
>         v4-only configs, but fewer #ifdefs makes maintenance easier and
>         there's an advantage in it always being the same size; this one
>         is more a matter of taste.
> 

Yeah, you are right.

> So:
> 
> struct vxlan_addr {
>         union {
>                 struct sockaddr_in      sun_sin
>                 struct sockaddr_in6     sun_sin6;
>                 struct sockaddr         sun_sa;
>         } sun;
> }
> #define vip_sin sun.sin
> #define vip_sin6        sun.sin6
> #define vip_sa  sun.sa
> 
> As it is, you're trying to hide that it's a sockaddr, which
> makes the code more difficult to read in context. "vip_sa.sa_family"
> is obviously the sa_family field of a sockaddr, "proto" could be
> lots of things.
>  

No, I just want to keep them short. I don't think "vip_sin6" is better
than "ip6".

> > +
> > +static int vxlan_nla_get_addr(struct vxlan_ip *ip, struct nlattr *nla)
> > +{
> > +   if (nla_len(nla) == sizeof(__be32)) {
> > +      ip->ip4 = nla_get_be32(nla);
> > +      ip->proto = AF_INET;
> > +   } else if (nla_len(nla) == sizeof(struct in6_addr)) {
> > +      nla_memcpy(&ip->ip6, nla, sizeof(struct in6_addr));
> > +      ip->proto = AF_INET6;
> > +   } else
> > +      return -EAFNOSUPPORT;
> > +   return 0;
> > +}
> 
>         netlink messages use padding -- I'm not sure nla_len() will
> be correct in all cases here. I think using a sockaddr here too
> would be better (then the family tells you the address type), as
> long as that doesn't create include issues Dave mentioned within
> the "ip" and "bridge" commands. Otherwise, maybe use a different
> NL type for v6.

At least Thomas doesn't object this, although he pointed out the same
thing.

>         Another reason to do this is that link-local v6 addresses
> require a scope_id and matching addresses are only equal if the
> interfaces are equal. You need to get that from user level and
> compare it in address matching, too, but it is in the sockaddr_in6
> so if you set it there, memcmp() will cover that as well.
> 

It is not convenient to zero the whole struct just for memcmp(), just
compare the part we need is sufficient.


All the rest you point out are all fixed in v3.

Thanks for review!

^ permalink raw reply

* Re: [PATCH v2] ipv6/tcp: Stop processing ICMPv6 redirect messages
From: Eric Dumazet @ 2013-04-07 15:02 UTC (permalink / raw)
  To: Christoph Paasch
  Cc: netdev, David Miller, djduanjiong, steffen.klassert,
	Neal Cardwell, Hannes Frederic Sowa
In-Reply-To: <1365346395-18016-1-git-send-email-christoph.paasch@uclouvain.be>

On Sun, 2013-04-07 at 16:53 +0200, Christoph Paasch wrote:
> Tetja Rediske found that if the host receives an ICMPv6 redirect message
> after sending a SYN+ACK, the connection will be reset.
> 
> He bisected it down to 093d04d (ipv6: Change skb->data before using
> icmpv6_notify() to propagate redirect), but the origin of the bug comes
> from ec18d9a26 (ipv6: Add redirect support to all protocol icmp error
> handlers.). The bug simply did not trigger prior to 093d04d, because
> skb->data did not point to the inner IP header and thus icmpv6_notify
> did not call the correct err_handler.
> 
> This patch adds the missing "goto out;" in tcp_v6_err. After receiving
> an ICMPv6 Redirect, we should not continue processing the ICMP in
> tcp_v6_err, as this may trigger the removal of request-socks or setting
> sk_err(_soft).
> 
> Reported-by: Tetja Rediske <tetja@tetja.de>
> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
> ---
> v2: Removed code-cleanup from the patch and improved the commit-message upon 
> suggestion from Eric Dumazet.
> 
>  net/ipv6/tcp_ipv6.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 1033d2b..e51bd1a 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -386,6 +386,7 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>  
>  		if (dst)
>  			dst->ops->redirect(dst, sk, skb);
> +		goto out;
>  	}
>  
>  	if (type == ICMPV6_PKT_TOOBIG) {


Thanks for sorting this out.

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH V1 net-next 3/3] net/mlx4_en: Advertize DCB_CAP_DCBX_HOST in getdcbx
From: Sergei Shtylyov @ 2013-04-07 15:31 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: davem, netdev, amirv, sagig, john.r.fastabend
In-Reply-To: <1365342248-31901-4-git-send-email-ogerlitz@mellanox.com>

Hello.

On 07-04-2013 17:44, Or Gerlitz wrote:

> When our getdcbx entry is called, DCB_CAP_DCBX_HOST should be advertized too.

> Cc: John Fastabend <john.r.fastabend@intel.com>
> Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
[...]

> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
> index 321553f..9199aed 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
> @@ -186,7 +186,7 @@ static int mlx4_en_dcbnl_ieee_setpfc(struct net_device *dev,
>
>   static u8 mlx4_en_dcbnl_getdcbx(struct net_device *dev)
>   {
> -	return DCB_CAP_DCBX_VER_IEEE;
> +	return (DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE);

    Nit: parens not needed here.

WBR, Sergei

^ permalink raw reply

* Re: [PATCH 00/51] netfilter updates for net-next
From: David Miller @ 2013-04-07 16:27 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1365250670-14993-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sat,  6 Apr 2013 14:16:59 +0200

> You can pull these changes from:
> 
> git://1984.lsi.us.es/nf-next master

Not a very good pull request, I don't know how you were able to
successfully create it with all the inaccuracies.

The dst.c change wasn't mentioned.

and the last three patches (the error propagation fixes from Patrick)
weren't even in this tree.

I pulled, but I cannot believe that GIT didn't complain about this
or provide some kind of feedback indicating that what you're asking
me to pull doesn't match up with what's actually in that tree.

I mean, if anyone had sent you a pull request like this, you would
have rejected it.

^ permalink raw reply

* Re: [PATCH v2] ipv6/tcp: Stop processing ICMPv6 redirect messages
From: David Miller @ 2013-04-07 16:36 UTC (permalink / raw)
  To: eric.dumazet
  Cc: christoph.paasch, netdev, djduanjiong, steffen.klassert,
	ncardwell, hannes
In-Reply-To: <1365346942.3887.6.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 07 Apr 2013 08:02:22 -0700

> On Sun, 2013-04-07 at 16:53 +0200, Christoph Paasch wrote:
>> Tetja Rediske found that if the host receives an ICMPv6 redirect message
>> after sending a SYN+ACK, the connection will be reset.
>> 
>> He bisected it down to 093d04d (ipv6: Change skb->data before using
>> icmpv6_notify() to propagate redirect), but the origin of the bug comes
>> from ec18d9a26 (ipv6: Add redirect support to all protocol icmp error
>> handlers.). The bug simply did not trigger prior to 093d04d, because
>> skb->data did not point to the inner IP header and thus icmpv6_notify
>> did not call the correct err_handler.
>> 
>> This patch adds the missing "goto out;" in tcp_v6_err. After receiving
>> an ICMPv6 Redirect, we should not continue processing the ICMP in
>> tcp_v6_err, as this may trigger the removal of request-socks or setting
>> sk_err(_soft).
>> 
>> Reported-by: Tetja Rediske <tetja@tetja.de>
>> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
>> ---
>> v2: Removed code-cleanup from the patch and improved the commit-message upon 
>> suggestion from Eric Dumazet.
>> 
>>  net/ipv6/tcp_ipv6.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
>> index 1033d2b..e51bd1a 100644
>> --- a/net/ipv6/tcp_ipv6.c
>> +++ b/net/ipv6/tcp_ipv6.c
>> @@ -386,6 +386,7 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>>  
>>  		if (dst)
>>  			dst->ops->redirect(dst, sk, skb);
>> +		goto out;
>>  	}
>>  
>>  	if (type == ICMPV6_PKT_TOOBIG) {
> 
> 
> Thanks for sorting this out.
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks everyone!

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox