Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [GIT] Networking
From: Linus Torvalds @ 2014-10-20  0:32 UTC (permalink / raw)
  To: David Miller, Pablo Neira Ayuso
  Cc: Andrew Morton, Network Development, Linux Kernel Mailing List
In-Reply-To: <20141019.132314.264021750039480460.davem@davemloft.net>

On Sun, Oct 19, 2014 at 10:23 AM, David Miller <davem@davemloft.net> wrote:
>
> A quick batch of bug fixes:

Ho humm.. Here's another networking issue with the current kernel:

  nf_reject_ipv6: module license 'unspecified' taints kernel.
  Disabling lock debugging due to kernel taint
  nf_reject_ipv6: Unknown symbol ip6_local_out (err 0)

Hmm? I'm not sure this is new, but I hadn't noticed it before. The
"unknown symbol" seems to be simply because ip6_local_out is GPL-only,
and without a proper license, 'nf_reject_ipv6' not only taints the
kernel but doesn't link properly..

Looks like the module license issue was just overlooked when moving
the code out in commit c8d7b98bec43 ("netfilter: move nf_send_resetX()
code to nf_reject_ipvX modules").

                     Linus

^ permalink raw reply

* Re: [PATCH 2/4] net: make skb_gso_segment error handling more robust
From: David Miller @ 2014-10-20  0:39 UTC (permalink / raw)
  To: fw; +Cc: netdev, edumazet
In-Reply-To: <1413751340-19621-3-git-send-email-fw@strlen.de>

From: Florian Westphal <fw@strlen.de>
Date: Sun, 19 Oct 2014 22:42:19 +0200

> skb_gso_segment has three possible return values:
> 1. a pointer to the first segmented skb
> 2. an errno value (IS_ERR())
> 3. NULL.  This can happen when GSO is used for header verification.
> 
> However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
> and would oops when NULL is returned.
> 
> Note that these call sites should never actually see such a NULL return
> value; all callers mask out the GSO bits in the feature argument.
> 
> However, in the past, there have been issues with some protocol handlers
> erronously not respecting the specified feature mask in some cases.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

I don't think it makes sense to return PTR_ERR(p) when
p is NULL.

^ permalink raw reply

* Re: [GIT] Networking
From: David Miller @ 2014-10-20  1:03 UTC (permalink / raw)
  To: torvalds; +Cc: pablo, akpm, netdev, linux-kernel
In-Reply-To: <CA+55aFz4UdVbZKPKwkesj0=1Yu7jLDP3Q3-jwe45b7gRU+0rog@mail.gmail.com>

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun, 19 Oct 2014 17:32:15 -0700

> Looks like the module license issue was just overlooked when moving
> the code out in commit c8d7b98bec43 ("netfilter: move nf_send_resetX()
> code to nf_reject_ipvX modules").

I think Pablo has a patch pending to address this, and indeed he does:

	http://marc.info/?l=linux-netdev&m=141293491712312&w=2

Pablo please push this to me soon, thanks.

^ permalink raw reply

* RE: [PATCH net] r8152: use cancel_delayed_work for runtime suspend
From: Hayes Wang @ 2014-10-20  2:19 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, nic_swsd,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1413661699.19391.2.camel-AfvqVibwNMkMNNZnWhT/Jw@public.gmane.org>

 Oliver Neukum [mailto:oliver-GvhC2dPhHPQdnm+yROfE0A@public.gmane.org] 
> Sent: Sunday, October 19, 2014 3:48 AM
[...]
> The diagnosis is good, the fix is not good. It opens a race
> during which the queued work can touch a suspended device.

The delayed work would wake up the device by
calling usb_autopm_get_interface() before
accessing the device. Besides, there is a mutex
to avoid the race.
 
Best Regards,
Hayes
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
From: Ben Hutchings @ 2014-10-20  3:09 UTC (permalink / raw)
  To: David Miller; +Cc: mleitner, stable, netdev, hannes
In-Reply-To: <20141013.135127.1915115817707962111.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]

On Mon, 2014-10-13 at 13:51 -0400, David Miller wrote:
> From: Marcelo Ricardo Leitner <mleitner@redhat.com>
> Date: Mon, 13 Oct 2014 14:03:30 -0300
> 
> > Further tests revealed that after moving the garbage collector to a work
> > queue and protecting it with a spinlock may leave the system prone to
> > soft lockups if bottom half gets very busy.
> > 
> > It was reproced with a set of firewall rules that REJECTed packets. If
> > the NIC bottom half handler ends up running on the same CPU that is
> > running the garbage collector on a very large cache, the garbage
> > collector will not be able to do its job due to the amount of work
> > needed for handling the REJECTs and also won't reschedule.
> > 
> > The fix is to disable bottom half during the garbage collecting, as it
> > already was in the first place (most calls to it came from softirqs).
> > 
> > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
> > Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > Acked-by: David S. Miller <davem@davemloft.net>
> > Cc: stable@vger.kernel.org
> 
> -stable folks, please integrate this directly, thanks!

I've appplied this and the previous two patches mentioned ('ipv4: move
route garbage collector to work queue' and 'ipv4: avoid parallel route
cache gc executions').  But I didn't get the other two from you.  The
last batch of networking fixes I received and applied was dated
2014-08-07, and the next one I've seen is dated 2014-10-11 and has
nothing for 3.2 or 3.4.  Did I miss one between these?

Ben.

-- 
Ben Hutchings
[W]e found...that it wasn't as easy to get programs right as we had thought.
... I realized that a large part of my life from then on was going to be spent
in finding mistakes in my own programs. - Maurice Wilkes, 1949

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply

* Re: [PATCH 3.4-stable v2] ipv6: reallocate addrconf router for ipv6 address when lo device up
From: Ben Hutchings @ 2014-10-20  3:14 UTC (permalink / raw)
  To: chenweilong
  Cc: David Miller, netdev, Greg Kroah-Hartman, stable, Sabrina Dubroca,
	Hannes Frederic Sowa, Gao feng, Li Zefan
In-Reply-To: <53E1E4C9.4010006@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]

On Wed, 2014-08-06 at 16:18 +0800, chenweilong wrote:
> It fix the bug 67951 on bugzilla
> https://bugzilla.kernel.org/show_bug.cgi?id=67951
> 
> The patch can't be applied directly, as it' used the function introduced
> by "commit 94e187c0" ip6_rt_put(), that patch can't be applied directly
> either.
> 
> ====================
> 
> From: Gao feng <gaofeng@cn.fujitsu.com>
> 
> commit 33d99113b1102c2d2f8603b9ba72d89d915c13f5 upstream.
> 
> This commit don't have a stable tag, but it fix the bug
> no reply after loopback down-up.It's very worthy to be
> applied to stable 3.4 kernels.
> 
> The bug is 67951 on bugzilla
> https://bugzilla.kernel.org/show_bug.cgi?id=67951
[...]

It looks like this is needed for 3.2.y as well, so I've queued it up.
Thanks.

Ben.

-- 
Ben Hutchings
[W]e found...that it wasn't as easy to get programs right as we had thought.
... I realized that a large part of my life from then on was going to be spent
in finding mistakes in my own programs. - Maurice Wilkes, 1949

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
From: David Miller @ 2014-10-20  4:23 UTC (permalink / raw)
  To: ben; +Cc: mleitner, stable, netdev, hannes
In-Reply-To: <1413774581.31953.12.camel@decadent.org.uk>

From: Ben Hutchings <ben@decadent.org.uk>
Date: Mon, 20 Oct 2014 04:09:41 +0100

> I've appplied this and the previous two patches mentioned ('ipv4: move
> route garbage collector to work queue' and 'ipv4: avoid parallel route
> cache gc executions').  But I didn't get the other two from you.  The
> last batch of networking fixes I received and applied was dated
> 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
> nothing for 3.2 or 3.4.  Did I miss one between these?

I'm at the point where I'm personally not going to go back more than
four releases, anything more than that is rediculous.

And this time that was 3.17, 3.16, 3.14, and 3.10

^ permalink raw reply

* Re: [PATCH net] ax88179_178a: fix bonding failure
From: David Miller @ 2014-10-20  4:55 UTC (permalink / raw)
  To: imorgan; +Cc: netdev
In-Reply-To: <alpine.LFD.2.11.1410190801050.5282@solo.int.primordial.ca>

From: Ian Morgan <imorgan@primordial.ca>
Date: Sun, 19 Oct 2014 08:05:13 -0400 (EDT)

> The following patch fixes a bug which causes the ax88179_178a driver to be
> incapable of being added to a bond.
> 
> When I brought up the issue with the bonding maintainers, they indicated
> that the real problem was with the NIC driver which must return zero for
> success (of setting the MAC address). I see that several other NIC drivers
> follow that pattern by either simply always returing zero, or by passing
> through a negative (error) result while rewriting any positive return code
> to zero. With that same philisophy applied to the ax88179_178a driver, it
> allows it to work correctly with the bonding driver.
> 
> I believe this is suitable for queuing in -stable, as it's a small, simple,
> and obvious fix that corrects a defect with no other known workaround.
> 
> This patch is against vanilla 3.17(.0).
> 
> Signed-off-by: Ian Morgan <imorgan@primordial.ca>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* [PATCH net] tipc: fix a potential deadlock
From: Ying Xue @ 2014-10-20  6:44 UTC (permalink / raw)
  To: davem; +Cc: jon.maloy, erik.hugne, netdev, tipc-discussion

Locking dependency detected below possible unsafe locking scenario:

           CPU0                          CPU1
T0:  tipc_named_rcv()                tipc_rcv()
T1:  [grab nametble write lock]*     [grab node lock]*
T2:  tipc_update_nametbl()           tipc_node_link_up()
T3:  tipc_nodesub_subscribe()        tipc_nametbl_publish()
T4:  [grab node lock]*               [grab nametble write lock]*

The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/node.c   |   46 ++++++++++++++++++++++++++++------------------
 net/tipc/node.h   |    7 ++++++-
 net/tipc/socket.c |    2 +-
 3 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/net/tipc/node.c b/net/tipc/node.c
index 90cee4a..5781634 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -219,11 +219,11 @@ void tipc_node_abort_sock_conns(struct list_head *conns)
 void tipc_node_link_up(struct tipc_node *n_ptr, struct tipc_link *l_ptr)
 {
 	struct tipc_link **active = &n_ptr->active_links[0];
-	u32 addr = n_ptr->addr;
 
 	n_ptr->working_links++;
-	tipc_nametbl_publish(TIPC_LINK_STATE, addr, addr, TIPC_NODE_SCOPE,
-			     l_ptr->bearer_id, addr);
+	n_ptr->action_flags |= TIPC_NOTIFY_LINK_UP;
+	n_ptr->link_id = l_ptr->peer_bearer_id << 16 | l_ptr->bearer_id;
+
 	pr_info("Established link <%s> on network plane %c\n",
 		l_ptr->name, l_ptr->net_plane);
 
@@ -284,10 +284,10 @@ static void node_select_active_links(struct tipc_node *n_ptr)
 void tipc_node_link_down(struct tipc_node *n_ptr, struct tipc_link *l_ptr)
 {
 	struct tipc_link **active;
-	u32 addr = n_ptr->addr;
 
 	n_ptr->working_links--;
-	tipc_nametbl_withdraw(TIPC_LINK_STATE, addr, l_ptr->bearer_id, addr);
+	n_ptr->action_flags |= TIPC_NOTIFY_LINK_DOWN;
+	n_ptr->link_id = l_ptr->peer_bearer_id << 16 | l_ptr->bearer_id;
 
 	if (!tipc_link_is_active(l_ptr)) {
 		pr_info("Lost standby link <%s> on network plane %c\n",
@@ -552,28 +552,30 @@ void tipc_node_unlock(struct tipc_node *node)
 	LIST_HEAD(conn_sks);
 	struct sk_buff_head waiting_sks;
 	u32 addr = 0;
-	unsigned int flags = node->action_flags;
+	int flags = node->action_flags;
+	u32 link_id = 0;
 
-	if (likely(!node->action_flags)) {
+	if (likely(!flags)) {
 		spin_unlock_bh(&node->lock);
 		return;
 	}
 
+	addr = node->addr;
+	link_id = node->link_id;
 	__skb_queue_head_init(&waiting_sks);
-	if (node->action_flags & TIPC_WAKEUP_USERS) {
+
+	if (flags & TIPC_WAKEUP_USERS)
 		skb_queue_splice_init(&node->waiting_sks, &waiting_sks);
-		node->action_flags &= ~TIPC_WAKEUP_USERS;
-	}
-	if (node->action_flags & TIPC_NOTIFY_NODE_DOWN) {
+
+	if (flags & TIPC_NOTIFY_NODE_DOWN) {
 		list_replace_init(&node->nsub, &nsub_list);
 		list_replace_init(&node->conn_sks, &conn_sks);
-		node->action_flags &= ~TIPC_NOTIFY_NODE_DOWN;
 	}
-	if (node->action_flags & TIPC_NOTIFY_NODE_UP) {
-		node->action_flags &= ~TIPC_NOTIFY_NODE_UP;
-		addr = node->addr;
-	}
-	node->action_flags &= ~TIPC_WAKEUP_BCAST_USERS;
+	node->action_flags &= ~(TIPC_WAKEUP_USERS | TIPC_NOTIFY_NODE_DOWN |
+				TIPC_NOTIFY_NODE_UP | TIPC_NOTIFY_LINK_UP |
+				TIPC_NOTIFY_LINK_DOWN |
+				TIPC_WAKEUP_BCAST_USERS);
+
 	spin_unlock_bh(&node->lock);
 
 	while (!skb_queue_empty(&waiting_sks))
@@ -588,6 +590,14 @@ void tipc_node_unlock(struct tipc_node *node)
 	if (flags & TIPC_WAKEUP_BCAST_USERS)
 		tipc_bclink_wakeup_users();
 
-	if (addr)
+	if (flags & TIPC_NOTIFY_NODE_UP)
 		tipc_named_node_up(addr);
+
+	if (flags & TIPC_NOTIFY_LINK_UP)
+		tipc_nametbl_publish(TIPC_LINK_STATE, addr, addr,
+				     TIPC_NODE_SCOPE, link_id, addr);
+
+	if (flags & TIPC_NOTIFY_LINK_DOWN)
+		tipc_nametbl_withdraw(TIPC_LINK_STATE, addr,
+				      link_id, addr);
 }
diff --git a/net/tipc/node.h b/net/tipc/node.h
index 67513c3..04e9145 100644
--- a/net/tipc/node.h
+++ b/net/tipc/node.h
@@ -53,6 +53,7 @@
  * TIPC_WAIT_OWN_LINKS_DOWN: wait until peer node is declared down
  * TIPC_NOTIFY_NODE_DOWN: notify node is down
  * TIPC_NOTIFY_NODE_UP: notify node is up
+ * TIPC_DISTRIBUTE_NAME: publish or withdraw link state name type
  */
 enum {
 	TIPC_WAIT_PEER_LINKS_DOWN	= (1 << 1),
@@ -60,7 +61,9 @@ enum {
 	TIPC_NOTIFY_NODE_DOWN		= (1 << 3),
 	TIPC_NOTIFY_NODE_UP		= (1 << 4),
 	TIPC_WAKEUP_USERS		= (1 << 5),
-	TIPC_WAKEUP_BCAST_USERS		= (1 << 6)
+	TIPC_WAKEUP_BCAST_USERS		= (1 << 6),
+	TIPC_NOTIFY_LINK_UP		= (1 << 7),
+	TIPC_NOTIFY_LINK_DOWN		= (1 << 8)
 };
 
 /**
@@ -100,6 +103,7 @@ struct tipc_node_bclink {
  * @working_links: number of working links to node (both active and standby)
  * @link_cnt: number of links to node
  * @signature: node instance identifier
+ * @link_id: local and remote bearer ids of changing link, if any
  * @nsub: list of "node down" subscriptions monitoring node
  * @rcu: rcu struct for tipc_node
  */
@@ -116,6 +120,7 @@ struct tipc_node {
 	int link_cnt;
 	int working_links;
 	u32 signature;
+	u32 link_id;
 	struct list_head nsub;
 	struct sk_buff_head waiting_sks;
 	struct list_head conn_sks;
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 75275c5..3043f10 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2673,7 +2673,7 @@ static int tipc_ioctl(struct socket *sk, unsigned int cmd, unsigned long arg)
 	case SIOCGETLINKNAME:
 		if (copy_from_user(&lnr, argp, sizeof(lnr)))
 			return -EFAULT;
-		if (!tipc_node_get_linkname(lnr.bearer_id, lnr.peer,
+		if (!tipc_node_get_linkname(lnr.bearer_id & 0xffff, lnr.peer,
 					    lnr.linkname, TIPC_MAX_LINK_NAME)) {
 			if (copy_to_user(argp, &lnr, sizeof(lnr)))
 				return -EFAULT;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net] tipc: fix lockdep warning when intra-node messages are delivered
From: Ying Xue @ 2014-10-20  6:46 UTC (permalink / raw)
  To: davem; +Cc: jon.maloy, erik.hugne, netdev, tipc-discussion

When running tipcTC&tipcTS test suite, below lockdep unsafe locking
scenario is reported:

[ 1109.997854]
[ 1109.997988] =================================
[ 1109.998290] [ INFO: inconsistent lock state ]
[ 1109.998575] 3.17.0-rc1+ #113 Not tainted
[ 1109.998762] ---------------------------------
[ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 1109.998762]  (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
[ 1109.998762]   [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
[ 1109.998762]   [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762]   [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762]   [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]   [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
[ 1109.998762]   [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
[ 1109.998762]   [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
[ 1109.998762]   [<ffffffff817676ee>] SYSC_connect+0xae/0xc0
[ 1109.998762]   [<ffffffff81767b7e>] SyS_connect+0xe/0x10
[ 1109.998762]   [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
[ 1109.998762]   [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
[ 1109.998762] irq event stamp: 241060
[ 1109.998762] hardirqs last  enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
[ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
[ 1109.998762] softirqs last  enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50
[ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]
[ 1109.998762] other info that might help us debug this:
[ 1109.998762]  Possible unsafe locking scenario:
[ 1109.998762]
[ 1109.998762]        CPU0
[ 1109.998762]        ----
[ 1109.998762]   lock(slock-AF_TIPC);
[ 1109.998762]   <Interrupt>
[ 1109.998762]     lock(slock-AF_TIPC);
[ 1109.998762]
[ 1109.998762]  *** DEADLOCK ***
[ 1109.998762]
[ 1109.998762] 2 locks held by swapper/7/0:
[ 1109.998762]  #0:  (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
[ 1109.998762]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]
[ 1109.998762] stack backtrace:
[ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
[ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1109.998762]  ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
[ 1109.998762]  ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
[ 1109.998762]  ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
[ 1109.998762] Call Trace:
[ 1109.998762]  <IRQ>  [<ffffffff81a209eb>] dump_stack+0x4e/0x68
[ 1109.998762]  [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
[ 1109.998762]  [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
[ 1109.998762]  [<ffffffff810a406c>] mark_lock+0x28c/0x2f0
[ 1109.998762]  [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
[ 1109.998762]  [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
[ 1109.998762]  [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
[ 1109.998762]  [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
[ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762]  [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
[ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762]  [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]  [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
[ 1109.998762]  [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762]  [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]  [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
[ 1109.998762]  [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
[ 1109.998762]  [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]  [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
[ 1109.998762]  [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
[ 1109.998762]  [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
[ 1109.998762]  [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70
[ 1109.998762]  [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
[ 1109.998762]  [<ffffffff81785518>] napi_gro_receive+0xd8/0x240
[ 1109.998762]  [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
[ 1109.998762]  [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0
[ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762]  [<ffffffff817842b1>] net_rx_action+0x141/0x310
[ 1109.998762]  [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
[ 1109.998762]  [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0
[ 1109.998762]  [<ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]  [<ffffffff81a30d07>] do_IRQ+0x67/0x110
[ 1109.998762]  [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
[ 1109.998762]  <EOI>  [<ffffffff8100d2b7>] ? default_idle+0x37/0x250
[ 1109.998762]  [<ffffffff8100d2b5>] ? default_idle+0x35/0x250
[ 1109.998762]  [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
[ 1109.998762]  [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
[ 1109.998762]  [<ffffffff81034c78>] start_secondary+0x188/0x1f0

When intra-node messages are delivered from one process to another
process, tipc_link_xmit() doesn't disable BH before it directly calls
tipc_sk_rcv() on process context to forward messages to destination
socket. Meanwhile, if messages delivered by remote node arrive at the
node and their destinations are also the same socket, tipc_sk_rcv()
running on process context might be preempted by tipc_sk_rcv() running
BH context. As a result, the latter cannot obtain the socket lock as
the lock was obtained by the former, however, the former has no chance
to be run as the latter is owning the CPU now, so headlock happens. To
avoid it, BH should be always disabled in tipc_sk_rcv().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/socket.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 3043f10..51bddc2 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1776,7 +1776,7 @@ int tipc_sk_rcv(struct sk_buff *buf)
 	sk = &tsk->sk;
 
 	/* Queue message */
-	bh_lock_sock(sk);
+	spin_lock_bh(&sk->sk_lock.slock);
 
 	if (!sock_owned_by_user(sk)) {
 		rc = filter_rcv(sk, buf);
@@ -1787,7 +1787,7 @@ int tipc_sk_rcv(struct sk_buff *buf)
 		if (sk_add_backlog(sk, buf, limit))
 			rc = -TIPC_ERR_OVERLOAD;
 	}
-	bh_unlock_sock(sk);
+	spin_unlock_bh(&sk->sk_lock.slock);
 	tipc_sk_put(tsk);
 	if (likely(!rc))
 		return 0;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH RFC v3 3/3] virtio-net: optimize free_old_xmit_skbs stats
From: Michael S. Tsirkin @ 2014-10-20  6:52 UTC (permalink / raw)
  To: linux-kernel, netdev; +Cc: Jason Wang, Rusty Russell, virtualization, netdev
In-Reply-To: <1413787824-16130-1-git-send-email-mst@redhat.com>

From: Jason Wang <jasowang@redhat.com>

We already have counters for sent packets and sent bytes.
Use them to reduce the number of u64_stats_update_begin/end().

Take care not to bother with stats update when called
speculatively.

Based on a patch by Jason Wang.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b83d39d..c2b69f8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -233,16 +233,22 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq,
 	       (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 
-		u64_stats_update_begin(&stats->tx_syncp);
-		stats->tx_bytes += skb->len;
 		bytes += skb->len;
-		stats->tx_packets++;
-		u64_stats_update_end(&stats->tx_syncp);
+		packets++;
 
 		dev_kfree_skb_any(skb);
-		packets++;
 	}
 
+	/* Avoid overhead when no packets have been processed
+	 * happens when called speculatively from start_xmit. */
+	if (!packets)
+		return 0;
+
+	u64_stats_update_begin(&stats->tx_syncp);
+	stats->tx_bytes += bytes;
+	stats->tx_packets += packets;
+	u64_stats_update_end(&stats->tx_syncp);
+
 	netdev_tx_completed_queue(txq, packets, bytes);
 
 	if (sq->vq->num_free >= 2+MAX_SKB_FRAGS)
-- 
MST

^ permalink raw reply related

* [PATCH RFC v3 0/3] virtio_net: enabling tx interrupts
From: Michael S. Tsirkin @ 2014-10-20  6:52 UTC (permalink / raw)
  To: linux-kernel, netdev; +Cc: Jason Wang

RFC patches to enable tx interrupts.
This is to demonstrate how this can be done without
core virtio changes, and to make sure I understand
the new APIs correctly.

Testing TBD, I was asked for a version for early testing.

Applies on top of patch: "virtio_net: fix use after free"
that I recently sent.

Changes from v3:
	clean up code, address issues raised by Jason
Changes from v1:
        address comments by Jason Wang, use delayed cb everywhere
        rebased Jason's patch on top of mine and include it (with some tweaks)

Jason Wang (1):
  virtio-net: optimize free_old_xmit_skbs stats

Michael S. Tsirkin (2):
  virtio_net: enable tx interrupt
  virtio_net: bql

 drivers/net/virtio_net.c | 144 +++++++++++++++++++++++++++++++++--------------
 1 file changed, 101 insertions(+), 43 deletions(-)

-- 
MST

^ permalink raw reply

* [PATCH RFC v3 1/3] virtio_net: enable tx interrupt
From: Michael S. Tsirkin @ 2014-10-20  6:52 UTC (permalink / raw)
  To: linux-kernel, netdev; +Cc: netdev, virtualization
In-Reply-To: <1413787824-16130-1-git-send-email-mst@redhat.com>

On newer hosts that support delayed tx interrupts,
we probably don't have much to gain from orphaning
packets early.

Based on patch by Jason Wang.

Note: this might degrade performance for
hosts without event idx support.
Should be addressed by the next patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c | 133 +++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 44 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 13d0a8b..14f4cda 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -72,6 +72,8 @@ struct send_queue {
 
 	/* Name of the send queue: output.$index */
 	char name[40];
+
+	struct napi_struct napi;
 };
 
 /* Internal representation of a receive virtqueue */
@@ -217,15 +219,41 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask)
 	return p;
 }
 
+static unsigned int free_old_xmit_skbs(struct netdev_queue *txq,
+				       struct send_queue *sq, int budget)
+{
+	struct sk_buff *skb;
+	unsigned int len;
+	struct virtnet_info *vi = sq->vq->vdev->priv;
+	struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
+	unsigned int packets = 0;
+
+	while (packets < budget &&
+	       (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
+		pr_debug("Sent skb %p\n", skb);
+
+		u64_stats_update_begin(&stats->tx_syncp);
+		stats->tx_bytes += skb->len;
+		stats->tx_packets++;
+		u64_stats_update_end(&stats->tx_syncp);
+
+		dev_kfree_skb_any(skb);
+		packets++;
+	}
+
+	if (sq->vq->num_free >= 2+MAX_SKB_FRAGS)
+		netif_tx_start_queue(txq);
+
+	return packets;
+}
+
 static void skb_xmit_done(struct virtqueue *vq)
 {
 	struct virtnet_info *vi = vq->vdev->priv;
+	struct send_queue *sq = &vi->sq[vq2txq(vq)];
 
-	/* Suppress further interrupts. */
-	virtqueue_disable_cb(vq);
-
-	/* We were probably waiting for more output buffers. */
-	netif_wake_subqueue(vi->dev, vq2txq(vq));
+	virtqueue_disable_cb(sq->vq);
+	napi_schedule(&sq->napi);
 }
 
 static unsigned int mergeable_ctx_to_buf_truesize(unsigned long mrg_ctx)
@@ -774,6 +802,31 @@ again:
 	return received;
 }
 
+static int virtnet_poll_tx(struct napi_struct *napi, int budget)
+{
+	struct send_queue *sq =
+		container_of(napi, struct send_queue, napi);
+	struct virtnet_info *vi = sq->vq->vdev->priv;
+	struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq));
+	unsigned int sent;
+
+	__netif_tx_lock(txq, smp_processor_id());
+	sent = free_old_xmit_skbs(txq, sq, budget);
+	if (sent < budget) {
+		napi_complete(napi);
+		/* Note: we must enable cb *after* napi_complete, because
+		 * napi_schedule calls from callbacks that trigger before
+		 * napi_complete are ignored.
+		 */
+		if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
+			virtqueue_disable_cb(sq->vq);
+			napi_schedule(&sq->napi);
+		}
+	}
+	__netif_tx_unlock(txq);
+	return sent;
+}
+
 #ifdef CONFIG_NET_RX_BUSY_POLL
 /* must be called with local_bh_disable()d */
 static int virtnet_busy_poll(struct napi_struct *napi)
@@ -822,30 +875,12 @@ static int virtnet_open(struct net_device *dev)
 			if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
 				schedule_delayed_work(&vi->refill, 0);
 		virtnet_napi_enable(&vi->rq[i]);
+		napi_enable(&vi->sq[i].napi);
 	}
 
 	return 0;
 }
 
-static void free_old_xmit_skbs(struct send_queue *sq)
-{
-	struct sk_buff *skb;
-	unsigned int len;
-	struct virtnet_info *vi = sq->vq->vdev->priv;
-	struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
-
-	while ((skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
-		pr_debug("Sent skb %p\n", skb);
-
-		u64_stats_update_begin(&stats->tx_syncp);
-		stats->tx_bytes += skb->len;
-		stats->tx_packets++;
-		u64_stats_update_end(&stats->tx_syncp);
-
-		dev_kfree_skb_any(skb);
-	}
-}
-
 static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 {
 	struct skb_vnet_hdr *hdr;
@@ -911,7 +946,9 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 		sg_set_buf(sq->sg, hdr, hdr_len);
 		num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
 	}
-	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
+
+	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb,
+				    GFP_ATOMIC);
 }
 
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -923,8 +960,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
 	bool kick = !skb->xmit_more;
 
-	/* Free up any pending old buffers before queueing new ones. */
-	free_old_xmit_skbs(sq);
+	virtqueue_disable_cb(sq->vq);
 
 	/* Try to transmit */
 	err = xmit_skb(sq, skb);
@@ -940,27 +976,26 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		return NETDEV_TX_OK;
 	}
 
-	/* Don't wait up for transmitted skbs to be freed. */
-	skb_orphan(skb);
-	nf_reset(skb);
-
 	/* Apparently nice girls don't return TX_BUSY; stop the queue
 	 * before it gets out of hand.  Naturally, this wastes entries. */
-	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
+	if (sq->vq->num_free < 2+MAX_SKB_FRAGS)
 		netif_stop_subqueue(dev, qnum);
-		if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
-			/* More just got used, free them then recheck. */
-			free_old_xmit_skbs(sq);
-			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
-				netif_start_subqueue(dev, qnum);
-				virtqueue_disable_cb(sq->vq);
-			}
-		}
-	}
 
 	if (kick || netif_xmit_stopped(txq))
 		virtqueue_kick(sq->vq);
 
+	/* Try to pop off some buffers before we re-enable callbacks.
+	 * It makes sense to do it after kick, since that causes
+	 * device to process packets.
+	 */
+	if (free_old_xmit_skbs(txq, sq, NAPI_POLL_WEIGHT) < NAPI_POLL_WEIGHT) {
+		if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
+			virtqueue_disable_cb(sq->vq);
+			napi_schedule(&sq->napi);
+		}
+	} else {
+		napi_schedule(&sq->napi);
+	}
 	return NETDEV_TX_OK;
 }
 
@@ -1137,8 +1172,10 @@ static int virtnet_close(struct net_device *dev)
 	/* Make sure refill_work doesn't re-enable napi! */
 	cancel_delayed_work_sync(&vi->refill);
 
-	for (i = 0; i < vi->max_queue_pairs; i++)
+	for (i = 0; i < vi->max_queue_pairs; i++) {
 		napi_disable(&vi->rq[i].napi);
+		napi_disable(&vi->sq[i].napi);
+	}
 
 	return 0;
 }
@@ -1457,8 +1494,10 @@ static void virtnet_free_queues(struct virtnet_info *vi)
 {
 	int i;
 
-	for (i = 0; i < vi->max_queue_pairs; i++)
+	for (i = 0; i < vi->max_queue_pairs; i++) {
 		netif_napi_del(&vi->rq[i].napi);
+		netif_napi_del(&vi->sq[i].napi);
+	}
 
 	kfree(vi->rq);
 	kfree(vi->sq);
@@ -1612,6 +1651,8 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
 		netif_napi_add(vi->dev, &vi->rq[i].napi, virtnet_poll,
 			       napi_weight);
 		napi_hash_add(&vi->rq[i].napi);
+		netif_napi_add(vi->dev, &vi->sq[i].napi, virtnet_poll_tx,
+			       napi_weight);
 
 		sg_init_table(vi->rq[i].sg, ARRAY_SIZE(vi->rq[i].sg));
 		ewma_init(&vi->rq[i].mrg_avg_pkt_len, 1, RECEIVE_AVG_WEIGHT);
@@ -1916,8 +1957,10 @@ static int virtnet_freeze(struct virtio_device *vdev)
 	if (netif_running(vi->dev)) {
 		for (i = 0; i < vi->max_queue_pairs; i++) {
 			napi_disable(&vi->rq[i].napi);
+			napi_disable(&vi->sq[i].napi);
 			napi_hash_del(&vi->rq[i].napi);
 			netif_napi_del(&vi->rq[i].napi);
+			netif_napi_del(&vi->sq[i].napi);
 		}
 	}
 
@@ -1942,8 +1985,10 @@ static int virtnet_restore(struct virtio_device *vdev)
 			if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
 				schedule_delayed_work(&vi->refill, 0);
 
-		for (i = 0; i < vi->max_queue_pairs; i++)
+		for (i = 0; i < vi->max_queue_pairs; i++) {
 			virtnet_napi_enable(&vi->rq[i]);
+			napi_enable(&vi->sq[i].napi);
+		}
 	}
 
 	netif_device_attach(vi->dev);
-- 
MST

^ permalink raw reply related

* [PATCH RFC v3 2/3] virtio_net: bql
From: Michael S. Tsirkin @ 2014-10-20  6:52 UTC (permalink / raw)
  To: linux-kernel, netdev; +Cc: netdev, virtualization
In-Reply-To: <1413787824-16130-1-git-send-email-mst@redhat.com>

Improve tx batching using byte queue limits.
Should be especially effective for MQ.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 14f4cda..b83d39d 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -227,6 +227,7 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq,
 	struct virtnet_info *vi = sq->vq->vdev->priv;
 	struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
 	unsigned int packets = 0;
+	unsigned int bytes = 0;
 
 	while (packets < budget &&
 	       (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
@@ -234,6 +235,7 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq,
 
 		u64_stats_update_begin(&stats->tx_syncp);
 		stats->tx_bytes += skb->len;
+		bytes += skb->len;
 		stats->tx_packets++;
 		u64_stats_update_end(&stats->tx_syncp);
 
@@ -241,6 +243,8 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq,
 		packets++;
 	}
 
+	netdev_tx_completed_queue(txq, packets, bytes);
+
 	if (sq->vq->num_free >= 2+MAX_SKB_FRAGS)
 		netif_tx_start_queue(txq);
 
@@ -959,6 +963,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	int err;
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
 	bool kick = !skb->xmit_more;
+	unsigned int bytes = skb->len;
 
 	virtqueue_disable_cb(sq->vq);
 
@@ -976,6 +981,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 		return NETDEV_TX_OK;
 	}
 
+	netdev_tx_sent_queue(txq, bytes);
+
 	/* Apparently nice girls don't return TX_BUSY; stop the queue
 	 * before it gets out of hand.  Naturally, this wastes entries. */
 	if (sq->vq->num_free < 2+MAX_SKB_FRAGS)
-- 
MST

^ permalink raw reply related

* Re: [PATCH 2/4] net: make skb_gso_segment error handling more robust
From: Florian Westphal @ 2014-10-20  7:05 UTC (permalink / raw)
  To: David Miller; +Cc: fw, netdev, edumazet
In-Reply-To: <20141019.203943.579204096575757665.davem@davemloft.net>

David Miller <davem@davemloft.net> wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Sun, 19 Oct 2014 22:42:19 +0200
> 
> > skb_gso_segment has three possible return values:
> > 1. a pointer to the first segmented skb
> > 2. an errno value (IS_ERR())
> > 3. NULL.  This can happen when GSO is used for header verification.
> > 
> > However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
> > and would oops when NULL is returned.
> > 
> > Note that these call sites should never actually see such a NULL return
> > value; all callers mask out the GSO bits in the feature argument.
> > 
> > However, in the past, there have been issues with some protocol handlers
> > erronously not respecting the specified feature mask in some cases.
> > 
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> 
> I don't think it makes sense to return PTR_ERR(p) when
> p is NULL.

Good point. Will respin.

^ permalink raw reply

* Re: [GIT] Networking
From: Pablo Neira Ayuso @ 2014-10-20  7:53 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, akpm, netdev, linux-kernel
In-Reply-To: <20141019.210314.1388340101577924200.davem@davemloft.net>

On Sun, Oct 19, 2014 at 09:03:14PM -0400, David Miller wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Sun, 19 Oct 2014 17:32:15 -0700
> 
> > Looks like the module license issue was just overlooked when moving
> > the code out in commit c8d7b98bec43 ("netfilter: move nf_send_resetX()
> > code to nf_reject_ipvX modules").
> 
> I think Pablo has a patch pending to address this, and indeed he does:
> 
> 	http://marc.info/?l=linux-netdev&m=141293491712312&w=2
> 
> Pablo please push this to me soon, thanks.

I'll send you this batch today. Thanks.

^ permalink raw reply

* [PATCH] drivers: net: xgene: Add missing initialization in xgene_enet_ecc_init()
From: Geert Uytterhoeven @ 2014-10-20  8:08 UTC (permalink / raw)
  To: David S. Miller, Iyappan Subramanian, Keyur Chudgar
  Cc: netdev, linux-kernel, Geert Uytterhoeven

drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c: In function ‘xgene_enet_ecc_init’:
drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c:126: warning: ‘data’ may be used uninitialized in this function

Depending on the arbitrary value on the stack, the loop may terminate
too early, and cause a bogus -ENODEV failure.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
index e6d24c2101982444..19e13583b4259cd4 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
@@ -123,7 +123,7 @@ static u32 xgene_enet_rd_mac(struct xgene_enet_pdata *p, u32 rd_addr)
 static int xgene_enet_ecc_init(struct xgene_enet_pdata *p)
 {
 	struct net_device *ndev = p->ndev;
-	u32 data;
+	u32 data = 0;
 	int i;
 
 	xgene_enet_wr_diag_csr(p, ENET_CFG_MEM_RAM_SHUTDOWN_ADDR, 0);
-- 
1.9.1

^ permalink raw reply related

* [PATCH 0/7] netfilter fixes for net
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains netfilter fixes for your net tree,
they are:

1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules.

2) Restrict nat and masq expressions to the nat chain type. Otherwise,
   users may crash their kernel if they attach a nat/masq rule to a non
   nat chain.

3) Fix hook validation in nft_compat when non-base chains are used.
   Basically, initialize hook_mask to zero.

4) Make sure you use match/targets in nft_compat from the right chain
   type. The existing validation relies on the table name which can be
   avoided by

5) Better netlink attribute validation in nft_nat. This expression has
   to reject the configuration when no address and proto configurations
   are specified.

6) Interpret NFTA_NAT_REG_*_MAX if only if NFTA_NAT_REG_*_MIN is set.
   Yet another sanity check to reject incorrect configurations from
   userspace.

7) Conditional NAT attribute dumping depending on the existing
   configuration.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 01d2d484e49e9bc0ed9b5fdaf345a0e2bf35ffed:

  Merge branch 'bcmgenet_systemport' (2014-10-10 15:39:22 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to 1e2d56a5d33a7e1fcd21ed3859f52596d02708b0:

  netfilter: nft_nat: dump attributes if they are set (2014-10-18 14:16:13 +0200)

----------------------------------------------------------------
Pablo Neira Ayuso (7):
      netfilter: missing module license in the nf_reject_ipvX modules
      netfilter: nf_tables: restrict nat/masq expressions to nat chain type
      netfilter: nft_compat: fix hook validation for non-base chains
      netfilter: nft_compat: validate chain type in match/target
      netfilter: nft_nat: insufficient attribute validation
      netfilter: nft_nat: NFTA_NAT_REG_ADDR_MAX depends on NFTA_NAT_REG_ADDR_MIN
      netfilter: nft_nat: dump attributes if they are set

 include/net/netfilter/nf_tables.h   |    3 ++
 include/net/netfilter/nft_masq.h    |    3 ++
 net/ipv4/netfilter/nf_reject_ipv4.c |    3 ++
 net/ipv4/netfilter/nft_masq_ipv4.c  |    1 +
 net/ipv6/netfilter/nf_reject_ipv6.c |    4 ++
 net/ipv6/netfilter/nft_masq_ipv6.c  |    1 +
 net/netfilter/nf_tables_api.c       |   14 ++++++
 net/netfilter/nft_compat.c          |   79 ++++++++++++++++++++++++++++----
 net/netfilter/nft_masq.c            |   12 +++++
 net/netfilter/nft_nat.c             |   86 ++++++++++++++++++++++-------------
 10 files changed, 165 insertions(+), 41 deletions(-)

^ permalink raw reply

* [PATCH 3/7] netfilter: nft_compat: fix hook validation for non-base chains
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

Set hook_mask to zero for non-base chains, otherwise people may hit
bogus errors from the xt_check_target() and xt_check_match() when
validating the uninitialized hook_mask.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_compat.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 7e2683c..44ae273 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -95,6 +95,8 @@ nft_target_set_tgchk_param(struct xt_tgchk_param *par,
 		const struct nf_hook_ops *ops = &basechain->ops[0];
 
 		par->hook_mask = 1 << ops->hooknum;
+	} else {
+		par->hook_mask = 0;
 	}
 	par->family	= ctx->afi->family;
 }
@@ -293,6 +295,8 @@ nft_match_set_mtchk_param(struct xt_mtchk_param *par, const struct nft_ctx *ctx,
 		const struct nf_hook_ops *ops = &basechain->ops[0];
 
 		par->hook_mask = 1 << ops->hooknum;
+	} else {
+		par->hook_mask = 0;
 	}
 	par->family	= ctx->afi->family;
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/7] netfilter: missing module license in the nf_reject_ipvX modules
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

[   23.545204] nf_reject_ipv4: module license 'unspecified' taints kernel.

Fixes: c8d7b98 ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules")
Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nf_reject_ipv4.c |    3 +++
 net/ipv6/netfilter/nf_reject_ipv6.c |    4 ++++
 2 files changed, 7 insertions(+)

diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
index b023b4e..92b303d 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -6,6 +6,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/module.h>
 #include <net/ip.h>
 #include <net/tcp.h>
 #include <net/route.h>
@@ -125,3 +126,5 @@ void nf_send_reset(struct sk_buff *oldskb, int hook)
 	kfree_skb(nskb);
 }
 EXPORT_SYMBOL_GPL(nf_send_reset);
+
+MODULE_LICENSE("GPL");
diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
index 5f5f043..20d9def 100644
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -5,6 +5,8 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
+
+#include <linux/module.h>
 #include <net/ipv6.h>
 #include <net/ip6_route.h>
 #include <net/ip6_fib.h>
@@ -161,3 +163,5 @@ void nf_send_reset6(struct net *net, struct sk_buff *oldskb, int hook)
 		ip6_local_out(nskb);
 }
 EXPORT_SYMBOL_GPL(nf_send_reset6);
+
+MODULE_LICENSE("GPL");
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 4/7] netfilter: nft_compat: validate chain type in match/target
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

We have to validate the real chain type to ensure that matches/targets
are not used out from their scope (eg. MASQUERADE in nat chain type).
The existing validation relies on the table name, but this is not
sufficient since userspace can fool us by using the appropriate table
name with a different chain type.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_compat.c |   75 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 66 insertions(+), 9 deletions(-)

diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 44ae273..0480f57 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -19,9 +19,52 @@
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter_ipv4/ip_tables.h>
 #include <linux/netfilter_ipv6/ip6_tables.h>
-#include <asm/uaccess.h> /* for set_fs */
 #include <net/netfilter/nf_tables.h>
 
+static const struct {
+       const char	*name;
+       u8		type;
+} table_to_chaintype[] = {
+       { "filter",     NFT_CHAIN_T_DEFAULT },
+       { "raw",        NFT_CHAIN_T_DEFAULT },
+       { "security",   NFT_CHAIN_T_DEFAULT },
+       { "mangle",     NFT_CHAIN_T_ROUTE },
+       { "nat",        NFT_CHAIN_T_NAT },
+       { },
+};
+
+static int nft_compat_table_to_chaintype(const char *table)
+{
+	int i;
+
+	for (i = 0; table_to_chaintype[i].name != NULL; i++) {
+		if (strcmp(table_to_chaintype[i].name, table) == 0)
+			return table_to_chaintype[i].type;
+	}
+
+	return -1;
+}
+
+static int nft_compat_chain_validate_dependency(const char *tablename,
+						const struct nft_chain *chain)
+{
+	enum nft_chain_type type;
+	const struct nft_base_chain *basechain;
+
+	if (!tablename || !(chain->flags & NFT_BASE_CHAIN))
+		return 0;
+
+	type = nft_compat_table_to_chaintype(tablename);
+	if (type < 0)
+		return -EINVAL;
+
+	basechain = nft_base_chain(chain);
+	if (basechain->type->type != type)
+		return -EINVAL;
+
+	return 0;
+}
+
 union nft_entry {
 	struct ipt_entry e4;
 	struct ip6t_entry e6;
@@ -153,6 +196,10 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	union nft_entry e = {};
 	int ret;
 
+	ret = nft_compat_chain_validate_dependency(target->table, ctx->chain);
+	if (ret < 0)
+		goto err;
+
 	target_compat_from_user(target, nla_data(tb[NFTA_TARGET_INFO]), info);
 
 	if (ctx->nla[NFTA_RULE_COMPAT]) {
@@ -218,6 +265,7 @@ static int nft_target_validate(const struct nft_ctx *ctx,
 {
 	struct xt_target *target = expr->ops->data;
 	unsigned int hook_mask = 0;
+	int ret;
 
 	if (ctx->chain->flags & NFT_BASE_CHAIN) {
 		const struct nft_base_chain *basechain =
@@ -225,11 +273,13 @@ static int nft_target_validate(const struct nft_ctx *ctx,
 		const struct nf_hook_ops *ops = &basechain->ops[0];
 
 		hook_mask = 1 << ops->hooknum;
-		if (hook_mask & target->hooks)
-			return 0;
+		if (!(hook_mask & target->hooks))
+			return -EINVAL;
 
-		/* This target is being called from an invalid chain */
-		return -EINVAL;
+		ret = nft_compat_chain_validate_dependency(target->table,
+							   ctx->chain);
+		if (ret < 0)
+			return ret;
 	}
 	return 0;
 }
@@ -324,6 +374,10 @@ nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	union nft_entry e = {};
 	int ret;
 
+	ret = nft_compat_chain_validate_dependency(match->name, ctx->chain);
+	if (ret < 0)
+		goto err;
+
 	match_compat_from_user(match, nla_data(tb[NFTA_MATCH_INFO]), info);
 
 	if (ctx->nla[NFTA_RULE_COMPAT]) {
@@ -383,6 +437,7 @@ static int nft_match_validate(const struct nft_ctx *ctx,
 {
 	struct xt_match *match = expr->ops->data;
 	unsigned int hook_mask = 0;
+	int ret;
 
 	if (ctx->chain->flags & NFT_BASE_CHAIN) {
 		const struct nft_base_chain *basechain =
@@ -390,11 +445,13 @@ static int nft_match_validate(const struct nft_ctx *ctx,
 		const struct nf_hook_ops *ops = &basechain->ops[0];
 
 		hook_mask = 1 << ops->hooknum;
-		if (hook_mask & match->hooks)
-			return 0;
+		if (!(hook_mask & match->hooks))
+			return -EINVAL;
 
-		/* This match is being called from an invalid chain */
-		return -EINVAL;
+		ret = nft_compat_chain_validate_dependency(match->name,
+							   ctx->chain);
+		if (ret < 0)
+			return ret;
 	}
 	return 0;
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/7] netfilter: nf_tables: restrict nat/masq expressions to nat chain type
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

This adds the missing validation code to avoid the use of nat/masq from
non-nat chains. The validation assumes two possible configuration
scenarios:

1) Use of nat from base chain that is not of nat type. Reject this
   configuration from the nft_*_init() path of the expression.

2) Use of nat from non-base chain. In this case, we have to wait until
   the non-base chain is referenced by at least one base chain via
   jump/goto. This is resolved from the nft_*_validate() path which is
   called from nf_tables_check_loops().

The user gets an -EOPNOTSUPP in both cases.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h  |    3 +++
 include/net/netfilter/nft_masq.h   |    3 +++
 net/ipv4/netfilter/nft_masq_ipv4.c |    1 +
 net/ipv6/netfilter/nft_masq_ipv6.c |    1 +
 net/netfilter/nf_tables_api.c      |   14 ++++++++++++++
 net/netfilter/nft_masq.c           |   12 ++++++++++++
 net/netfilter/nft_nat.c            |   12 ++++++++++++
 7 files changed, 46 insertions(+)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 3d72923..845c596 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -530,6 +530,9 @@ enum nft_chain_type {
 	NFT_CHAIN_T_MAX
 };
 
+int nft_chain_validate_dependency(const struct nft_chain *chain,
+				  enum nft_chain_type type);
+
 struct nft_stats {
 	u64			bytes;
 	u64			pkts;
diff --git a/include/net/netfilter/nft_masq.h b/include/net/netfilter/nft_masq.h
index c72729f..e2a518b 100644
--- a/include/net/netfilter/nft_masq.h
+++ b/include/net/netfilter/nft_masq.h
@@ -13,4 +13,7 @@ int nft_masq_init(const struct nft_ctx *ctx,
 
 int nft_masq_dump(struct sk_buff *skb, const struct nft_expr *expr);
 
+int nft_masq_validate(const struct nft_ctx *ctx, const struct nft_expr *expr,
+		      const struct nft_data **data);
+
 #endif /* _NFT_MASQ_H_ */
diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c b/net/ipv4/netfilter/nft_masq_ipv4.c
index 1c636d6..c1023c4 100644
--- a/net/ipv4/netfilter/nft_masq_ipv4.c
+++ b/net/ipv4/netfilter/nft_masq_ipv4.c
@@ -39,6 +39,7 @@ static const struct nft_expr_ops nft_masq_ipv4_ops = {
 	.eval		= nft_masq_ipv4_eval,
 	.init		= nft_masq_init,
 	.dump		= nft_masq_dump,
+	.validate	= nft_masq_validate,
 };
 
 static struct nft_expr_type nft_masq_ipv4_type __read_mostly = {
diff --git a/net/ipv6/netfilter/nft_masq_ipv6.c b/net/ipv6/netfilter/nft_masq_ipv6.c
index 556262f..8a7ac68 100644
--- a/net/ipv6/netfilter/nft_masq_ipv6.c
+++ b/net/ipv6/netfilter/nft_masq_ipv6.c
@@ -39,6 +39,7 @@ static const struct nft_expr_ops nft_masq_ipv6_ops = {
 	.eval		= nft_masq_ipv6_eval,
 	.init		= nft_masq_init,
 	.dump		= nft_masq_dump,
+	.validate	= nft_masq_validate,
 };
 
 static struct nft_expr_type nft_masq_ipv6_type __read_mostly = {
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 556a0df..65eb2a1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3744,6 +3744,20 @@ static const struct nfnetlink_subsystem nf_tables_subsys = {
 	.abort		= nf_tables_abort,
 };
 
+int nft_chain_validate_dependency(const struct nft_chain *chain,
+				  enum nft_chain_type type)
+{
+	const struct nft_base_chain *basechain;
+
+	if (chain->flags & NFT_BASE_CHAIN) {
+		basechain = nft_base_chain(chain);
+		if (basechain->type->type != type)
+			return -EOPNOTSUPP;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nft_chain_validate_dependency);
+
 /*
  * Loop detection - walk through the ruleset beginning at the destination chain
  * of a new jump until either the source chain is reached (loop) or all
diff --git a/net/netfilter/nft_masq.c b/net/netfilter/nft_masq.c
index 6637bab..d1ffd5e 100644
--- a/net/netfilter/nft_masq.c
+++ b/net/netfilter/nft_masq.c
@@ -26,6 +26,11 @@ int nft_masq_init(const struct nft_ctx *ctx,
 		  const struct nlattr * const tb[])
 {
 	struct nft_masq *priv = nft_expr_priv(expr);
+	int err;
+
+	err = nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT);
+	if (err < 0)
+		return err;
 
 	if (tb[NFTA_MASQ_FLAGS] == NULL)
 		return 0;
@@ -55,5 +60,12 @@ nla_put_failure:
 }
 EXPORT_SYMBOL_GPL(nft_masq_dump);
 
+int nft_masq_validate(const struct nft_ctx *ctx, const struct nft_expr *expr,
+		      const struct nft_data **data)
+{
+	return nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT);
+}
+EXPORT_SYMBOL_GPL(nft_masq_validate);
+
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>");
diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
index 799550b..0f0af6e 100644
--- a/net/netfilter/nft_nat.c
+++ b/net/netfilter/nft_nat.c
@@ -95,6 +95,10 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	u32 family;
 	int err;
 
+	err = nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT);
+	if (err < 0)
+		return err;
+
 	if (tb[NFTA_NAT_TYPE] == NULL)
 		return -EINVAL;
 
@@ -205,6 +209,13 @@ nla_put_failure:
 	return -1;
 }
 
+static int nft_nat_validate(const struct nft_ctx *ctx,
+			    const struct nft_expr *expr,
+			    const struct nft_data **data)
+{
+	return nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT);
+}
+
 static struct nft_expr_type nft_nat_type;
 static const struct nft_expr_ops nft_nat_ops = {
 	.type           = &nft_nat_type,
@@ -212,6 +223,7 @@ static const struct nft_expr_ops nft_nat_ops = {
 	.eval           = nft_nat_eval,
 	.init           = nft_nat_init,
 	.dump           = nft_nat_dump,
+	.validate	= nft_nat_validate,
 };
 
 static struct nft_expr_type nft_nat_type __read_mostly = {
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 5/7] netfilter: nft_nat: insufficient attribute validation
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

We have to validate that we at least get an NFTA_NAT_REG_ADDR_MIN or
NFTA_NFT_REG_PROTO_MIN attribute. Reject the configuration if none
of them are present.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_nat.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
index 0f0af6e..5078f1f 100644
--- a/net/netfilter/nft_nat.c
+++ b/net/netfilter/nft_nat.c
@@ -99,7 +99,9 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	if (err < 0)
 		return err;
 
-	if (tb[NFTA_NAT_TYPE] == NULL)
+	if (tb[NFTA_NAT_TYPE] == NULL ||
+	    (tb[NFTA_NAT_REG_ADDR_MIN] == NULL &&
+	     tb[NFTA_NAT_REG_PROTO_MIN] == NULL))
 		return -EINVAL;
 
 	switch (ntohl(nla_get_be32(tb[NFTA_NAT_TYPE]))) {
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 6/7] netfilter: nft_nat: NFTA_NAT_REG_ADDR_MAX depends on NFTA_NAT_REG_ADDR_MIN
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

Interpret NFTA_NAT_REG_ADDR_MAX if NFTA_NAT_REG_ADDR_MIN is present,
otherwise, skip it. Same thing with NFTA_NAT_REG_PROTO_MAX.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_nat.c |   50 ++++++++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
index 5078f1f..a95e0c1 100644
--- a/net/netfilter/nft_nat.c
+++ b/net/netfilter/nft_nat.c
@@ -126,38 +126,44 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	priv->family = family;
 
 	if (tb[NFTA_NAT_REG_ADDR_MIN]) {
-		priv->sreg_addr_min = ntohl(nla_get_be32(
-						tb[NFTA_NAT_REG_ADDR_MIN]));
+		priv->sreg_addr_min =
+			ntohl(nla_get_be32(tb[NFTA_NAT_REG_ADDR_MIN]));
+
 		err = nft_validate_input_register(priv->sreg_addr_min);
 		if (err < 0)
 			return err;
-	}
 
-	if (tb[NFTA_NAT_REG_ADDR_MAX]) {
-		priv->sreg_addr_max = ntohl(nla_get_be32(
-						tb[NFTA_NAT_REG_ADDR_MAX]));
-		err = nft_validate_input_register(priv->sreg_addr_max);
-		if (err < 0)
-			return err;
-	} else
-		priv->sreg_addr_max = priv->sreg_addr_min;
+		if (tb[NFTA_NAT_REG_ADDR_MAX]) {
+			priv->sreg_addr_max =
+				ntohl(nla_get_be32(tb[NFTA_NAT_REG_ADDR_MAX]));
+
+			err = nft_validate_input_register(priv->sreg_addr_max);
+			if (err < 0)
+				return err;
+		} else {
+			priv->sreg_addr_max = priv->sreg_addr_min;
+		}
+	}
 
 	if (tb[NFTA_NAT_REG_PROTO_MIN]) {
-		priv->sreg_proto_min = ntohl(nla_get_be32(
-						tb[NFTA_NAT_REG_PROTO_MIN]));
+		priv->sreg_proto_min =
+			ntohl(nla_get_be32(tb[NFTA_NAT_REG_PROTO_MIN]));
+
 		err = nft_validate_input_register(priv->sreg_proto_min);
 		if (err < 0)
 			return err;
-	}
 
-	if (tb[NFTA_NAT_REG_PROTO_MAX]) {
-		priv->sreg_proto_max = ntohl(nla_get_be32(
-						tb[NFTA_NAT_REG_PROTO_MAX]));
-		err = nft_validate_input_register(priv->sreg_proto_max);
-		if (err < 0)
-			return err;
-	} else
-		priv->sreg_proto_max = priv->sreg_proto_min;
+		if (tb[NFTA_NAT_REG_PROTO_MAX]) {
+			priv->sreg_proto_max =
+				ntohl(nla_get_be32(tb[NFTA_NAT_REG_PROTO_MAX]));
+
+			err = nft_validate_input_register(priv->sreg_proto_max);
+			if (err < 0)
+				return err;
+		} else {
+			priv->sreg_proto_max = priv->sreg_proto_min;
+		}
+	}
 
 	if (tb[NFTA_NAT_FLAGS]) {
 		priv->flags = ntohl(nla_get_be32(tb[NFTA_NAT_FLAGS]));
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 7/7] netfilter: nft_nat: dump attributes if they are set
From: Pablo Neira Ayuso @ 2014-10-20  8:10 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1413792639-3954-1-git-send-email-pablo@netfilter.org>

Dump NFTA_NAT_REG_ADDR_MIN if this is non-zero. Same thing with
NFTA_NAT_REG_PROTO_MIN.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_nat.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
index a95e0c1..afe2b0b 100644
--- a/net/netfilter/nft_nat.c
+++ b/net/netfilter/nft_nat.c
@@ -191,17 +191,19 @@ static int nft_nat_dump(struct sk_buff *skb, const struct nft_expr *expr)
 
 	if (nla_put_be32(skb, NFTA_NAT_FAMILY, htonl(priv->family)))
 		goto nla_put_failure;
-	if (nla_put_be32(skb,
-			 NFTA_NAT_REG_ADDR_MIN, htonl(priv->sreg_addr_min)))
-		goto nla_put_failure;
-	if (nla_put_be32(skb,
-			 NFTA_NAT_REG_ADDR_MAX, htonl(priv->sreg_addr_max)))
-		goto nla_put_failure;
+
+	if (priv->sreg_addr_min) {
+		if (nla_put_be32(skb, NFTA_NAT_REG_ADDR_MIN,
+				 htonl(priv->sreg_addr_min)) ||
+		    nla_put_be32(skb, NFTA_NAT_REG_ADDR_MAX,
+				 htonl(priv->sreg_addr_max)))
+			goto nla_put_failure;
+	}
+
 	if (priv->sreg_proto_min) {
 		if (nla_put_be32(skb, NFTA_NAT_REG_PROTO_MIN,
-				 htonl(priv->sreg_proto_min)))
-			goto nla_put_failure;
-		if (nla_put_be32(skb, NFTA_NAT_REG_PROTO_MAX,
+				 htonl(priv->sreg_proto_min)) ||
+		    nla_put_be32(skb, NFTA_NAT_REG_PROTO_MAX,
 				 htonl(priv->sreg_proto_max)))
 			goto nla_put_failure;
 	}
-- 
1.7.10.4


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox