Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v3 2/2] bonding: update bonding.txt for primary description
From: Ding Tianhong @ 2014-01-18  8:28 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 Documentation/networking/bonding.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index a4d925e..5cdb229 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -657,7 +657,8 @@ primary
 	one slave is preferred over another, e.g., when one slave has
 	higher throughput than another.
 
-	The primary option is only valid for active-backup mode.
+	The primary option is only valid for active-backup(1),
+	balance-tlb (5) and balance-alb (6) mode.
 
 primary_reselect
 
-- 
1.8.0

^ permalink raw reply related

* [PATCH net-next] bonding: move the netdev_add_tso_features() to bonding module
From: Ding Tianhong @ 2014-01-18  8:31 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Eric Dumazet, David S. Miller,
	Netdev

The function netdev_add_tso_features() was only be used for bonding,
so no need to export it in netdevice.h, move it to bonding module.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_main.c | 12 +++++++++++-
 include/linux/netdevice.h       | 10 ----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e06c445..4cfe14e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1045,6 +1045,16 @@ static void bond_netpoll_cleanup(struct net_device *bond_dev)
 
 /*---------------------------------- IOCTL ----------------------------------*/
 
+/* Allow TSO being used on stacked device:
+ * Performing the GSO segmentation before last device
+ * is a performance improvement.
+ */
+static netdev_features_t bond_add_tso_features(netdev_features_t features,
+					       netdev_features_t mask)
+{
+	return netdev_increment_features(features, NETIF_F_ALL_TSO, mask);
+}
+
 static netdev_features_t bond_fix_features(struct net_device *dev,
 					   netdev_features_t features)
 {
@@ -1068,7 +1078,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
 						     slave->dev->features,
 						     mask);
 	}
-	features = netdev_add_tso_features(features, mask);
+	features = bond_add_tso_features(features, mask);
 
 	return features;
 }
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a2a70cc..1be74ea 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3010,16 +3010,6 @@ static inline netdev_features_t netdev_get_wanted_features(
 netdev_features_t netdev_increment_features(netdev_features_t all,
 	netdev_features_t one, netdev_features_t mask);
 
-/* Allow TSO being used on stacked device :
- * Performing the GSO segmentation before last device
- * is a performance improvement.
- */
-static inline netdev_features_t netdev_add_tso_features(netdev_features_t features,
-							netdev_features_t mask)
-{
-	return netdev_increment_features(features, NETIF_F_ALL_TSO, mask);
-}
-
 int __netdev_update_features(struct net_device *dev);
 void netdev_update_features(struct net_device *dev);
 void netdev_change_features(struct net_device *dev);
-- 
1.8.0

^ permalink raw reply related

* kmem_cache_alloc panic in 3.10+
From: dormando @ 2014-01-18  8:44 UTC (permalink / raw)
  To: netdev, linux-kernel

Hello again!

We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least
(trying newer stables now, but I can't tell if it was fixed, and it takes
weeks to reproduce).

Unfortunately I can only get 8k back from pstore. The panic looks a bit
longer than that is caught in the log, but the bottom part is almost
always this same trace as this one:

Panic#6 Part1
<4>[1197485.199166]  [<ffffffff81611e8c>] tcp_push+0x6c/0x90
<4>[1197485.199171]  [<ffffffff816160a9>] tcp_sendmsg+0x109/0xd40
<4>[1197485.199179]  [<ffffffff81114b65>] ? put_page+0x35/0x40
<4>[1197485.199185]  [<ffffffff8163bf75>] inet_sendmsg+0x45/0xb0
<4>[1197485.199191]  [<ffffffff8159da7e>] sock_aio_write+0x11e/0x130
<4>[1197485.199196]  [<ffffffff8163b83f>] ? inet_recvmsg+0x4f/0x80
<4>[1197485.199203]  [<ffffffff811558ad>] do_sync_readv_writev+0x6d/0xa0
<4>[1197485.199209]  [<ffffffff8115722b>] do_readv_writev+0xfb/0x2f0
<4>[1197485.199215]  [<ffffffff8110fda5>] ? __free_pages+0x35/0x40
<4>[1197485.199220]  [<ffffffff8110fe56>] ? free_pages+0x46/0x50
<4>[1197485.199226]  [<ffffffff8112f9e2>] ? SyS_mincore+0x152/0x690
<4>[1197485.199231]  [<ffffffff81157468>] vfs_writev+0x48/0x60
<4>[1197485.199236]  [<ffffffff811575af>] SyS_writev+0x5f/0xd0
<4>[1197485.199243]  [<ffffffff816cf942>] system_call_fastpath+0x16/0x1b
<4>[1197485.199247] Code: 65 4c 03 04 25 c8 cb 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 0f 84 84 00 00 00 48 85 c0 74 7f 49 63 44 24 20 49 8b 3c 24 <49> 8b 5c 05 00 48 8d 4a 01 4c 89 e8 65 48 0f c7 0f 0f 94 c0 3c
<1>[1197485.199290] RIP  [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
<4>[1197485.199296]  RSP <ffff883171211868>
<4>[1197485.199299] CR2: 0000000100000000
<4>[1197485.199343] ---[ end trace 90fee06aa40b7304 ]---
<1>[1197485.263911] BUG: unable to handle kernel paging request at 0000000100000000
<1>[1197485.263923] IP: [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
<4>[1197485.263932] PGD 3f43e5c067 PUD 0
<4>[1197485.263937] Oops: 0000 [#5] SMP
<4>[1197485.263941] Modules linked in: ntfs vfat msdos fat macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode sb_edac edac_core lpc_ich mfd_core ixgbe igb i2c_algo_bit mdio ptp pps_core
<4>[1197485.263966] CPU: 0 PID: 233846 Comm: cache-worker Tainted: G      D      3.10.15 #1
<4>[1197485.263972] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 2.0a 03/07/2013
<4>[1197485.263976] task: ffff883427f9dc00 ti: ffff8830d4312000 task.ti: ffff8830d4312000
<4>[1197485.263982] RIP: 0010:[<ffffffff811476da>]  [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
<4>[1197485.263990] RSP: 0018:ffff881fffc038c8  EFLAGS: 00010286
<4>[1197485.263994] RAX: 0000000000000000 RBX: ffffffff81c8c740 RCX: 00000000ffffffff
<4>[1197485.263999] RDX: 0000000029273024 RSI: 0000000000000020 RDI: 0000000000015680
<4>[1197485.264004] RBP: ffff881fffc03908 R08: ffff881fffc15680 R09: ffffffff815bdd4b
<4>[1197485.264009] R10: ffff881c65d21800 R11: 0000000000000000 R12: ffff881fff803800
<4>[1197485.264014] R13: 0000000100000000 R14: 00000000ffffffff R15: 0000000000000000
<4>[1197485.264019] FS:  00007f8d855eb700(0000) GS:ffff881fffc00000(0000) knlGS:0000000000000000
<4>[1197485.264024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[1197485.264028] CR2: 0000000100000000 CR3: 000000308f258000 CR4: 00000000000407f0
<4>[1197485.264032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[1197485.264037] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[1197485.264041] Stack:
<4>[1197485.264044]  ffff881fffc03928 00000020815d0d95 ffff881fffc03938 ffffffff81c8c740
<4>[1197485.264050]  ffff881fce210000 0000000000000001 00000000ffffffff 0000000000000000
<4>[1197485.264056]  ffff881fffc03958 ffffffff815bdd4b ffff881fffc039a8 0000000000000000
<4>[1197485.264063] Call Trace:
<4>[1197485.264066]  <IRQ>
<4>[1197485.264069]  [<ffffffff815bdd4b>] dst_alloc+0x5b/0x190
<4>[1197485.264080]  [<ffffffff8160068c>] rt_dst_alloc+0x4c/0x50
<4>[1197485.264085]  [<ffffffff81602a30>] __ip_route_output_key+0x270/0x880
<4>[1197485.264092]  [<ffffffff8107ee7e>] ? try_to_wake_up+0x23e/0x2b0
<4>[1197485.264097]  [<ffffffff81603067>] ip_route_output_flow+0x27/0x60
<4>[1197485.264102]  [<ffffffff8160ab8a>] ip_queue_xmit+0x36a/0x390
<4>[1197485.264108]  [<ffffffff816207c5>] tcp_transmit_skb+0x485/0x890
<4>[1197485.264113]  [<ffffffff81621aa1>] tcp_send_ack+0xf1/0x130
<4>[1197485.264118]  [<ffffffff81618d7e>] __tcp_ack_snd_check+0x5e/0xa0
<4>[1197485.264123]  [<ffffffff8161f2c2>] tcp_rcv_state_process+0x8b2/0xb20
<4>[1197485.264128]  [<ffffffff81627e61>] tcp_v4_do_rcv+0x191/0x4f0
<4>[1197485.264133]  [<ffffffff8162984c>] tcp_v4_rcv+0x5fc/0x750
<4>[1197485.264138]  [<ffffffff81604c80>] ? ip_rcv+0x350/0x350
<4>[1197485.264143]  [<ffffffff815e45cd>] ? nf_hook_slow+0x7d/0x160
<4>[1197485.264147]  [<ffffffff81604c80>] ? ip_rcv+0x350/0x350
<4>[1197485.264152]  [<ffffffff81604d4e>] ip_local_deliver_finish+0xce/0x250
<4>[1197485.264156]  [<ffffffff81604f1c>] ip_local_deliver+0x4c/0x80
<4>[1197485.264161]  [<ffffffff816045a9>] ip_rcv_finish+0x119/0x360
<4>[1197485.264165]  [<ffffffff81604b60>] ip_rcv+0x230/0x350
<4>[1197485.264170]  [<ffffffff815b89f7>] __netif_receive_skb_core+0x477/0x600
<4>[1197485.264175]  [<ffffffff815b8ba7>] __netif_receive_skb+0x27/0x70
<4>[1197485.264180]  [<ffffffff815b8ce4>] process_backlog+0xf4/0x1e0
<4>[1197485.264184]  [<ffffffff815b94e5>] net_rx_action+0xf5/0x250
<4>[1197485.264190]  [<ffffffff81053b7f>] __do_softirq+0xef/0x270
<4>[1197485.264196]  [<ffffffff816d0b7c>] call_softirq+0x1c/0x30
<4>[1197485.264199]  <EOI>
<4>[1197485.264201]  [<ffffffff81004495>] do_softirq+0x55/0x90
<4>[1197485.264209]  [<ffffffff81053a84>] local_bh_enable+0x94/0xa0
<4>[1197485.264215]  [<ffffffff8165567a>] ipt_do_table+0x22a/0x680
<4>[1197485.264221]  [<ffffffff815d39c1>] ? skb_clone_tx_timestamp+0x31/0x110
<4>[1197485.264231]  [<ffffffffa00ae840>] ? ixgbe_xmit_frame_ring+0x4c0/0xd40 [ixgbe]
<4>[1197485.264239]  [<ffffffffa00af103>] ? ixgbe_xmit_frame+0x43/0x90 [ixgbe]
<4>[1197485.264245]  [<ffffffff81657a23>] iptable_raw_hook+0x33/0x70
<4>[1197485.264252]  [<ffffffff815e43a7>] nf_iterate+0x87/0xb0
<4>[1197485.264256]  [<ffffffff81607e20>] ? ip_options_echo+0x420/0x420
<4>[1197485.264261]  [<ffffffff815e45cd>] nf_hook_slow+0x7d/0x160
<4>[1197485.264266]  [<ffffffff81607e20>] ? ip_options_echo+0x420/0x420
<4>[1197485.264270]  [<ffffffff8160a430>] __ip_local_out+0xa0/0xb0
<4>[1197485.264275]  [<ffffffff8160a456>] ip_local_out+0x16/0x30
<4>[1197485.264280]  [<ffffffff8160a97a>] ip_queue_xmit+0x15a/0x390
<4>[1197485.264286]  [<ffffffff81625e73>] ? tcp_v4_md5_lookup+0x13/0x20
<4>[1197485.264290]  [<ffffffff816207c5>] tcp_transmit_skb+0x485/0x890
<4>[1197485.264295]  [<ffffffff81622e08>] tcp_write_xmit+0x1b8/0xa50
<4>[1197485.264300]  [<ffffffff815a7e28>] ? __alloc_skb+0xa8/0x1f0
<4>[1197485.264304]  [<ffffffff816236d0>] tcp_push_one+0x30/0x40
<4>[1197485.264309]  [<ffffffff81616b84>] tcp_sendmsg+0xbe4/0xd40
<4>[1197485.264315]  [<ffffffff81114b65>] ? put_page+0x35/0x40
<4>[1197485.264321]  [<ffffffff8163bf75>] inet_sendmsg+0x45/0xb0
<4>[1197485.264326]  [<ffffffff8159da7e>] sock_aio_write+0x11e/0x130
<4>[1197485.264331]  [<ffffffff8163b83f>] ? inet_recvmsg+0x4f/0x80
<4>[1197485.264337]  [<ffffffff811558ad>] do_sync_readv_writev+0x6d/0xa0
<4>[1197485.264343]  [<ffffffff8115722b>] do_readv_writev+0xfb/0x2f0
<4>[1197485.264347]  [<ffffffff8110fda5>] ? __free_pages+0x35/0x40
<4>[1197485.264352]  [<ffffffff8110fe56>] ? free_pages+0x46/0x50
<4>[1197485.264357]  [<ffffffff8112f9e2>] ? SyS_mincore+0x152/0x690
<4>[1197485.264363]  [<ffffffff81157468>] vfs_writev+0x48/0x60
<4>[1197485.264367]  [<ffffffff811575af>] SyS_writev+0x5f/0xd0
<4>[1197485.264373]  [<ffffffff816cf942>] system_call_fastpath+0x16/0x1b
<4>[1197485.264377] Code: 65 4c 03 04 25 c8 cb 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 0f 84 84 00 00 00 48 85 c0 74 7f 49 63 44 24 20 49 8b 3c 24 <49> 8b 5c 05 00 48 8d 4a 01 4c 89 e8 65 48 0f c7 0f 0f 94 c0 3c
<1>[1197485.264417] RIP  [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
<4>[1197485.264424]  RSP <ffff881fffc038c8>
<4>[1197485.264427] CR2: 0000000100000000
<4>[1197485.264431] ---[ end trace 90fee06aa40b7305 ]---
<0>[1197485.325141] Kernel panic - not syncing: Fatal exception in interrupt

... way down in the tcp code.

Any help would be appreciated :) I'll do what I can to help, but iterating
this particular crash is very hard due to the amount of time it takes to
reproduce. Since we have a large number of machines they're always
crashing here and there, but once they do it's not going to happen again
for a while.

Thanks!
-Dormando

^ permalink raw reply

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
From: Heiko Carstens @ 2014-01-18 10:12 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, schwidefsky, hannes, netdev, dborkman, darkjames-ws,
	mgherzan, rmk+kernel, matt
In-Reply-To: <20140117.185600.1405505573912550580.davem@davemloft.net>

On Fri, Jan 17, 2014 at 06:56:00PM -0800, David Miller wrote:
> From: Heiko Carstens <heiko.carstens@de.ibm.com>
> Date: Fri, 17 Jan 2014 09:59:16 +0100
> 
> > Could you please also apply the patch below to your tree? It would only
> > generate a merge conflict, that would need fixing, if it would sit in the
> > s390 tree.
> 
> Applied and I queued it up for -stable so I can combine it with
> Eric's original change when I submit it to -stable.

Great, thank you!

^ permalink raw reply

* [PATCH net-next] sch_netem: replace magic numbers with enumerate
From: Yang Yingliang @ 2014-01-18 10:13 UTC (permalink / raw)
  To: netdev; +Cc: stephen, davem

Replace some magic numbers which describe states of 4-state model
loss generator with enumerate.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 net/sched/sch_netem.c | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 3019c10..a2bfc37 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -110,6 +110,13 @@ struct netem_sched_data {
 		CLG_GILB_ELL,
 	} loss_model;
 
+	enum {
+		TX_IN_GAP_PERIOD = 1,
+		TX_IN_BURST_PERIOD,
+		LOST_IN_GAP_PERIOD,
+		LOST_IN_BURST_PERIOD,
+	} _4_state_model;
+
 	/* Correlated Loss Generation models */
 	struct clgstate {
 		/* state of the Markov chain */
@@ -205,43 +212,45 @@ static bool loss_4state(struct netem_sched_data *q)
 	 * probabilities outgoing from the current state, then decides the
 	 * next state and if the next packet has to be transmitted or lost.
 	 * The four states correspond to:
-	 *   1 => successfully transmitted packets within a gap period
-	 *   4 => isolated losses within a gap period
-	 *   3 => lost packets within a burst period
-	 *   2 => successfully transmitted packets within a burst period
+	 *   TX_IN_GAP_PERIOD => successfully transmitted packets within a gap period
+	 *   LOST_IN_BURST_PERIOD => isolated losses within a gap period
+	 *   LOST_IN_GAP_PERIOD => lost packets within a burst period
+	 *   TX_IN_GAP_PERIOD => successfully transmitted packets within a burst period
 	 */
 	switch (clg->state) {
-	case 1:
+	case TX_IN_GAP_PERIOD:
 		if (rnd < clg->a4) {
-			clg->state = 4;
+			clg->state = LOST_IN_BURST_PERIOD;
 			return true;
 		} else if (clg->a4 < rnd && rnd < clg->a1 + clg->a4) {
-			clg->state = 3;
+			clg->state = LOST_IN_GAP_PERIOD;
 			return true;
-		} else if (clg->a1 + clg->a4 < rnd)
-			clg->state = 1;
+		} else if (clg->a1 + clg->a4 < rnd) {
+			clg->state = TX_IN_GAP_PERIOD;
+		}
 
 		break;
-	case 2:
+	case TX_IN_BURST_PERIOD:
 		if (rnd < clg->a5) {
-			clg->state = 3;
+			clg->state = LOST_IN_GAP_PERIOD;
 			return true;
-		} else
-			clg->state = 2;
+		} else {
+			clg->state = TX_IN_BURST_PERIOD;
+		}
 
 		break;
-	case 3:
+	case LOST_IN_GAP_PERIOD:
 		if (rnd < clg->a3)
-			clg->state = 2;
+			clg->state = TX_IN_BURST_PERIOD;
 		else if (clg->a3 < rnd && rnd < clg->a2 + clg->a3) {
-			clg->state = 1;
+			clg->state = TX_IN_GAP_PERIOD;
 		} else if (clg->a2 + clg->a3 < rnd) {
-			clg->state = 3;
+			clg->state = LOST_IN_GAP_PERIOD;
 			return true;
 		}
 		break;
-	case 4:
-		clg->state = 1;
+	case LOST_IN_BURST_PERIOD:
+		clg->state = TX_IN_GAP_PERIOD;
 		break;
 	}
 
-- 
1.8.0

^ permalink raw reply related

* Re: [PATCH 1/6] cgroup: make CONFIG_NET_CLS_CGROUP and CONFIG_NETPRIO_CGROUP bool instead of tristate
From: Daniel Borkmann @ 2014-01-18 11:25 UTC (permalink / raw)
  To: Li Zefan
  Cc: Neil Horman, netdev,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Thomas Graf, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, David S. Miller
In-Reply-To: <52D9D421.6040608-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

On 01/18/2014 02:08 AM, Li Zefan wrote:
> Cc: Daniel Borkmann <dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> On 2014/1/18 2:11, Tejun Heo wrote:
>> net_cls and net_prio are the only cgroups which are allowed to be
>> built as modules.  The savings from allowing the two controllers to be
>> built as modules are tiny especially given that cgroup module support
>> itself adds quite a bit of complexity.
>>
>> The following are the sizes of vmlinux with both built as module and
>> both built as part of the kernel image with cgroup module support
>> removed.
>>
>> 	text		data		bss		dec
>> 	20292207	2411496		10784768	33488471
>> 	20293421	2412568		10784768	33490757
>>
>> The total difference is 2286 bytes.  Given that none of other
>> controllers has much chance of being made a module and that we're
>> unlikely to add new modular controllers, the added complexity is
>> simply not justifiable.
>>
>> As a first step to drop cgroup module support, this patch changes the
>> two config options to bool from tristate and drops module related code
>> from the two controllers.
>>
> 
> I sugguested Daniel to do this for net_cls, and the change has been in
> net-next.
> 
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=fe1217c4f3f7d7cbf8efdd8dd5fdc7204a1d65a8
> 
> I was planning to remove module support after that change goes into
> upstream. :)

I am fine with that, thanks Li.

>> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
>> Cc: Thomas Graf <tgraf-G/eBtMaohhA@public.gmane.org>
>> Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
>> ---
>>   net/Kconfig               |  2 +-
>>   net/core/netprio_cgroup.c | 32 ++------------------------------
>>   net/sched/Kconfig         |  2 +-
>>   net/sched/cls_cgroup.c    | 23 ++---------------------
>>   4 files changed, 6 insertions(+), 53 deletions(-)
>>
> 
> The modular version of task_netprioidx() in include/net/netprio_cgroup.h
> can be removed.
> 

^ permalink raw reply

* [PATCH linux-next] net: batman-adv: use "__packed __aligned(2)" for each structure instead of "__packed(2)" region
From: Chen Gang @ 2014-01-18 11:31 UTC (permalink / raw)
  To: mareklindner-rVWd3aGhH2z5bpWLKbzFeg,
	sw-2YrNx6rUIHYiY0qSoAWiAoQuADTiUCJX,
	antonio-x4xJYDvStAgysxA8WJXlww
  Cc: David Miller, b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r,
	netdev, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-metag-u79uwXL29TY76Z2rM5mHXA, James Hogan

Unfortunately, not all compilers assumes the structures within a pack
region also need be packed (e.g. metag), so need add a pack explicitly
to satisfy all compilers.

The related error (under metag with allmodconfig):

    MODPOST 2952 modules
  ERROR: "__compiletime_assert_431" [net/batman-adv/batman-adv.ko] undefined!
  ERROR: "__compiletime_assert_432" [net/batman-adv/batman-adv.ko] undefined!
  ERROR: "__compiletime_assert_429" [net/batman-adv/batman-adv.ko] undefined!
  ERROR: "__compiletime_assert_428" [net/batman-adv/batman-adv.ko] undefined!
  ERROR: "__compiletime_assert_423" [net/batman-adv/batman-adv.ko] undefined!


Signed-off-by: Chen Gang <gang.chen.5i5j-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 net/batman-adv/packet.h | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/net/batman-adv/packet.h b/net/batman-adv/packet.h
index 0a381d1..9206b48 100644
--- a/net/batman-adv/packet.h
+++ b/net/batman-adv/packet.h
@@ -154,7 +154,6 @@ enum batadv_tvlv_type {
 	BATADV_TVLV_ROAM	= 0x05,
 };
 
-#pragma pack(2)
 /* the destination hardware field in the ARP frame is used to
  * transport the claim type and the group id
  */
@@ -162,8 +161,7 @@ struct batadv_bla_claim_dst {
 	uint8_t magic[3];	/* FF:43:05 */
 	uint8_t type;		/* bla_claimframe */
 	__be16 group;		/* group id */
-};
-#pragma pack()
+} __packed __aligned(2);
 
 /**
  * struct batadv_ogm_packet - ogm (routing protocol) packet
@@ -281,7 +279,6 @@ struct batadv_icmp_packet_rr {
  * misalignment of the payload after the ethernet header. It may also lead to
  * leakage of information when the padding it not initialized before sending.
  */
-#pragma pack(2)
 
 /**
  * struct batadv_unicast_packet - unicast packet for network payload
@@ -300,7 +297,7 @@ struct batadv_unicast_packet {
 	/* "4 bytes boundary + 2 bytes" long to make the payload after the
 	 * following ethernet header again 4 bytes boundary aligned
 	 */
-};
+}  __packed __aligned(2);
 
 /**
  * struct batadv_unicast_4addr_packet - extended unicast packet
@@ -316,7 +313,7 @@ struct batadv_unicast_4addr_packet {
 	/* "4 bytes boundary + 2 bytes" long to make the payload after the
 	 * following ethernet header again 4 bytes boundary aligned
 	 */
-};
+}  __packed __aligned(2);
 
 /**
  * struct batadv_frag_packet - fragmented packet
@@ -347,7 +344,7 @@ struct batadv_frag_packet {
 	uint8_t orig[ETH_ALEN];
 	__be16  seqno;
 	__be16  total_size;
-};
+}  __packed __aligned(2);
 
 /**
  * struct batadv_bcast_packet - broadcast packet for network payload
@@ -368,7 +365,7 @@ struct batadv_bcast_packet {
 	/* "4 bytes boundary + 2 bytes" long to make the payload after the
 	 * following ethernet header again 4 bytes boundary aligned
 	 */
-};
+}  __packed __aligned(2);
 
 /**
  * struct batadv_coded_packet - network coded packet
@@ -404,9 +401,8 @@ struct batadv_coded_packet {
 	uint8_t  second_orig_dest[ETH_ALEN];
 	__be32   second_crc;
 	__be16   coded_len;
-};
+}  __packed __aligned(2);
 
-#pragma pack()
 
 /**
  * struct batadv_unicast_tvlv - generic unicast packet with tvlv payload
-- 
1.7.11.7
--
To unsubscribe from this list: send the line "unsubscribe linux-metag" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH 3/7] staging,spear_adc: Add dependency on HAS_IOMEM
From: Jonathan Cameron @ 2014-01-18 11:41 UTC (permalink / raw)
  To: Richard Weinberger, kishon-l0cyMroinI0,
	anton-9xeibp6oKSgdnm+yROfE0A, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	richardcochran-Re5JQEeQqe8AvxtiuMwx3w,
	lidza.louina-Re5JQEeQqe8AvxtiuMwx3w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w,
	florian-p3rKhJxN3npAfugRpC6u6w,
	thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8,
	lars-Qo5EllUWu/uELgA04lAiVw, marex-ynQEQJNshbs,
	acourbot-DDmLM1+adcrQT0dZR+AlfA, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	driverdev-devel-tBiZLqfeLfOHmIFyCCdPziST3g8Odh+X,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
	linux-iio-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1389714345-20165-3-git-send-email-richard-/L3Ra7n9ekc@public.gmane.org>

On 14/01/14 15:45, Richard Weinberger wrote:
> On archs like S390 or um this driver cannot build nor work.
> Make it depend on HAS_IOMEM to bypass build failures.
>
> drivers/staging/iio/adc/spear_adc.c: In function ‘spear_adc_probe’:
> drivers/staging/iio/adc/spear_adc.c:393:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration
>
> Signed-off-by: Richard Weinberger <richard-/L3Ra7n9ekc@public.gmane.org>
Applied to the fixes-togreg branch of iio.git

Thanks,
> ---
>   drivers/staging/iio/adc/Kconfig | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/staging/iio/adc/Kconfig b/drivers/staging/iio/adc/Kconfig
> index e3d6430..7d5d675 100644
> --- a/drivers/staging/iio/adc/Kconfig
> +++ b/drivers/staging/iio/adc/Kconfig
> @@ -128,6 +128,7 @@ config MXS_LRADC
>   config SPEAR_ADC
>   	tristate "ST SPEAr ADC"
>   	depends on PLAT_SPEAR || COMPILE_TEST
> +	depends on HAS_IOMEM
>   	help
>   	  Say yes here to build support for the integrated ADC inside the
>   	  ST SPEAr SoC. Provides direct access via sysfs.
>

^ permalink raw reply

* Re: [PATCH 7/7] staging,lpc32xx_adc: Add dependency on HAS_IOMEM
From: Jonathan Cameron @ 2014-01-18 11:50 UTC (permalink / raw)
  To: Richard Weinberger, kishon, anton, dwmw2, richardcochran,
	lidza.louina, gregkh, davem
  Cc: marex, lars, linux-iio, netdev, driverdev-devel, linux-kernel,
	acourbot, devel, florian, sebastian.hesselbarth
In-Reply-To: <1389714345-20165-7-git-send-email-richard@nod.at>



On 14/01/14 15:45, Richard Weinberger wrote:
> On archs like S390 or um this driver cannot build nor work.
> Make it depend on HAS_IOMEM to bypass build failures.
>
> drivers/built-in.o: In function `lpc32xx_adc_probe':
> drivers/staging/iio/adc/lpc32xx_adc.c:149: undefined reference to `devm_ioremap'
>
> Signed-off-by: Richard Weinberger <richard@nod.at>
applied to the fixes-togreg branch of iio.git

Thanks,
> ---
>   drivers/staging/iio/adc/Kconfig | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/staging/iio/adc/Kconfig b/drivers/staging/iio/adc/Kconfig
> index 7d5d675..3633298 100644
> --- a/drivers/staging/iio/adc/Kconfig
> +++ b/drivers/staging/iio/adc/Kconfig
> @@ -103,6 +103,7 @@ config AD7280
>   config LPC32XX_ADC
>   	tristate "NXP LPC32XX ADC"
>   	depends on ARCH_LPC32XX || COMPILE_TEST
> +	depends on HAS_IOMEM
>   	help
>   	  Say yes here to build support for the integrated ADC inside the
>   	  LPC32XX SoC. Note that this feature uses the same hardware as the
>

^ permalink raw reply

* Re: [PATCH net-next] bonding: move the netdev_add_tso_features() to bonding module
From: Veaceslav Falico @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Ding Tianhong; +Cc: Jay Vosburgh, Eric Dumazet, David S. Miller, Netdev
In-Reply-To: <52DA3BE5.5020500@huawei.com>

On Sat, Jan 18, 2014 at 04:31:33PM +0800, Ding Tianhong wrote:
>The function netdev_add_tso_features() was only be used for bonding,
>so no need to export it in netdevice.h, move it to bonding module.

Eric added it for a reason - like, other drivers might use it. Do you know
if team, bridge, vlan etc. might use it?

Thanks.

>
>Cc: Eric Dumazet <edumazet@google.com>
>Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>---
> drivers/net/bonding/bond_main.c | 12 +++++++++++-
> include/linux/netdevice.h       | 10 ----------
> 2 files changed, 11 insertions(+), 11 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index e06c445..4cfe14e 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1045,6 +1045,16 @@ static void bond_netpoll_cleanup(struct net_device *bond_dev)
>
> /*---------------------------------- IOCTL ----------------------------------*/
>
>+/* Allow TSO being used on stacked device:
>+ * Performing the GSO segmentation before last device
>+ * is a performance improvement.
>+ */
>+static netdev_features_t bond_add_tso_features(netdev_features_t features,
>+					       netdev_features_t mask)
>+{
>+	return netdev_increment_features(features, NETIF_F_ALL_TSO, mask);
>+}
>+
> static netdev_features_t bond_fix_features(struct net_device *dev,
> 					   netdev_features_t features)
> {
>@@ -1068,7 +1078,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
> 						     slave->dev->features,
> 						     mask);
> 	}
>-	features = netdev_add_tso_features(features, mask);
>+	features = bond_add_tso_features(features, mask);
>
> 	return features;
> }
>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>index a2a70cc..1be74ea 100644
>--- a/include/linux/netdevice.h
>+++ b/include/linux/netdevice.h
>@@ -3010,16 +3010,6 @@ static inline netdev_features_t netdev_get_wanted_features(
> netdev_features_t netdev_increment_features(netdev_features_t all,
> 	netdev_features_t one, netdev_features_t mask);
>
>-/* Allow TSO being used on stacked device :
>- * Performing the GSO segmentation before last device
>- * is a performance improvement.
>- */
>-static inline netdev_features_t netdev_add_tso_features(netdev_features_t features,
>-							netdev_features_t mask)
>-{
>-	return netdev_increment_features(features, NETIF_F_ALL_TSO, mask);
>-}
>-
> int __netdev_update_features(struct net_device *dev);
> void netdev_update_features(struct net_device *dev);
> void netdev_change_features(struct net_device *dev);
>-- 
>1.8.0
>
>

^ permalink raw reply

* Re: [PATCH net-next] net: add build-time checks for msg->msg_name size
From: Hannes Frederic Sowa @ 2014-01-18 12:05 UTC (permalink / raw)
  To: Steffen Hurrle; +Cc: netdev
In-Reply-To: <20140117215314.GC7562@noise.didjital.de>

On Fri, Jan 17, 2014 at 10:53:15PM +0100, Steffen Hurrle wrote:
> This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
> handler msg_name and msg_namelen logic").
> 
> DECLARE_SOCKADDR validates that the structure we use for writing the
> name information to is not larger than the buffer which is reserved
> for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
> consistently in sendmsg code paths.
> 
> Signed-off-by: Steffen Hurrle <steffen@hurrle.net>
> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Thanks!

^ permalink raw reply

* Re: [PATCH net-next] ipcomp: Convert struct xt_ipcomp spis into 16bits
From: Pablo Neira Ayuso @ 2014-01-18 12:24 UTC (permalink / raw)
  To: Fan Du; +Cc: steffen.klassert, davem, netdev, netfilter-devel
In-Reply-To: <1390011374-21760-1-git-send-email-fan.du@windriver.com>

On Sat, Jan 18, 2014 at 10:16:14AM +0800, Fan Du wrote:
> sparse warnings: (new ones prefixed by >>)
> 
> >> >> net/netfilter/xt_ipcomp.c:63:26: sparse: restricted __be16 degrades to integer
> >> >> net/netfilter/xt_ipcomp.c:63:26: sparse: cast to restricted __be32
> 
> Fix this by using 16bits long spi, as IPcomp CPI is only valid for 16bits.
> 
> Signed-off-by: Fan Du <fan.du@windriver.com>
> ---
>  include/uapi/linux/netfilter/xt_ipcomp.h |    2 +-
>  net/netfilter/xt_ipcomp.c                |    4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/uapi/linux/netfilter/xt_ipcomp.h b/include/uapi/linux/netfilter/xt_ipcomp.h
> index 45c7e40..ca82ebb 100644
> --- a/include/uapi/linux/netfilter/xt_ipcomp.h
> +++ b/include/uapi/linux/netfilter/xt_ipcomp.h
> @@ -4,7 +4,7 @@
>  #include <linux/types.h>
>  
>  struct xt_ipcomp {
> -	__u32 spis[2];	/* Security Parameter Index */
> +	__u16 spis[2];	/* Security Parameter Index */

This changes the binary interface so it break userspace (iptables
needs to be recompiled), we're still in time to make such change as
this is net-next stuff, but what I understand from the patch
description is that this aims to fix a sparse warning, which is a bit
of intrusive change.

Didn't you find any way to fix this without change the layout of
xt_ipcomp?

>  	__u8 invflags;	/* Inverse flags */
>  	__u8 hdrres;	/* Test of the Reserved Filed */
>  };
> diff --git a/net/netfilter/xt_ipcomp.c b/net/netfilter/xt_ipcomp.c
> index a4c7561..5542cb2 100644
> --- a/net/netfilter/xt_ipcomp.c
> +++ b/net/netfilter/xt_ipcomp.c
> @@ -29,7 +29,7 @@ MODULE_DESCRIPTION("Xtables: IPv4/6 IPsec-IPComp SPI match");
>  
>  /* Returns 1 if the spi is matched by the range, 0 otherwise */
>  static inline bool
> -spi_match(u_int32_t min, u_int32_t max, u_int32_t spi, bool invert)
> +spi_match(u_int16_t min, u_int16_t max, u_int16_t spi, bool invert)
>  {
>  	bool r;
>  	pr_debug("spi_match:%c 0x%x <= 0x%x <= 0x%x\n",
> @@ -60,7 +60,7 @@ static bool comp_mt(const struct sk_buff *skb, struct xt_action_param *par)
>  	}
>  
>  	return spi_match(compinfo->spis[0], compinfo->spis[1],
> -			 ntohl(chdr->cpi << 16),
> +			 ntohl(chdr->cpi),
>  			 !!(compinfo->invflags & XT_IPCOMP_INV_SPI));
>  }
>  
> -- 
> 1.7.9.5
> 

^ permalink raw reply

* Re: [PATCH linux-next] net: batman-adv: use "__packed __aligned(2)" for each structure instead of "__packed(2)" region
From: Antonio Quartulli @ 2014-01-18 13:03 UTC (permalink / raw)
  To: Chen Gang, David Miller
  Cc: James Hogan, mareklindner-rVWd3aGhH2z5bpWLKbzFeg, netdev,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-metag-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <52DA65F4.5070501-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3852 bytes --]

On 18/01/14 12:31, Chen Gang wrote:
> Unfortunately, not all compilers assumes the structures within a pack
> region also need be packed (e.g. metag), so need add a pack explicitly
> to satisfy all compilers.
> 
> The related error (under metag with allmodconfig):
> 
>     MODPOST 2952 modules
>   ERROR: "__compiletime_assert_431" [net/batman-adv/batman-adv.ko] undefined!
>   ERROR: "__compiletime_assert_432" [net/batman-adv/batman-adv.ko] undefined!
>   ERROR: "__compiletime_assert_429" [net/batman-adv/batman-adv.ko] undefined!
>   ERROR: "__compiletime_assert_428" [net/batman-adv/batman-adv.ko] undefined!
>   ERROR: "__compiletime_assert_423" [net/batman-adv/batman-adv.ko] undefined!
> 
> 
> Signed-off-by: Chen Gang <gang.chen.5i5j-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

David, what do you think about this change?


Can "__packed __aligned(2)" generate a different structure padding than
"#pragma pack(2)" ?

I am not really sure about the difference between the two. But if we
have the possibility that the padding may change then this patch should
go into net, otherwise we will have a protocol compatibility problem
between 3.13 and 3.14.


Cheers,

> ---
>  net/batman-adv/packet.h | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/net/batman-adv/packet.h b/net/batman-adv/packet.h
> index 0a381d1..9206b48 100644
> --- a/net/batman-adv/packet.h
> +++ b/net/batman-adv/packet.h
> @@ -154,7 +154,6 @@ enum batadv_tvlv_type {
>  	BATADV_TVLV_ROAM	= 0x05,
>  };
>  
> -#pragma pack(2)
>  /* the destination hardware field in the ARP frame is used to
>   * transport the claim type and the group id
>   */
> @@ -162,8 +161,7 @@ struct batadv_bla_claim_dst {
>  	uint8_t magic[3];	/* FF:43:05 */
>  	uint8_t type;		/* bla_claimframe */
>  	__be16 group;		/* group id */
> -};
> -#pragma pack()
> +} __packed __aligned(2);
>  
>  /**
>   * struct batadv_ogm_packet - ogm (routing protocol) packet
> @@ -281,7 +279,6 @@ struct batadv_icmp_packet_rr {
>   * misalignment of the payload after the ethernet header. It may also lead to
>   * leakage of information when the padding it not initialized before sending.
>   */
> -#pragma pack(2)
>  
>  /**
>   * struct batadv_unicast_packet - unicast packet for network payload
> @@ -300,7 +297,7 @@ struct batadv_unicast_packet {
>  	/* "4 bytes boundary + 2 bytes" long to make the payload after the
>  	 * following ethernet header again 4 bytes boundary aligned
>  	 */
> -};
> +}  __packed __aligned(2);
>  
>  /**
>   * struct batadv_unicast_4addr_packet - extended unicast packet
> @@ -316,7 +313,7 @@ struct batadv_unicast_4addr_packet {
>  	/* "4 bytes boundary + 2 bytes" long to make the payload after the
>  	 * following ethernet header again 4 bytes boundary aligned
>  	 */
> -};
> +}  __packed __aligned(2);
>  
>  /**
>   * struct batadv_frag_packet - fragmented packet
> @@ -347,7 +344,7 @@ struct batadv_frag_packet {
>  	uint8_t orig[ETH_ALEN];
>  	__be16  seqno;
>  	__be16  total_size;
> -};
> +}  __packed __aligned(2);
>  
>  /**
>   * struct batadv_bcast_packet - broadcast packet for network payload
> @@ -368,7 +365,7 @@ struct batadv_bcast_packet {
>  	/* "4 bytes boundary + 2 bytes" long to make the payload after the
>  	 * following ethernet header again 4 bytes boundary aligned
>  	 */
> -};
> +}  __packed __aligned(2);
>  
>  /**
>   * struct batadv_coded_packet - network coded packet
> @@ -404,9 +401,8 @@ struct batadv_coded_packet {
>  	uint8_t  second_orig_dest[ETH_ALEN];
>  	__be32   second_crc;
>  	__be16   coded_len;
> -};
> +}  __packed __aligned(2);
>  
> -#pragma pack()
>  
>  /**
>   * struct batadv_unicast_tvlv - generic unicast packet with tvlv payload
> 


-- 
Antonio Quartulli


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [RFC PATCH net-next 3/3] virtio-net: Add accelerated RFS support
From: Ben Hutchings @ 2014-01-18 14:19 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Zhi Yong Wu, Stefan Hajnoczi, Linux Netdev List, Eric Dumazet,
	David S. Miller, Zhi Yong Wu
In-Reply-To: <CA+mtBx8eRuWpYkYoPbuCaO1h0Y+g96zJB96zP17ZixOwZ1_gmQ@mail.gmail.com>

On Fri, 2014-01-17 at 20:59 -0800, Tom Herbert wrote:
> Ben,
> 
> I've never quite understood why flow management in aRFS has to be done
> with separate messages, and if I recall this seems to mitigate
> performance gains to a large extent. It seems like we should be able
> to piggyback on a TX descriptor for a connection information about the
> RX side for that connection, namely the rxhash and queue mapping.
> State creation should be implicit by just seeing a new rxhash value,
> tear down might be accomplished with a separate flag on the final TX
> packet on the connection (this would need some additional logic in the
> stack). Is this method not feasible in either NICs or virtio-net?

Well that's roughly how Flow Director works, isn't it?  So it is
feasible on at least one NIC!  It might be possible to implement
something like that in firmware on the SFC9100 (with the filter based on
the following packet headers, not a hash), but I don't know.  As for
other vendors - I have no idea.

Inserting filters from the receive path seemed like a natural extension
of the software RFS implementation.  And it means that the hardware
filters are inserted a little earlier (no need to transmit another
packet), but maybe that doesn't matter much.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] net: remove unnecessary initializations in net_dev_init
From: Sabrina Dubroca @ 2014-01-18 15:04 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca

softnet_data is set to 0 by memset, no need to initialize specific
fields to 0 or NULL afterwards.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/core/dev.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 288df62..b57b44a2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7000,25 +7000,16 @@ static int __init net_dev_init(void)
 		memset(sd, 0, sizeof(*sd));
 		skb_queue_head_init(&sd->input_pkt_queue);
 		skb_queue_head_init(&sd->process_queue);
-		sd->completion_queue = NULL;
 		INIT_LIST_HEAD(&sd->poll_list);
-		sd->output_queue = NULL;
 		sd->output_queue_tailp = &sd->output_queue;
 #ifdef CONFIG_RPS
 		sd->csd.func = rps_trigger_softirq;
 		sd->csd.info = sd;
-		sd->csd.flags = 0;
 		sd->cpu = i;
 #endif
 
 		sd->backlog.poll = process_backlog;
 		sd->backlog.weight = weight_p;
-		sd->backlog.gro_list = NULL;
-		sd->backlog.gro_count = 0;
-
-#ifdef CONFIG_NET_FLOW_LIMIT
-		sd->flow_limit = NULL;
-#endif
 	}
 
 	dev_boot_phase = 0;
-- 
1.8.5.3

^ permalink raw reply related

* Re: kmem_cache_alloc panic in 3.10+
From: Eric Dumazet @ 2014-01-18 16:29 UTC (permalink / raw)
  To: dormando; +Cc: netdev, linux-kernel, Alexei Starovoitov
In-Reply-To: <alpine.DEB.2.10.1401180036020.18419@dinf>

On Sat, 2014-01-18 at 00:44 -0800, dormando wrote:
> Hello again!
> 
> We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least
> (trying newer stables now, but I can't tell if it was fixed, and it takes
> weeks to reproduce).
> 
> Unfortunately I can only get 8k back from pstore. The panic looks a bit
> longer than that is caught in the log, but the bottom part is almost
> always this same trace as this one:
> 
> Panic#6 Part1
> <4>[1197485.199166]  [<ffffffff81611e8c>] tcp_push+0x6c/0x90
> <4>[1197485.199171]  [<ffffffff816160a9>] tcp_sendmsg+0x109/0xd40
> <4>[1197485.199179]  [<ffffffff81114b65>] ? put_page+0x35/0x40
> <4>[1197485.199185]  [<ffffffff8163bf75>] inet_sendmsg+0x45/0xb0
> <4>[1197485.199191]  [<ffffffff8159da7e>] sock_aio_write+0x11e/0x130
> <4>[1197485.199196]  [<ffffffff8163b83f>] ? inet_recvmsg+0x4f/0x80
> <4>[1197485.199203]  [<ffffffff811558ad>] do_sync_readv_writev+0x6d/0xa0
> <4>[1197485.199209]  [<ffffffff8115722b>] do_readv_writev+0xfb/0x2f0
> <4>[1197485.199215]  [<ffffffff8110fda5>] ? __free_pages+0x35/0x40
> <4>[1197485.199220]  [<ffffffff8110fe56>] ? free_pages+0x46/0x50
> <4>[1197485.199226]  [<ffffffff8112f9e2>] ? SyS_mincore+0x152/0x690
> <4>[1197485.199231]  [<ffffffff81157468>] vfs_writev+0x48/0x60
> <4>[1197485.199236]  [<ffffffff811575af>] SyS_writev+0x5f/0xd0
> <4>[1197485.199243]  [<ffffffff816cf942>] system_call_fastpath+0x16/0x1b
> <4>[1197485.199247] Code: 65 4c 03 04 25 c8 cb 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 0f 84 84 00 00 00 48 85 c0 74 7f 49 63 44 24 20 49 8b 3c 24 <49> 8b 5c 05 00 48 8d 4a 01 4c 89 e8 65 48 0f c7 0f 0f 94 c0 3c
> <1>[1197485.199290] RIP  [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
> <4>[1197485.199296]  RSP <ffff883171211868>
> <4>[1197485.199299] CR2: 0000000100000000
> <4>[1197485.199343] ---[ end trace 90fee06aa40b7304 ]---
> <1>[1197485.263911] BUG: unable to handle kernel paging request at 0000000100000000
> <1>[1197485.263923] IP: [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
> <4>[1197485.263932] PGD 3f43e5c067 PUD 0
> <4>[1197485.263937] Oops: 0000 [#5] SMP
> <4>[1197485.263941] Modules linked in: ntfs vfat msdos fat macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode sb_edac edac_core lpc_ich mfd_core ixgbe igb i2c_algo_bit mdio ptp pps_core
> <4>[1197485.263966] CPU: 0 PID: 233846 Comm: cache-worker Tainted: G      D      3.10.15 #1
> <4>[1197485.263972] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 2.0a 03/07/2013
> <4>[1197485.263976] task: ffff883427f9dc00 ti: ffff8830d4312000 task.ti: ffff8830d4312000
> <4>[1197485.263982] RIP: 0010:[<ffffffff811476da>]  [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
> <4>[1197485.263990] RSP: 0018:ffff881fffc038c8  EFLAGS: 00010286
> <4>[1197485.263994] RAX: 0000000000000000 RBX: ffffffff81c8c740 RCX: 00000000ffffffff
> <4>[1197485.263999] RDX: 0000000029273024 RSI: 0000000000000020 RDI: 0000000000015680
> <4>[1197485.264004] RBP: ffff881fffc03908 R08: ffff881fffc15680 R09: ffffffff815bdd4b
> <4>[1197485.264009] R10: ffff881c65d21800 R11: 0000000000000000 R12: ffff881fff803800
> <4>[1197485.264014] R13: 0000000100000000 R14: 00000000ffffffff R15: 0000000000000000
> <4>[1197485.264019] FS:  00007f8d855eb700(0000) GS:ffff881fffc00000(0000) knlGS:0000000000000000
> <4>[1197485.264024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[1197485.264028] CR2: 0000000100000000 CR3: 000000308f258000 CR4: 00000000000407f0
> <4>[1197485.264032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <4>[1197485.264037] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> <4>[1197485.264041] Stack:
> <4>[1197485.264044]  ffff881fffc03928 00000020815d0d95 ffff881fffc03938 ffffffff81c8c740
> <4>[1197485.264050]  ffff881fce210000 0000000000000001 00000000ffffffff 0000000000000000
> <4>[1197485.264056]  ffff881fffc03958 ffffffff815bdd4b ffff881fffc039a8 0000000000000000
> <4>[1197485.264063] Call Trace:
> <4>[1197485.264066]  <IRQ>
> <4>[1197485.264069]  [<ffffffff815bdd4b>] dst_alloc+0x5b/0x190
> <4>[1197485.264080]  [<ffffffff8160068c>] rt_dst_alloc+0x4c/0x50
> <4>[1197485.264085]  [<ffffffff81602a30>] __ip_route_output_key+0x270/0x880
> <4>[1197485.264092]  [<ffffffff8107ee7e>] ? try_to_wake_up+0x23e/0x2b0
> <4>[1197485.264097]  [<ffffffff81603067>] ip_route_output_flow+0x27/0x60
> <4>[1197485.264102]  [<ffffffff8160ab8a>] ip_queue_xmit+0x36a/0x390
> <4>[1197485.264108]  [<ffffffff816207c5>] tcp_transmit_skb+0x485/0x890
> <4>[1197485.264113]  [<ffffffff81621aa1>] tcp_send_ack+0xf1/0x130
> <4>[1197485.264118]  [<ffffffff81618d7e>] __tcp_ack_snd_check+0x5e/0xa0
> <4>[1197485.264123]  [<ffffffff8161f2c2>] tcp_rcv_state_process+0x8b2/0xb20
> <4>[1197485.264128]  [<ffffffff81627e61>] tcp_v4_do_rcv+0x191/0x4f0
> <4>[1197485.264133]  [<ffffffff8162984c>] tcp_v4_rcv+0x5fc/0x750
> <4>[1197485.264138]  [<ffffffff81604c80>] ? ip_rcv+0x350/0x350
> <4>[1197485.264143]  [<ffffffff815e45cd>] ? nf_hook_slow+0x7d/0x160
> <4>[1197485.264147]  [<ffffffff81604c80>] ? ip_rcv+0x350/0x350
> <4>[1197485.264152]  [<ffffffff81604d4e>] ip_local_deliver_finish+0xce/0x250
> <4>[1197485.264156]  [<ffffffff81604f1c>] ip_local_deliver+0x4c/0x80
> <4>[1197485.264161]  [<ffffffff816045a9>] ip_rcv_finish+0x119/0x360
> <4>[1197485.264165]  [<ffffffff81604b60>] ip_rcv+0x230/0x350
> <4>[1197485.264170]  [<ffffffff815b89f7>] __netif_receive_skb_core+0x477/0x600
> <4>[1197485.264175]  [<ffffffff815b8ba7>] __netif_receive_skb+0x27/0x70
> <4>[1197485.264180]  [<ffffffff815b8ce4>] process_backlog+0xf4/0x1e0
> <4>[1197485.264184]  [<ffffffff815b94e5>] net_rx_action+0xf5/0x250
> <4>[1197485.264190]  [<ffffffff81053b7f>] __do_softirq+0xef/0x270
> <4>[1197485.264196]  [<ffffffff816d0b7c>] call_softirq+0x1c/0x30
> <4>[1197485.264199]  <EOI>
> <4>[1197485.264201]  [<ffffffff81004495>] do_softirq+0x55/0x90
> <4>[1197485.264209]  [<ffffffff81053a84>] local_bh_enable+0x94/0xa0
> <4>[1197485.264215]  [<ffffffff8165567a>] ipt_do_table+0x22a/0x680
> <4>[1197485.264221]  [<ffffffff815d39c1>] ? skb_clone_tx_timestamp+0x31/0x110
> <4>[1197485.264231]  [<ffffffffa00ae840>] ? ixgbe_xmit_frame_ring+0x4c0/0xd40 [ixgbe]
> <4>[1197485.264239]  [<ffffffffa00af103>] ? ixgbe_xmit_frame+0x43/0x90 [ixgbe]
> <4>[1197485.264245]  [<ffffffff81657a23>] iptable_raw_hook+0x33/0x70
> <4>[1197485.264252]  [<ffffffff815e43a7>] nf_iterate+0x87/0xb0
> <4>[1197485.264256]  [<ffffffff81607e20>] ? ip_options_echo+0x420/0x420
> <4>[1197485.264261]  [<ffffffff815e45cd>] nf_hook_slow+0x7d/0x160
> <4>[1197485.264266]  [<ffffffff81607e20>] ? ip_options_echo+0x420/0x420
> <4>[1197485.264270]  [<ffffffff8160a430>] __ip_local_out+0xa0/0xb0
> <4>[1197485.264275]  [<ffffffff8160a456>] ip_local_out+0x16/0x30
> <4>[1197485.264280]  [<ffffffff8160a97a>] ip_queue_xmit+0x15a/0x390
> <4>[1197485.264286]  [<ffffffff81625e73>] ? tcp_v4_md5_lookup+0x13/0x20
> <4>[1197485.264290]  [<ffffffff816207c5>] tcp_transmit_skb+0x485/0x890
> <4>[1197485.264295]  [<ffffffff81622e08>] tcp_write_xmit+0x1b8/0xa50
> <4>[1197485.264300]  [<ffffffff815a7e28>] ? __alloc_skb+0xa8/0x1f0
> <4>[1197485.264304]  [<ffffffff816236d0>] tcp_push_one+0x30/0x40
> <4>[1197485.264309]  [<ffffffff81616b84>] tcp_sendmsg+0xbe4/0xd40
> <4>[1197485.264315]  [<ffffffff81114b65>] ? put_page+0x35/0x40
> <4>[1197485.264321]  [<ffffffff8163bf75>] inet_sendmsg+0x45/0xb0
> <4>[1197485.264326]  [<ffffffff8159da7e>] sock_aio_write+0x11e/0x130
> <4>[1197485.264331]  [<ffffffff8163b83f>] ? inet_recvmsg+0x4f/0x80
> <4>[1197485.264337]  [<ffffffff811558ad>] do_sync_readv_writev+0x6d/0xa0
> <4>[1197485.264343]  [<ffffffff8115722b>] do_readv_writev+0xfb/0x2f0
> <4>[1197485.264347]  [<ffffffff8110fda5>] ? __free_pages+0x35/0x40
> <4>[1197485.264352]  [<ffffffff8110fe56>] ? free_pages+0x46/0x50
> <4>[1197485.264357]  [<ffffffff8112f9e2>] ? SyS_mincore+0x152/0x690
> <4>[1197485.264363]  [<ffffffff81157468>] vfs_writev+0x48/0x60
> <4>[1197485.264367]  [<ffffffff811575af>] SyS_writev+0x5f/0xd0
> <4>[1197485.264373]  [<ffffffff816cf942>] system_call_fastpath+0x16/0x1b
> <4>[1197485.264377] Code: 65 4c 03 04 25 c8 cb 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 0f 84 84 00 00 00 48 85 c0 74 7f 49 63 44 24 20 49 8b 3c 24 <49> 8b 5c 05 00 48 8d 4a 01 4c 89 e8 65 48 0f c7 0f 0f 94 c0 3c
> <1>[1197485.264417] RIP  [<ffffffff811476da>] kmem_cache_alloc+0x5a/0x130
> <4>[1197485.264424]  RSP <ffff881fffc038c8>
> <4>[1197485.264427] CR2: 0000000100000000
> <4>[1197485.264431] ---[ end trace 90fee06aa40b7305 ]---
> <0>[1197485.325141] Kernel panic - not syncing: Fatal exception in interrupt
> 
> ... way down in the tcp code.
> 
> Any help would be appreciated :) I'll do what I can to help, but iterating
> this particular crash is very hard due to the amount of time it takes to
> reproduce. Since we have a large number of machines they're always
> crashing here and there, but once they do it's not going to happen again
> for a while.
> 
> Thanks!
> -Dormando
> --

Hmm...

Some dst seems to be destroyed twice. This likely screws slab allocator.

Please try following untested patch :
diff --git a/include/net/route.h b/include/net/route.h
index 9d1f423d5944..bb96e0873eb5 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -314,4 +314,9 @@ static inline int ip4_dst_hoplimit(const struct dst_entry *dst)
 	return hoplimit;
 }
 
+static inline void rt_free(struct rtable *rt)
+{
+	call_rcu(&rt->dst.rcu_head, dst_rcu_free);
+}
+
 #endif	/* _ROUTE_H */
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index b53f0bf84dca..97b43b09e037 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -152,7 +152,7 @@ static void rt_fibinfo_free(struct rtable __rcu **rtp)
 	 * free_fib_info_rcu()
 	 */
 
-	dst_free(&rt->dst);
+	rt_free(rt);
 }
 
 static void free_nh_exceptions(struct fib_nh *nh)
@@ -192,7 +192,7 @@ static void rt_fibinfo_free_cpus(struct rtable __rcu * __percpu *rtp)
 
 		rt = rcu_dereference_protected(*per_cpu_ptr(rtp, cpu), 1);
 		if (rt)
-			dst_free(&rt->dst);
+			rt_free(rt);
 	}
 	free_percpu(rtp);
 }
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 25071b48921c..06f79225b7ac 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -556,11 +556,6 @@ static void ip_rt_build_flow_key(struct flowi4 *fl4, const struct sock *sk,
 		build_sk_flow_key(fl4, sk);
 }
 
-static inline void rt_free(struct rtable *rt)
-{
-	call_rcu(&rt->dst.rcu_head, dst_rcu_free);
-}
-
 static DEFINE_SPINLOCK(fnhe_lock);
 
 static void fnhe_flush_routes(struct fib_nh_exception *fnhe)

^ permalink raw reply related

* Re: kmem_cache_alloc panic in 3.10+
From: Eric Dumazet @ 2014-01-18 16:57 UTC (permalink / raw)
  To: dormando; +Cc: netdev, linux-kernel, Alexei Starovoitov
In-Reply-To: <1390062576.31367.519.camel@edumazet-glaptop2.roam.corp.google.com>

On Sat, 2014-01-18 at 08:29 -0800, Eric Dumazet wrote:

> Hmm...
> 
> Some dst seems to be destroyed twice. This likely screws slab allocator.
> 
> Please try following untested patch :


Forget it, after some coffee it makes no longer sense ;)

^ permalink raw reply

* Re: [PATCH net-next] bonding: move the netdev_add_tso_features() to bonding module
From: Eric Dumazet @ 2014-01-18 17:08 UTC (permalink / raw)
  To: Veaceslav Falico
  Cc: Ding Tianhong, Jay Vosburgh, Eric Dumazet, David S. Miller,
	Netdev
In-Reply-To: <20140118114801.GA30549@redhat.com>

On Sat, 2014-01-18 at 12:48 +0100, Veaceslav Falico wrote:
> On Sat, Jan 18, 2014 at 04:31:33PM +0800, Ding Tianhong wrote:
> >The function netdev_add_tso_features() was only be used for bonding,
> >so no need to export it in netdevice.h, move it to bonding module.
> 
> Eric added it for a reason - like, other drivers might use it. Do you know
> if team, bridge, vlan etc. might use it?

A helper can be used once, this is fine. A car can have 4 seats, and can
even be used with no passenger.

I am quite bored by patches that break clean layering for wrong reasons.


static inline netdev_features_t netdev_add_tso_features(netdev_features_t features,
                                                      netdev_features_t mask)
{
      return netdev_increment_features(features, NETIF_F_ALL_TSO, mask);
}

There is _nothing_ in this helper that implies it should be private to bonding.

^ permalink raw reply

* Re: [PATCH net-next] net: vxlan: do not use vxlan_net before checking event type
From: Eric Dumazet @ 2014-01-18 17:18 UTC (permalink / raw)
  To: Cong Wang
  Cc: Daniel Borkmann, David Miller, Linux Kernel Network Developers,
	Eric W. Biederman, Jesse Brandeburg
In-Reply-To: <CAM_iQpUoQcHpQJn-nYp9mO+XXMmXSjFxy3ASwchAH-qECoz9OA@mail.gmail.com>

On Fri, 2014-01-17 at 19:50 -0800, Cong Wang wrote:
> On Fri, Jan 17, 2014 at 10:32 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> >
> >
> > If you want to do cleanups, whatever, I really don't care.
> > You had your chance to complain about that when you reviewed
> > the initial version ... it has nothing to do with the fix.
> 
> This is not for stable, as long as it doesn't harm the readability
> we are free to do any cleanup's.
> 
> If unsure, check Eric's patch for tunnel dst cache.
> 
> BTW, I am the original author of the patch, you just updated
> it *trivially* and set yourself as the author. :) I don't mind, but
> remember that this may be not appropriate for others. At
> very least I didn't and don't do this myself.


Hmm... Daniel mentioned in the changelog you wrote the initial patch,
and you are credited as the author of the patch, since he kept your
"Signed-off-by: ..." as the first one.

Quite frankly, keeping vxlan_handle_lowerdev_unregister() was the right
choice.

Stop thinking that a function needs to be used more than once to have
the right to exist. Splitting code in small parts ease readability and
code reuse/refactor, this should be obvious to you.

^ permalink raw reply

* Re: ipv4_dst_destroy panic regression after 3.10.15
From: Alexei Starovoitov @ 2014-01-18 17:18 UTC (permalink / raw)
  To: dormando
  Cc: Eric Dumazet, netdev, linux-kernel@vger.kernel.org,
	Alexei Starovoitov
In-Reply-To: <alpine.DEB.2.10.1401172313530.18419@dinf>

On Fri, Jan 17, 2014 at 11:16 PM, dormando <dormando@rydia.net> wrote:
>> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote:
>> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote:
>> > > Hi,
>> > >
>> > > Upgraded a few kernels to the latest 3.10 stable tree while tracking down
>> > > a rare kernel panic, seems to have introduced a much more frequent kernel
>> > > panic. Takes anywhere from 4 hours to 2 days to trigger:
>> > >
>> > > <4>[196727.311203] general protection fault: 0000 [#1] SMP
>> > > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
>> > > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
>> > > <4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
>> > > <4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
>> > > <4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>]  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
>> > > <4>[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
>> > > <4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
>> > > <4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
>> > > <4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
>> > > <4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>> > > <4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
>> > > <4>[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
>> > > <4>[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > <4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
>> > > <4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > <4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> > > <4>[196727.311713] Stack:
>> > > <4>[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
>> > > <4>[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
>> > > <4>[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
>> > > <4>[196727.311885] Call Trace:
>> > > <4>[196727.311907]  <IRQ>
>> > > <4>[196727.311912]  [<ffffffff815b7f42>] dst_destroy+0x32/0xe0
>> > > <4>[196727.311959]  [<ffffffff815b86c6>] dst_release+0x56/0x80
>> > > <4>[196727.311986]  [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0
>> > > <4>[196727.312013]  [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820
>> > > <4>[196727.312041]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
>> > > <4>[196727.312070]  [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150
>> > > <4>[196727.312097]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
>> > > <4>[196727.312125]  [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230
>> > > <4>[196727.312154]  [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90
>> > > <4>[196727.312183]  [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360
>> > > <4>[196727.312212]  [<ffffffff815fe00b>] ip_rcv+0x22b/0x340
>> > > <4>[196727.312242]  [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan]
>> > > <4>[196727.312275]  [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640
>> > > <4>[196727.312308]  [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150
>> > > <4>[196727.312338]  [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70
>> > > <4>[196727.312368]  [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0
>> > > <4>[196727.312397]  [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140
>> > > <4>[196727.312433]  [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe]
>> > > <4>[196727.312463]  [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340
>> > > <4>[196727.312491]  [<ffffffff815b1691>] net_rx_action+0x111/0x210
>> > > <4>[196727.312521]  [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70
>> > > <4>[196727.312552]  [<ffffffff810519d0>] __do_softirq+0xd0/0x270
>> > > <4>[196727.312583]  [<ffffffff816cef3c>] call_softirq+0x1c/0x30
>> > > <4>[196727.312613]  [<ffffffff81004205>] do_softirq+0x55/0x90
>> > > <4>[196727.312640]  [<ffffffff81051c85>] irq_exit+0x55/0x60
>> > > <4>[196727.312668]  [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0
>> > > <4>[196727.312696]  [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a
>> > > <4>[196727.312722]  <EOI>
>> > > <4>[196727.312727]  [<ffffffff8100a150>] ? default_idle+0x20/0xe0
>> > > <4>[196727.312775]  [<ffffffff8100a8ff>] arch_cpu_idle+0xf/0x20
>> > > <4>[196727.312803]  [<ffffffff8108d330>] cpu_startup_entry+0xc0/0x270
>> > > <4>[196727.312833]  [<ffffffff816b276e>] start_secondary+0x1f9/0x200
>> > > <4>[196727.312860] Code: 4a 9f e9 81 e8 13 cb 0c 00 48 8b 93 b0 00 00 00 48 bf 00 02 20 00 00 00 ad de 48 8b 83 b8 00 00 00 48 be 00 01 10 00 00 00 ad de <48> 89 42 08 48 89 10 48 89 bb b8 00 00 00 48 c7 c7 4a 9f e9 81
>> > > <1>[196727.313071] RIP  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
>> > > <4>[196727.313100]  RSP <ffff885effd23a70>
>> > > <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
>> > > <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt
>> > >
>> > >
>> > > ... bisecting it's going to be a pain... I tried eyeballing the diffs and
>> > > am trying a revert or two.
>> > >
>> > > We've hit it in .25, .26 so far. I have .27 running but not sure if it
>> > > crashed, so the change exists between .15 and .25.
>> >
>> > Please try following fix, thanks for the report !
>> >
>> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>> > index 25071b48921c..78a50a22298a 100644
>> > --- a/net/ipv4/route.c
>> > +++ b/net/ipv4/route.c
>> > @@ -1333,7 +1333,7 @@ static void ipv4_dst_destroy(struct dst_entry
>> > *dst)
>> >
>> >     if (!list_empty(&rt->rt_uncached)) {
>> >             spin_lock_bh(&rt_uncached_lock);
>> > -           list_del(&rt->rt_uncached);
>> > +           list_del_init(&rt->rt_uncached);
>> >             spin_unlock_bh(&rt_uncached_lock);
>> >     }
>> >  }
>> >
>>
>> Problem could come from this commit, in linux 3.10.23,
>> you also could try to revert it
>>
>> commit 62713c4b6bc10c2d082ee1540e11b01a2b2162ab
>> Author: Alexei Starovoitov <ast@plumgrid.com>
>> Date:   Tue Nov 19 19:12:34 2013 -0800
>>
>>     ipv4: fix race in concurrent ip_route_input_slow()
>>
>>     [ Upstream commit dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4 ]
>>
>>     CPUs can ask for local route via ip_route_input_noref() concurrently.
>>     if nh_rth_input is not cached yet, CPUs will proceed to allocate
>>     equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
>>     via rt_cache_route()
>>     Most of the time they succeed, but on occasion the following two lines:
>>         orig = *p;
>>         prev = cmpxchg(p, orig, rt);
>>     in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
>>     But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
>>     so dst is leaking. dst_destroy() is never called and 'lo' device
>>     refcnt doesn't go to zero, which can be seen in the logs as:
>>         unregister_netdevice: waiting for lo to become free. Usage count = 1
>>     Adding mdelay() between above two lines makes it easily reproducible.
>>     Fix it similar to nh_pcpu_rth_output case.
>>
>>     Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.")
>>     Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
>>     Signed-off-by: David S. Miller <davem@davemloft.net>
>>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>
>
> Heh. I spent an hour squinting at the difflog from .15 to .25 and this was
> my best guess. I have a kernel running in production with only this
> reverted as of ~5 hours ago. Won't know if it helps for a day or two.
>
> I'm building a kernel now with your route patch, but 62713c4 *not*
> reverted, which I can throw on a different machine. Does this sound like a
> good idea?

the traces of your crash don't look similar to dst leak that was fixed by
commit 62713c4...

To trigger the 'fix' logic of the 62713c4 add the following diff:
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f6c6ab1..8972676 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1259,7 +1259,7 @@ static bool rt_cache_route(struct fib_nh *nh,
struct rtable *rt)
                p = (struct rtable **)__this_cpu_ptr(nh->nh_pcpu_rth_output);
        }
        orig = *p;
-
+       mdelay(100);
        prev = cmpxchg(p, orig, rt);
        if (prev == orig) {
                if (orig)

I've been running with it for a day without issues.
Note that it will stress both 'input' and 'output' ways of adding dsts to
rt_uncached list...
and 'output' was there for ages.

If mdelay() helps to reproduce it in minutes instead of days
then we're on the right path.
Could you share details of your workload?
In our case it was a lot of namespaces with a ton of processes
talking to each other, forcefully killed and restarted.
Do you see the same crash in the latest tree?

PS sorry for double posts. netdev email bounced few times for me...

^ permalink raw reply related

* Re: [PATCH] net: remove unnecessary initializations in net_dev_init
From: Eric Dumazet @ 2014-01-18 17:40 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: davem, netdev
In-Reply-To: <1390057451-30807-1-git-send-email-sd@queasysnail.net>

On Sat, 2014-01-18 at 16:04 +0100, Sabrina Dubroca wrote:
> softnet_data is set to 0 by memset, no need to initialize specific
> fields to 0 or NULL afterwards.
> 
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
>  net/core/dev.c | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 288df62..b57b44a2 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7000,25 +7000,16 @@ static int __init net_dev_init(void)
>  		memset(sd, 0, sizeof(*sd));

Hi Sabrina

Well, if you really want, you also can remove this memset(), percpu data
defined as :

DEFINE_PER_CPU_ALIGNED(struct softnet_data, softnet_data);

must also be zero at boot time.

Thanks !

^ permalink raw reply

* Re: [RFC PATCH] tuntap: Fix for a race in accessing numqueue
From: Sergei Shtylyov @ 2014-01-18 17:43 UTC (permalink / raw)
  To: Dominic Curran, netdev; +Cc: Jason Wang, Maxim Krasnyansky
In-Reply-To: <1390004815-7052-1-git-send-email-dominic.curran@citrix.com>

Hello.

On 18-01-2014 4:26, Dominic Curran wrote:

> A patch for fixing a race between queue selection and changing queues
> was introduced in commit 92bb73ea2c434618a68a5.

    Please also specify that commit's summary line in parens.

> The fix was to prevent the driver from re-reading the tun->numqueues
> more than once within tun_select_queue().

> We have been experiancing  'Divide-by-zero' errors in
> tun_net_xmit() since we moved from 3.6 to 3.10, and believe that they
> come from a simular source where the value of tun->numqueues changes
> to zero between the first and second read of tun->numqueues.

> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Maxim Krasnyansky <maxk@qualcomm.com>

WBR, Sergei

^ permalink raw reply

* Re: [PATCH net-next] net: vxlan: do not use vxlan_net before checking event type
From: Cong Wang @ 2014-01-18 17:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Daniel Borkmann, David Miller, Linux Kernel Network Developers,
	Eric W. Biederman, Jesse Brandeburg
In-Reply-To: <1390065511.31367.535.camel@edumazet-glaptop2.roam.corp.google.com>

On Sat, Jan 18, 2014 at 9:18 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-01-17 at 19:50 -0800, Cong Wang wrote:
>> On Fri, Jan 17, 2014 at 10:32 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>> >
>> >
>> > If you want to do cleanups, whatever, I really don't care.
>> > You had your chance to complain about that when you reviewed
>> > the initial version ... it has nothing to do with the fix.
>>
>> This is not for stable, as long as it doesn't harm the readability
>> we are free to do any cleanup's.
>>
>> If unsure, check Eric's patch for tunnel dst cache.
>>
>> BTW, I am the original author of the patch, you just updated
>> it *trivially* and set yourself as the author. :) I don't mind, but
>> remember that this may be not appropriate for others. At
>> very least I didn't and don't do this myself.
>
>
> Hmm... Daniel mentioned in the changelog you wrote the initial patch,
> and you are credited as the author of the patch, since he kept your
> "Signed-off-by: ..." as the first one.

Author == 'From: ...', you knew it, right?

But WITHOUT even asking for my permission. I am sure this is
not how we usually work. At least, why not ask me before doing
anything? Why not give me a chance to response?

>
> Quite frankly, keeping vxlan_handle_lowerdev_unregister() was the right
> choice.
>
> Stop thinking that a function needs to be used more than once to have
> the right to exist. Splitting code in small parts ease readability and
> code reuse/refactor, this should be obvious to you.
>

When did I say because that it is only used once? Please, stop guessing
my mind.

^ permalink raw reply

* Re: [PATCH] net: remove unnecessary initializations in net_dev_init
From: Sabrina Dubroca @ 2014-01-18 18:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev
In-Reply-To: <1390066846.31367.538.camel@edumazet-glaptop2.roam.corp.google.com>

2014-01-18, 09:40:46 -0800, Eric Dumazet wrote:
> On Sat, 2014-01-18 at 16:04 +0100, Sabrina Dubroca wrote:
> > softnet_data is set to 0 by memset, no need to initialize specific
> > fields to 0 or NULL afterwards.
> > 
> > Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> > ---
> >  net/core/dev.c | 9 ---------
> >  1 file changed, 9 deletions(-)
> > 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 288df62..b57b44a2 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -7000,25 +7000,16 @@ static int __init net_dev_init(void)
> >  		memset(sd, 0, sizeof(*sd));
> 
> Hi Sabrina
> 
> Well, if you really want, you also can remove this memset(), percpu data
> defined as :
> 
> DEFINE_PER_CPU_ALIGNED(struct softnet_data, softnet_data);
> 
> must also be zero at boot time.
> 
> Thanks !

Okay, I'll send a v2.
Thanks!

-- 
Sabrina

^ permalink raw reply

* [PATCH net-next v2] net: remove unnecessary initializations in net_dev_init
From: Sabrina Dubroca @ 2014-01-18 18:19 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca

softnet_data is already set to 0, no need to use memset or initialize
specific fields to 0 or NULL afterwards.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---

v2: remove memset as well, since as Eric said, percpu data is 0 at
    boot time

 net/core/dev.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 288df62..36c0cc69 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6997,28 +6997,18 @@ static int __init net_dev_init(void)
 	for_each_possible_cpu(i) {
 		struct softnet_data *sd = &per_cpu(softnet_data, i);
 
-		memset(sd, 0, sizeof(*sd));
 		skb_queue_head_init(&sd->input_pkt_queue);
 		skb_queue_head_init(&sd->process_queue);
-		sd->completion_queue = NULL;
 		INIT_LIST_HEAD(&sd->poll_list);
-		sd->output_queue = NULL;
 		sd->output_queue_tailp = &sd->output_queue;
 #ifdef CONFIG_RPS
 		sd->csd.func = rps_trigger_softirq;
 		sd->csd.info = sd;
-		sd->csd.flags = 0;
 		sd->cpu = i;
 #endif
 
 		sd->backlog.poll = process_backlog;
 		sd->backlog.weight = weight_p;
-		sd->backlog.gro_list = NULL;
-		sd->backlog.gro_count = 0;
-
-#ifdef CONFIG_NET_FLOW_LIMIT
-		sd->flow_limit = NULL;
-#endif
 	}
 
 	dev_boot_phase = 0;
-- 
1.8.5.3

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox