Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 1/2] net: Add layer 2 hardware acceleration operations for macvlan devices
From: Neil Horman @ 2013-10-07 21:20 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20131007.155214.2232375975382665567.davem@davemloft.net>

Forgive the poor reply format, Dave, I deleted your email (to fast on the
trigger apparently), so I have to reconstruct it.

>> @@ -426,9 +426,12 @@ struct sk_buff {
>>  	char			cb[48] __aligned(8);
>>  
>>  	unsigned long		_skb_refdst;
>> -#ifdef CONFIG_XFRM
>> -	struct	sec_path	*sp;
>> -#endif
>> +
>> +	union {
>> +		struct	sec_path	*sp;
>> +		void 			*accel_priv;
>> +	};
>> +
>
>I'm not %100 sure these two things are really mutually exclusive.
>
>What if bridging ebtables does an input route lookup?  That can
>populate the security path.
>
You are mostly likely right, thats why this is an RFC, I haven't really thought
through that bit fully yet, to be perfectly honest.  I wanted a place for a
pointer to the accelerated data path data to live, and that looked like a
reasonably safe place at the time, but as you point out, its not.  I'll need to
find a better place for it.

>Also, why have you not added this to the usual netdev_ops and
>hw_features?

Thats me experimenting.  I was thinking that origionally this functionality
might be grouped separately, so that we could handle it independently of the
standard network device operations (you might have noticed in v1 of my patch I
had a size_t variable in there, so I thought the separation might be
organizationally nice).  It was also something I was tinkering with for
potential future work to support other data plane accelerators (like the FM6000
switch chip from intel) in a manner that didn't pollute the more typical host network
devices.  Like I said though, just experimenting at the moment....

Regards
Neil

^ permalink raw reply

* [PATCH] net: vlan: fix nlmsg size calculation in vlan_get_size()
From: Marc Kleine-Budde @ 2013-10-07 21:19 UTC (permalink / raw)
  To: netdev; +Cc: kernel, Marc Kleine-Budde, Patrick McHardy

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 net/8021q/vlan_netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index 3091297..c7e634a 100644
--- a/net/8021q/vlan_netlink.c
+++ b/net/8021q/vlan_netlink.c
@@ -171,7 +171,7 @@ static size_t vlan_get_size(const struct net_device *dev)
 
 	return nla_total_size(2) +	/* IFLA_VLAN_PROTOCOL */
 	       nla_total_size(2) +	/* IFLA_VLAN_ID */
-	       sizeof(struct ifla_vlan_flags) + /* IFLA_VLAN_FLAGS */
+	       nla_total_size(sizeof(struct ifla_vlan_flags)) + /* IFLA_VLAN_FLAGS */
 	       vlan_qos_map_size(vlan->nr_ingress_mappings) +
 	       vlan_qos_map_size(vlan->nr_egress_mappings);
 }
-- 
1.8.4.rc3

^ permalink raw reply related

* Re: bug in passing file descriptors
From: David Miller @ 2013-10-07 21:32 UTC (permalink / raw)
  To: sar; +Cc: luto, netdev, mtk.manpages, ebiederm
In-Reply-To: <5253199B.3000109@nec-labs.com>

From: Steve Rago <sar@nec-labs.com>
Date: Mon, 7 Oct 2013 16:29:15 -0400

> On 10/07/2013 03:42 PM, David Miller wrote:
>> There is no compatability issue.
>>
>> 32-bit tasks will always see the 4-byte align/length.
>> 64-bit tasks will always see the 8-byte align/length.
>>
> 
> Really?  So when I compile my application on a 32-bit Linux box and
> then try to run it on a 64-bit Linux box, you're not going to overrun
> my buffer when CMSG_SPACE led me to allocate an insufficient amount of
> memory needed to account for padding on the 64-bit platform?

We have a compatability layer that gives 32-bit applications the
same behavior as if they had run on a 32-bit machine.

Search around for the MSG_MSG_COMPAT flag and how that is used in
net/socket.c

^ permalink raw reply

* Re: [PATCH 1/2] net: Add layer 2 hardware acceleration operations for macvlan devices
From: David Miller @ 2013-10-07 21:34 UTC (permalink / raw)
  To: nhorman; +Cc: netdev
In-Reply-To: <20131007212000.GA1596@hmsreliant.think-freely.org>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 7 Oct 2013 17:20:00 -0400

> Thats me experimenting.  I was thinking that origionally this functionality
> might be grouped separately, so that we could handle it independently of the
> standard network device operations (you might have noticed in v1 of my patch I
> had a size_t variable in there, so I thought the separation might be
> organizationally nice).  It was also something I was tinkering with for
> potential future work to support other data plane accelerators (like the FM6000
> switch chip from intel) in a manner that didn't pollute the more typical host network
> devices.  Like I said though, just experimenting at the moment....

Can these dataplane devices still act like a normal networking port and
send and receive packets at the host level?

If yes, that would be an extremely strong argument for netdev_ops.

^ permalink raw reply

* Re: [PATCH v2.42 1/5] odp: Allow VLAN actions after MPLS actions
From: Ben Pfaff @ 2013-10-07 21:41 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Ravi K, netdev-u79uwXL29TY76Z2rM5mHXA,
	Isaku Yamahata
In-Reply-To: <20131007063447.GF19926-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

On Mon, Oct 07, 2013 at 03:34:47PM +0900, Simon Horman wrote:
> What I have done is to make an incremental patch which:
> 
> 1. Moves the 'vlan_tci' member of strict xlate_in to
>    be the 'final_vlan_tci' member of struct xlate_ctx.
> 
> 2. Moves the 'vlan_tci' local variable of do_xlate_actions()
>    to be the 'next_vlan_tci' member of struct xlate_ctx.
> 
> 3. Restructures the comments surrounding the logic of the vlan_tci
>    code that this patch adds mostly as comments for the new
>    members of struct xlate_ctx. I hope things are (still?) clear.
> 
> For reference, the incremental patch I have so far is as follows.
> I will squash it into this patch before reposting this series.

Thanks a lot, I'll take another look at the series now.

^ permalink raw reply

* [PATCH] pkt_sched: fq: fix non TCP flows pacing
From: Eric Dumazet @ 2013-10-07 21:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Steinar H. Gunderson

From: Eric Dumazet <edumazet@google.com>

Steinar reported FQ pacing was not working for UDP flows.

It looks like the initial sk->sk_pacing_rate value of 0 was
a wrong choice. We should init it to ~0U like sk_max_pacing_rate

Then, TCA_FQ_FLOW_DEFAULT_RATE should be removed because it makes
no real sense. (The default rate is really : ~0U)

While debugging this issue, I realized sk_pacing_rate is shared between
transport and packet scheduler without locking / barriers :

We should use ACCESS_ONCE() to make sure compiler wont perform
multiple loads or stores.

Reported-by: Steinar H. Gunderson <sesse@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/sock.c      |    1 +
 net/ipv4/tcp_input.c |    7 ++++++-
 net/sched/sch_fq.c   |   20 +++++++++-----------
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 2bd9b3f..fd6afa2 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2331,6 +2331,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 #endif
 
 	sk->sk_max_pacing_rate = ~0U;
+	sk->sk_pacing_rate = ~0U;
 	/*
 	 * Before updating sk_refcnt, we must commit prior changes to memory
 	 * (Documentation/RCU/rculist_nulls.txt for details)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index fa6cf1f..d95f875 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -755,7 +755,12 @@ static void tcp_update_pacing_rate(struct sock *sk)
 	if (tp->srtt > 8 + 2)
 		do_div(rate, tp->srtt);
 
-	sk->sk_pacing_rate = min_t(u64, rate, sk->sk_max_pacing_rate);
+	/* ACCESS_ONCE() is needed because FQ fetches sk_pacing_rate without
+	 * any lock. We want to make sure compiler wont use sk_pacing_rate
+	 * with intermediate values.
+	 */
+	ACCESS_ONCE(sk->sk_pacing_rate) = min_t(u64, rate,
+						sk->sk_max_pacing_rate);
 }
 
 /* Calculate rto without backoff.  This is the second half of Van Jacobson's
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index a2fef8b..46b2adb 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -472,20 +472,16 @@ begin:
 	if (f->credit > 0 || !q->rate_enable)
 		goto out;
 
-	if (skb->sk && skb->sk->sk_state != TCP_TIME_WAIT) {
-		rate = skb->sk->sk_pacing_rate ?: q->flow_default_rate;
+	rate = q->flow_max_rate;
+	if (skb->sk && skb->sk->sk_state != TCP_TIME_WAIT)
+		rate = min(ACCESS_ONCE(skb->sk->sk_pacing_rate), rate);
 
-		rate = min(rate, q->flow_max_rate);
-	} else {
-		rate = q->flow_max_rate;
-		if (rate == ~0U)
-			goto out;
-	}
-	if (rate) {
+	if (rate != ~0U) {
 		u32 plen = max(qdisc_pkt_len(skb), q->quantum);
 		u64 len = (u64)plen * NSEC_PER_SEC;
 
-		do_div(len, rate);
+		if (likely(rate))
+			do_div(len, rate);
 		/* Since socket rate can change later,
 		 * clamp the delay to 125 ms.
 		 * TODO: maybe segment the too big skb, as in commit
@@ -735,12 +731,14 @@ static int fq_dump(struct Qdisc *sch, struct sk_buff *skb)
 	if (opts == NULL)
 		goto nla_put_failure;
 
+	/* TCA_FQ_FLOW_DEFAULT_RATE is not used anymore,
+	 * do not bother giving its value
+	 */
 	if (nla_put_u32(skb, TCA_FQ_PLIMIT, sch->limit) ||
 	    nla_put_u32(skb, TCA_FQ_FLOW_PLIMIT, q->flow_plimit) ||
 	    nla_put_u32(skb, TCA_FQ_QUANTUM, q->quantum) ||
 	    nla_put_u32(skb, TCA_FQ_INITIAL_QUANTUM, q->initial_quantum) ||
 	    nla_put_u32(skb, TCA_FQ_RATE_ENABLE, q->rate_enable) ||
-	    nla_put_u32(skb, TCA_FQ_FLOW_DEFAULT_RATE, q->flow_default_rate) ||
 	    nla_put_u32(skb, TCA_FQ_FLOW_MAX_RATE, q->flow_max_rate) ||
 	    nla_put_u32(skb, TCA_FQ_BUCKETS_LOG, q->fq_trees_log))
 		goto nla_put_failure;

^ permalink raw reply related

* Re: [RFC PATCH 0/2 v2] net: alternate proposal for using macvlans with forwarding acceleration
From: John Fastabend @ 2013-10-07 22:09 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, John Fastabend, Andy Gospodarek, David Miller
In-Reply-To: <1380917405-23801-1-git-send-email-nhorman@tuxdriver.com>

On 10/04/2013 01:10 PM, Neil Horman wrote:
> Hey all-
>       heres the next, updated version of the vsi/macvlan integration that we've
> been discussing.
>
> Some change notes:
>
> * Changes to the fowarding ops structure - Removed the priv_size field, and
> added a flags field.  Removal of the priv_size field was accomplished by just
> having the add method return a void * and using ERR_PTR and PTR_ERR checks,
> which also allows us to allocate memory for the acceleration path in the driver,
> which I like.  I'm not super happy still with how I'm using the flags (currenly
> only used to indicate support for feature sets), but at least we have the flags
> now, and they can be exposed to user space via iproute2 or ethtool if need be

For the flag why not use the existing feature flag namespace? Adding
NETIF_F_HW_L2FW_DOFFLOAD or something equivalent to netdev_features.h
and also doing the string updates would get the userspace support for
free.

The one downside of a global feature flag approach like this is if
the user wants to create some offloaded macvlan devices and some SW
only macvlans the control sequence is a bit clumsy. But unless we add
a new flag to the macvlan netlink message I'm not sure how to avoid
this. Further if you have multiple control applications
creating/deleting these macvlans the flag mechanism starts to break
down. One app sets the flag the other deletes, you get the idea.

It might be worth adding a netlink flag to change the state of
the HW offload. I know this goes against the 'it just gets offloaded'
line of thought, but if you make the default to offload then it should
be OK.

>
> * Changes to the Transmit path - Specifically I'm using dev_queue_xmit to send
> frames now, which I like as it makes the macvlan subject to the lowerdevs qdisc
> configuration.

This creates perhaps another oddity where on TX the packets will be
visible to the lowerdev. Think tcpdump and egress qdisc. But on ingress
the packets will be delivered directly to macvlan device. Also I was
thinking there might be good reasons to skip the lowerdev qdisc, for
performance reasons or QOS setups.

So it might be best the way you had it in the previous revision even if
it was not functionally equivalent to the SW path. If you skip the lower
dev and submit directly to the hardware then you also don't need a
sk_buff change. The driver can do a lookup via the skb->dev field and
additional hints from the stack are not needed.

If there is a perceived problem with not having the HW and SW completely
equivalent functionally we could always default the feature flag to off.
Then enabling the feature would be a single 'ethtool' command. This
seems like a nice compromise.

I like where this is going though!

Thanks,
John

-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* Re: [E1000-devel] [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern
From: Waskiewicz Jr, Peter P @ 2013-10-07 22:21 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Tejun Heo, linux-mips@linux-mips.org, VMware, Inc.,
	linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-ide@vger.kernel.org, linux-s390@vger.kernel.org, King,
	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org,
	x86@kernel.org, Ben Hutchings, Alexander Gordeev, Matt Porter,
	iss_storagedev@hp.com, Michael Ellerman, "lin
In-Reply-To: <1381176656.645.171.camel@pasglop>

On Tue, 2013-10-08 at 07:10 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2013-10-07 at 14:01 -0400, Tejun Heo wrote:
> > I don't think the same race condition would happen with the loop.  The
> > problem case is where multiple msi(x) allocation fails completely
> > because the global limit went down before inquiry and allocation.  In
> > the loop based interface, it'd retry with the lower number.
> > 
> > As long as the number of drivers which need this sort of adaptive
> > allocation isn't too high and the common cases can be made simple, I
> > don't think the "complex" part of interface is all that important.
> > Maybe we can have reserve / cancel type interface or just keep the
> > loop with more explicit function names (ie. try_enable or something
> > like that).
> 
> We want to be able to request an MSI-X at runtime anyway ... if I want
> to dynamically add a queue to my network interface, I want it to be able
> to pop a new arbitrary MSI-X.

If you want to dynamically allocate another queue, you'd either need to
have them all pre-allocated at alloc_etherdev_mqs(), or add a new API to
netdev that allows adding new queues on the fly.

How things are done today, the Tx queues are all tacked onto the end of
the netdev struct.  That would have to change to probably a linked list
of queues that could be grown or shrunk on the fly.
netif_alloc_netdev_queues() would need to change the kzalloc() to a list
allocation.

Cheers,
-PJ

^ permalink raw reply

* Re: [PATCH 1/2] net: Add layer 2 hardware acceleration operations for macvlan devices
From: John Fastabend @ 2013-10-07 22:39 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, netdev
In-Reply-To: <20131007.173433.163556658910279518.davem@davemloft.net>

On 10/07/2013 02:34 PM, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Mon, 7 Oct 2013 17:20:00 -0400
>
>> Thats me experimenting.  I was thinking that origionally this functionality
>> might be grouped separately, so that we could handle it independently of the
>> standard network device operations (you might have noticed in v1 of my patch I
>> had a size_t variable in there, so I thought the separation might be
>> organizationally nice).  It was also something I was tinkering with for
>> potential future work to support other data plane accelerators (like the FM6000
>> switch chip from intel) in a manner that didn't pollute the more typical host network
>> devices.  Like I said though, just experimenting at the moment....
>

We can do something like the dcbnl ops and add another pointer off
the net device structure and then use the skb->dev field to find the
correct set of ops? This seems like the simplest option to me and
isolates the ops structure.

Is there some information loss from hanging it off the netdevice
structure vs the skb? I can't see any.

> Can these dataplane devices still act like a normal networking port and
> send and receive packets at the host level?
>

Yes they act like normal networking ports except for there is a
switching component in the hardware. These patches are not looking at
virtual or multiple physical functions at the moment.

> If yes, that would be an extremely strong argument for netdev_ops.

I agree.


-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* Re: [PATCH] net: Separate the close_list and the unreg_list v2
From: Eric W. Biederman @ 2013-10-07 22:45 UTC (permalink / raw)
  To: David Miller; +Cc: fruggeri, netdev
In-Reply-To: <20131007.152238.779958484281422820.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Sat, 05 Oct 2013 19:26:05 -0700
>
>> 
>> Separate the unreg_list and the close_list in dev_close_many preventing
>> dev_close_many from permuting the unreg_list.  The permutations of the
>> unreg_list have resulted in cases where the loopback device is accessed
>> it has been freed in code such as dst_ifdown.  Resulting in subtle memory
>> corruption.
>> 
>> This is the second bug from sharing the storage between the close_list
>> and the unreg_list.  The issues that crop up with sharing are
>> apparently too subtle to show up in normal testing or usage, so let's
>> forget about being clever and use two separate lists.
>> 
>> v2: Make all callers pass in a close_list to dev_close_many
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>> 
>> Sending the complete diff because this version is actually more
>> readable and more obviously correct.
>
> I'll apply this, thanks Eric.

Thanks.  It is good to see this getting sorted out.

Eric

^ permalink raw reply

* [PATCH 1/4] [RFC] net: Explicitly initialize u64_stats_sync structures for lockdep
From: John Stultz @ 2013-10-07 22:51 UTC (permalink / raw)
  To: LKML
  Cc: John Stultz, Eric Dumazet, Thomas Petazzoni, Mirko Lindner,
	Stephen Hemminger, Roger Luethi, Patrick McHardy, Rusty Russell,
	Michael S. Tsirkin, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Wensong Zhang, Simon Horman, Julian Anastasov,
	Jesse Gross, Mathieu Desnoyers, Steven Rostedt, Peter Zijlstra,
	Ingo Molnar, Thomas Gleixner, David S. Miller, netdev
In-Reply-To: <1381186321-4906-1-git-send-email-john.stultz@linaro.org>

In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.

The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.

This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.

Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.

Feedback would be appreciated!

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/net/dummy.c                            |  6 ++++++
 drivers/net/ethernet/emulex/benet/be_main.c    |  4 ++++
 drivers/net/ethernet/intel/igb/igb_main.c      |  5 +++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  4 ++++
 drivers/net/ethernet/marvell/mvneta.c          |  3 +++
 drivers/net/ethernet/marvell/sky2.c            |  3 +++
 drivers/net/ethernet/neterion/vxge/vxge-main.c |  4 ++++
 drivers/net/ethernet/nvidia/forcedeth.c        |  2 ++
 drivers/net/ethernet/realtek/8139too.c         |  3 +++
 drivers/net/ethernet/tile/tilepro.c            |  2 ++
 drivers/net/ethernet/via/via-rhine.c           |  3 +++
 drivers/net/ifb.c                              |  5 +++++
 drivers/net/loopback.c                         |  6 ++++++
 drivers/net/macvlan.c                          |  7 +++++++
 drivers/net/nlmon.c                            |  8 ++++++++
 drivers/net/team/team.c                        |  6 ++++++
 drivers/net/team/team_mode_loadbalance.c       |  9 ++++++++-
 drivers/net/veth.c                             |  8 ++++++++
 drivers/net/virtio_net.c                       |  8 ++++++++
 drivers/net/vxlan.c                            |  8 ++++++++
 drivers/net/xen-netfront.c                     |  6 ++++++
 include/linux/u64_stats_sync.h                 |  7 +++++++
 net/8021q/vlan_dev.c                           |  9 ++++++++-
 net/bridge/br_device.c                         |  7 +++++++
 net/ipv4/af_inet.c                             | 14 ++++++++++++++
 net/ipv4/ip_tunnel.c                           |  8 +++++++-
 net/ipv6/addrconf.c                            | 14 ++++++++++++++
 net/ipv6/af_inet6.c                            | 14 ++++++++++++++
 net/ipv6/ip6_gre.c                             | 15 +++++++++++++++
 net/ipv6/ip6_tunnel.c                          |  7 +++++++
 net/ipv6/sit.c                                 | 15 +++++++++++++++
 net/netfilter/ipvs/ip_vs_ctl.c                 | 25 ++++++++++++++++++++++---
 net/openvswitch/datapath.c                     |  6 ++++++
 net/openvswitch/vport.c                        |  8 ++++++++
 34 files changed, 253 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index b710c6b..bd8f84b 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -88,10 +88,16 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static int dummy_dev_init(struct net_device *dev)
 {
+	int i;
 	dev->dstats = alloc_percpu(struct pcpu_dstats);
 	if (!dev->dstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_dstats *dstats;
+		dstats = per_cpu_ptr(dev->dstats, i);
+		u64_stats_init(&dstats->syncp);
+	}
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 100b528..d2dcf2e 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -2033,6 +2033,9 @@ static int be_tx_qs_create(struct be_adapter *adapter)
 		if (status)
 			return status;
 
+		u64_stats_init(&txo->stats.sync);
+		u64_stats_init(&txo->stats.sync_compl);
+
 		/* If num_evt_qs is less than num_tx_qs, then more than
 		 * one txq share an eq
 		 */
@@ -2094,6 +2097,7 @@ static int be_rx_cqs_create(struct be_adapter *adapter)
 		if (rc)
 			return rc;
 
+		u64_stats_init(&rxo->stats.sync);
 		eq = &adapter->eq_obj[i % adapter->num_evt_qs].q;
 		rc = be_cmd_cq_create(adapter, cq, eq, false, 3);
 		if (rc)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 8cf44f2..b6edb93 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -1223,6 +1223,9 @@ static int igb_alloc_q_vector(struct igb_adapter *adapter,
 		ring->count = adapter->tx_ring_count;
 		ring->queue_index = txr_idx;
 
+		u64_stats_init(&ring->tx_syncp);
+		u64_stats_init(&ring->tx_syncp2);
+
 		/* assign ring to adapter */
 		adapter->tx_ring[txr_idx] = ring;
 
@@ -1256,6 +1259,8 @@ static int igb_alloc_q_vector(struct igb_adapter *adapter,
 		ring->count = adapter->rx_ring_count;
 		ring->queue_index = rxr_idx;
 
+		u64_stats_init(&ring->rx_syncp);
+
 		/* assign ring to adapter */
 		adapter->rx_ring[rxr_idx] = ring;
 	}
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0ade0cd..c175036 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4867,6 +4867,8 @@ int ixgbe_setup_tx_resources(struct ixgbe_ring *tx_ring)
 	if (!tx_ring->tx_buffer_info)
 		goto err;
 
+	u64_stats_init(&tx_ring->syncp);
+
 	/* round up to nearest 4K */
 	tx_ring->size = tx_ring->count * sizeof(union ixgbe_adv_tx_desc);
 	tx_ring->size = ALIGN(tx_ring->size, 4096);
@@ -4949,6 +4951,8 @@ int ixgbe_setup_rx_resources(struct ixgbe_ring *rx_ring)
 	if (!rx_ring->rx_buffer_info)
 		goto err;
 
+	u64_stats_init(&rx_ring->syncp);
+
 	/* Round up to nearest 4K */
 	rx_ring->size = rx_ring->count * sizeof(union ixgbe_adv_rx_desc);
 	rx_ring->size = ALIGN(rx_ring->size, 4096);
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index e35bac7..cb4635c 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2792,6 +2792,9 @@ static int mvneta_probe(struct platform_device *pdev)
 
 	pp = netdev_priv(dev);
 
+	u64_stats_init(&pp->tx_stats.syncp);
+	u64_stats_init(&pp->rx_stats.syncp);
+
 	pp->weight = MVNETA_RX_POLL_WEIGHT;
 	pp->phy_node = phy_node;
 	pp->phy_interface = phy_mode;
diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index e09a8c6..339d841 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -4763,6 +4763,9 @@ static struct net_device *sky2_init_netdev(struct sky2_hw *hw, unsigned port,
 	sky2->hw = hw;
 	sky2->msg_enable = netif_msg_init(debug, default_msg);
 
+	u64_stats_init(&sky2->tx_stats.syncp);
+	u64_stats_init(&sky2->rx_stats.syncp);
+
 	/* Auto speed and flow control */
 	sky2->flags = SKY2_FLAG_AUTO_SPEED | SKY2_FLAG_AUTO_PAUSE;
 	if (hw->chip_id != CHIP_ID_YUKON_XL)
diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index 5a20eaf..44626ec 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -2072,6 +2072,10 @@ static int vxge_open_vpaths(struct vxgedev *vdev)
 				vdev->config.tx_steering_type;
 			vpath->fifo.ndev = vdev->ndev;
 			vpath->fifo.pdev = vdev->pdev;
+
+			u64_stats_init(&vpath->fifo.stats.syncp);
+			u64_stats_init(&vpath->ring.stats.syncp);
+
 			if (vdev->config.tx_steering_type)
 				vpath->fifo.txq =
 					netdev_get_tx_queue(vdev->ndev, i);
diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index 098b96d..2d045be 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -5619,6 +5619,8 @@ static int nv_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	spin_lock_init(&np->lock);
 	spin_lock_init(&np->hwstats_lock);
 	SET_NETDEV_DEV(dev, &pci_dev->dev);
+	u64_stats_init(&np->swstats_rx_syncp);
+	u64_stats_init(&np->swstats_tx_syncp);
 
 	init_timer(&np->oom_kick);
 	np->oom_kick.data = (unsigned long) dev;
diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index 3ccedeb..c40e9848 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -791,6 +791,9 @@ static struct net_device *rtl8139_init_board(struct pci_dev *pdev)
 
 	pci_set_master (pdev);
 
+	u64_stats_init(&tp->rx_stats.syncp);
+	u64_stats_init(&tp->tx_stats.syncp);
+
 retry:
 	/* PIO bar register comes first. */
 	bar = !use_io;
diff --git a/drivers/net/ethernet/tile/tilepro.c b/drivers/net/ethernet/tile/tilepro.c
index 106be47..edb2e12 100644
--- a/drivers/net/ethernet/tile/tilepro.c
+++ b/drivers/net/ethernet/tile/tilepro.c
@@ -1008,6 +1008,8 @@ static void tile_net_register(void *dev_ptr)
 	info->egress_timer.data = (long)info;
 	info->egress_timer.function = tile_net_handle_egress_timer;
 
+	u64_stats_init(&info->stats.syncp);
+
 	priv->cpu[my_cpu] = info;
 
 	/*
diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
index c8f088a..13cade2 100644
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -987,6 +987,9 @@ static int rhine_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	rp->base = ioaddr;
 
+	u64_stats_init(&rp->tx_stats.syncp);
+	u64_stats_init(&rp->rx_stats.syncp);
+
 	/* Get chip registers into a sane state */
 	rhine_power_init(dev);
 	rhine_hw_init(dev, pioaddr);
diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index a3bed28..c14d39b 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -265,6 +265,7 @@ MODULE_PARM_DESC(numifbs, "Number of ifb devices");
 static int __init ifb_init_one(int index)
 {
 	struct net_device *dev_ifb;
+	struct ifb_private *dp;
 	int err;
 
 	dev_ifb = alloc_netdev(sizeof(struct ifb_private),
@@ -273,6 +274,10 @@ static int __init ifb_init_one(int index)
 	if (!dev_ifb)
 		return -ENOMEM;
 
+	dp = netdev_priv(dev_ifb);
+	u64_stats_init(&dp->rsync);
+	u64_stats_init(&dp->tsync);
+
 	dev_ifb->rtnl_link_ops = &ifb_link_ops;
 	err = register_netdevice(dev_ifb);
 	if (err < 0)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index a17d85a..ac24c27 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -137,10 +137,16 @@ static const struct ethtool_ops loopback_ethtool_ops = {
 
 static int loopback_dev_init(struct net_device *dev)
 {
+	int i;
 	dev->lstats = alloc_percpu(struct pcpu_lstats);
 	if (!dev->lstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_lstats *lb_stats;
+		lb_stats = per_cpu_ptr(dev->lstats, i);
+		u64_stats_init(&lb_stats->syncp);
+	}
 	return 0;
 }
 
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 9bf46bd..0924e51b 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -501,6 +501,7 @@ static int macvlan_init(struct net_device *dev)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
 	const struct net_device *lowerdev = vlan->lowerdev;
+	int i;
 
 	dev->state		= (dev->state & ~MACVLAN_STATE_MASK) |
 				  (lowerdev->state & MACVLAN_STATE_MASK);
@@ -516,6 +517,12 @@ static int macvlan_init(struct net_device *dev)
 	if (!vlan->pcpu_stats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct macvlan_pcpu_stats *mvlstats;
+		mvlstats = per_cpu_ptr(vlan->pcpu_stats, i);
+		u64_stats_init(&mvlstats->syncp);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/net/nlmon.c b/drivers/net/nlmon.c
index b57ce5f..d2bb12b 100644
--- a/drivers/net/nlmon.c
+++ b/drivers/net/nlmon.c
@@ -47,8 +47,16 @@ static int nlmon_change_mtu(struct net_device *dev, int new_mtu)
 
 static int nlmon_dev_init(struct net_device *dev)
 {
+	int i;
+
 	dev->lstats = alloc_percpu(struct pcpu_lstats);
 
+	for_each_possible_cpu(i) {
+		struct pcpu_lstats *nlmstats;
+		nlmstats = per_cpu_ptr(dev->lstats, i);
+		u64_stats_init(&nlmstats->syncp);
+	}
+
 	return dev->lstats == NULL ? -ENOMEM : 0;
 }
 
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 50e43e6..6574eb8 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1540,6 +1540,12 @@ static int team_init(struct net_device *dev)
 	if (!team->pcpu_stats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct team_pcpu_stats *team_stats;
+		team_stats = per_cpu_ptr(team->pcpu_stats, i);
+		u64_stats_init(&team_stats->syncp);
+	}
+
 	for (i = 0; i < TEAM_PORT_HASHENTRIES; i++)
 		INIT_HLIST_HEAD(&team->en_port_hlist[i]);
 	INIT_LIST_HEAD(&team->port_list);
diff --git a/drivers/net/team/team_mode_loadbalance.c b/drivers/net/team/team_mode_loadbalance.c
index 829a9cd..d671fc3 100644
--- a/drivers/net/team/team_mode_loadbalance.c
+++ b/drivers/net/team/team_mode_loadbalance.c
@@ -570,7 +570,7 @@ static int lb_init(struct team *team)
 {
 	struct lb_priv *lb_priv = get_lb_priv(team);
 	lb_select_tx_port_func_t *func;
-	int err;
+	int i, err;
 
 	/* set default tx port selector */
 	func = lb_select_tx_port_get_func("hash");
@@ -588,6 +588,13 @@ static int lb_init(struct team *team)
 		goto err_alloc_pcpu_stats;
 	}
 
+	for_each_possible_cpu(i) {
+		struct lb_pcpu_stats *team_lb_stats;
+		team_lb_stats = per_cpu_ptr(lb_priv->pcpu_stats, i);
+		u64_stats_init(&team_lb_stats->syncp);
+	}
+
+
 	INIT_DELAYED_WORK(&lb_priv->ex->stats.refresh_dw, lb_stats_refresh);
 
 	err = team_options_register(team, lb_options, ARRAY_SIZE(lb_options));
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index eee1f19..46e83e3 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -230,10 +230,18 @@ static int veth_change_mtu(struct net_device *dev, int new_mtu)
 
 static int veth_dev_init(struct net_device *dev)
 {
+	int i;
+
 	dev->vstats = alloc_percpu(struct pcpu_vstats);
 	if (!dev->vstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_vstats *veth_stats;
+		veth_stats = per_cpu_ptr(dev->vstats, i);
+		u64_stats_init(&veth_stats->syncp);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index defec2b..bd12772 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1559,6 +1559,14 @@ static int virtnet_probe(struct virtio_device *vdev)
 	if (vi->stats == NULL)
 		goto free;
 
+	for_each_possible_cpu(i) {
+		struct virtnet_stats *virtnet_stats;
+		virtnet_stats = per_cpu_ptr(vi->stats, i);
+		u64_stats_init(&virtnet_stats->tx_syncp);
+		u64_stats_init(&virtnet_stats->rx_syncp);
+	}
+
+
 	vi->vq_index = alloc_percpu(int);
 	if (vi->vq_index == NULL)
 		goto free_stats;
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index d1292fe..2e4cdc8 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1886,11 +1886,19 @@ static int vxlan_init(struct net_device *dev)
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
 	struct vxlan_sock *vs;
+	int i;
 
 	dev->tstats = alloc_percpu(struct pcpu_tstats);
 	if (!dev->tstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *vxlan_stats;
+		vxlan_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&vxlan_stats->syncp);
+	}
+
+
 	spin_lock(&vn->sock_lock);
 	vs = vxlan_find_sock(dev_net(dev), vxlan->dst_port);
 	if (vs) {
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 36808bf..54223ac 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1338,6 +1338,12 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
 	if (np->stats == NULL)
 		goto exit;
 
+	for_each_possible_cpu(i) {
+		struct netfront_stats *xen_nf_stats;
+		xen_nf_stats = per_cpu_ptr(np->stats, i);
+		u64_stats_init(&xen_nf_stats->syncp);
+	}
+
 	/* Initialise tx_skbs as a free chain containing every entry. */
 	np->tx_skb_freelist = 0;
 	for (i = 0; i < NET_TX_RING_SIZE; i++) {
diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
index 8da8c4e..7bfabd2 100644
--- a/include/linux/u64_stats_sync.h
+++ b/include/linux/u64_stats_sync.h
@@ -67,6 +67,13 @@ struct u64_stats_sync {
 #endif
 };
 
+
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+# define u64_stats_init(syncp)	seqcount_init(syncp.seq)
+#else
+# define u64_stats_init(syncp)	do { } while (0)
+#endif
+
 static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 09bf1c3..4deff3e 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -558,7 +558,7 @@ static const struct net_device_ops vlan_netdev_ops;
 static int vlan_dev_init(struct net_device *dev)
 {
 	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
-	int subclass = 0;
+	int subclass = 0, i;
 
 	netif_carrier_off(dev);
 
@@ -612,6 +612,13 @@ static int vlan_dev_init(struct net_device *dev)
 	if (!vlan_dev_priv(dev)->vlan_pcpu_stats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct vlan_pcpu_stats *vlan_stat;
+		vlan_stat = per_cpu_ptr(vlan_dev_priv(dev)->vlan_pcpu_stats, i);
+		u64_stats_init(&vlan_stat->syncp);
+	}
+
+
 	return 0;
 }
 
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index ca04163..7893d64 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -88,11 +88,18 @@ out:
 static int br_dev_init(struct net_device *dev)
 {
 	struct net_bridge *br = netdev_priv(dev);
+	int i;
 
 	br->stats = alloc_percpu(struct br_cpu_netstats);
 	if (!br->stats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct br_cpu_netstats *br_dev_stats;
+		br_dev_stats = per_cpu_ptr(br->stats, i);
+		u64_stats_init(&br_dev_stats->syncp);
+	}
+
 	return 0;
 }
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 7a1874b..f40ce62 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1520,6 +1520,7 @@ int snmp_mib_init(void __percpu *ptr[2], size_t mibsize, size_t align)
 	ptr[0] = __alloc_percpu(mibsize, align);
 	if (!ptr[0])
 		return -ENOMEM;
+
 #if SNMP_ARRAY_SZ == 2
 	ptr[1] = __alloc_percpu(mibsize, align);
 	if (!ptr[1]) {
@@ -1563,6 +1564,8 @@ static const struct net_protocol icmp_protocol = {
 
 static __net_init int ipv4_mib_init_net(struct net *net)
 {
+	int i;
+
 	if (snmp_mib_init((void __percpu **)net->mib.tcp_statistics,
 			  sizeof(struct tcp_mib),
 			  __alignof__(struct tcp_mib)) < 0)
@@ -1571,6 +1574,17 @@ static __net_init int ipv4_mib_init_net(struct net *net)
 			  sizeof(struct ipstats_mib),
 			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip_mib;
+
+	for_each_possible_cpu(i) {
+		struct ipstats_mib *af_inet_stats;
+		af_inet_stats = per_cpu_ptr(net->mib.ip_statistics[0], i);
+		u64_stats_init(&af_inet_stats->syncp);
+#if SNMP_ARRAY_SZ == 2
+		af_inet_stats = per_cpu_ptr(net->mib.ip_statistics[1], i);
+		u64_stats_init(&af_inet_stats->syncp);
+#endif
+	}
+
 	if (snmp_mib_init((void __percpu **)net->mib.net_statistics,
 			  sizeof(struct linux_mib),
 			  __alignof__(struct linux_mib)) < 0)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index ac9fabe..2b9c945 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -976,13 +976,19 @@ int ip_tunnel_init(struct net_device *dev)
 {
 	struct ip_tunnel *tunnel = netdev_priv(dev);
 	struct iphdr *iph = &tunnel->parms.iph;
-	int err;
+	int i, err;
 
 	dev->destructor	= ip_tunnel_dev_free;
 	dev->tstats = alloc_percpu(struct pcpu_tstats);
 	if (!dev->tstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *ipt_stats;
+		ipt_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&ipt_stats->syncp);
+	}
+
 	err = gro_cells_init(&tunnel->gro_cells, dev);
 	if (err) {
 		free_percpu(dev->tstats);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d6ff126..390953c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -281,10 +281,24 @@ static void addrconf_mod_dad_timer(struct inet6_ifaddr *ifp,
 
 static int snmp6_alloc_dev(struct inet6_dev *idev)
 {
+	int i;
+
 	if (snmp_mib_init((void __percpu **)idev->stats.ipv6,
 			  sizeof(struct ipstats_mib),
 			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip;
+
+	for_each_possible_cpu(i) {
+		struct ipstats_mib *addrconf_stats;
+		addrconf_stats = per_cpu_ptr(idev->stats.ipv6[0], i);
+		u64_stats_init(&addrconf_stats->syncp);
+#if SNMP_ARRAY_SZ == 2
+		addrconf_stats = per_cpu_ptr(idev->stats.ipv6[1], i);
+		u64_stats_init(&addrconf_stats->syncp);
+#endif
+	}
+
+
 	idev->stats.icmpv6dev = kzalloc(sizeof(struct icmpv6_mib_device),
 					GFP_KERNEL);
 	if (!idev->stats.icmpv6dev)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 7c96100..a8f8559 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -719,6 +719,8 @@ static void ipv6_packet_cleanup(void)
 
 static int __net_init ipv6_init_mibs(struct net *net)
 {
+	int i;
+
 	if (snmp_mib_init((void __percpu **)net->mib.udp_stats_in6,
 			  sizeof(struct udp_mib),
 			  __alignof__(struct udp_mib)) < 0)
@@ -731,6 +733,18 @@ static int __net_init ipv6_init_mibs(struct net *net)
 			  sizeof(struct ipstats_mib),
 			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip_mib;
+
+	for_each_possible_cpu(i) {
+		struct ipstats_mib *af_inet6_stats;
+		af_inet6_stats = per_cpu_ptr(net->mib.ipv6_statistics[0], i);
+		u64_stats_init(&af_inet6_stats->syncp);
+#if SNMP_ARRAY_SZ == 2
+		af_inet6_stats = per_cpu_ptr(net->mib.ipv6_statistics[1], i);
+		u64_stats_init(&af_inet6_stats->syncp);
+#endif
+	}
+
+
 	if (snmp_mib_init((void __percpu **)net->mib.icmpv6_statistics,
 			  sizeof(struct icmpv6_mib),
 			  __alignof__(struct icmpv6_mib)) < 0)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 6b26e9f..b355cb0 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1254,6 +1254,7 @@ static void ip6gre_tunnel_setup(struct net_device *dev)
 static int ip6gre_tunnel_init(struct net_device *dev)
 {
 	struct ip6_tnl *tunnel;
+	int i;
 
 	tunnel = netdev_priv(dev);
 
@@ -1271,6 +1272,13 @@ static int ip6gre_tunnel_init(struct net_device *dev)
 	if (!dev->tstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *ip6gre_tunnel_stats;
+		ip6gre_tunnel_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&ip6gre_tunnel_stats->syncp);
+	}
+
+
 	return 0;
 }
 
@@ -1451,6 +1459,7 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
 static int ip6gre_tap_init(struct net_device *dev)
 {
 	struct ip6_tnl *tunnel;
+	int i;
 
 	tunnel = netdev_priv(dev);
 
@@ -1464,6 +1473,12 @@ static int ip6gre_tap_init(struct net_device *dev)
 	if (!dev->tstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *ip6gre_tap_stats;
+		ip6gre_tap_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&ip6gre_tap_stats->syncp);
+	}
+
 	return 0;
 }
 
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 2d8f482..b0e3aa1 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1486,12 +1486,19 @@ static inline int
 ip6_tnl_dev_init_gen(struct net_device *dev)
 {
 	struct ip6_tnl *t = netdev_priv(dev);
+	int i;
 
 	t->dev = dev;
 	t->net = dev_net(dev);
 	dev->tstats = alloc_percpu(struct pcpu_tstats);
 	if (!dev->tstats)
 		return -ENOMEM;
+
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *ip6_tnl_stats;
+		ip6_tnl_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&ip6_tnl_stats->syncp);
+	}
 	return 0;
 }
 
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 7ee5cb9..24889fc 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1256,6 +1256,7 @@ static void ipip6_tunnel_setup(struct net_device *dev)
 static int ipip6_tunnel_init(struct net_device *dev)
 {
 	struct ip_tunnel *tunnel = netdev_priv(dev);
+	int i;
 
 	tunnel->dev = dev;
 	tunnel->net = dev_net(dev);
@@ -1268,6 +1269,12 @@ static int ipip6_tunnel_init(struct net_device *dev)
 	if (!dev->tstats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *ipip6_tunnel_stats;
+		ipip6_tunnel_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&ipip6_tunnel_stats->syncp);
+	}
+
 	return 0;
 }
 
@@ -1277,6 +1284,7 @@ static int __net_init ipip6_fb_tunnel_init(struct net_device *dev)
 	struct iphdr *iph = &tunnel->parms.iph;
 	struct net *net = dev_net(dev);
 	struct sit_net *sitn = net_generic(net, sit_net_id);
+	int i;
 
 	tunnel->dev = dev;
 	tunnel->net = dev_net(dev);
@@ -1290,6 +1298,13 @@ static int __net_init ipip6_fb_tunnel_init(struct net_device *dev)
 	dev->tstats = alloc_percpu(struct pcpu_tstats);
 	if (!dev->tstats)
 		return -ENOMEM;
+
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *ipip6_fb_stats;
+		ipip6_fb_stats = per_cpu_ptr(dev->tstats, i);
+		u64_stats_init(&ipip6_fb_stats->syncp);
+	}
+
 	dev_hold(dev);
 	rcu_assign_pointer(sitn->tunnels_wc[0], tunnel);
 	return 0;
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index c8148e4..5c54c23 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -836,7 +836,7 @@ ip_vs_new_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest,
 	       struct ip_vs_dest **dest_p)
 {
 	struct ip_vs_dest *dest;
-	unsigned int atype;
+	unsigned int atype, i;
 
 	EnterFunction(2);
 
@@ -863,6 +863,12 @@ ip_vs_new_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest,
 	if (!dest->stats.cpustats)
 		goto err_alloc;
 
+	for_each_possible_cpu(i) {
+		struct ip_vs_cpu_stats *ip_vs_dest_stats;
+		ip_vs_dest_stats = per_cpu_ptr(dest->stats.cpustats, i);
+		u64_stats_init(&ip_vs_dest_stats->syncp);
+	}
+
 	dest->af = svc->af;
 	dest->protocol = svc->protocol;
 	dest->vaddr = svc->addr;
@@ -1136,7 +1142,7 @@ static int
 ip_vs_add_service(struct net *net, struct ip_vs_service_user_kern *u,
 		  struct ip_vs_service **svc_p)
 {
-	int ret = 0;
+	int ret = 0, i;
 	struct ip_vs_scheduler *sched = NULL;
 	struct ip_vs_pe *pe = NULL;
 	struct ip_vs_service *svc = NULL;
@@ -1186,6 +1192,13 @@ ip_vs_add_service(struct net *net, struct ip_vs_service_user_kern *u,
 		goto out_err;
 	}
 
+	for_each_possible_cpu(i) {
+		struct ip_vs_cpu_stats *ip_vs_stats;
+		ip_vs_stats = per_cpu_ptr(svc->stats.cpustats, i);
+		u64_stats_init(&ip_vs_stats->syncp);
+	}
+
+
 	/* I'm the first user of the service */
 	atomic_set(&svc->refcnt, 0);
 
@@ -3796,7 +3809,7 @@ static struct notifier_block ip_vs_dst_notifier = {
 
 int __net_init ip_vs_control_net_init(struct net *net)
 {
-	int idx;
+	int i, idx;
 	struct netns_ipvs *ipvs = net_ipvs(net);
 
 	/* Initialize rs_table */
@@ -3815,6 +3828,12 @@ int __net_init ip_vs_control_net_init(struct net *net)
 	if (!ipvs->tot_stats.cpustats)
 		return -ENOMEM;
 
+	for_each_possible_cpu(i) {
+		struct ip_vs_cpu_stats *ipvs_tot_stats;
+		ipvs_tot_stats = per_cpu_ptr(ipvs->tot_stats.cpustats, i);
+		u64_stats_init(&ipvs_tot_stats->syncp);
+	}
+
 	spin_lock_init(&ipvs->tot_stats.lock);
 
 	proc_create("ip_vs", 0, net->proc_net, &ip_vs_info_fops);
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 2aa13bd..b92553c 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1698,6 +1698,12 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 		goto err_destroy_table;
 	}
 
+	for_each_possible_cpu(i) {
+		struct dp_stats_percpu *dpath_stats;
+		dpath_stats = per_cpu_ptr(dp->stats_percpu, i);
+		u64_stats_init(&dpath_stats->sync);
+	}
+
 	dp->ports = kmalloc(DP_VPORT_HASH_BUCKETS * sizeof(struct hlist_head),
 			GFP_KERNEL);
 	if (!dp->ports) {
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 6f65dbe..d830a95f 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -118,6 +118,7 @@ struct vport *ovs_vport_alloc(int priv_size, const struct vport_ops *ops,
 {
 	struct vport *vport;
 	size_t alloc_size;
+	int i;
 
 	alloc_size = sizeof(struct vport);
 	if (priv_size) {
@@ -141,6 +142,13 @@ struct vport *ovs_vport_alloc(int priv_size, const struct vport_ops *ops,
 		return ERR_PTR(-ENOMEM);
 	}
 
+	for_each_possible_cpu(i) {
+		struct pcpu_tstats *vport_stats;
+		vport_stats = per_cpu_ptr(vport->percpu_stats, i);
+		u64_stats_init(&vport_stats->syncp);
+	}
+
+
 	spin_lock_init(&vport->stats_lock);
 
 	return vport;
-- 
1.8.1.2


^ permalink raw reply related

* Re: bug in passing file descriptors
From: Andi Kleen @ 2013-10-07 22:55 UTC (permalink / raw)
  To: David Miller; +Cc: sar, luto, netdev, mtk.manpages, ebiederm
In-Reply-To: <20131007.173237.1669132001431607341.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: Steve Rago <sar@nec-labs.com>
> Date: Mon, 7 Oct 2013 16:29:15 -0400
>
>> On 10/07/2013 03:42 PM, David Miller wrote:
>>> There is no compatability issue.
>>>
>>> 32-bit tasks will always see the 4-byte align/length.
>>> 64-bit tasks will always see the 8-byte align/length.
>>>
>> 
>> Really?  So when I compile my application on a 32-bit Linux box and
>> then try to run it on a 64-bit Linux box, you're not going to overrun
>> my buffer when CMSG_SPACE led me to allocate an insufficient amount of
>> memory needed to account for padding on the 64-bit platform?
>
> We have a compatability layer that gives 32-bit applications the
> same behavior as if they had run on a 32-bit machine.
>
> Search around for the MSG_MSG_COMPAT flag and how that is used in
> net/socket.c

But it seems the compat layer doesn't handle this correctly,
otherwise Steve's original test case would work.

Must be a bug somewhere in the compat layer.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply

* [PATCH] net ipv4: Allow unprivileged users to use most of the per net systctls
From: Eric W. Biederman @ 2013-10-07 23:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


Allow unprivileged users to use:
/proc/sys/net/ipv4/icmp_echo_ignore_all
/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
/proc/sys/net/ipv4/icmp_ignore_bogus_error_response
/proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr
/proc/sys/net/ipv4/icmp_ratelimit
/proc/sys/net/ipv4/icmp_ratemask
/proc/sys/net/ipv4/ping_group_range
/proc/sys/net/ipv4/tcp_ecn
/proc/sys/net/ipv4/ip_local_ports_range

These are occassionally handy and after a quick review I don't see
any problems with unprivileged users using them.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/ipv4/sysctl_net_ipv4.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index c08f096d46b5..470ea82fca51 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -898,9 +898,9 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 		table[8].data =
 			&net->ipv4.sysctl_local_ports.range;
 
-		/* Don't export sysctls to unprivileged users */
+		/* Don't export dangerous sysctls to unprivileged users */
 		if (net->user_ns != &init_user_ns)
-			table[0].procname = NULL;
+			table[9].procname = NULL;
 	}
 
 	/*
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH] mac80211: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))
From: djduanjiong-Re5JQEeQqe8AvxtiuMwx3w @ 2013-10-08  0:09 UTC (permalink / raw)
  To: Johannes Berg, John W. Linville, David S. Miller
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Duan Jiong

From: Duan Jiong <duanj.fnst-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>

Signed-off-by: Duan Jiong <duanj.fnst-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 net/mac80211/key.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index 620677e..3e51dd7 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -879,7 +879,7 @@ ieee80211_gtk_rekey_add(struct ieee80211_vif *vif,
 				  keyconf->keylen, keyconf->key,
 				  0, NULL);
 	if (IS_ERR(key))
-		return ERR_PTR(PTR_ERR(key));
+		return ERR_CAST(key);
 
 	if (sdata->u.mgd.mfp != IEEE80211_MFP_DISABLED)
 		key->conf.flags |= IEEE80211_KEY_FLAG_RX_MGMT;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH 1/2] net: Add layer 2 hardware acceleration operations for macvlan devices
From: Neil Horman @ 2013-10-08  0:52 UTC (permalink / raw)
  To: John Fastabend; +Cc: David Miller, netdev
In-Reply-To: <52533805.2010607@gmail.com>

On Mon, Oct 07, 2013 at 03:39:01PM -0700, John Fastabend wrote:
> On 10/07/2013 02:34 PM, David Miller wrote:
> >From: Neil Horman <nhorman@tuxdriver.com>
> >Date: Mon, 7 Oct 2013 17:20:00 -0400
> >
> >>Thats me experimenting.  I was thinking that origionally this functionality
> >>might be grouped separately, so that we could handle it independently of the
> >>standard network device operations (you might have noticed in v1 of my patch I
> >>had a size_t variable in there, so I thought the separation might be
> >>organizationally nice).  It was also something I was tinkering with for
> >>potential future work to support other data plane accelerators (like the FM6000
> >>switch chip from intel) in a manner that didn't pollute the more typical host network
> >>devices.  Like I said though, just experimenting at the moment....
> >
> 
> We can do something like the dcbnl ops and add another pointer off
> the net device structure and then use the skb->dev field to find the
> correct set of ops? This seems like the simplest option to me and
> isolates the ops structure.
> 
We certainly could do that, or perhaps, for what we're trying to do here, just
using standard netdev_ops is sufficient.  I kind of like the separation (like
the dcbnl_ops), but like I said, experimenting.  I'll try the next version with
the accel methods added to the netdev structure for comparison.

> Is there some information loss from hanging it off the netdevice
> structure vs the skb? I can't see any.
> 
No, not that I'm aware of.  The only reason I added it to the skb in this
version was that, by doing so, I was able to make dual use of the netdev's
standard tx path.

> >Can these dataplane devices still act like a normal networking port and
> >send and receive packets at the host level?
> >
> 
> Yes they act like normal networking ports except for there is a
> switching component in the hardware. These patches are not looking at
> virtual or multiple physical functions at the moment.
> 
To be clear, as John says, these patches aren't addressing any dataplane
acceleration devices beyond the internal switching capabilities of the ixgbe
cards.  That said, other chips will have varying degrees of capabilities, from
simple L2 switching, to full content addressable memories that allow for l2/l3
forwarding, as well as higher level routing functions.  Again however, these
patches are just to integrate macvlans with johns virtual station interface
work.

> >If yes, that would be an extremely strong argument for netdev_ops.
> 
> I agree.
In this specific case, that may well be the case, yes.  I'm not so sure of that
for more advanced switching/routing accelerators, but we probably should do what
makes sense now, and worry about future bridges when we forward over them
(pardon the pun :) ).

Neil

> 
> 
> -- 
> John Fastabend         Intel Corporation
> 

^ permalink raw reply

* Re: [RFC PATCH 0/2 v2] net: alternate proposal for using macvlans with forwarding acceleration
From: Neil Horman @ 2013-10-08  1:08 UTC (permalink / raw)
  To: John Fastabend; +Cc: netdev, John Fastabend, Andy Gospodarek, David Miller
In-Reply-To: <52533107.6060306@gmail.com>

On Mon, Oct 07, 2013 at 03:09:11PM -0700, John Fastabend wrote:
> On 10/04/2013 01:10 PM, Neil Horman wrote:
> >Hey all-
> >      heres the next, updated version of the vsi/macvlan integration that we've
> >been discussing.
> >
> >Some change notes:
> >
> >* Changes to the fowarding ops structure - Removed the priv_size field, and
> >added a flags field.  Removal of the priv_size field was accomplished by just
> >having the add method return a void * and using ERR_PTR and PTR_ERR checks,
> >which also allows us to allocate memory for the acceleration path in the driver,
> >which I like.  I'm not super happy still with how I'm using the flags (currenly
> >only used to indicate support for feature sets), but at least we have the flags
> >now, and they can be exposed to user space via iproute2 or ethtool if need be
> 
> For the flag why not use the existing feature flag namespace? Adding
> NETIF_F_HW_L2FW_DOFFLOAD or something equivalent to netdev_features.h
> and also doing the string updates would get the userspace support for
> free.
> 
> The one downside of a global feature flag approach like this is if
> the user wants to create some offloaded macvlan devices and some SW
> only macvlans the control sequence is a bit clumsy. But unless we add
> a new flag to the macvlan netlink message I'm not sure how to avoid
> this. Further if you have multiple control applications
> creating/deleting these macvlans the flag mechanism starts to break
> down. One app sets the flag the other deletes, you get the idea.
> 
> It might be worth adding a netlink flag to change the state of
> the HW offload. I know this goes against the 'it just gets offloaded'
> line of thought, but if you make the default to offload then it should
> be OK.
> 
Good suggestions, I should have thought to use the global flags previously. I
got hung up on the notion that for some reason our feature set should be
separated like the ops.  I really don't need to do that.

> >
> >* Changes to the Transmit path - Specifically I'm using dev_queue_xmit to send
> >frames now, which I like as it makes the macvlan subject to the lowerdevs qdisc
> >configuration.
> 
> This creates perhaps another oddity where on TX the packets will be
> visible to the lowerdev. Think tcpdump and egress qdisc. But on ingress
> the packets will be delivered directly to macvlan device. Also I was
> thinking there might be good reasons to skip the lowerdev qdisc, for
> performance reasons or QOS setups.
> 
Hmm, is that avoidable by doing an extra check in dev_queue_xmit (i.e. checking
to see if accel_data or some other pointer is set)?  Or do you think its worth
separating the tx path to an accelerated or non accelerated case?

> So it might be best the way you had it in the previous revision even if
> it was not functionally equivalent to the SW path. If you skip the lower
> dev and submit directly to the hardware then you also don't need a
> sk_buff change. The driver can do a lookup via the skb->dev field and
> additional hints from the stack are not needed.
> 
Ok, fair enough, I'll put it back the way I had it previously.

> If there is a perceived problem with not having the HW and SW completely
> equivalent functionally we could always default the feature flag to off.
> Then enabling the feature would be a single 'ethtool' command. This
> seems like a nice compromise.
> 
Yeah, I can agree with that.

> I like where this is going though!
I'm glad you think so!  I'll have another version out later this week!
Best
Neil

> 
> Thanks,
> John
> 
> -- 
> John Fastabend         Intel Corporation
> 

^ permalink raw reply

* Re: [PATCH] {iproute2, xfrm}: Use memcpy to suppress gcc phony buffer overflow warning
From: Fan Du @ 2013-10-08  1:22 UTC (permalink / raw)
  To: David Laight; +Cc: Sohny Thomas, stephen, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B7368@saturn3.aculab.com>

Back from vacation, sorry for the late reply.

On 2013年09月30日 17:29, David Laight wrote:
>>> This is a false positive warning as the destination pointer "buf"
>>> pointers to
>>> an ZERO length array, which actually will occupy alg.buf mostly.
>>> Fix this by using memcpy.
>>>
>>> struct xfrm_algo {
>>>           char            alg_name[64];
>>>           unsigned int    alg_key_len;    /* in bits */
>>>           char            alg_key[0];
>>> };
>>>
>>> struct {
>>>           union {
>>>                   struct xfrm_algo alg;
>>>                   struct xfrm_algo_aead aead;
>>>                   struct xfrm_algo_auth auth;
>>>           } u;
>>>           char buf[XFRM_ALGO_KEY_BUF_SIZE];
>>> } alg = {};
>>>
>>> buf = alg.u.alg.alg_key;
>
> That is worse than horrid...
> The tools have every right to complain about any accesses to alg_key[].

Only when using strcpy, because a build in checking inserted in this function.

>>> ---
>>>    ip/xfrm_state.c |    2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
>>> index 0d98e78..5cc87d3 100644
>>> --- a/ip/xfrm_state.c
>>> +++ b/ip/xfrm_state.c
>>> @@ -159,7 +159,7 @@ static int xfrm_algo_parse(struct xfrm_algo *alg,
>>> enum xfrm_attr_type_t type,
>>>                if (len>  max)
>>>                    invarg("\"ALGO-KEY\" makes buffer overflow\n", key);
>
> I presume there is a return hiding in invarg().

Good guess :)

>>>
>>> -            strncpy(buf, key, len);
>>> +            memcpy(buf, key, len);
>
> Passing the length of the SOURCE to strncpy() is almost always wrong.
> You are still not terminating the copied string.

Don't worry.

The length using here has been increased by 1 at the beginning of the function,
so the copied string to the destination is terminated well.

> 	David
>
>

-- 
浮沉随浪只记今朝笑

--fan

^ permalink raw reply

* Re: [PATCH] tun: don't look at current when non-blocking
From: Jason Wang @ 2013-10-08  2:50 UTC (permalink / raw)
  To: Michael S. Tsirkin, netdev, linux-kernel
  Cc: David S. Miller, Eric Dumazet, Pavel Emelyanov
In-Reply-To: <20131006182512.GA16504@redhat.com>

On 10/07/2013 02:25 AM, Michael S. Tsirkin wrote:
> We play with a wait queue even if socket is
> non blocking. This is an obvious waste.
> Besides, it will prevent calling the non blocking
> variant when current is not valid.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---

Acked-by: Jason Wang <jasowang@redhat.com>
>  drivers/net/tun.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 807815f..7cb105c 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1293,7 +1293,8 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
>  	if (unlikely(!noblock))
>  		add_wait_queue(&tfile->wq.wait, &wait);
>  	while (len) {
> -		current->state = TASK_INTERRUPTIBLE;
> +		if (unlikely(!noblock))
> +			current->state = TASK_INTERRUPTIBLE;
>  
>  		/* Read frames from the queue */
>  		if (!(skb = skb_dequeue(&tfile->socket.sk->sk_receive_queue))) {
> @@ -1320,9 +1321,10 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
>  		break;
>  	}
>  
> -	current->state = TASK_RUNNING;
> -	if (unlikely(!noblock))
> +	if (unlikely(!noblock)) {
> +		current->state = TASK_RUNNING;
>  		remove_wait_queue(&tfile->wq.wait, &wait);
> +	}
>  
>  	return ret;
>  }

^ permalink raw reply

* [PATCH 1/2] cgroup: netprio: remove unnecessary task_netprioidx
From: Gao feng @ 2013-10-08  3:05 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
  Cc: jhs-jkUAjuhPggJWk0Htik3J/w, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	tj-DgEjT+Ai2ygdnm+yROfE0A, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	lizefan-hv44wF8Li93QT0dZR+AlfA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, Gao feng

Since the tasks have been migrated to the cgroup,
there is no need to call task_netprioidx to get
task's cgroup id.

Signed-off-by: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 net/core/netprio_cgroup.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index d9cd627..9b7cf6c 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -222,11 +222,10 @@ static void net_prio_attach(struct cgroup_subsys_state *css,
 			    struct cgroup_taskset *tset)
 {
 	struct task_struct *p;
-	void *v;
+	void *v = (void *)(unsigned long)css->cgroup->id;
 
 	cgroup_taskset_for_each(p, css, tset) {
 		task_lock(p);
-		v = (void *)(unsigned long)task_netprioidx(p);
 		iterate_fd(p->files, 0, update_netprio, v);
 		task_unlock(p);
 	}
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 2/2] cgroup: cls: remove unnecessary task_cls_classid
From: Gao feng @ 2013-10-08  3:05 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
  Cc: jhs-jkUAjuhPggJWk0Htik3J/w, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	tj-DgEjT+Ai2ygdnm+yROfE0A, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	lizefan-hv44wF8Li93QT0dZR+AlfA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, Gao feng
In-Reply-To: <1381201520-25938-1-git-send-email-gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>

We can get classid through cgroup_subsys_state,
this is directviewing and effective.

Signed-off-by: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 net/sched/cls_cgroup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 867b4a3..16006c9 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -72,11 +72,11 @@ static void cgrp_attach(struct cgroup_subsys_state *css,
 			struct cgroup_taskset *tset)
 {
 	struct task_struct *p;
-	void *v;
+	struct cgroup_cls_state *cs = css_cls_state(css);
+	void *v = (void *)(unsigned long)cs->classid;
 
 	cgroup_taskset_for_each(p, css, tset) {
 		task_lock(p);
-		v = (void *)(unsigned long)task_cls_classid(p);
 		iterate_fd(p->files, 0, update_classid, v);
 		task_unlock(p);
 	}
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH] moxa: fix the error handling in moxart_mac_probe()
From: Wei Yongjun @ 2013-10-08  3:19 UTC (permalink / raw)
  To: grant.likely, rob.herring, davem, jg1.han, b.zolnierkie,
	kyungmin.park
  Cc: yongjun_wei, netdev

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

This patch fix the error handling in moxart_mac_probe():
 - return -ENOMEM in some memory alloc fail cases
 - add missing free_netdev() in the error handling case

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 drivers/net/ethernet/moxa/moxart_ether.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index bd1a2d2..ea54d95 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -448,7 +448,8 @@ static int moxart_mac_probe(struct platform_device *pdev)
 	irq = irq_of_parse_and_map(node, 0);
 	if (irq <= 0) {
 		netdev_err(ndev, "irq_of_parse_and_map failed\n");
-		return -EINVAL;
+		ret = -EINVAL;
+		goto irq_map_fail;
 	}
 
 	priv = netdev_priv(ndev);
@@ -472,24 +473,32 @@ static int moxart_mac_probe(struct platform_device *pdev)
 	priv->tx_desc_base = dma_alloc_coherent(NULL, TX_REG_DESC_SIZE *
 						TX_DESC_NUM, &priv->tx_base,
 						GFP_DMA | GFP_KERNEL);
-	if (priv->tx_desc_base == NULL)
+	if (priv->tx_desc_base == NULL) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	priv->rx_desc_base = dma_alloc_coherent(NULL, RX_REG_DESC_SIZE *
 						RX_DESC_NUM, &priv->rx_base,
 						GFP_DMA | GFP_KERNEL);
-	if (priv->rx_desc_base == NULL)
+	if (priv->rx_desc_base == NULL) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	priv->tx_buf_base = kmalloc(priv->tx_buf_size * TX_DESC_NUM,
 				    GFP_ATOMIC);
-	if (!priv->tx_buf_base)
+	if (!priv->tx_buf_base) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	priv->rx_buf_base = kmalloc(priv->rx_buf_size * RX_DESC_NUM,
 				    GFP_ATOMIC);
-	if (!priv->rx_buf_base)
+	if (!priv->rx_buf_base) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	platform_set_drvdata(pdev, ndev);
 
@@ -522,7 +531,8 @@ static int moxart_mac_probe(struct platform_device *pdev)
 init_fail:
 	netdev_err(ndev, "init failed\n");
 	moxart_mac_free_memory(ndev);
-
+irq_map_fail:
+	free_netdev(ndev);
 	return ret;
 }
 

^ permalink raw reply related

* [PATCH] moxa: fix the error handling in moxart_mac_probe()
From: Wei Yongjun @ 2013-10-08  3:26 UTC (permalink / raw)
  To: grant.likely, rob.herring, davem, jg1.han, b.zolnierkie,
	kyungmin.park
  Cc: yongjun_wei, netdev

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

This patch fix the error handling in moxart_mac_probe():
 - return -ENOMEM in some memory alloc fail cases
 - add missing free_netdev() in the error handling case

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 drivers/net/ethernet/moxa/moxart_ether.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index bd1a2d2..ea54d95 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -448,7 +448,8 @@ static int moxart_mac_probe(struct platform_device *pdev)
 	irq = irq_of_parse_and_map(node, 0);
 	if (irq <= 0) {
 		netdev_err(ndev, "irq_of_parse_and_map failed\n");
-		return -EINVAL;
+		ret = -EINVAL;
+		goto irq_map_fail;
 	}
 
 	priv = netdev_priv(ndev);
@@ -472,24 +473,32 @@ static int moxart_mac_probe(struct platform_device *pdev)
 	priv->tx_desc_base = dma_alloc_coherent(NULL, TX_REG_DESC_SIZE *
 						TX_DESC_NUM, &priv->tx_base,
 						GFP_DMA | GFP_KERNEL);
-	if (priv->tx_desc_base == NULL)
+	if (priv->tx_desc_base == NULL) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	priv->rx_desc_base = dma_alloc_coherent(NULL, RX_REG_DESC_SIZE *
 						RX_DESC_NUM, &priv->rx_base,
 						GFP_DMA | GFP_KERNEL);
-	if (priv->rx_desc_base == NULL)
+	if (priv->rx_desc_base == NULL) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	priv->tx_buf_base = kmalloc(priv->tx_buf_size * TX_DESC_NUM,
 				    GFP_ATOMIC);
-	if (!priv->tx_buf_base)
+	if (!priv->tx_buf_base) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	priv->rx_buf_base = kmalloc(priv->rx_buf_size * RX_DESC_NUM,
 				    GFP_ATOMIC);
-	if (!priv->rx_buf_base)
+	if (!priv->rx_buf_base) {
+		ret = -ENOMEM;
 		goto init_fail;
+	}
 
 	platform_set_drvdata(pdev, ndev);
 
@@ -522,7 +531,8 @@ static int moxart_mac_probe(struct platform_device *pdev)
 init_fail:
 	netdev_err(ndev, "init failed\n");
 	moxart_mac_free_memory(ndev);
-
+irq_map_fail:
+	free_netdev(ndev);
 	return ret;
 }
 

^ permalink raw reply related

* [PATCH] qlcnic: add missing destroy_workqueue() on error path in qlcnic_probe()
From: Wei Yongjun @ 2013-10-08  3:32 UTC (permalink / raw)
  To: himanshu.madhani, rajesh.borundia, shahed.shaikh,
	jitendra.kalsaria, sony.chacko, sucheta.chakraborty
  Cc: yongjun_wei, linux-driver, netdev

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

Add the missing destroy_workqueue() before return from
qlcnic_probe() in the error handling case.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index 21d00a0..f07f2b0 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -2257,7 +2257,7 @@ qlcnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	err = qlcnic_alloc_adapter_resources(adapter);
 	if (err)
-		goto err_out_free_netdev;
+		goto err_out_free_wq;
 
 	adapter->dev_rst_time = jiffies;
 	adapter->ahw->revision_id = pdev->revision;
@@ -2396,6 +2396,9 @@ err_out_disable_msi:
 err_out_free_hw:
 	qlcnic_free_adapter_resources(adapter);
 
+err_out_free_wq:
+	destroy_workqueue(adapter->qlcnic_wq);
+
 err_out_free_netdev:
 	free_netdev(netdev);
 

^ permalink raw reply related

* [PATCH net-next] fib_trie: only calc for the un-first node
From: baker.kernel @ 2013-10-08  3:36 UTC (permalink / raw)
  To: davem, kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel; +Cc: baker.zhang

From: "baker.zhang" <baker.kernel@gmail.com>

This is a enhancement.

for the first node in fib_trie, newpos is 0, bit is 1.
Only for the leaf or node with unmatched key need calc pos.

Signed-off-by: baker.zhang <baker.kernel@gmail.com>
---
 net/ipv4/fib_trie.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 45c74ba..ec9a9ef 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1117,12 +1117,8 @@ static struct list_head *fib_insert_node(struct trie *t, u32 key, int plen)
 		 *  first tnode need some special handling
 		 */
 
-		if (tp)
-			pos = tp->pos+tp->bits;
-		else
-			pos = 0;
-
 		if (n) {
+			pos = tp ? tp->pos+tp->bits : 0;
 			newpos = tkey_mismatch(key, pos, n->key);
 			tn = tnode_new(n->key, newpos, 1);
 		} else {
-- 
1.7.9.5

^ permalink raw reply related

* RE: [PATCH] moxa: fix the error handling in moxart_mac_probe()
From: yongjun_wei @ 2013-10-08  3:29 UTC (permalink / raw)
  To: Wei Yongjun, grant.likely@linaro.org, rob.herring@calxeda.com,
	davem@davemloft.net, jg1.han@samsung.com,
	b.zolnierkie@samsung.com, kyungmin.park@samsung.com
  Cc: netdev@vger.kernel.org
In-Reply-To: <CAPgLHd95Rv=sCCWFsuLwtGz5GL5+7N-y6zQ195uOXF3ebgcj3A@mail.gmail.com>

Sorry, this mail is dup by mail server, please ignore it, thanks.

-----Original Message-----
From: Wei Yongjun [mailto:weiyj.lk@gmail.com] 
Sent: 2013年10月8日 11:27
To: grant.likely@linaro.org; rob.herring@calxeda.com; davem@davemloft.net; jg1.han@samsung.com; b.zolnierkie@samsung.com; kyungmin.park@samsung.com
Cc: Yongjun Wei (RD-CN); netdev@vger.kernel.org
Subject: [PATCH] moxa: fix the error handling in moxart_mac_probe()

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

This patch fix the error handling in moxart_mac_probe():
 - return -ENOMEM in some memory alloc fail cases
 - add missing free_netdev() in the error handling case

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

<table class="TM_EMAIL_NOTICE"><tr><td><pre>
TREND MICRO EMAIL NOTICE
The information contained in this email and any attachments is confidential 
and may be subject to copyright or other intellectual property protection. 
If you are not the intended recipient, you are not authorized to use or 
disclose this information, and we request that you notify us by reply mail or
telephone and delete the original message from your mail system.
</pre></td></tr></table>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox