Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Rick Jones @ 2012-05-02 17:16 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, Alexander Duyck, David Miller, netdev,
	Neal Cardwell, Tom Herbert, Jeff Kirsher, Michael Chan,
	Matt Carlson, Herbert Xu, Ben Hutchings, Ilpo Järvinen,
	Maciej Żenczykowski
In-Reply-To: <1335947084.22133.134.camel@edumazet-glaptop>

On 05/02/2012 01:24 AM, Eric Dumazet wrote:
> On Tue, 2012-05-01 at 12:45 -0700, Alexander Duyck wrote:
>
>> I have a hacked together ixgbe up and running now with the new build_skb
>> logic and RSC/LRO disabled.  It looks like it is giving me a 5%
>> performance boost for small packet routing, but I am using more CPU for
>> netperf TCP receive tests and I was wondering if you had seen anything
>> similar on the tg3 driver?
>
> Really hard to say, numbers are so small on Gb link :
>
> what do you use to make your numbers ?
>
> netperf -H 172.30.42.23 -t OMNI -C -c
> OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.30.42.23 (172.30.42.23) port 0 AF_INET
> Local       Local       Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service
> Send Socket Send Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand
> Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units
> Final       Final                                             %     Method %      Method
> 1700840     1700840     16384  10.01   931.60     10^6bits/s  4.50  S      1.32   S      1.582   2.783   usec/KB

If there is so little CPU consumed, I'm a bit surprised the throughput 
wasn't 940 Gbit/s.

It might be a good idea to fix the local and remote socket buffer sizes 
for these sorts of A-B comparisons to take the variability of the 
autotuning out.

And then, to see if the small differences are "real" one can light-up 
the confidence intervals.  For example (using kernels unrelated to the 
patch discussion):

raj@tardy:~/netperf2_trunk/src$ ./netperf -H 192.168.1.3 -t omni -c -C 
-I 99,1 -i 30,3 -- -s 256K -S 256K -m 16K -O 
throughput,local_cpu_util,local_sd,remote_cpu_util,remote_sd,throughput_confid,local_cpu_confid,remote_cpu_confid,confidence_iteration
OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 () 
port 0 AF_INET : +/-0.500% @ 99% conf.  : interval : demo
Throughput Local Local   Remote Remote  Throughput Local      Remote 
  Confidence
            CPU   Service CPU    Service Confidence CPU        CPU 
   Iterations
            Util  Demand  Util   Demand  Width (%)  Confidence 
Confidence Run
            %             %                         Width (%)  Width (%) 

941.36     8.70  3.030   45.36  7.895   0.006      18.836     0.209 
  30

In this instance, I asked to be 99% confident the throughput and CPU 
util were within +/- 0.5% of the "real" mean.  The confidence intervals 
were hit for throughput and remote CPU util, but not for local CPU util 
- netperf was running on my personal workstation, which also receives 
email etc.  Presumably a more isolated and idle system would have hit 
the confidence intervals.

Other sources of variation to consider eliminating when looking for 
small differences in CPU utilization might be the multiqueue support in 
the NIC.  I'll often just terminate irqbalance and set all the IRQs to a 
single CPU (when doing single stream tests).  Or, one can fully specify 
the four-tuple for the netperf data connection.

rick jones
of course there is also the whole question of the effect of HW threading 
on the meaningfulness of OS-determined utilization...

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Alexander Duyck @ 2012-05-02 17:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335976071.22133.581.camel@edumazet-glaptop>

On 05/02/2012 09:27 AM, Eric Dumazet wrote:
> On Wed, 2012-05-02 at 18:19 +0200, Eric Dumazet wrote:
>> On Wed, 2012-05-02 at 09:16 -0700, Alexander Duyck wrote:
>>
>>> I was working with the out-of-tree ixgbe because I have the option there
>>> of stripping out FCoE and RSC via a couple of build flags.  The problem
>>> is I don't know if the head frag stuff will work out very well with
>>> ixgbe because RSC and FCoE require that we have to use 1K aligned
>>> receive buffers.  It would require us to make us have to bump up our
>>> allocation size by NET_SKB_PAD plus skb_shared_info which would likely
>>> force us up to order 1 pages on most platforms.
>> What is RSC exactly, and why RSC is used in the build_skb() context ?
>>
>>
> It looks like e1000e would be a good candidate for build_skb()
> (without packet split)

Yes, e1000e and e1000 would be good candidates since they have separate
flows for jumbo flows.  Odds are they could probably also take advantage
of the page reuse code I have in igb and ixgbe, but I just haven't had
time to get around to updating them.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Alexander Duyck @ 2012-05-02 17:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335975578.22133.580.camel@edumazet-glaptop>

On 05/02/2012 09:19 AM, Eric Dumazet wrote:
> On Wed, 2012-05-02 at 09:16 -0700, Alexander Duyck wrote:
>
>> I was working with the out-of-tree ixgbe because I have the option there
>> of stripping out FCoE and RSC via a couple of build flags.  The problem
>> is I don't know if the head frag stuff will work out very well with
>> ixgbe because RSC and FCoE require that we have to use 1K aligned
>> receive buffers.  It would require us to make us have to bump up our
>> allocation size by NET_SKB_PAD plus skb_shared_info which would likely
>> force us up to order 1 pages on most platforms.
> What is RSC exactly, and why RSC is used in the build_skb() context ?
RSC is your in-hardware LRO.  Basically it aggregates the TCP flows in
hardware instead of software.  As a result we have to be able to receive
jumbo frames any time it is enabled.  This means we can end up using the
full data buffer which we can only set in 1K increments.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH net-next] net: take care of cloned skbs in tcp_try_coalesce()
From: Eric Dumazet @ 2012-05-02 16:46 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <4FA1606A.6040607@intel.com>

On Wed, 2012-05-02 at 09:27 -0700, Alexander Duyck wrote:

> Are you sure about that?  I think this may blow up if a bridge is
> brought into play.  In that case you will have clones that will be going
> through the xmit path of network drivers and I know in the case of the
> older e1000 driver it didn't stop to look at the length but would
> instead just go through and start mapping all frags to the device.  I am
> fairly certain you are risking a data corruption any time you modify
> nr_frags and dataref is != 1.
> 


Hmm...

A driver should not map more fragments than len/data_len permits.
But point taken.

Frankly we can add the test, but it means that any sniffer running will
disable tcp coalescing, while net/packet/af_packet.c does the right
thing.

I'll check how I can do...

> I really think what we should be doing is either not merge period, or we
> have to go through slow paths if either the to or the from is cloned.
> 
> >>> @@ -4515,7 +4521,12 @@ copyfrags:
> >>>  		offset = from->data - (unsigned char *)page_address(page);
> >>>  		skb_fill_page_desc(to, skb_shinfo(to)->nr_frags,
> >>>  				   page, offset, skb_headlen(from));
> >>> -		*fragstolen = true;
> >>> +
> >>> +		if (skb_cloned(from))
> >>> +			get_page(page);
> >>> +		else
> >>> +			*fragstolen = true;
> >>> +
> >>>  		delta = len; /* we dont know real truesize... */
> >>>  		goto copyfrags;
> >>>  	}
> >>>
> >>>
> >> I don't see where we are now addressing the put_page call to release the
> >> page afterwards.  By calling get_page you are incrementing the page
> >> count by one, but where are you decrementing dataref in the shared
> >> info?  Without that we are looking at a memory leak because __kfree_skb
> >> will decrement the dataref but it will never reach 0 so it will never
> >> call put_page on the head frag.
> > really the dataref was already incremented at skb_clone() time
> >
> > It will be properly decremented since we call __kfree_skb()
> >
> > Only the last decrement will perform the put_page()
> >
> > Think about splice() is doing, its the same get_page() game.
> I think you are missing the point.  So skb_clone will bump the dataref
> to 2, calling get_page will bump the page count to 2.  After this
> function you don't call __kfree_skb(skb) instead you call
> kmem_cache_free(skbuff_head_cache, skb).  This will free the sk_buff,
> but not decrement dataref leaving it at 2.  Eventually the raw socket
> will call kfree_skb(skb) on the clone dropping the dataref to 1 and you
> will call put_page dropping the page count to 1, but I don't see where
> the last __kfree_skb call will come from that will drop dataref and the
> page count to 0.

No, you miss that _if_ we added one to page count, then we wont call
kmem_cache_free(skbuff_head_cache, skb) but call __kfree_skb(skb)
instead because fragstolen will be false.

if (fragstolen)
	kmem_cache_free(...)
else
	__kfree_skb(...)

In future patch (addressing tcp coalescing in tcp_queue_rcv() as well),
I'll add a helper to make this more clear.

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Eric Dumazet @ 2012-05-02 16:27 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335975578.22133.580.camel@edumazet-glaptop>

On Wed, 2012-05-02 at 18:19 +0200, Eric Dumazet wrote:
> On Wed, 2012-05-02 at 09:16 -0700, Alexander Duyck wrote:
> 
> > I was working with the out-of-tree ixgbe because I have the option there
> > of stripping out FCoE and RSC via a couple of build flags.  The problem
> > is I don't know if the head frag stuff will work out very well with
> > ixgbe because RSC and FCoE require that we have to use 1K aligned
> > receive buffers.  It would require us to make us have to bump up our
> > allocation size by NET_SKB_PAD plus skb_shared_info which would likely
> > force us up to order 1 pages on most platforms.
> 
> What is RSC exactly, and why RSC is used in the build_skb() context ?
> 
> 

It looks like e1000e would be a good candidate for build_skb()
(without packet split)

^ permalink raw reply

* Re: [PATCH net-next] net: take care of cloned skbs in tcp_try_coalesce()
From: Alexander Duyck @ 2012-05-02 16:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335975168.22133.578.camel@edumazet-glaptop>

On 05/02/2012 09:12 AM, Eric Dumazet wrote:
> On Wed, 2012-05-02 at 08:52 -0700, Alexander Duyck wrote:
>> On 05/02/2012 01:13 AM, Eric Dumazet wrote:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> Before stealing fragments or skb head, we must make sure skb is not
>>> cloned.
>>>
>>> If skb is cloned, we must take references on pages instead.
>>>
>>> Bug happened using tcpdump (if not using mmap())
>>>
>>> Reported-by: Alexander Duyck <alexander.h.duyck@intel.com>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> ---
>>>  net/ipv4/tcp_input.c |   17 ++++++++++++++---
>>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>>> index 96a631d..7686d7f 100644
>>> --- a/net/ipv4/tcp_input.c
>>> +++ b/net/ipv4/tcp_input.c
>>> @@ -4467,7 +4467,7 @@ static bool tcp_try_coalesce(struct sock *sk,
>>>  			     struct sk_buff *from,
>>>  			     bool *fragstolen)
>>>  {
>>> -	int delta, len = from->len;
>>> +	int i, delta, len = from->len;
>>>  
>>>  	*fragstolen = false;
>>>  	if (tcp_hdr(from)->fin)
>>> @@ -4497,7 +4497,13 @@ copyfrags:
>>>  		       skb_shinfo(from)->frags,
>>>  		       skb_shinfo(from)->nr_frags * sizeof(skb_frag_t));
>>>  		skb_shinfo(to)->nr_frags += skb_shinfo(from)->nr_frags;
>>> -		skb_shinfo(from)->nr_frags = 0;
>>> +
>>> +		if (skb_cloned(from))
>>> +			for (i = 0; i < skb_shinfo(from)->nr_frags; i++)
>>> +				skb_frag_ref(from, i);
>>> +		else
>>> +			skb_shinfo(from)->nr_frags = 0;
>>> +
>>>  		to->truesize += delta;
>>>  		atomic_add(delta, &sk->sk_rmem_alloc);
>>>  		sk_mem_charge(sk, delta);
>> I am fairly certain the bug I saw is only masked over by this change. 
>> The underlying problem is that we shouldn't be messing with nr_frags on
>> the from or the to if either one is clone.  You now have a check in
>> place for the from, but what about the to?  This function should
>> probably be calling a pskb_expand_head on the to skb in order to
>> guarantee that the skb->head isn't shared.  Otherwise this is going to
>> cause other issues for any functions that are sharing these skbs that
>> just walk through frags without checking skb->len or skb->data_len first. 
> Its safe to increase to->len and increase nr_frags in this context,
> because we hold a reference to dataref : It cannot disappear under us.
>
> clones will still have their skb->len at skb_clone() time and wont care
> we expanded the frags.
Are you sure about that?  I think this may blow up if a bridge is
brought into play.  In that case you will have clones that will be going
through the xmit path of network drivers and I know in the case of the
older e1000 driver it didn't stop to look at the length but would
instead just go through and start mapping all frags to the device.  I am
fairly certain you are risking a data corruption any time you modify
nr_frags and dataref is != 1.

I really think what we should be doing is either not merge period, or we
have to go through slow paths if either the to or the from is cloned.

>>> @@ -4515,7 +4521,12 @@ copyfrags:
>>>  		offset = from->data - (unsigned char *)page_address(page);
>>>  		skb_fill_page_desc(to, skb_shinfo(to)->nr_frags,
>>>  				   page, offset, skb_headlen(from));
>>> -		*fragstolen = true;
>>> +
>>> +		if (skb_cloned(from))
>>> +			get_page(page);
>>> +		else
>>> +			*fragstolen = true;
>>> +
>>>  		delta = len; /* we dont know real truesize... */
>>>  		goto copyfrags;
>>>  	}
>>>
>>>
>> I don't see where we are now addressing the put_page call to release the
>> page afterwards.  By calling get_page you are incrementing the page
>> count by one, but where are you decrementing dataref in the shared
>> info?  Without that we are looking at a memory leak because __kfree_skb
>> will decrement the dataref but it will never reach 0 so it will never
>> call put_page on the head frag.
> really the dataref was already incremented at skb_clone() time
>
> It will be properly decremented since we call __kfree_skb()
>
> Only the last decrement will perform the put_page()
>
> Think about splice() is doing, its the same get_page() game.
I think you are missing the point.  So skb_clone will bump the dataref
to 2, calling get_page will bump the page count to 2.  After this
function you don't call __kfree_skb(skb) instead you call
kmem_cache_free(skbuff_head_cache, skb).  This will free the sk_buff,
but not decrement dataref leaving it at 2.  Eventually the raw socket
will call kfree_skb(skb) on the clone dropping the dataref to 1 and you
will call put_page dropping the page count to 1, but I don't see where
the last __kfree_skb call will come from that will drop dataref and the
page count to 0.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH 15/16] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage
From: Mel Gorman @ 2012-05-02 16:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux-MM, Linux-Netdev, LKML, David Miller, Neil Brown,
	Peter Zijlstra, Mike Christie, Eric B Munson
In-Reply-To: <20120501152437.194f0fc2.akpm@linux-foundation.org>

On Tue, May 01, 2012 at 03:24:37PM -0700, Andrew Morton wrote:
> On Mon, 16 Apr 2012 13:17:02 +0100
> Mel Gorman <mgorman@suse.de> wrote:
> 
> > If swap is backed by network storage such as NBD, there is a risk
> > that a large number of reclaimers can hang the system by consuming
> > all PF_MEMALLOC reserves. To avoid these hangs, the administrator
> > must tune min_free_kbytes in advance which is a bit fragile.
> > 
> > This patch throttles direct reclaimers if half the PF_MEMALLOC reserves
> > are in use. If the system is routinely getting throttled the system
> > administrator can increase min_free_kbytes so degradation is smoother
> > but the system will keep running.
> > 
> >
> > ...
> >
> > +static bool pfmemalloc_watermark_ok(pg_data_t *pgdat)
> > +{
> > +	struct zone *zone;
> > +	unsigned long pfmemalloc_reserve = 0;
> > +	unsigned long free_pages = 0;
> > +	int i;
> > +	bool wmark_ok;
> > +
> > +	for (i = 0; i <= ZONE_NORMAL; i++) {
> > +		zone = &pgdat->node_zones[i];
> > +		pfmemalloc_reserve += min_wmark_pages(zone);
> > +		free_pages += zone_page_state(zone, NR_FREE_PAGES);
> > +	}
> > +
> > +	wmark_ok = (free_pages > pfmemalloc_reserve / 2) ? true : false;
> 
> 	wmark_ok = free_pages > pfmemalloc_reserve / 2;
> 

Of course, I don't know what I was on when I wrote that particular line.

> > +
> > +	/* kswapd must be awake if processes are being throttled */
> > +	if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) {
> > +		pgdat->classzone_idx = min(pgdat->classzone_idx,
> > +						(enum zone_type)ZONE_NORMAL);
> > +		wake_up_interruptible(&pgdat->kswapd_wait);
> > +	}
> > +
> > +	return wmark_ok;
> > +}
> > +
> > +/*
> > + * Throttle direct reclaimers if backing storage is backed by the network
> > + * and the PFMEMALLOC reserve for the preferred node is getting dangerously
> > + * depleted. kswapd will continue to make progress and wake the processes
> > + * when the low watermark is reached
> > + */
> > +static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
> > +					nodemask_t *nodemask)
> > +{
> > +	struct zone *zone;
> > +	int high_zoneidx = gfp_zone(gfp_mask);
> > +	pg_data_t *pgdat;
> > +
> > +	/* Kernel threads such as kjournald should not be throttled */
> 
> The comment should explain "why", not "what".  Particularly when the
> "what" was bleedin obvious ;)
> 
> Also...   why?
> 

        /*
         * Kernel threads should not be throttled as they may be indirectly
         * responsible for cleaning pages necessary for reclaim to make forward
         * progress. kjournald for example may enter direct reclaim while
         * committing a transaction where throttling it could forcing other
         * processes to block on log_wait_commit()
         */

Does that help?

> > +	if (current->flags & PF_KTHREAD)
> > +		return;
> > +
> > +	/* Check if the pfmemalloc reserves are ok */
> > +	first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
> > +	pgdat = zone->zone_pgdat;
> > +	if (pfmemalloc_watermark_ok(pgdat))
> > +		return;
> > +
> > +	/*
> > +	 * If the caller cannot enter the filesystem, it's possible that it
> > +	 * is processing a journal transaction. In this case, it is not safe
> > +	 * to block on pfmemalloc_wait as kswapd could also be blocked waiting
> > +	 * to start a transaction. Instead, throttle for up to a second before
> > +	 * the reclaim must continue.
> > +	 */
> 
> I suppose this applies to fs locks in general, not just to
> journal_start()?
> 

Yes. I updated the comment to reflect that.

        /*
         * If the caller cannot enter the filesystem, it's possible that it
         * is due to the caller holding an FS lock or performing a journal
         * transaction in the case of a filesystem like ext[3|4]. In this case,
         * it is not safe to block on pfmemalloc_wait as kswapd could be
         * blocked waiting on the same lock. Instead, throttle for up to a
         * second before continuing.
         */


> > +	if (!(gfp_mask & __GFP_FS)) {
> > +		wait_event_interruptible_timeout(pgdat->pfmemalloc_wait,
> > +			pfmemalloc_watermark_ok(pgdat), HZ);
> > +		return;
> > +	}
> > +
> > +	/* Throttle until kswapd wakes the process */
> > +	wait_event_killable(zone->zone_pgdat->pfmemalloc_wait,
> > +		pfmemalloc_watermark_ok(pgdat));
> > +}
> > +
> >  unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> >  				gfp_t gfp_mask, nodemask_t *nodemask)
> >  {
> >
> > ...
> >
> > @@ -2610,6 +2686,20 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
> >  	if (remaining)
> >  		return true;
> >  
> > +	/*
> > +	 * There is a potential race between when kswapd checks it watermarks
> 
> "its"
> 

Fixed.

> > +	 * and a process gets throttled. There is also a potential race if
> > +	 * processes get throttled, kswapd wakes, a large process exits therby
> > +	 * balancing the zones that causes kswapd to miss a wakeup. If kswapd
> > +	 * is going to sleep, no process should be sleeping on pfmemalloc_wait
> > +	 * so wake them now if necessary. If necessary, processes will wake
> > +	 * kswapd and get throttled again
> > +	 */
> 
> Yes, the possibility for missed wakeups here worried me.  There's no
> synchronization and it would be easy to leave holes.
> 
> It's good that there is no timeout on the throttling - a timeout would
> cover up rare races most nastily.
> 

Yes and I wanted to avoid that. If there is a lost wakup, sysrq+t should
show processes stuck in throttle_direct_reclaim() while kswapd is asleep.

> > +	if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
> > +		wake_up(&pgdat->pfmemalloc_wait);
> > +		return true;
> > +	}
> 
> A bool-returning function called "sleeping_prematurely" should have no
> side-effects.  But it now performs wakeups.  Wanna see if there is a
> way of making this nicer?
> 

Minimally, the two instances of "There is a potential race" was a
merging mistake so I deleted the one in kswapd_try_to_sleep().

I looked at moving this wake_up outside sleeping_prematurely() but it
looked worse really. What I did instead was rename
sleeping_prematurely() to prepare_kswapd_sleep() and and commented it
like this

/*
 * Prepare kswapd for sleeping. This verifies that there are no processes
 * waiting in throttle_direct_reclaim() and that watermarks have been
 * met.
 *
 * Returns true if kswapd is ready to sleep
 */
static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
                                        int classzone_idx)

It's cheating a bit but a name like "prepare" implies that it may have
side-effects.

> >  	/* Check the watermark levels */
> >  	for (i = 0; i <= classzone_idx; i++) {
> >  		struct zone *zone = pgdat->node_zones + i;
> > @@ -2871,6 +2961,12 @@ loop_again:
> >  			}
> >  
> >  		}
> > +
> > +		/* Wake throttled direct reclaimers if low watermark is met */
> 
> s/"what"/"why"/ !
> 

                /*
                 * If the low watermark is met there is no need for processes
                 * to be throttled on pfmemalloc_wait as they should not be
                 * able to safely make forward progress. Wake them
                 */

?

Here is how the patch currently stands

---8<---
mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage

If swap is backed by network storage such as NBD, there is a risk
that a large number of reclaimers can hang the system by consuming
all PF_MEMALLOC reserves. To avoid these hangs, the administrator
must tune min_free_kbytes in advance which is a bit fragile.

This patch throttles direct reclaimers if half the PF_MEMALLOC reserves
are in use. If the system is routinely getting throttled the system
administrator can increase min_free_kbytes so degradation is smoother
but the system will keep running.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/mmzone.h |    1 +
 mm/page_alloc.c        |    1 +
 mm/vmscan.c            |  128 +++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 122 insertions(+), 8 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index dff7115..e6b733d 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -663,6 +663,7 @@ typedef struct pglist_data {
 					     range, including holes */
 	int node_id;
 	wait_queue_head_t kswapd_wait;
+	wait_queue_head_t pfmemalloc_wait;
 	struct task_struct *kswapd;
 	int kswapd_max_order;
 	enum zone_type classzone_idx;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e225a7c..b9eb64a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4326,6 +4326,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 	pgdat_resize_init(pgdat);
 	pgdat->nr_zones = 0;
 	init_waitqueue_head(&pgdat->kswapd_wait);
+	init_waitqueue_head(&pgdat->pfmemalloc_wait);
 	pgdat->kswapd_max_order = 0;
 	pgdat_page_cgroup_init(pgdat);
 	
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 33c332b..6f322e8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2431,6 +2431,80 @@ out:
 	return 0;
 }
 
+static bool pfmemalloc_watermark_ok(pg_data_t *pgdat)
+{
+	struct zone *zone;
+	unsigned long pfmemalloc_reserve = 0;
+	unsigned long free_pages = 0;
+	int i;
+	bool wmark_ok;
+
+	for (i = 0; i <= ZONE_NORMAL; i++) {
+		zone = &pgdat->node_zones[i];
+		pfmemalloc_reserve += min_wmark_pages(zone);
+		free_pages += zone_page_state(zone, NR_FREE_PAGES);
+	}
+
+	wmark_ok = free_pages > pfmemalloc_reserve / 2;
+
+	/* kswapd must be awake if processes are being throttled */
+	if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) {
+		pgdat->classzone_idx = min(pgdat->classzone_idx,
+						(enum zone_type)ZONE_NORMAL);
+		wake_up_interruptible(&pgdat->kswapd_wait);
+	}
+
+	return wmark_ok;
+}
+
+/*
+ * Throttle direct reclaimers if backing storage is backed by the network
+ * and the PFMEMALLOC reserve for the preferred node is getting dangerously
+ * depleted. kswapd will continue to make progress and wake the processes
+ * when the low watermark is reached
+ */
+static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
+					nodemask_t *nodemask)
+{
+	struct zone *zone;
+	int high_zoneidx = gfp_zone(gfp_mask);
+	pg_data_t *pgdat;
+
+	/*
+	 * Kernel threads should not be throttled as they may be indirectly
+	 * responsible for cleaning pages necessary for reclaim to make forward
+	 * progress. kjournald for example may enter direct reclaim while
+	 * committing a transaction where throttling it could forcing other
+	 * processes to block on log_wait_commit().
+	 */
+	if (current->flags & PF_KTHREAD)
+		return;
+
+	/* Check if the pfmemalloc reserves are ok */
+	first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
+	pgdat = zone->zone_pgdat;
+	if (pfmemalloc_watermark_ok(pgdat))
+		return;
+
+	/*
+	 * If the caller cannot enter the filesystem, it's possible that it
+	 * is due to the caller holding an FS lock or performing a journal
+	 * transaction in the case of a filesystem like ext[3|4]. In this case,
+	 * it is not safe to block on pfmemalloc_wait as kswapd could be
+	 * blocked waiting on the same lock. Instead, throttle for up to a
+	 * second before continuing.
+	 */
+	if (!(gfp_mask & __GFP_FS)) {
+		wait_event_interruptible_timeout(pgdat->pfmemalloc_wait,
+			pfmemalloc_watermark_ok(pgdat), HZ);
+		return;
+	}
+
+	/* Throttle until kswapd wakes the process */
+	wait_event_killable(zone->zone_pgdat->pfmemalloc_wait,
+		pfmemalloc_watermark_ok(pgdat));
+}
+
 unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 				gfp_t gfp_mask, nodemask_t *nodemask)
 {
@@ -2449,6 +2523,15 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		.gfp_mask = sc.gfp_mask,
 	};
 
+	throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
+
+	/*
+	 * Do not enter reclaim if fatal signal is pending. 1 is returned so
+	 * that the page allocator does not consider triggering OOM
+	 */
+	if (fatal_signal_pending(current))
+		return 1;
+
 	trace_mm_vmscan_direct_reclaim_begin(order,
 				sc.may_writepage,
 				gfp_mask);
@@ -2598,8 +2681,13 @@ static bool pgdat_balanced(pg_data_t *pgdat, unsigned long balanced_pages,
 	return balanced_pages >= (present_pages >> 2);
 }
 
-/* is kswapd sleeping prematurely? */
-static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
+/*
+ * Prepare kswapd for sleeping. This verifies that there are no processes
+ * waiting in throttle_direct_reclaim() and that watermarks have been met.
+ *
+ * Returns true if kswapd is ready to sleep
+ */
+static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
 					int classzone_idx)
 {
 	int i;
@@ -2608,7 +2696,21 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
 
 	/* If a direct reclaimer woke kswapd within HZ/10, it's premature */
 	if (remaining)
-		return true;
+		return false;
+
+	/*
+	 * There is a potential race between when kswapd checks its watermarks
+	 * and a process gets throttled. There is also a potential race if
+	 * processes get throttled, kswapd wakes, a large process exits therby
+	 * balancing the zones that causes kswapd to miss a wakeup. If kswapd
+	 * is going to sleep, no process should be sleeping on pfmemalloc_wait
+	 * so wake them now if necessary. If necessary, processes will wake
+	 * kswapd and get throttled again
+	 */
+	if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
+		wake_up(&pgdat->pfmemalloc_wait);
+		return false;
+	}
 
 	/* Check the watermark levels */
 	for (i = 0; i <= classzone_idx; i++) {
@@ -2641,9 +2743,9 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
 	 * must be balanced
 	 */
 	if (order)
-		return !pgdat_balanced(pgdat, balanced, classzone_idx);
+		return pgdat_balanced(pgdat, balanced, classzone_idx);
 	else
-		return !all_zones_ok;
+		return all_zones_ok;
 }
 
 /*
@@ -2871,6 +2973,16 @@ loop_again:
 			}
 
 		}
+
+		/*
+		 * If the low watermark is met there is no need for processes
+		 * to be throttled on pfmemalloc_wait as they should not be
+		 * able to safely make forward progress. Wake them
+		 */
+		if (waitqueue_active(&pgdat->pfmemalloc_wait) &&
+				pfmemalloc_watermark_ok(pgdat))
+			wake_up(&pgdat->pfmemalloc_wait);
+
 		if (all_zones_ok || (order && pgdat_balanced(pgdat, balanced, *classzone_idx)))
 			break;		/* kswapd: all done */
 		/*
@@ -2971,7 +3083,7 @@ out:
 	}
 
 	/*
-	 * Return the order we were reclaiming at so sleeping_prematurely()
+	 * Return the order we were reclaiming at so prepare_kswapd_sleep()
 	 * makes a decision on the order we were last reclaiming at. However,
 	 * if another caller entered the allocator slow path while kswapd
 	 * was awake, order will remain at the higher level
@@ -2991,7 +3103,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int order, int classzone_idx)
 	prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
 
 	/* Try to sleep for a short interval */
-	if (!sleeping_prematurely(pgdat, order, remaining, classzone_idx)) {
+	if (prepare_kswapd_sleep(pgdat, order, remaining, classzone_idx)) {
 		remaining = schedule_timeout(HZ/10);
 		finish_wait(&pgdat->kswapd_wait, &wait);
 		prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
@@ -3001,7 +3113,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int order, int classzone_idx)
 	 * After a short sleep, check if it was a premature sleep. If not, then
 	 * go fully to sleep until explicitly woken up.
 	 */
-	if (!sleeping_prematurely(pgdat, order, remaining, classzone_idx)) {
+	if (prepare_kswapd_sleep(pgdat, order, remaining, classzone_idx)) {
 		trace_mm_vmscan_kswapd_sleep(pgdat->node_id);
 
 		/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* Re: [PATCH 05/16] mm: allow PF_MEMALLOC from softirq context
From: Mel Gorman @ 2012-05-02 16:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux-MM, Linux-Netdev, LKML, David Miller, Neil Brown,
	Peter Zijlstra, Mike Christie, Eric B Munson
In-Reply-To: <20120501150813.657cd5c0.akpm@linux-foundation.org>

On Tue, May 01, 2012 at 03:08:13PM -0700, Andrew Morton wrote:
> On Mon, 16 Apr 2012 13:16:52 +0100
> Mel Gorman <mgorman@suse.de> wrote:
> 
> > This is needed to allow network softirq packet processing to make
> > use of PF_MEMALLOC.
> 
> hm, why?  You just added __GFP_MEMALLOC so we don't need to futz with
> PF_MEMALLOC?
> 

The number of call sites is a problem. In patch 12, PF_MEMALLOC is set
where required. For example it is set in __netif_receive_skb() before it
calls packet_type->func() which is a per-protocol receive function such
as net/ipv4/ip_input.c#ip_rcv(). To use __GFP_MEMALLOC, every allocation
on this path would need to check the skb and set the flag as appropriate
for every protocol. This would make a mess and seeing as it is needed for
every allocation it makes more sense to set PF_MEMALLOC.

> > Currently softirq context cannot use PF_MEMALLOC due to it not being
> > associated with a task, and therefore not having task flags to fiddle
> > with - thus the gfp to alloc flag mapping ignores the task flags when
> > in interrupts (hard or soft) context.
> > 
> > Allowing softirqs to make use of PF_MEMALLOC therefore requires some
> > trickery.  We basically borrow the task flags from whatever process
> > happens to be preempted by the softirq.
> > 
> > So we modify the gfp to alloc flags mapping to not exclude task flags
> > in softirq context, and modify the softirq code to save, clear and
> > restore the PF_MEMALLOC flag.
> > 
> > The save and clear, ensures the preempted task's PF_MEMALLOC flag
> > doesn't leak into the softirq. The restore ensures a softirq's
> > PF_MEMALLOC flag cannot leak back into the preempted process.
> > 
> > ...
> >
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1913,6 +1913,13 @@ static inline void rcu_copy_process(struct task_struct *p)
> >  
> >  #endif
> >  
> > +static inline void tsk_restore_flags(struct task_struct *p,
> > +				     unsigned long pflags, unsigned long mask)
> 
> The naming is poor.
> 
> p -> "tsk" or "task"
> pflags -> "old_flags"
> mask -> "flags"
> 

I went with orig_flags instead of old_flags so it reads as "restore the
original task flags".

> > +{
> > +	p->flags &= ~mask;
> > +	p->flags |= pflags & mask;
> > +}
> > +
> >  #ifdef CONFIG_SMP
> >  extern void do_set_cpus_allowed(struct task_struct *p,
> >  			       const struct cpumask *new_mask);
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 671f959..d349caa 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -210,6 +210,8 @@ asmlinkage void __do_softirq(void)
> >  	__u32 pending;
> >  	int max_restart = MAX_SOFTIRQ_RESTART;
> >  	int cpu;
> > +	unsigned long pflags = current->flags;
> 
> "old_flags"
> 
> > +	current->flags &= ~PF_MEMALLOC;
> 
> The line before this one would be a suitable place for a comment!
> 

        /*
         * Mask out PF_MEMALLOC s current task context is borrowed for the
         * softirq. A softirq handled such as network RX might set PF_MEMALLOC
         * again if the socket is related to swap
         */

?

> >  	pending = local_softirq_pending();
> >  	account_system_vtime(current);
> > @@ -265,6 +267,7 @@ restart:
> >  
> >  	account_system_vtime(current);
> >  	__local_bh_enable(SOFTIRQ_OFFSET);
> > +	tsk_restore_flags(current, pflags, PF_MEMALLOC);
> >  }
> >  
> > ...
> >

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Eric Dumazet @ 2012-05-02 16:19 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <4FA15DDE.5090904@intel.com>

On Wed, 2012-05-02 at 09:16 -0700, Alexander Duyck wrote:

> I was working with the out-of-tree ixgbe because I have the option there
> of stripping out FCoE and RSC via a couple of build flags.  The problem
> is I don't know if the head frag stuff will work out very well with
> ixgbe because RSC and FCoE require that we have to use 1K aligned
> receive buffers.  It would require us to make us have to bump up our
> allocation size by NET_SKB_PAD plus skb_shared_info which would likely
> force us up to order 1 pages on most platforms.

What is RSC exactly, and why RSC is used in the build_skb() context ?

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Alexander Duyck @ 2012-05-02 16:16 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335947084.22133.134.camel@edumazet-glaptop>

On 05/02/2012 01:24 AM, Eric Dumazet wrote:
> On Tue, 2012-05-01 at 12:45 -0700, Alexander Duyck wrote:
>
>> I have a hacked together ixgbe up and running now with the new build_skb
>> logic and RSC/LRO disabled.  It looks like it is giving me a 5%
>> performance boost for small packet routing, but I am using more CPU for
>> netperf TCP receive tests and I was wondering if you had seen anything
>> similar on the tg3 driver?
> Really hard to say, numbers are so small on Gb link :
>
> what do you use to make your numbers ?
>
> netperf -H 172.30.42.23 -t OMNI -C -c 
> OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.30.42.23 (172.30.42.23) port 0 AF_INET
> Local       Local       Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service  
> Send Socket Send Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand   
> Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units    
> Final       Final                                             %     Method %      Method                          
> 1700840     1700840     16384  10.01   931.60     10^6bits/s  4.50  S      1.32   S      1.582   2.783   usec/KB  
>
> About ixgbe, feel free to send your patch ;)
>
> Thanks !
>
>
I was working with the out-of-tree ixgbe because I have the option there
of stripping out FCoE and RSC via a couple of build flags.  The problem
is I don't know if the head frag stuff will work out very well with
ixgbe because RSC and FCoE require that we have to use 1K aligned
receive buffers.  It would require us to make us have to bump up our
allocation size by NET_SKB_PAD plus skb_shared_info which would likely
force us up to order 1 pages on most platforms.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH net-next] net: take care of cloned skbs in tcp_try_coalesce()
From: Eric Dumazet @ 2012-05-02 16:12 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <4FA15830.6080600@intel.com>

On Wed, 2012-05-02 at 08:52 -0700, Alexander Duyck wrote:
> On 05/02/2012 01:13 AM, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Before stealing fragments or skb head, we must make sure skb is not
> > cloned.
> >
> > If skb is cloned, we must take references on pages instead.
> >
> > Bug happened using tcpdump (if not using mmap())
> >
> > Reported-by: Alexander Duyck <alexander.h.duyck@intel.com>
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  net/ipv4/tcp_input.c |   17 ++++++++++++++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index 96a631d..7686d7f 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -4467,7 +4467,7 @@ static bool tcp_try_coalesce(struct sock *sk,
> >  			     struct sk_buff *from,
> >  			     bool *fragstolen)
> >  {
> > -	int delta, len = from->len;
> > +	int i, delta, len = from->len;
> >  
> >  	*fragstolen = false;
> >  	if (tcp_hdr(from)->fin)
> > @@ -4497,7 +4497,13 @@ copyfrags:
> >  		       skb_shinfo(from)->frags,
> >  		       skb_shinfo(from)->nr_frags * sizeof(skb_frag_t));
> >  		skb_shinfo(to)->nr_frags += skb_shinfo(from)->nr_frags;
> > -		skb_shinfo(from)->nr_frags = 0;
> > +
> > +		if (skb_cloned(from))
> > +			for (i = 0; i < skb_shinfo(from)->nr_frags; i++)
> > +				skb_frag_ref(from, i);
> > +		else
> > +			skb_shinfo(from)->nr_frags = 0;
> > +
> >  		to->truesize += delta;
> >  		atomic_add(delta, &sk->sk_rmem_alloc);
> >  		sk_mem_charge(sk, delta);
> I am fairly certain the bug I saw is only masked over by this change. 
> The underlying problem is that we shouldn't be messing with nr_frags on
> the from or the to if either one is clone.  You now have a check in
> place for the from, but what about the to?  This function should
> probably be calling a pskb_expand_head on the to skb in order to
> guarantee that the skb->head isn't shared.  Otherwise this is going to
> cause other issues for any functions that are sharing these skbs that
> just walk through frags without checking skb->len or skb->data_len first. 

Its safe to increase to->len and increase nr_frags in this context,
because we hold a reference to dataref : It cannot disappear under us.

clones will still have their skb->len at skb_clone() time and wont care
we expanded the frags.

> 
> > @@ -4515,7 +4521,12 @@ copyfrags:
> >  		offset = from->data - (unsigned char *)page_address(page);
> >  		skb_fill_page_desc(to, skb_shinfo(to)->nr_frags,
> >  				   page, offset, skb_headlen(from));
> > -		*fragstolen = true;
> > +
> > +		if (skb_cloned(from))
> > +			get_page(page);
> > +		else
> > +			*fragstolen = true;
> > +
> >  		delta = len; /* we dont know real truesize... */
> >  		goto copyfrags;
> >  	}
> >
> >
> I don't see where we are now addressing the put_page call to release the
> page afterwards.  By calling get_page you are incrementing the page
> count by one, but where are you decrementing dataref in the shared
> info?  Without that we are looking at a memory leak because __kfree_skb
> will decrement the dataref but it will never reach 0 so it will never
> call put_page on the head frag.

really the dataref was already incremented at skb_clone() time

It will be properly decremented since we call __kfree_skb()

Only the last decrement will perform the put_page()

Think about splice() is doing, its the same get_page() game.

^ permalink raw reply

* RE: [RFC][PATCH] net: ipv4: ipconfig: decrease CONF_CARRIER_TIMEOUT
From: David Laight @ 2012-05-02 15:59 UTC (permalink / raw)
  To: Christian Hemp, davem, kuznet, jmorris, yoshfuji, kaber, netdev
In-Reply-To: <1335972259-20975-1-git-send-email-c.hemp@phytec.de>

 
> A timeout of two minutes is pretty anoying if _no_ ethernet cable
> is attached by purpose.  This patch decreases the timeout of
> CONF_CARRIER_TIMEOUT to an accaptable value of 10 secounds.
> 
...
>  
>  /* Define the friendly delay before and after opening net devices */
>  #define CONF_POST_OPEN		10	/* After 
> opening: 10 msecs */
> -#define CONF_CARRIER_TIMEOUT	120000	/* Wait for carrier timeout */
> +#define CONF_CARRIER_TIMEOUT	1000	/* Wait for carrier timeout */

Doesn't that reduce it to 1 second!

I'm also not all sure how long it might take.
I'm sure there are some switches/routers that can take quite a while
do negotiate the link.
Usually noticed when dhcp takes links down and up as it assigns
an address.

	David

^ permalink raw reply

* Re: [PATCH net-next] net: take care of cloned skbs in tcp_try_coalesce()
From: Alexander Duyck @ 2012-05-02 15:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335946384.22133.119.camel@edumazet-glaptop>

On 05/02/2012 01:13 AM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Before stealing fragments or skb head, we must make sure skb is not
> cloned.
>
> If skb is cloned, we must take references on pages instead.
>
> Bug happened using tcpdump (if not using mmap())
>
> Reported-by: Alexander Duyck <alexander.h.duyck@intel.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/ipv4/tcp_input.c |   17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 96a631d..7686d7f 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4467,7 +4467,7 @@ static bool tcp_try_coalesce(struct sock *sk,
>  			     struct sk_buff *from,
>  			     bool *fragstolen)
>  {
> -	int delta, len = from->len;
> +	int i, delta, len = from->len;
>  
>  	*fragstolen = false;
>  	if (tcp_hdr(from)->fin)
> @@ -4497,7 +4497,13 @@ copyfrags:
>  		       skb_shinfo(from)->frags,
>  		       skb_shinfo(from)->nr_frags * sizeof(skb_frag_t));
>  		skb_shinfo(to)->nr_frags += skb_shinfo(from)->nr_frags;
> -		skb_shinfo(from)->nr_frags = 0;
> +
> +		if (skb_cloned(from))
> +			for (i = 0; i < skb_shinfo(from)->nr_frags; i++)
> +				skb_frag_ref(from, i);
> +		else
> +			skb_shinfo(from)->nr_frags = 0;
> +
>  		to->truesize += delta;
>  		atomic_add(delta, &sk->sk_rmem_alloc);
>  		sk_mem_charge(sk, delta);
I am fairly certain the bug I saw is only masked over by this change. 
The underlying problem is that we shouldn't be messing with nr_frags on
the from or the to if either one is clone.  You now have a check in
place for the from, but what about the to?  This function should
probably be calling a pskb_expand_head on the to skb in order to
guarantee that the skb->head isn't shared.  Otherwise this is going to
cause other issues for any functions that are sharing these skbs that
just walk through frags without checking skb->len or skb->data_len first. 

> @@ -4515,7 +4521,12 @@ copyfrags:
>  		offset = from->data - (unsigned char *)page_address(page);
>  		skb_fill_page_desc(to, skb_shinfo(to)->nr_frags,
>  				   page, offset, skb_headlen(from));
> -		*fragstolen = true;
> +
> +		if (skb_cloned(from))
> +			get_page(page);
> +		else
> +			*fragstolen = true;
> +
>  		delta = len; /* we dont know real truesize... */
>  		goto copyfrags;
>  	}
>
>
I don't see where we are now addressing the put_page call to release the
page afterwards.  By calling get_page you are incrementing the page
count by one, but where are you decrementing dataref in the shared
info?  Without that we are looking at a memory leak because __kfree_skb
will decrement the dataref but it will never reach 0 so it will never
call put_page on the head frag.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH 01/13 v4] usb/net: rndis: inline the cpu_to_le32() macro
From: Jussi Kivilinna @ 2012-05-02 15:29 UTC (permalink / raw)
  To: Linus Walleij
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	Greg Kroah-Hartman, David S. Miller, Felipe Balbi, Haiyang Zhang,
	Wei Yongjun, Ben Hutchings
In-Reply-To: <1335896538-13281-1-git-send-email-linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 35319 bytes --]

Quoting Linus Walleij <linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>:

> The header file <linux/usb/rndis_host.h> used a number of #defines
> that included the cpu_to_le32() macro to assure the result will be
> in LE endianness. Inlining this into the code instead of using it
> in the code definitions yields consolidation opportunities later
> on as you will see in the following patches. The individual
> drivers also used local defines - all are switched over to the
> pattern of doing the conversion at the call sites instead.
>

After this patch, endianness checks with sparse output:

   CHECK   drivers/net/usb/rndis_host.c
drivers/net/usb/rndis_host.c:152:18: warning: restricted __le32  
degrades to integer
drivers/net/usb/rndis_host.c:152:13: warning: incorrect type in  
assignment (different base types)
drivers/net/usb/rndis_host.c:152:13:    expected restricted __le32  
[usertype] rsp
drivers/net/usb/rndis_host.c:152:13:    got unsigned int

   CHECK   drivers/net/wireless/rndis_wlan.c
drivers/net/wireless/rndis_wlan.c:2627:38: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2627:38:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2627:38:    got int
drivers/net/wireless/rndis_wlan.c:2657:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2657:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2657:37:    got int
drivers/net/wireless/rndis_wlan.c:1258:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1258:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1258:37:    got int
drivers/net/wireless/rndis_wlan.c:2116:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2116:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2116:39:    got int
drivers/net/wireless/rndis_wlan.c:1036:38: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1036:38:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1036:38:    got int
drivers/net/wireless/rndis_wlan.c:1045:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1045:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1045:37:    got int
drivers/net/wireless/rndis_wlan.c:1062:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1062:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1062:37:    got int
drivers/net/wireless/rndis_wlan.c:1086:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1086:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1086:39:    got int
drivers/net/wireless/rndis_wlan.c:1097:40: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1097:40:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1097:40:    got int
drivers/net/wireless/rndis_wlan.c:1122:45: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1122:45:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1122:45:    got int
drivers/net/wireless/rndis_wlan.c:1184:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1184:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1184:37:    got int
drivers/net/wireless/rndis_wlan.c:1211:38: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1211:38:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1211:38:    got int
drivers/net/wireless/rndis_wlan.c:1237:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1237:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1237:37:    got int
drivers/net/wireless/rndis_wlan.c:1285:38: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1285:38:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1285:38:    got int
drivers/net/wireless/rndis_wlan.c:1299:38: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1299:38:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1299:38:    got int
drivers/net/wireless/rndis_wlan.c:1336:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1336:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1336:39:    got int
drivers/net/wireless/rndis_wlan.c:1344:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1344:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1344:37:    got int
drivers/net/wireless/rndis_wlan.c:1362:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1362:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1362:39:    got int
drivers/net/wireless/rndis_wlan.c:1416:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1416:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1416:37:    got int
drivers/net/wireless/rndis_wlan.c:1507:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1507:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1507:37:    got int
drivers/net/wireless/rndis_wlan.c:1597:45: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1597:45:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1597:45:    got int
drivers/net/wireless/rndis_wlan.c:1603:45: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1603:45:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1603:45:    got int
drivers/net/wireless/rndis_wlan.c:1672:45: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1672:45:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1672:45:    got int
drivers/net/wireless/rndis_wlan.c:1685:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1685:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1685:37:    got int
drivers/net/wireless/rndis_wlan.c:1751:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1751:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1751:39:    got int
drivers/net/wireless/rndis_wlan.c:1779:37: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:1779:37:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:1779:37:    got int
drivers/net/wireless/rndis_wlan.c:2514:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2514:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2514:39:    got int
drivers/net/wireless/rndis_wlan.c:2521:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2521:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2521:39:    got int
drivers/net/wireless/rndis_wlan.c:2696:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2696:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2696:39:    got int
drivers/net/wireless/rndis_wlan.c:2723:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:2723:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:2723:39:    got int
drivers/net/wireless/rndis_wlan.c:3100:25: warning: restricted __le32  
degrades to integer
drivers/net/wireless/rndis_wlan.c:3151:42: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:3151:42:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:3151:42:    got int
drivers/net/wireless/rndis_wlan.c:3176:42: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:3176:42:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:3176:42:    got int
drivers/net/wireless/rndis_wlan.c:3250:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:3250:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:3250:39:    got int
drivers/net/wireless/rndis_wlan.c:3278:39: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:3278:39:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:3278:39:    got int
drivers/net/wireless/rndis_wlan.c:3286:41: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:3286:41:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:3286:41:    got int
drivers/net/wireless/rndis_wlan.c:3606:31: warning: incorrect type in  
argument 2 (different base types)
drivers/net/wireless/rndis_wlan.c:3606:31:    expected restricted  
__le32 [usertype] oid
drivers/net/wireless/rndis_wlan.c:3606:31:    got int

Patch fixing this attached.

Patch-set to clean-up ugliness caused by this patch at:  
http://koti.mbnet.fi/axh/kernel/rndis_wlan/

-Jussi

> Signed-off-by: Linus Walleij <linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  drivers/net/usb/rndis_host.c      |   52 +++++++-------
>  drivers/net/wireless/rndis_wlan.c |  138  
> +++++++++++++++++++------------------
>  include/linux/usb/rndis_host.h    |   84 +++++++++++-----------
>  3 files changed, 139 insertions(+), 135 deletions(-)
>
> diff --git a/drivers/net/usb/rndis_host.c b/drivers/net/usb/rndis_host.c
> index c8f1b5b..05cad0b 100644
> --- a/drivers/net/usb/rndis_host.c
> +++ b/drivers/net/usb/rndis_host.c
> @@ -78,10 +78,10 @@ static void rndis_msg_indicate(struct usbnet  
> *dev, struct rndis_indicate *msg,
>  		dev->driver_info->indication(dev, msg, buflen);
>  	} else {
>  		switch (msg->status) {
> -		case RNDIS_STATUS_MEDIA_CONNECT:
> +		case cpu_to_le32(RNDIS_STATUS_MEDIA_CONNECT):
>  			dev_info(udev, "rndis media connect\n");
>  			break;
> -		case RNDIS_STATUS_MEDIA_DISCONNECT:
> +		case cpu_to_le32(RNDIS_STATUS_MEDIA_DISCONNECT):
>  			dev_info(udev, "rndis media disconnect\n");
>  			break;
>  		default:
> @@ -117,8 +117,8 @@ int rndis_command(struct usbnet *dev, struct  
> rndis_msg_hdr *buf, int buflen)
>  	 */
>
>  	/* Issue the request; xid is unique, don't bother byteswapping it */
> -	if (likely(buf->msg_type != RNDIS_MSG_HALT &&
> -		   buf->msg_type != RNDIS_MSG_RESET)) {
> +	if (likely(buf->msg_type != cpu_to_le32(RNDIS_MSG_HALT) &&
> +		   buf->msg_type != cpu_to_le32(RNDIS_MSG_RESET))) {
>  		xid = dev->xid++;
>  		if (!xid)
>  			xid = dev->xid++;
> @@ -164,9 +164,10 @@ int rndis_command(struct usbnet *dev, struct  
> rndis_msg_hdr *buf, int buflen)
>  			request_id = (__force u32) buf->request_id;
>  			if (likely(buf->msg_type == rsp)) {
>  				if (likely(request_id == xid)) {
> -					if (unlikely(rsp == RNDIS_MSG_RESET_C))
> +					if (unlikely(rsp ==
> +					    cpu_to_le32(RNDIS_MSG_RESET_C)))
>  						return 0;
> -					if (likely(RNDIS_STATUS_SUCCESS
> +					if (likely(cpu_to_le32(RNDIS_STATUS_SUCCESS)
>  							== buf->status))
>  						return 0;
>  					dev_dbg(&info->control->dev,
> @@ -179,16 +180,15 @@ int rndis_command(struct usbnet *dev, struct  
> rndis_msg_hdr *buf, int buflen)
>  					request_id, xid);
>  				/* then likely retry */
>  			} else switch (buf->msg_type) {
> -			case RNDIS_MSG_INDICATE:	/* fault/event */
> +			case cpu_to_le32(RNDIS_MSG_INDICATE): /* fault/event */
>  				rndis_msg_indicate(dev, (void *)buf, buflen);
> -
>  				break;
> -			case RNDIS_MSG_KEEPALIVE: {	/* ping */
> +			case cpu_to_le32(RNDIS_MSG_KEEPALIVE): { /* ping */
>  				struct rndis_keepalive_c *msg = (void *)buf;
>
> -				msg->msg_type = RNDIS_MSG_KEEPALIVE_C;
> +				msg->msg_type = cpu_to_le32(RNDIS_MSG_KEEPALIVE_C);
>  				msg->msg_len = cpu_to_le32(sizeof *msg);
> -				msg->status = RNDIS_STATUS_SUCCESS;
> +				msg->status = cpu_to_le32(RNDIS_STATUS_SUCCESS);
>  				retval = usb_control_msg(dev->udev,
>  					usb_sndctrlpipe(dev->udev, 0),
>  					USB_CDC_SEND_ENCAPSULATED_COMMAND,
> @@ -251,7 +251,7 @@ static int rndis_query(struct usbnet *dev,  
> struct usb_interface *intf,
>  	u.buf = buf;
>
>  	memset(u.get, 0, sizeof *u.get + in_len);
> -	u.get->msg_type = RNDIS_MSG_QUERY;
> +	u.get->msg_type = cpu_to_le32(RNDIS_MSG_QUERY);
>  	u.get->msg_len = cpu_to_le32(sizeof *u.get + in_len);
>  	u.get->oid = oid;
>  	u.get->len = cpu_to_le32(in_len);
> @@ -324,7 +324,7 @@ generic_rndis_bind(struct usbnet *dev, struct  
> usb_interface *intf, int flags)
>  	if (retval < 0)
>  		goto fail;
>
> -	u.init->msg_type = RNDIS_MSG_INIT;
> +	u.init->msg_type = cpu_to_le32(RNDIS_MSG_INIT);
>  	u.init->msg_len = cpu_to_le32(sizeof *u.init);
>  	u.init->major_version = cpu_to_le32(1);
>  	u.init->minor_version = cpu_to_le32(0);
> @@ -395,22 +395,23 @@ generic_rndis_bind(struct usbnet *dev, struct  
> usb_interface *intf, int flags)
>  	/* Check physical medium */
>  	phym = NULL;
>  	reply_len = sizeof *phym;
> -	retval = rndis_query(dev, intf, u.buf, OID_GEN_PHYSICAL_MEDIUM,
> +	retval = rndis_query(dev, intf, u.buf,
> +			     cpu_to_le32(OID_GEN_PHYSICAL_MEDIUM),
>  			0, (void **) &phym, &reply_len);
>  	if (retval != 0 || !phym) {
>  		/* OID is optional so don't fail here. */
> -		phym_unspec = RNDIS_PHYSICAL_MEDIUM_UNSPECIFIED;
> +		phym_unspec = cpu_to_le32(RNDIS_PHYSICAL_MEDIUM_UNSPECIFIED);
>  		phym = &phym_unspec;
>  	}
>  	if ((flags & FLAG_RNDIS_PHYM_WIRELESS) &&
> -			*phym != RNDIS_PHYSICAL_MEDIUM_WIRELESS_LAN) {
> +	    *phym != cpu_to_le32(RNDIS_PHYSICAL_MEDIUM_WIRELESS_LAN)) {
>  		netif_dbg(dev, probe, dev->net,
>  			  "driver requires wireless physical medium, but device is not\n");
>  		retval = -ENODEV;
>  		goto halt_fail_and_release;
>  	}
>  	if ((flags & FLAG_RNDIS_PHYM_NOT_WIRELESS) &&
> -			*phym == RNDIS_PHYSICAL_MEDIUM_WIRELESS_LAN) {
> +	    *phym == cpu_to_le32(RNDIS_PHYSICAL_MEDIUM_WIRELESS_LAN)) {
>  		netif_dbg(dev, probe, dev->net,
>  			  "driver requires non-wireless physical medium, but device is  
> wireless.\n");
>  		retval = -ENODEV;
> @@ -419,7 +420,8 @@ generic_rndis_bind(struct usbnet *dev, struct  
> usb_interface *intf, int flags)
>
>  	/* Get designated host ethernet address */
>  	reply_len = ETH_ALEN;
> -	retval = rndis_query(dev, intf, u.buf, OID_802_3_PERMANENT_ADDRESS,
> +	retval = rndis_query(dev, intf, u.buf,
> +			     cpu_to_le32(OID_802_3_PERMANENT_ADDRESS),
>  			48, (void **) &bp, &reply_len);
>  	if (unlikely(retval< 0)) {
>  		dev_err(&intf->dev, "rndis get ethaddr, %d\n", retval);
> @@ -430,12 +432,12 @@ generic_rndis_bind(struct usbnet *dev, struct  
> usb_interface *intf, int flags)
>
>  	/* set a nonzero filter to enable data transfers */
>  	memset(u.set, 0, sizeof *u.set);
> -	u.set->msg_type = RNDIS_MSG_SET;
> +	u.set->msg_type = cpu_to_le32(RNDIS_MSG_SET);
>  	u.set->msg_len = cpu_to_le32(4 + sizeof *u.set);
> -	u.set->oid = OID_GEN_CURRENT_PACKET_FILTER;
> +	u.set->oid = cpu_to_le32(OID_GEN_CURRENT_PACKET_FILTER);
>  	u.set->len = cpu_to_le32(4);
>  	u.set->offset = cpu_to_le32((sizeof *u.set) - 8);
> -	*(__le32 *)(u.buf + sizeof *u.set) = RNDIS_DEFAULT_FILTER;
> +	*(__le32 *)(u.buf + sizeof *u.set) = cpu_to_le32(RNDIS_DEFAULT_FILTER);
>
>  	retval = rndis_command(dev, u.header, CONTROL_BUFFER_SIZE);
>  	if (unlikely(retval < 0)) {
> @@ -450,7 +452,7 @@ generic_rndis_bind(struct usbnet *dev, struct  
> usb_interface *intf, int flags)
>
>  halt_fail_and_release:
>  	memset(u.halt, 0, sizeof *u.halt);
> -	u.halt->msg_type = RNDIS_MSG_HALT;
> +	u.halt->msg_type = cpu_to_le32(RNDIS_MSG_HALT);
>  	u.halt->msg_len = cpu_to_le32(sizeof *u.halt);
>  	(void) rndis_command(dev, (void *)u.halt, CONTROL_BUFFER_SIZE);
>  fail_and_release:
> @@ -475,7 +477,7 @@ void rndis_unbind(struct usbnet *dev, struct  
> usb_interface *intf)
>  	/* try to clear any rndis state/activity (no i/o from stack!) */
>  	halt = kzalloc(CONTROL_BUFFER_SIZE, GFP_KERNEL);
>  	if (halt) {
> -		halt->msg_type = RNDIS_MSG_HALT;
> +		halt->msg_type = cpu_to_le32(RNDIS_MSG_HALT);
>  		halt->msg_len = cpu_to_le32(sizeof *halt);
>  		(void) rndis_command(dev, (void *)halt, CONTROL_BUFFER_SIZE);
>  		kfree(halt);
> @@ -501,7 +503,7 @@ int rndis_rx_fixup(struct usbnet *dev, struct  
> sk_buff *skb)
>  		data_len = le32_to_cpu(hdr->data_len);
>
>  		/* don't choke if we see oob, per-packet data, etc */
> -		if (unlikely(hdr->msg_type != RNDIS_MSG_PACKET ||
> +		if (unlikely(hdr->msg_type != cpu_to_le32(RNDIS_MSG_PACKET) ||
>  			     skb->len < msg_len ||
>  			     (data_offset + data_len + 8) > msg_len)) {
>  			dev->net->stats.rx_frame_errors++;
> @@ -569,7 +571,7 @@ rndis_tx_fixup(struct usbnet *dev, struct  
> sk_buff *skb, gfp_t flags)
>  fill:
>  	hdr = (void *) __skb_push(skb, sizeof *hdr);
>  	memset(hdr, 0, sizeof *hdr);
> -	hdr->msg_type = RNDIS_MSG_PACKET;
> +	hdr->msg_type = cpu_to_le32(RNDIS_MSG_PACKET);
>  	hdr->msg_len = cpu_to_le32(skb->len);
>  	hdr->data_offset = cpu_to_le32(sizeof(*hdr) - 8);
>  	hdr->data_len = cpu_to_le32(len);
> diff --git a/drivers/net/wireless/rndis_wlan.c  
> b/drivers/net/wireless/rndis_wlan.c
> index d66e298..a935012 100644
> --- a/drivers/net/wireless/rndis_wlan.c
> +++ b/drivers/net/wireless/rndis_wlan.c
> @@ -90,45 +90,45 @@ MODULE_PARM_DESC(workaround_interval,
>
>
>  /* various RNDIS OID defs */
> -#define OID_GEN_LINK_SPEED			cpu_to_le32(0x00010107)
> -#define OID_GEN_RNDIS_CONFIG_PARAMETER		cpu_to_le32(0x0001021b)
> -
> -#define OID_GEN_XMIT_OK				cpu_to_le32(0x00020101)
> -#define OID_GEN_RCV_OK				cpu_to_le32(0x00020102)
> -#define OID_GEN_XMIT_ERROR			cpu_to_le32(0x00020103)
> -#define OID_GEN_RCV_ERROR			cpu_to_le32(0x00020104)
> -#define OID_GEN_RCV_NO_BUFFER			cpu_to_le32(0x00020105)
> -
> -#define OID_802_3_CURRENT_ADDRESS		cpu_to_le32(0x01010102)
> -#define OID_802_3_MULTICAST_LIST		cpu_to_le32(0x01010103)
> -#define OID_802_3_MAXIMUM_LIST_SIZE		cpu_to_le32(0x01010104)
> -
> -#define OID_802_11_BSSID			cpu_to_le32(0x0d010101)
> -#define OID_802_11_SSID				cpu_to_le32(0x0d010102)
> -#define OID_802_11_INFRASTRUCTURE_MODE		cpu_to_le32(0x0d010108)
> -#define OID_802_11_ADD_WEP			cpu_to_le32(0x0d010113)
> -#define OID_802_11_REMOVE_WEP			cpu_to_le32(0x0d010114)
> -#define OID_802_11_DISASSOCIATE			cpu_to_le32(0x0d010115)
> -#define OID_802_11_AUTHENTICATION_MODE		cpu_to_le32(0x0d010118)
> -#define OID_802_11_PRIVACY_FILTER		cpu_to_le32(0x0d010119)
> -#define OID_802_11_BSSID_LIST_SCAN		cpu_to_le32(0x0d01011a)
> -#define OID_802_11_ENCRYPTION_STATUS		cpu_to_le32(0x0d01011b)
> -#define OID_802_11_ADD_KEY			cpu_to_le32(0x0d01011d)
> -#define OID_802_11_REMOVE_KEY			cpu_to_le32(0x0d01011e)
> -#define OID_802_11_ASSOCIATION_INFORMATION	cpu_to_le32(0x0d01011f)
> -#define OID_802_11_CAPABILITY			cpu_to_le32(0x0d010122)
> -#define OID_802_11_PMKID			cpu_to_le32(0x0d010123)
> -#define OID_802_11_NETWORK_TYPES_SUPPORTED	cpu_to_le32(0x0d010203)
> -#define OID_802_11_NETWORK_TYPE_IN_USE		cpu_to_le32(0x0d010204)
> -#define OID_802_11_TX_POWER_LEVEL		cpu_to_le32(0x0d010205)
> -#define OID_802_11_RSSI				cpu_to_le32(0x0d010206)
> -#define OID_802_11_RSSI_TRIGGER			cpu_to_le32(0x0d010207)
> -#define OID_802_11_FRAGMENTATION_THRESHOLD	cpu_to_le32(0x0d010209)
> -#define OID_802_11_RTS_THRESHOLD		cpu_to_le32(0x0d01020a)
> -#define OID_802_11_SUPPORTED_RATES		cpu_to_le32(0x0d01020e)
> -#define OID_802_11_CONFIGURATION		cpu_to_le32(0x0d010211)
> -#define OID_802_11_POWER_MODE			cpu_to_le32(0x0d010216)
> -#define OID_802_11_BSSID_LIST			cpu_to_le32(0x0d010217)
> +#define OID_GEN_LINK_SPEED			0x00010107
> +#define OID_GEN_RNDIS_CONFIG_PARAMETER		0x0001021b
> +
> +#define OID_GEN_XMIT_OK				0x00020101
> +#define OID_GEN_RCV_OK				0x00020102
> +#define OID_GEN_XMIT_ERROR			0x00020103
> +#define OID_GEN_RCV_ERROR			0x00020104
> +#define OID_GEN_RCV_NO_BUFFER			0x00020105
> +
> +#define OID_802_3_CURRENT_ADDRESS		0x01010102
> +#define OID_802_3_MULTICAST_LIST		0x01010103
> +#define OID_802_3_MAXIMUM_LIST_SIZE		0x01010104
> +
> +#define OID_802_11_BSSID			0x0d010101
> +#define OID_802_11_SSID				0x0d010102
> +#define OID_802_11_INFRASTRUCTURE_MODE		0x0d010108
> +#define OID_802_11_ADD_WEP			0x0d010113
> +#define OID_802_11_REMOVE_WEP			0x0d010114
> +#define OID_802_11_DISASSOCIATE			0x0d010115
> +#define OID_802_11_AUTHENTICATION_MODE		0x0d010118
> +#define OID_802_11_PRIVACY_FILTER		0x0d010119
> +#define OID_802_11_BSSID_LIST_SCAN		0x0d01011a
> +#define OID_802_11_ENCRYPTION_STATUS		0x0d01011b
> +#define OID_802_11_ADD_KEY			0x0d01011d
> +#define OID_802_11_REMOVE_KEY			0x0d01011e
> +#define OID_802_11_ASSOCIATION_INFORMATION	0x0d01011f
> +#define OID_802_11_CAPABILITY			0x0d010122
> +#define OID_802_11_PMKID			0x0d010123
> +#define OID_802_11_NETWORK_TYPES_SUPPORTED	0x0d010203
> +#define OID_802_11_NETWORK_TYPE_IN_USE		0x0d010204
> +#define OID_802_11_TX_POWER_LEVEL		0x0d010205
> +#define OID_802_11_RSSI				0x0d010206
> +#define OID_802_11_RSSI_TRIGGER			0x0d010207
> +#define OID_802_11_FRAGMENTATION_THRESHOLD	0x0d010209
> +#define OID_802_11_RTS_THRESHOLD		0x0d01020a
> +#define OID_802_11_SUPPORTED_RATES		0x0d01020e
> +#define OID_802_11_CONFIGURATION		0x0d010211
> +#define OID_802_11_POWER_MODE			0x0d010216
> +#define OID_802_11_BSSID_LIST			0x0d010217
>
>
>  /* Typical noise/maximum signal level values taken from ndiswrapper  
> iw_ndis.h */
> @@ -151,8 +151,8 @@ MODULE_PARM_DESC(workaround_interval,
>
>
>  /* codes for "status" field of completion messages */
> -#define RNDIS_STATUS_ADAPTER_NOT_READY		cpu_to_le32(0xc0010011)
> -#define RNDIS_STATUS_ADAPTER_NOT_OPEN		cpu_to_le32(0xc0010012)
> +#define RNDIS_STATUS_ADAPTER_NOT_READY		0xc0010011
> +#define RNDIS_STATUS_ADAPTER_NOT_OPEN		0xc0010012
>
>
>  /* Known device types */
> @@ -673,7 +673,7 @@ static int rndis_akm_suite_to_key_mgmt(u32 akm_suite)
>  static const char *oid_to_string(__le32 oid)
>  {
>  	switch (oid) {
> -#define OID_STR(oid) case oid: return(#oid)
> +#define OID_STR(oid) case cpu_to_le32(oid): return(#oid)
>  		/* from rndis_host.h */
>  		OID_STR(OID_802_3_PERMANENT_ADDRESS);
>  		OID_STR(OID_GEN_MAXIMUM_FRAME_SIZE);
> @@ -737,18 +737,18 @@ static int rndis_error_status(__le32 rndis_status)
>  {
>  	int ret = -EINVAL;
>  	switch (rndis_status) {
> -	case RNDIS_STATUS_SUCCESS:
> +	case cpu_to_le32(RNDIS_STATUS_SUCCESS):
>  		ret = 0;
>  		break;
> -	case RNDIS_STATUS_FAILURE:
> -	case RNDIS_STATUS_INVALID_DATA:
> +	case cpu_to_le32(RNDIS_STATUS_FAILURE):
> +	case cpu_to_le32(RNDIS_STATUS_INVALID_DATA):
>  		ret = -EINVAL;
>  		break;
> -	case RNDIS_STATUS_NOT_SUPPORTED:
> +	case cpu_to_le32(RNDIS_STATUS_NOT_SUPPORTED):
>  		ret = -EOPNOTSUPP;
>  		break;
> -	case RNDIS_STATUS_ADAPTER_NOT_READY:
> -	case RNDIS_STATUS_ADAPTER_NOT_OPEN:
> +	case cpu_to_le32(RNDIS_STATUS_ADAPTER_NOT_READY):
> +	case cpu_to_le32(RNDIS_STATUS_ADAPTER_NOT_OPEN):
>  		ret = -EBUSY;
>  		break;
>  	}
> @@ -782,7 +782,7 @@ static int rndis_query_oid(struct usbnet *dev,  
> __le32 oid, void *data, int *len)
>  	mutex_lock(&priv->command_lock);
>
>  	memset(u.get, 0, sizeof *u.get);
> -	u.get->msg_type = RNDIS_MSG_QUERY;
> +	u.get->msg_type = cpu_to_le32(RNDIS_MSG_QUERY);
>  	u.get->msg_len = cpu_to_le32(sizeof *u.get);
>  	u.get->oid = oid;
>
> @@ -866,7 +866,7 @@ static int rndis_set_oid(struct usbnet *dev,  
> __le32 oid, const void *data,
>  	mutex_lock(&priv->command_lock);
>
>  	memset(u.set, 0, sizeof *u.set);
> -	u.set->msg_type = RNDIS_MSG_SET;
> +	u.set->msg_type = cpu_to_le32(RNDIS_MSG_SET);
>  	u.set->msg_len = cpu_to_le32(sizeof(*u.set) + len);
>  	u.set->oid = oid;
>  	u.set->len = cpu_to_le32(len);
> @@ -908,7 +908,7 @@ static int rndis_reset(struct usbnet *usbdev)
>
>  	reset = (void *)priv->command_buffer;
>  	memset(reset, 0, sizeof(*reset));
> -	reset->msg_type = RNDIS_MSG_RESET;
> +	reset->msg_type = cpu_to_le32(RNDIS_MSG_RESET);
>  	reset->msg_len = cpu_to_le32(sizeof(*reset));
>  	priv->current_command_oid = 0;
>  	ret = rndis_command(usbdev, (void *)reset, CONTROL_BUFFER_SIZE);
> @@ -994,7 +994,7 @@ static int rndis_set_config_parameter(struct  
> usbnet *dev, char *param,
>  	}
>  #endif
>
> -	ret = rndis_set_oid(dev, OID_GEN_RNDIS_CONFIG_PARAMETER,
> +	ret = rndis_set_oid(dev, cpu_to_le32(OID_GEN_RNDIS_CONFIG_PARAMETER),
>  							infobuf, info_len);
>  	if (ret != 0)
>  		netdev_dbg(dev->net, "setting rndis config parameter failed, %d\n",
> @@ -1626,14 +1626,14 @@ static void set_multicast_list(struct usbnet *usbdev)
>  	char *mc_addrs = NULL;
>  	int mc_count;
>
> -	basefilter = filter = RNDIS_PACKET_TYPE_DIRECTED |
> -			      RNDIS_PACKET_TYPE_BROADCAST;
> +	basefilter = filter = cpu_to_le32(RNDIS_PACKET_TYPE_DIRECTED |
> +					  RNDIS_PACKET_TYPE_BROADCAST);
>
>  	if (usbdev->net->flags & IFF_PROMISC) {
> -		filter |= RNDIS_PACKET_TYPE_PROMISCUOUS |
> -			RNDIS_PACKET_TYPE_ALL_LOCAL;
> +		filter |= cpu_to_le32(RNDIS_PACKET_TYPE_PROMISCUOUS |
> +				      RNDIS_PACKET_TYPE_ALL_LOCAL);
>  	} else if (usbdev->net->flags & IFF_ALLMULTI) {
> -		filter |= RNDIS_PACKET_TYPE_ALL_MULTICAST;
> +		filter |= cpu_to_le32(RNDIS_PACKET_TYPE_ALL_MULTICAST);
>  	}
>
>  	if (filter != basefilter)
> @@ -1646,7 +1646,7 @@ static void set_multicast_list(struct usbnet *usbdev)
>  	netif_addr_lock_bh(usbdev->net);
>  	mc_count = netdev_mc_count(usbdev->net);
>  	if (mc_count > priv->multicast_size) {
> -		filter |= RNDIS_PACKET_TYPE_ALL_MULTICAST;
> +		filter |= cpu_to_le32(RNDIS_PACKET_TYPE_ALL_MULTICAST);
>  	} else if (mc_count) {
>  		int i = 0;
>
> @@ -1673,9 +1673,9 @@ static void set_multicast_list(struct usbnet *usbdev)
>  				    mc_count * ETH_ALEN);
>  		kfree(mc_addrs);
>  		if (ret == 0)
> -			filter |= RNDIS_PACKET_TYPE_MULTICAST;
> +			filter |= cpu_to_le32(RNDIS_PACKET_TYPE_MULTICAST);
>  		else
> -			filter |= RNDIS_PACKET_TYPE_ALL_MULTICAST;
> +			filter |= cpu_to_le32(RNDIS_PACKET_TYPE_ALL_MULTICAST);
>
>  		netdev_dbg(usbdev->net, "OID_802_3_MULTICAST_LIST(%d, max: %d) -> %d\n",
>  			   mc_count, priv->multicast_size, ret);
> @@ -3096,7 +3096,7 @@ static void rndis_wlan_indication(struct  
> usbnet *usbdev, void *ind, int buflen)
>  	struct rndis_indicate *msg = ind;
>
>  	switch (msg->status) {
> -	case RNDIS_STATUS_MEDIA_CONNECT:
> +	case cpu_to_le32(RNDIS_STATUS_MEDIA_CONNECT):
>  		if (priv->current_command_oid == OID_802_11_ADD_KEY) {
>  			/* OID_802_11_ADD_KEY causes sometimes extra
>  			 * "media connect" indications which confuses driver
> @@ -3116,7 +3116,7 @@ static void rndis_wlan_indication(struct  
> usbnet *usbdev, void *ind, int buflen)
>  		queue_work(priv->workqueue, &priv->work);
>  		break;
>
> -	case RNDIS_STATUS_MEDIA_DISCONNECT:
> +	case cpu_to_le32(RNDIS_STATUS_MEDIA_DISCONNECT):
>  		netdev_info(usbdev->net, "media disconnect\n");
>
>  		/* queue work to avoid recursive calls into rndis_command */
> @@ -3124,7 +3124,7 @@ static void rndis_wlan_indication(struct  
> usbnet *usbdev, void *ind, int buflen)
>  		queue_work(priv->workqueue, &priv->work);
>  		break;
>
> -	case RNDIS_STATUS_MEDIA_SPECIFIC_INDICATION:
> +	case cpu_to_le32(RNDIS_STATUS_MEDIA_SPECIFIC_INDICATION):
>  		rndis_wlan_media_specific_indication(usbdev, msg, buflen);
>  		break;
>
> @@ -3465,13 +3465,15 @@ static int rndis_wlan_bind(struct usbnet  
> *usbdev, struct usb_interface *intf)
>  	 */
>  	usbdev->net->netdev_ops = &rndis_wlan_netdev_ops;
>
> -	tmp = RNDIS_PACKET_TYPE_DIRECTED | RNDIS_PACKET_TYPE_BROADCAST;
> -	retval = rndis_set_oid(usbdev, OID_GEN_CURRENT_PACKET_FILTER, &tmp,
> -								sizeof(tmp));
> +	tmp = cpu_to_le32(RNDIS_PACKET_TYPE_DIRECTED |  
> RNDIS_PACKET_TYPE_BROADCAST);
> +	retval = rndis_set_oid(usbdev,
> +			       cpu_to_le32(OID_GEN_CURRENT_PACKET_FILTER),
> +			       &tmp, sizeof(tmp));
>
>  	len = sizeof(tmp);
> -	retval = rndis_query_oid(usbdev, OID_802_3_MAXIMUM_LIST_SIZE, &tmp,
> -								&len);
> +	retval = rndis_query_oid(usbdev,
> +				 cpu_to_le32(OID_802_3_MAXIMUM_LIST_SIZE),
> +				 &tmp, &len);
>  	priv->multicast_size = le32_to_cpu(tmp);
>  	if (retval < 0 || priv->multicast_size < 0)
>  		priv->multicast_size = 0;
> diff --git a/include/linux/usb/rndis_host.h b/include/linux/usb/rndis_host.h
> index 88fceb7..9a005b6 100644
> --- a/include/linux/usb/rndis_host.h
> +++ b/include/linux/usb/rndis_host.h
> @@ -49,46 +49,46 @@ struct rndis_msg_hdr {
>   */
>  #define	RNDIS_CONTROL_TIMEOUT_MS	(5 * 1000)
>
> -#define RNDIS_MSG_COMPLETION	cpu_to_le32(0x80000000)
> +#define RNDIS_MSG_COMPLETION	0x80000000
>
>  /* codes for "msg_type" field of rndis messages;
>   * only the data channel uses packet messages (maybe batched);
>   * everything else goes on the control channel.
>   */
> -#define RNDIS_MSG_PACKET	cpu_to_le32(0x00000001)	/* 1-N packets */
> -#define RNDIS_MSG_INIT		cpu_to_le32(0x00000002)
> +#define RNDIS_MSG_PACKET	0x00000001	/* 1-N packets */
> +#define RNDIS_MSG_INIT		0x00000002
>  #define RNDIS_MSG_INIT_C	(RNDIS_MSG_INIT|RNDIS_MSG_COMPLETION)
> -#define RNDIS_MSG_HALT		cpu_to_le32(0x00000003)
> -#define RNDIS_MSG_QUERY		cpu_to_le32(0x00000004)
> +#define RNDIS_MSG_HALT		0x00000003
> +#define RNDIS_MSG_QUERY		0x00000004
>  #define RNDIS_MSG_QUERY_C	(RNDIS_MSG_QUERY|RNDIS_MSG_COMPLETION)
> -#define RNDIS_MSG_SET		cpu_to_le32(0x00000005)
> +#define RNDIS_MSG_SET		0x00000005
>  #define RNDIS_MSG_SET_C		(RNDIS_MSG_SET|RNDIS_MSG_COMPLETION)
> -#define RNDIS_MSG_RESET		cpu_to_le32(0x00000006)
> +#define RNDIS_MSG_RESET		0x00000006
>  #define RNDIS_MSG_RESET_C	(RNDIS_MSG_RESET|RNDIS_MSG_COMPLETION)
> -#define RNDIS_MSG_INDICATE	cpu_to_le32(0x00000007)
> -#define RNDIS_MSG_KEEPALIVE	cpu_to_le32(0x00000008)
> +#define RNDIS_MSG_INDICATE	0x00000007
> +#define RNDIS_MSG_KEEPALIVE	0x00000008
>  #define RNDIS_MSG_KEEPALIVE_C	(RNDIS_MSG_KEEPALIVE|RNDIS_MSG_COMPLETION)
>
>  /* codes for "status" field of completion messages */
> -#define	RNDIS_STATUS_SUCCESS			cpu_to_le32(0x00000000)
> -#define	RNDIS_STATUS_FAILURE			cpu_to_le32(0xc0000001)
> -#define	RNDIS_STATUS_INVALID_DATA		cpu_to_le32(0xc0010015)
> -#define	RNDIS_STATUS_NOT_SUPPORTED		cpu_to_le32(0xc00000bb)
> -#define	RNDIS_STATUS_MEDIA_CONNECT		cpu_to_le32(0x4001000b)
> -#define	RNDIS_STATUS_MEDIA_DISCONNECT		cpu_to_le32(0x4001000c)
> -#define	RNDIS_STATUS_MEDIA_SPECIFIC_INDICATION	cpu_to_le32(0x40010012)
> +#define	RNDIS_STATUS_SUCCESS			0x00000000
> +#define	RNDIS_STATUS_FAILURE			0xc0000001
> +#define	RNDIS_STATUS_INVALID_DATA		0xc0010015
> +#define	RNDIS_STATUS_NOT_SUPPORTED		0xc00000bb
> +#define	RNDIS_STATUS_MEDIA_CONNECT		0x4001000b
> +#define	RNDIS_STATUS_MEDIA_DISCONNECT		0x4001000c
> +#define	RNDIS_STATUS_MEDIA_SPECIFIC_INDICATION	0x40010012
>
>  /* codes for OID_GEN_PHYSICAL_MEDIUM */
> -#define	RNDIS_PHYSICAL_MEDIUM_UNSPECIFIED	cpu_to_le32(0x00000000)
> -#define	RNDIS_PHYSICAL_MEDIUM_WIRELESS_LAN	cpu_to_le32(0x00000001)
> -#define	RNDIS_PHYSICAL_MEDIUM_CABLE_MODEM	cpu_to_le32(0x00000002)
> -#define	RNDIS_PHYSICAL_MEDIUM_PHONE_LINE	cpu_to_le32(0x00000003)
> -#define	RNDIS_PHYSICAL_MEDIUM_POWER_LINE	cpu_to_le32(0x00000004)
> -#define	RNDIS_PHYSICAL_MEDIUM_DSL		cpu_to_le32(0x00000005)
> -#define	RNDIS_PHYSICAL_MEDIUM_FIBRE_CHANNEL	cpu_to_le32(0x00000006)
> -#define	RNDIS_PHYSICAL_MEDIUM_1394		cpu_to_le32(0x00000007)
> -#define	RNDIS_PHYSICAL_MEDIUM_WIRELESS_WAN	cpu_to_le32(0x00000008)
> -#define	RNDIS_PHYSICAL_MEDIUM_MAX		cpu_to_le32(0x00000009)
> +#define	RNDIS_PHYSICAL_MEDIUM_UNSPECIFIED	0x00000000
> +#define	RNDIS_PHYSICAL_MEDIUM_WIRELESS_LAN	0x00000001
> +#define	RNDIS_PHYSICAL_MEDIUM_CABLE_MODEM	0x00000002
> +#define	RNDIS_PHYSICAL_MEDIUM_PHONE_LINE	0x00000003
> +#define	RNDIS_PHYSICAL_MEDIUM_POWER_LINE	0x00000004
> +#define	RNDIS_PHYSICAL_MEDIUM_DSL		0x00000005
> +#define	RNDIS_PHYSICAL_MEDIUM_FIBRE_CHANNEL	0x00000006
> +#define	RNDIS_PHYSICAL_MEDIUM_1394		0x00000007
> +#define	RNDIS_PHYSICAL_MEDIUM_WIRELESS_WAN	0x00000008
> +#define	RNDIS_PHYSICAL_MEDIUM_MAX		0x00000009
>
>  struct rndis_data_hdr {
>  	__le32	msg_type;		/* RNDIS_MSG_PACKET */
> @@ -226,24 +226,24 @@ struct rndis_keepalive_c {	/* IN (optionally OUT) */
>   * there are gobs more that may optionally be supported.  We'll  
> avoid as much
>   * of that mess as possible.
>   */
> -#define OID_802_3_PERMANENT_ADDRESS	cpu_to_le32(0x01010101)
> -#define OID_GEN_MAXIMUM_FRAME_SIZE	cpu_to_le32(0x00010106)
> -#define OID_GEN_CURRENT_PACKET_FILTER	cpu_to_le32(0x0001010e)
> -#define OID_GEN_PHYSICAL_MEDIUM		cpu_to_le32(0x00010202)
> +#define OID_802_3_PERMANENT_ADDRESS	0x01010101
> +#define OID_GEN_MAXIMUM_FRAME_SIZE	0x00010106
> +#define OID_GEN_CURRENT_PACKET_FILTER	0x0001010e
> +#define OID_GEN_PHYSICAL_MEDIUM		0x00010202
>
>  /* packet filter bits used by OID_GEN_CURRENT_PACKET_FILTER */
> -#define RNDIS_PACKET_TYPE_DIRECTED		cpu_to_le32(0x00000001)
> -#define RNDIS_PACKET_TYPE_MULTICAST		cpu_to_le32(0x00000002)
> -#define RNDIS_PACKET_TYPE_ALL_MULTICAST		cpu_to_le32(0x00000004)
> -#define RNDIS_PACKET_TYPE_BROADCAST		cpu_to_le32(0x00000008)
> -#define RNDIS_PACKET_TYPE_SOURCE_ROUTING	cpu_to_le32(0x00000010)
> -#define RNDIS_PACKET_TYPE_PROMISCUOUS		cpu_to_le32(0x00000020)
> -#define RNDIS_PACKET_TYPE_SMT			cpu_to_le32(0x00000040)
> -#define RNDIS_PACKET_TYPE_ALL_LOCAL		cpu_to_le32(0x00000080)
> -#define RNDIS_PACKET_TYPE_GROUP			cpu_to_le32(0x00001000)
> -#define RNDIS_PACKET_TYPE_ALL_FUNCTIONAL	cpu_to_le32(0x00002000)
> -#define RNDIS_PACKET_TYPE_FUNCTIONAL		cpu_to_le32(0x00004000)
> -#define RNDIS_PACKET_TYPE_MAC_FRAME		cpu_to_le32(0x00008000)
> +#define RNDIS_PACKET_TYPE_DIRECTED		0x00000001
> +#define RNDIS_PACKET_TYPE_MULTICAST		0x00000002
> +#define RNDIS_PACKET_TYPE_ALL_MULTICAST		0x00000004
> +#define RNDIS_PACKET_TYPE_BROADCAST		0x00000008
> +#define RNDIS_PACKET_TYPE_SOURCE_ROUTING	0x00000010
> +#define RNDIS_PACKET_TYPE_PROMISCUOUS		0x00000020
> +#define RNDIS_PACKET_TYPE_SMT			0x00000040
> +#define RNDIS_PACKET_TYPE_ALL_LOCAL		0x00000080
> +#define RNDIS_PACKET_TYPE_GROUP			0x00001000
> +#define RNDIS_PACKET_TYPE_ALL_FUNCTIONAL	0x00002000
> +#define RNDIS_PACKET_TYPE_FUNCTIONAL		0x00004000
> +#define RNDIS_PACKET_TYPE_MAC_FRAME		0x00008000
>
>  /* default filter used with RNDIS devices */
>  #define RNDIS_DEFAULT_FILTER ( \
> --
> 1.7.7.6
>
>
>



[-- Attachment #2: 02-rndis_host-rndis_wlan-missing-cpu_to_le32s.patch --]
[-- Type: text/x-diff, Size: 15243 bytes --]

rndis_wlan & rndis_host: missing cpu_to_le32()s

From: Jussi Kivilinna <jussi.kivilinna-E01nCVcF24I@public.gmane.org>

Signed-off-by: Jussi Kivilinna <jussi.kivilinna-E01nCVcF24I@public.gmane.org>
---
 drivers/net/usb/rndis_host.c      |    2 +
 drivers/net/wireless/rndis_wlan.c |   74 +++++++++++++++++++------------------
 2 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/drivers/net/usb/rndis_host.c b/drivers/net/usb/rndis_host.c
index 05cad0b..3b7ddfd 100644
--- a/drivers/net/usb/rndis_host.c
+++ b/drivers/net/usb/rndis_host.c
@@ -149,7 +149,7 @@ int rndis_command(struct usbnet *dev, struct rndis_msg_hdr *buf, int buflen)
 	}
 
 	/* Poll the control channel; the request probably completed immediately */
-	rsp = buf->msg_type | RNDIS_MSG_COMPLETION;
+	rsp = buf->msg_type | cpu_to_le32(RNDIS_MSG_COMPLETION);
 	for (count = 0; count < 10; count++) {
 		memset(buf, 0, CONTROL_BUFFER_SIZE);
 		retval = usb_control_msg(dev->udev,
diff --git a/drivers/net/wireless/rndis_wlan.c b/drivers/net/wireless/rndis_wlan.c
index a935012..a686b5d 100644
--- a/drivers/net/wireless/rndis_wlan.c
+++ b/drivers/net/wireless/rndis_wlan.c
@@ -1033,7 +1033,7 @@ static int rndis_start_bssid_list_scan(struct usbnet *usbdev)
 
 	/* Note: OID_802_11_BSSID_LIST_SCAN clears internal BSS list. */
 	tmp = cpu_to_le32(1);
-	return rndis_set_oid(usbdev, OID_802_11_BSSID_LIST_SCAN, &tmp,
+	return rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_BSSID_LIST_SCAN), &tmp,
 							sizeof(tmp));
 }
 
@@ -1042,7 +1042,7 @@ static int set_essid(struct usbnet *usbdev, struct ndis_80211_ssid *ssid)
 	struct rndis_wlan_private *priv = get_rndis_wlan_priv(usbdev);
 	int ret;
 
-	ret = rndis_set_oid(usbdev, OID_802_11_SSID, ssid, sizeof(*ssid));
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_SSID), ssid, sizeof(*ssid));
 	if (ret < 0) {
 		netdev_warn(usbdev->net, "setting SSID failed (%08X)\n", ret);
 		return ret;
@@ -1059,7 +1059,7 @@ static int set_bssid(struct usbnet *usbdev, const u8 *bssid)
 {
 	int ret;
 
-	ret = rndis_set_oid(usbdev, OID_802_11_BSSID, bssid, ETH_ALEN);
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_BSSID), bssid, ETH_ALEN);
 	if (ret < 0) {
 		netdev_warn(usbdev->net, "setting BSSID[%pM] failed (%08X)\n",
 			    bssid, ret);
@@ -1083,7 +1083,7 @@ static int get_bssid(struct usbnet *usbdev, u8 bssid[ETH_ALEN])
 	int ret, len;
 
 	len = ETH_ALEN;
-	ret = rndis_query_oid(usbdev, OID_802_11_BSSID, bssid, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_BSSID), bssid, &len);
 
 	if (ret != 0)
 		memset(bssid, 0, ETH_ALEN);
@@ -1094,7 +1094,7 @@ static int get_bssid(struct usbnet *usbdev, u8 bssid[ETH_ALEN])
 static int get_association_info(struct usbnet *usbdev,
 			struct ndis_80211_assoc_info *info, int len)
 {
-	return rndis_query_oid(usbdev, OID_802_11_ASSOCIATION_INFORMATION,
+	return rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_ASSOCIATION_INFORMATION),
 				info, &len);
 }
 
@@ -1119,7 +1119,7 @@ static int disassociate(struct usbnet *usbdev, bool reset_ssid)
 	int i, ret = 0;
 
 	if (priv->radio_on) {
-		ret = rndis_set_oid(usbdev, OID_802_11_DISASSOCIATE, NULL, 0);
+		ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_DISASSOCIATE), NULL, 0);
 		if (ret == 0) {
 			priv->radio_on = false;
 			netdev_dbg(usbdev->net, "%s(): radio_on = false\n",
@@ -1181,7 +1181,7 @@ static int set_auth_mode(struct usbnet *usbdev, u32 wpa_version,
 		return -ENOTSUPP;
 
 	tmp = cpu_to_le32(auth_mode);
-	ret = rndis_set_oid(usbdev, OID_802_11_AUTHENTICATION_MODE, &tmp,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_AUTHENTICATION_MODE), &tmp,
 								sizeof(tmp));
 	if (ret != 0) {
 		netdev_warn(usbdev->net, "setting auth mode failed (%08X)\n",
@@ -1208,7 +1208,7 @@ static int set_priv_filter(struct usbnet *usbdev)
 	else
 		tmp = cpu_to_le32(NDIS_80211_PRIV_ACCEPT_ALL);
 
-	return rndis_set_oid(usbdev, OID_802_11_PRIVACY_FILTER, &tmp,
+	return rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_PRIVACY_FILTER), &tmp,
 								sizeof(tmp));
 }
 
@@ -1234,7 +1234,7 @@ static int set_encr_mode(struct usbnet *usbdev, int pairwise, int groupwise)
 		encr_mode = NDIS_80211_ENCR_DISABLED;
 
 	tmp = cpu_to_le32(encr_mode);
-	ret = rndis_set_oid(usbdev, OID_802_11_ENCRYPTION_STATUS, &tmp,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_ENCRYPTION_STATUS), &tmp,
 								sizeof(tmp));
 	if (ret != 0) {
 		netdev_warn(usbdev->net, "setting encr mode failed (%08X)\n",
@@ -1255,7 +1255,7 @@ static int set_infra_mode(struct usbnet *usbdev, int mode)
 		   __func__, priv->infra_mode);
 
 	tmp = cpu_to_le32(mode);
-	ret = rndis_set_oid(usbdev, OID_802_11_INFRASTRUCTURE_MODE, &tmp,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_INFRASTRUCTURE_MODE), &tmp,
 								sizeof(tmp));
 	if (ret != 0) {
 		netdev_warn(usbdev->net, "setting infra mode failed (%08X)\n",
@@ -1282,7 +1282,7 @@ static int set_rts_threshold(struct usbnet *usbdev, u32 rts_threshold)
 		rts_threshold = 2347;
 
 	tmp = cpu_to_le32(rts_threshold);
-	return rndis_set_oid(usbdev, OID_802_11_RTS_THRESHOLD, &tmp,
+	return rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_RTS_THRESHOLD), &tmp,
 								sizeof(tmp));
 }
 
@@ -1296,7 +1296,7 @@ static int set_frag_threshold(struct usbnet *usbdev, u32 frag_threshold)
 		frag_threshold = 2346;
 
 	tmp = cpu_to_le32(frag_threshold);
-	return rndis_set_oid(usbdev, OID_802_11_FRAGMENTATION_THRESHOLD, &tmp,
+	return rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_FRAGMENTATION_THRESHOLD), &tmp,
 								sizeof(tmp));
 }
 
@@ -1333,7 +1333,7 @@ static int set_channel(struct usbnet *usbdev, int channel)
 	dsconfig = ieee80211_dsss_chan_to_freq(channel) * 1000;
 
 	len = sizeof(config);
-	ret = rndis_query_oid(usbdev, OID_802_11_CONFIGURATION, &config, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_CONFIGURATION), &config, &len);
 	if (ret < 0) {
 		netdev_dbg(usbdev->net, "%s(): querying configuration failed\n",
 			   __func__);
@@ -1341,7 +1341,7 @@ static int set_channel(struct usbnet *usbdev, int channel)
 	}
 
 	config.ds_config = cpu_to_le32(dsconfig);
-	ret = rndis_set_oid(usbdev, OID_802_11_CONFIGURATION, &config,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_CONFIGURATION), &config,
 								sizeof(config));
 
 	netdev_dbg(usbdev->net, "%s(): %d -> %d\n", __func__, channel, ret);
@@ -1359,7 +1359,7 @@ static struct ieee80211_channel *get_current_channel(struct usbnet *usbdev,
 
 	/* Get channel and beacon interval */
 	len = sizeof(config);
-	ret = rndis_query_oid(usbdev, OID_802_11_CONFIGURATION, &config, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_CONFIGURATION), &config, &len);
 	netdev_dbg(usbdev->net, "%s(): OID_802_11_CONFIGURATION -> %d\n",
 				__func__, ret);
 	if (ret < 0)
@@ -1413,7 +1413,7 @@ static int add_wep_key(struct usbnet *usbdev, const u8 *key, int key_len,
 				    ret);
 	}
 
-	ret = rndis_set_oid(usbdev, OID_802_11_ADD_WEP, &ndis_key,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_ADD_WEP), &ndis_key,
 							sizeof(ndis_key));
 	if (ret != 0) {
 		netdev_warn(usbdev->net, "adding encryption key %d failed (%08X)\n",
@@ -1504,7 +1504,7 @@ static int add_wpa_key(struct usbnet *usbdev, const u8 *key, int key_len,
 			get_bssid(usbdev, ndis_key.bssid);
 	}
 
-	ret = rndis_set_oid(usbdev, OID_802_11_ADD_KEY, &ndis_key,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_ADD_KEY), &ndis_key,
 					le32_to_cpu(ndis_key.size));
 	netdev_dbg(usbdev->net, "%s(): OID_802_11_ADD_KEY -> %08X\n",
 		   __func__, ret);
@@ -1594,13 +1594,13 @@ static int remove_key(struct usbnet *usbdev, u8 index, const u8 *bssid)
 			memset(remove_key.bssid, 0xff,
 						sizeof(remove_key.bssid));
 
-		ret = rndis_set_oid(usbdev, OID_802_11_REMOVE_KEY, &remove_key,
+		ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_REMOVE_KEY), &remove_key,
 							sizeof(remove_key));
 		if (ret != 0)
 			return ret;
 	} else {
 		keyindex = cpu_to_le32(index);
-		ret = rndis_set_oid(usbdev, OID_802_11_REMOVE_WEP, &keyindex,
+		ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_REMOVE_WEP), &keyindex,
 							sizeof(keyindex));
 		if (ret != 0) {
 			netdev_warn(usbdev->net,
@@ -1669,7 +1669,7 @@ static void set_multicast_list(struct usbnet *usbdev)
 		goto set_filter;
 
 	if (mc_count) {
-		ret = rndis_set_oid(usbdev, OID_802_3_MULTICAST_LIST, mc_addrs,
+		ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_3_MULTICAST_LIST), mc_addrs,
 				    mc_count * ETH_ALEN);
 		kfree(mc_addrs);
 		if (ret == 0)
@@ -1682,7 +1682,7 @@ static void set_multicast_list(struct usbnet *usbdev)
 	}
 
 set_filter:
-	ret = rndis_set_oid(usbdev, OID_GEN_CURRENT_PACKET_FILTER, &filter,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_GEN_CURRENT_PACKET_FILTER), &filter,
 							sizeof(filter));
 	if (ret < 0) {
 		netdev_warn(usbdev->net, "couldn't set packet filter: %08x\n",
@@ -1748,7 +1748,7 @@ static struct ndis_80211_pmkid *get_device_pmkids(struct usbnet *usbdev)
 	pmkids->length = cpu_to_le32(len);
 	pmkids->bssid_info_count = cpu_to_le32(max_pmkids);
 
-	ret = rndis_query_oid(usbdev, OID_802_11_PMKID, pmkids, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_PMKID), pmkids, &len);
 	if (ret < 0) {
 		netdev_dbg(usbdev->net, "%s(): OID_802_11_PMKID(%d, %d)"
 				" -> %d\n", __func__, len, max_pmkids, ret);
@@ -1776,7 +1776,7 @@ static int set_device_pmkids(struct usbnet *usbdev,
 
 	debug_print_pmkids(usbdev, pmkids, __func__);
 
-	ret = rndis_set_oid(usbdev, OID_802_11_PMKID, pmkids,
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_PMKID), pmkids,
 						le32_to_cpu(pmkids->length));
 	if (ret < 0) {
 		netdev_dbg(usbdev->net, "%s(): OID_802_11_PMKID(%d, %d) -> %d"
@@ -2113,7 +2113,7 @@ resize_buf:
 	 * resizing until it won't get any bigger.
 	 */
 	new_len = len;
-	ret = rndis_query_oid(usbdev, OID_802_11_BSSID_LIST, buf, &new_len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_BSSID_LIST), buf, &new_len);
 	if (ret != 0 || new_len < sizeof(struct ndis_80211_bssid_list_ex))
 		goto out;
 
@@ -2511,14 +2511,14 @@ static void rndis_fill_station_info(struct usbnet *usbdev,
 	memset(sinfo, 0, sizeof(*sinfo));
 
 	len = sizeof(linkspeed);
-	ret = rndis_query_oid(usbdev, OID_GEN_LINK_SPEED, &linkspeed, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_GEN_LINK_SPEED), &linkspeed, &len);
 	if (ret == 0) {
 		sinfo->txrate.legacy = le32_to_cpu(linkspeed) / 1000;
 		sinfo->filled |= STATION_INFO_TX_BITRATE;
 	}
 
 	len = sizeof(rssi);
-	ret = rndis_query_oid(usbdev, OID_802_11_RSSI, &rssi, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_RSSI), &rssi, &len);
 	if (ret == 0) {
 		sinfo->signal = level_to_qual(le32_to_cpu(rssi));
 		sinfo->filled |= STATION_INFO_SIGNAL;
@@ -2624,7 +2624,7 @@ static int rndis_flush_pmksa(struct wiphy *wiphy, struct net_device *netdev)
 	pmkid.length = cpu_to_le32(sizeof(pmkid));
 	pmkid.bssid_info_count = cpu_to_le32(0);
 
-	return rndis_set_oid(usbdev, OID_802_11_PMKID, &pmkid, sizeof(pmkid));
+	return rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_PMKID), &pmkid, sizeof(pmkid));
 }
 
 static int rndis_set_power_mgmt(struct wiphy *wiphy, struct net_device *dev,
@@ -2654,7 +2654,7 @@ static int rndis_set_power_mgmt(struct wiphy *wiphy, struct net_device *dev,
 	priv->power_mode = power_mode;
 
 	mode = cpu_to_le32(power_mode);
-	ret = rndis_set_oid(usbdev, OID_802_11_POWER_MODE, &mode, sizeof(mode));
+	ret = rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_POWER_MODE), &mode, sizeof(mode));
 
 	netdev_dbg(usbdev->net, "%s(): OID_802_11_POWER_MODE -> %d\n",
 				__func__, ret);
@@ -2693,7 +2693,7 @@ static void rndis_wlan_craft_connected_bss(struct usbnet *usbdev, u8 *bssid,
 	/* Get signal quality, in case of error use rssi=0 and ignore error. */
 	len = sizeof(rssi);
 	rssi = 0;
-	ret = rndis_query_oid(usbdev, OID_802_11_RSSI, &rssi, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_RSSI), &rssi, &len);
 	signal = level_to_qual(le32_to_cpu(rssi));
 
 	netdev_dbg(usbdev->net, "%s(): OID_802_11_RSSI -> %d, "
@@ -2720,7 +2720,7 @@ static void rndis_wlan_craft_connected_bss(struct usbnet *usbdev, u8 *bssid,
 	/* Get SSID, in case of error, use zero length SSID and ignore error. */
 	len = sizeof(ssid);
 	memset(&ssid, 0, sizeof(ssid));
-	ret = rndis_query_oid(usbdev, OID_802_11_SSID, &ssid, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_SSID), &ssid, &len);
 	netdev_dbg(usbdev->net, "%s(): OID_802_11_SSID -> %d, len: %d, ssid: "
 				"'%.32s'\n", __func__, ret,
 				le32_to_cpu(ssid.length), ssid.essid);
@@ -3097,7 +3097,7 @@ static void rndis_wlan_indication(struct usbnet *usbdev, void *ind, int buflen)
 
 	switch (msg->status) {
 	case cpu_to_le32(RNDIS_STATUS_MEDIA_CONNECT):
-		if (priv->current_command_oid == OID_802_11_ADD_KEY) {
+		if (priv->current_command_oid == cpu_to_le32(OID_802_11_ADD_KEY)) {
 			/* OID_802_11_ADD_KEY causes sometimes extra
 			 * "media connect" indications which confuses driver
 			 * and userspace to think that device is
@@ -3148,7 +3148,7 @@ static int rndis_wlan_get_caps(struct usbnet *usbdev, struct wiphy *wiphy)
 
 	/* determine supported modes */
 	len = sizeof(networks_supported);
-	retval = rndis_query_oid(usbdev, OID_802_11_NETWORK_TYPES_SUPPORTED,
+	retval = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_NETWORK_TYPES_SUPPORTED),
 						&networks_supported, &len);
 	if (retval >= 0) {
 		n = le32_to_cpu(networks_supported.num_items);
@@ -3173,7 +3173,7 @@ static int rndis_wlan_get_caps(struct usbnet *usbdev, struct wiphy *wiphy)
 	/* get device 802.11 capabilities, number of PMKIDs */
 	caps = (struct ndis_80211_capability *)caps_buf;
 	len = sizeof(caps_buf);
-	retval = rndis_query_oid(usbdev, OID_802_11_CAPABILITY, caps, &len);
+	retval = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_CAPABILITY), caps, &len);
 	if (retval >= 0) {
 		netdev_dbg(usbdev->net, "OID_802_11_CAPABILITY -> len %d, "
 				"ver %d, pmkids %d, auth-encr-pairs %d\n",
@@ -3247,7 +3247,7 @@ static void rndis_device_poller(struct work_struct *work)
 	}
 
 	len = sizeof(rssi);
-	ret = rndis_query_oid(usbdev, OID_802_11_RSSI, &rssi, &len);
+	ret = rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_RSSI), &rssi, &len);
 	if (ret == 0) {
 		priv->last_qual = level_to_qual(le32_to_cpu(rssi));
 		rndis_do_cqm(usbdev, le32_to_cpu(rssi));
@@ -3275,7 +3275,7 @@ static void rndis_device_poller(struct work_struct *work)
 		 * working.
 		 */
 		tmp = cpu_to_le32(1);
-		rndis_set_oid(usbdev, OID_802_11_BSSID_LIST_SCAN, &tmp,
+		rndis_set_oid(usbdev, cpu_to_le32(OID_802_11_BSSID_LIST_SCAN), &tmp,
 								sizeof(tmp));
 
 		len = CONTROL_BUFFER_SIZE;
@@ -3283,7 +3283,7 @@ static void rndis_device_poller(struct work_struct *work)
 		if (!buf)
 			goto end;
 
-		rndis_query_oid(usbdev, OID_802_11_BSSID_LIST, buf, &len);
+		rndis_query_oid(usbdev, cpu_to_le32(OID_802_11_BSSID_LIST), buf, &len);
 		kfree(buf);
 	}
 
@@ -3603,7 +3603,7 @@ static int rndis_wlan_stop(struct usbnet *usbdev)
 	/* Set current packet filter zero to block receiving data packets from
 	   device. */
 	filter = 0;
-	rndis_set_oid(usbdev, OID_GEN_CURRENT_PACKET_FILTER, &filter,
+	rndis_set_oid(usbdev, cpu_to_le32(OID_GEN_CURRENT_PACKET_FILTER), &filter,
 								sizeof(filter));
 
 	return retval;

^ permalink raw reply related

* [RFC][PATCH] net: ipv4: ipconfig: decrease CONF_CARRIER_TIMEOUT
From: Christian Hemp @ 2012-05-02 15:24 UTC (permalink / raw)
  To: davem, kuznet, jmorris, yoshfuji, kaber, netdev; +Cc: Christian Hemp

A timeout of two minutes is pretty anoying if _no_ ethernet cable
is attached by purpose.  This patch decreases the timeout of
CONF_CARRIER_TIMEOUT to an accaptable value of 10 secounds.

Signed-off-by: Christian Hemp <c.hemp@phytec.de>
---
 net/ipv4/ipconfig.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 99ec116..2aa80ac 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -89,7 +89,7 @@
 
 /* Define the friendly delay before and after opening net devices */
 #define CONF_POST_OPEN		10	/* After opening: 10 msecs */
-#define CONF_CARRIER_TIMEOUT	120000	/* Wait for carrier timeout */
+#define CONF_CARRIER_TIMEOUT	1000	/* Wait for carrier timeout */
 
 /* Define the timeout for waiting for a DHCP/BOOTP/RARP reply */
 #define CONF_OPEN_RETRIES 	2	/* (Re)open devices twice */
-- 
1.7.0.4

^ permalink raw reply related

* Re: sky2 still badly broken
From: Niccolò Belli @ 2012-05-02 15:12 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20120430122501.5fff4b1e@nehalam.linuxnetplumber.net>

Il 30/04/2012 21:25, Stephen Hemminger ha scritto:
> You are getting CRC and FIFO overrun errors. What laptop is this?
> Everything works fine on my old Fuijitsu with same chip (but rev 14).
> You could try taking out the status bit checks and see if the
> packets are really okay and the Marvell chip is complaining about
> bogus status.

I compiled 3.4-rc5 + both sky2 patches you recently published and I did 
some more tests:

Point to point: works flawlessly.
Attached to the switch: rx errors, even downloading a very small 89 KB 
file :(

This is the dump using IPv4:
http://files.linuxsystems.it/temp/2012-05/sky2_ipv4.pcap

dmesg (I did an rmmod -f sky2 before doing the test):
[ 1147.885026] sky2 0000:06:00.0: eth0: disabling interface
[ 1148.909548] sky2: driver version 1.30
[ 1148.909764] sky2 0000:06:00.0: Yukon-2 EC Ultra chip revision 3
[ 1148.914367] sky2 0000:06:00.0: irq 45 for MSI/MSI-X
[ 1148.916310] sky2 0000:06:00.0: eth0: addr 00:13:77:b4:1b:fa
[ 1148.942297] sky2 0000:06:00.0: eth0: enabling interface
[ 1148.944225] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1151.496295] sky2 0000:06:00.0: eth0: Link is up at 1000 Mbps, full 
duplex, flow control rx
[ 1151.497614] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 1178.479437] device eth0 entered promiscuous mode
[ 1179.541707] *sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 1468*
[ 1179.544110] *sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 1468*
[ 1181.642598] device eth0 left promiscuous mode




This is the dump using IPv6:
http://files.linuxsystems.it/temp/2012-05/sky2_ipv6.pcap

dmesg (I did an rmmod -f sky2 before doing the test):
[ 1314.225572] sky2 0000:06:00.0: eth0: disabling interface
[ 1315.248523] sky2: driver version 1.30
[ 1315.248731] sky2 0000:06:00.0: Yukon-2 EC Ultra chip revision 3
[ 1315.249333] sky2 0000:06:00.0: irq 45 for MSI/MSI-X
[ 1315.250307] sky2 0000:06:00.0: eth0: addr 00:13:77:b4:1b:fa
[ 1315.271364] sky2 0000:06:00.0: eth0: enabling interface
[ 1315.273015] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1317.875985] sky2 0000:06:00.0: eth0: Link is up at 1000 Mbps, full 
duplex, flow control rx
[ 1317.877311] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 1345.946062] device eth0 entered promiscuous mode
[ 1349.119848] device eth0 left promiscuous mode
[ 1376.231698] device eth0 entered promiscuous mode
[ 1377.369095] *sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 1468*
[ 1379.618257] device eth0 left promiscuous mode



Switch is a Netgear GS724Tv3 firmware 5.0.3.5


Cheers,
Niccolò

^ permalink raw reply

* Re: [net-next PATCH v4 0/8] Managing the forwarding database(FDB)
From: Michael S. Tsirkin @ 2012-05-02 15:08 UTC (permalink / raw)
  Cc: john.r.fastabend, shemminger, bhutchings, sri, hadi,
	jeffrey.t.kirsher, netdev, gregory.v.rose, krkumar2, roprabhu
In-Reply-To: <20120415.130637.2258594023349277277.davem@davemloft.net>

On Sun, Apr 15, 2012 at 01:06:37PM -0400, David Miller wrote:
> From: John Fastabend <john.r.fastabend@intel.com>
> Date: Sun, 15 Apr 2012 09:43:51 -0700
> 
> > The following series is a submission for net-next to allow
> > embedded switches and other stacked devices other then the
> > Linux bridge to manage a forwarding database.
> > 
> > Previously discussed here,
> > 
> > http://lists.openwall.net/netdev/2012/03/19/26
> > 
> > v4: propagate return codes correctly for ndo_dflt_Fdb_dump()
> > 
> > v3: resolve the macvlan patch 8/8 to fix a dev_set_promiscuity()
> >     error and add the flags field to change and get link routines.
> > 
> > v2: addressed feedback from Ben Hutchings resolving a typo in the
> >     multicast add/del routines and improving the error handling
> >     when both NTF_SELF and NTF_MASTER are set.
> > 
> > I've tested this with 'br' tool published by Stephen Hemminger
> > soon to be renamed 'bridge' I believe and various traffic
> > generators mostly pktgen, ping, and netperf.
> 
> All applied, if we need any more tweaks we can just add them
> on top of this work.
> 
> Thanks John.

John, do you plan to update kvm userspace to use this interface?

-- 
MST

^ permalink raw reply

* Re: [PATCH v2] net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
From: Eric Dumazet @ 2012-05-02 14:32 UTC (permalink / raw)
  To: Sasha Levin; +Cc: davem, jchapman, netdev, linux-kernel, davej
In-Reply-To: <1335967123-2588-1-git-send-email-levinsasha928@gmail.com>

On Wed, 2012-05-02 at 15:58 +0200, Sasha Levin wrote:
> l2tp_ip_sendmsg could return without releasing socket lock, making it all the
> way to userspace, and generating the following warning:
> 
> [  130.891594] ================================================
> [  130.894569] [ BUG: lock held when returning to user space! ]
> [  130.897257] 3.4.0-rc5-next-20120501-sasha #104 Tainted: G        W
> [  130.900336] ------------------------------------------------
> [  130.902996] trinity/8384 is leaving the kernel with locks still held!
> [  130.906106] 1 lock held by trinity/8384:
> [  130.907924]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff82b9503f>] l2tp_ip_sendmsg+0x2f/0x550
> 
> Introduced by commit 2f16270 ("l2tp: Fix locking in l2tp_ip.c").
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---

Oh well, please read Documentation/stable_kernel_rules.txt

Also, David prefers to handle stable submissions himself for net tree.

Anyway :

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* RE: [PATCH net-next 0/4] be2net fixes
From: Somnath.Kotur @ 2012-05-02 14:04 UTC (permalink / raw)
  To: Somnath.Kotur, netdev
In-Reply-To: <1335965986-31886-1-git-send-email-somnath.kotur@emulex.com>

David,
Have sent the 4 patches from the earlier patch series of 5 as is without updating the individual patch numbers i.e they still denote patch 1/5 etc.
Pls let me know if this is a problem and/or if you want me to resend them again? 

Thanks
Somnath

> -----Original Message-----
> From: Somnath Kotur [mailto:somnath.kotur@emulex.com]
> Sent: Wednesday, May 02, 2012 7:10 PM
> To: netdev@vger.kernel.org
> Cc: Kotur, Somnath
> Subject: [PATCH net-next 0/4] be2net fixes
> 
> Re-posting patches 1-4 from the earlier patch set of 5.
> Incorporated review comment from Ben Hutchings in patch 2.
> Will address comments in Patch 5 and send it out seperately.
> 
> Somnath Kotur (4):
>   be2net: Fix to not set link speed for disabled functions of a UMC
>     card
>   be2net: Fix to apply duplex value as unknown when link is down.
>   be2net: Record receive queue index in skb to aid RPS.
>   be2net: Fix EEH error reset before a flash dump completes
> 
>  drivers/net/ethernet/emulex/benet/be_ethtool.c |    4 ++--
>  drivers/net/ethernet/emulex/benet/be_main.c    |    7 +++++++
>  2 files changed, 9 insertions(+), 2 deletions(-)

^ permalink raw reply

* [PATCH v2] net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
From: Sasha Levin @ 2012-05-02 13:58 UTC (permalink / raw)
  To: davem, jchapman, eric.dumazet
  Cc: netdev, linux-kernel, davej, Sasha Levin, stable

l2tp_ip_sendmsg could return without releasing socket lock, making it all the
way to userspace, and generating the following warning:

[  130.891594] ================================================
[  130.894569] [ BUG: lock held when returning to user space! ]
[  130.897257] 3.4.0-rc5-next-20120501-sasha #104 Tainted: G        W
[  130.900336] ------------------------------------------------
[  130.902996] trinity/8384 is leaving the kernel with locks still held!
[  130.906106] 1 lock held by trinity/8384:
[  130.907924]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff82b9503f>] l2tp_ip_sendmsg+0x2f/0x550

Introduced by commit 2f16270 ("l2tp: Fix locking in l2tp_ip.c").

Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 net/l2tp/l2tp_ip.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 585d93e..6274f0b 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -442,8 +442,9 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 
 		daddr = lip->l2tp_addr.s_addr;
 	} else {
+		rc = -EDESTADDRREQ;
 		if (sk->sk_state != TCP_ESTABLISHED)
-			return -EDESTADDRREQ;
+			goto out;
 
 		daddr = inet->inet_daddr;
 		connected = 1;
-- 
1.7.8.5

^ permalink raw reply related

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Ben Hutchings @ 2012-05-02 13:53 UTC (permalink / raw)
  To: sedat.dilek
  Cc: Lennert Buytenhek, John W. Linville, linux-wireless, netdev,
	linux-kernel, lautriv, Jim Cromie, Hauke Mehrtens
In-Reply-To: <CA+icZUV=qtXwVxb+uvQJcsza0y+6k2LHJFFPnECY0M=QWBexaQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2331 bytes --]

On Tue, 2012-05-01 at 15:54 +0200, Sedat Dilek wrote:
> On Tue, May 1, 2012 at 2:51 PM, Lennert Buytenhek
> <buytenh@wantstofly.org> wrote:
> > On Sun, Apr 29, 2012 at 12:25:21AM +0200, Sedat Dilek wrote:
> >
> >> > On 1st sight, logs look fine:
> >> >
> >> > [21:52:52] <lautriv> [    6.050967] ieee80211 phy0: 88w8361p v4,
> >> > 00173f3bdde3, STA firmware 2.1.4.25
> >> >
> >> > But WLAN connection is not that fast and stable as lautriv reports
> >> > (several abnormalities were observed).
> >> >
> >> > I requested a tarball which includes:
> >> > * dmesg (Linux-3.3.3)
> >> > * e_n_a (/etc/network/interfaces)
> >> > * ifconfig output
> >> > * iwconfig output
> >> > * iw_phy output
> >> > * ps_axu (WPA) output
> >> >
> >> > lautriv will be so kind to be around on #linux-wireless/Freenode the
> >> > next days (UTC+2: German/Swiss local-time).
> >> > Just ping him.
> >> >
> >> > Hope you have fun, together!
> >> >
> >> > - Sedat -
> >>
> >> A new tarball from lautriv with same outputs as before, but now tested
> >> with Linux-3.4-rc4.
> 
> [ CC hauke (OpenWrt) and Ben Hutchings (linux-firmware maintainer) ]
> 
> > The output looks good enough for me to ACK adding the PCI ID.
> >
> > Can the firmware being used here be submitted to the linux-firmware
> > git tree?
> 
> I can't say much about the firmware [1] inclusion or the procedure of
> it into linux-firmware [2].
> Maybe, Ben can explain the procedure and what has to be considered
> before inclusion in linux-firmware.
> The original firmware and helper images were extracted from a Netgear
> Windows driver [1].
[...]

Even assuming that the original driver binary is freely redistributable,
it's not at all clear that these extracted blobs would be.

If there's some difficulty in getting a sensible licence for the
firmware, you can consider providing a tool to do this extraction, as
has been done for some other drivers whose vendors don't provide the
firmware blobs alone.  However, as yet, linux-firmware doesn't include
any such extraction tools and I would need to discuss such an addition
with David (and maybe other distribution maintainers).  Hopefully that
won't be necessary, though.

Ben.

-- 
Ben Hutchings
Design a system any fool can use, and only a fool will want to use it.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH] net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
From: Sasha Levin @ 2012-05-02 13:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, jchapman, netdev, linux-kernel, davej
In-Reply-To: <1335965318.22133.568.camel@edumazet-glaptop>

On Wed, May 2, 2012 at 3:28 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Good catch, but please use existing code style in this function.
>
> rc = -EDESTADDRREQ;
> if (sk->sk_state != TCP_ESTABLISHED)
>        goto out;
>
> Also, please add in your commit message bug origin to ease stable team
> work (not counting reviewers work)
>
> Bug added in commit 2f16270f41e1 (l2tp: Fix locking in l2tp_ip.c)
>
> Really, given the amount of patches you already sent, you should already
> know that.
>
> Thanks
>
>

Understood. I'll resend.

^ permalink raw reply

* [PATCH net-next 4/5] be2net: Fix EEH error reset before a flash dump completes
From: Somnath Kotur @ 2012-05-02 13:41 UTC (permalink / raw)
  To: netdev; +Cc: Somnath Kotur, Sathya Perla

An EEH error can cause the FW to trigger a flash debug dump.
Resetting the card while flash dump is in progress can cause it not to recover.
Wait for it to finish before letting EEH flow to reset the card.

Signed-off-by: Sathya Perla <Sathya.Perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index b7bc905..6d5d30b 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3821,6 +3821,11 @@ static pci_ers_result_t be_eeh_err_detected(struct pci_dev *pdev,
 
 	pci_disable_device(pdev);
 
+	/* The error could cause the FW to trigger a flash debug dump.
+	 * Resetting the card while flash dump is in progress
+	 * can cause it not to recover; wait for it to finish
+	 */
+	ssleep(30);
 	return PCI_ERS_RESULT_NEED_RESET;
 }
 
-- 
1.5.6.1

^ permalink raw reply related

* [PATCH net-next 3/5] be2net: Record receive queue index in skb to aid RPS.
From: Somnath Kotur @ 2012-05-02 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Somnath Kotur, Sarveshwar Bandi


Signed-off-by: Sarveshwar Bandi <Sarveshwar.Bandi@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index c8f7b3a..b7bc905 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1259,6 +1259,7 @@ static void be_rx_compl_process(struct be_rx_obj *rxo,
 		skb_checksum_none_assert(skb);
 
 	skb->protocol = eth_type_trans(skb, netdev);
+	skb_record_rx_queue(skb, rxo - &adapter->rx_obj[0]);
 	if (netdev->features & NETIF_F_RXHASH)
 		skb->rxhash = rxcp->rss_hash;
 
@@ -1315,6 +1316,7 @@ void be_rx_compl_process_gro(struct be_rx_obj *rxo, struct napi_struct *napi,
 	skb->len = rxcp->pkt_size;
 	skb->data_len = rxcp->pkt_size;
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
+	skb_record_rx_queue(skb, rxo - &adapter->rx_obj[0]);
 	if (adapter->netdev->features & NETIF_F_RXHASH)
 		skb->rxhash = rxcp->rss_hash;
 
-- 
1.5.6.1

^ permalink raw reply related

* [PATCH net-next 2/5] be2net: Fix to apply duplex value as unknown when link is down.
From: Somnath Kotur @ 2012-05-02 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Somnath Kotur, Sarveshwar Bandi


Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_ethtool.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index c9ba2cb..747f68f 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -618,7 +618,7 @@ static int be_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd)
 		ecmd->supported = adapter->phy.supported;
 	}
 
-	ecmd->duplex = DUPLEX_FULL;
+	ecmd->duplex = netif_carrier_ok(netdev) ? DUPLEX_FULL : DUPLEX_UNKNOWN;
 	ecmd->phy_address = adapter->port_num;
 
 	return 0;
-- 
1.5.6.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox