Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 10/17] netvm: Allow skb allocation to use PFMEMALLOC reserves
From: Sebastian Andrzej Siewior @ 2012-06-21 16:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sebastian Andrzej Siewior, Mel Gorman, Andrew Morton, Linux-MM,
	Linux-Netdev, LKML, David Miller, Neil Brown, Peter Zijlstra,
	Mike Christie, Eric B Munson
In-Reply-To: <1340296719.4604.5984.camel@edumazet-glaptop>

> > This is mostly used by nic to refil their RX skb pool. You add the
> > __GFP_MEMALLOC to the allocation to rise the change of a successfull refill
> > for the swap case.
> > A few drivers use build_skb() to create the skb. __netdev_alloc_skb()
> > shouldn't be affected since the allocation happens with GFP_ATOMIC. Looking at
> > TG3 it uses build_skb() and get_pages() / kmalloc(). Shouldn't this be some
> > considered?
> 
> Please look at net-next, this was changed recently.
> 
> In fact most RX allocations are done using netdev_alloc_frag(), because
> its called from __netdev_alloc_skb()

Argh, this is what I meant more or less. I got the flag magic wrong so I assumed
that this is only called without GFP_ATOMIC but it is not. Thanks for the
hint.

> So tg3 is not anymore the exception, but the norm.

Sebastian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 10/17] netvm: Allow skb allocation to use PFMEMALLOC reserves
From: Eric Dumazet @ 2012-06-21 16:38 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Mel Gorman, Andrew Morton, Linux-MM, Linux-Netdev, LKML,
	David Miller, Neil Brown, Peter Zijlstra, Mike Christie,
	Eric B Munson
In-Reply-To: <20120621163029.GB6045@breakpoint.cc>

On Thu, 2012-06-21 at 18:30 +0200, Sebastian Andrzej Siewior wrote:
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 1d6ecc8..9a58dcc 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -167,14 +206,19 @@ static void skb_under_panic(struct sk_buff *skb, int sz, void *here)
> >   *	%GFP_ATOMIC.
> >   */
> >  struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
> > -			    int fclone, int node)
> > +			    int flags, int node)
> >  {
> >  	struct kmem_cache *cache;
> >  	struct skb_shared_info *shinfo;
> >  	struct sk_buff *skb;
> >  	u8 *data;
> > +	bool pfmemalloc;
> >  
> > -	cache = fclone ? skbuff_fclone_cache : skbuff_head_cache;
> > +	cache = (flags & SKB_ALLOC_FCLONE)
> > +		? skbuff_fclone_cache : skbuff_head_cache;
> > +
> > +	if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX))
> > +		gfp_mask |= __GFP_MEMALLOC;
> >  
> >  	/* Get the HEAD */
> >  	skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);
> 
> This is mostly used by nic to refil their RX skb pool. You add the
> __GFP_MEMALLOC to the allocation to rise the change of a successfull refill
> for the swap case.
> A few drivers use build_skb() to create the skb. __netdev_alloc_skb()
> shouldn't be affected since the allocation happens with GFP_ATOMIC. Looking at
> TG3 it uses build_skb() and get_pages() / kmalloc(). Shouldn't this be some
> considered?

Please look at net-next, this was changed recently.

In fact most RX allocations are done using netdev_alloc_frag(), because
its called from __netdev_alloc_skb()

So tg3 is not anymore the exception, but the norm.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH] bnx2x: fix panic when TX ring is full
From: Dmitry Kravkov @ 2012-06-21 16:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tomas Hruby, David Miller, netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com
In-Reply-To: <1340295912.4604.5935.camel@edumazet-glaptop>

On Thu, 2012-06-21 at 18:25 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-21 at 18:56 +0300, Dmitry Kravkov wrote:
> > On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote:
> > > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote:
> > > 
> > > > The crash happens with default configuration since
> > > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory
> > > > limits initialization when !CONFIG_SYSCTL", but it can be hit by
> > > > increasing values of tcp_wmem even earlier.
> > > 
> > > This makes no sense.
> > Bisected to this commit and reproduced before the commit only after:
> > echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
> > by this max nr_frags raised from 8 to 17, when running 40 netperfs
> > 
> > i've decreased rx queue to 200, during the test
> 
> I repeat, this bug can be triggered anytime with a skb not provided by
> local tcp stack.
Got it
> 
> By the way, looking at your fix, its pretty obvious the fix has nothing
> to do with TCP stack.
It just describes the "4" number

> 
> commit changelog must be accurate, so please remove this wrong
> information. This will confuse future readers.
will do so
> 
> 
> 
> 

^ permalink raw reply

* Re: [PATCH 10/17] netvm: Allow skb allocation to use PFMEMALLOC reserves
From: Sebastian Andrzej Siewior @ 2012-06-21 16:30 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Linux-MM, Linux-Netdev, LKML, David Miller,
	Neil Brown, Peter Zijlstra, Mike Christie, Eric B Munson
In-Reply-To: <1340192652-31658-11-git-send-email-mgorman@suse.de>

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 1d6ecc8..9a58dcc 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -167,14 +206,19 @@ static void skb_under_panic(struct sk_buff *skb, int sz, void *here)
>   *	%GFP_ATOMIC.
>   */
>  struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
> -			    int fclone, int node)
> +			    int flags, int node)
>  {
>  	struct kmem_cache *cache;
>  	struct skb_shared_info *shinfo;
>  	struct sk_buff *skb;
>  	u8 *data;
> +	bool pfmemalloc;
>  
> -	cache = fclone ? skbuff_fclone_cache : skbuff_head_cache;
> +	cache = (flags & SKB_ALLOC_FCLONE)
> +		? skbuff_fclone_cache : skbuff_head_cache;
> +
> +	if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX))
> +		gfp_mask |= __GFP_MEMALLOC;
>  
>  	/* Get the HEAD */
>  	skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);

This is mostly used by nic to refil their RX skb pool. You add the
__GFP_MEMALLOC to the allocation to rise the change of a successfull refill
for the swap case.
A few drivers use build_skb() to create the skb. __netdev_alloc_skb()
shouldn't be affected since the allocation happens with GFP_ATOMIC. Looking at
TG3 it uses build_skb() and get_pages() / kmalloc(). Shouldn't this be some
considered?

Sebastian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH] bnx2x: fix panic when TX ring is full
From: Eric Dumazet @ 2012-06-21 16:25 UTC (permalink / raw)
  To: Dmitry Kravkov
  Cc: Tomas Hruby, David Miller, netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com
In-Reply-To: <1340294182.18721.30.camel@lb-tlvb-dmitry>

On Thu, 2012-06-21 at 18:56 +0300, Dmitry Kravkov wrote:
> On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote:
> > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote:
> > 
> > > The crash happens with default configuration since
> > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory
> > > limits initialization when !CONFIG_SYSCTL", but it can be hit by
> > > increasing values of tcp_wmem even earlier.
> > 
> > This makes no sense.
> Bisected to this commit and reproduced before the commit only after:
> echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
> by this max nr_frags raised from 8 to 17, when running 40 netperfs
> 
> i've decreased rx queue to 200, during the test

I repeat, this bug can be triggered anytime with a skb not provided by
local tcp stack.

By the way, looking at your fix, its pretty obvious the fix has nothing
to do with TCP stack.

commit changelog must be accurate, so please remove this wrong
information. This will confuse future readers.

^ permalink raw reply

* Re: [PATCH] l2tp: synchronise u64 stats writer callsites
From: Tom Parkin @ 2012-06-21 16:19 UTC (permalink / raw)
  To: David Laight; +Cc: netdev, James Chapman
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F5D@saturn3.aculab.com>

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]

On Thu, Jun 21, 2012 at 04:18:12PM +0100, David Laight wrote:
> The purpose of the u64_stats_update_begin/end is to
> perform lockless writes of the stats.
> If you need to lock them (because multiple threads can
> be writing to stats covered by the same 'syncp' at the
> same time) then the reader might as well use the same lock.
> 
> Otherwise split the 'syncp' such that only one update
> can be happening (for each sync).

Thanks David.

I think the best approach is probably to attempt to partition the l2tp
statistics such that we can be sure of single-threaded writer access
for each dataset, which can then be covered by a 'syncp'.

If that turns out not to be possible, I suppose the fallback position
is to do away with the u64_stats_update* calls and just use a spinlock
instead.

I'll look at implementing the former and put a new patch together.

Tom
-- 
Tom Parkin
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH 10/17] netvm: Allow skb allocation to use PFMEMALLOC reserves
From: Sebastian Andrzej Siewior @ 2012-06-21 16:09 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Linux-MM, Linux-Netdev, LKML, David Miller,
	Neil Brown, Peter Zijlstra, Mike Christie, Eric B Munson
In-Reply-To: <1340192652-31658-11-git-send-email-mgorman@suse.de>

On Wed, Jun 20, 2012 at 12:44:05PM +0100, Mel Gorman wrote:
> index b534a1b..61c951f 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -505,6 +506,15 @@ struct sk_buff {
>  #include <linux/slab.h>
>  
>  
> +#define SKB_ALLOC_FCLONE	0x01
> +#define SKB_ALLOC_RX		0x02
> +
> +/* Returns true if the skb was allocated from PFMEMALLOC reserves */
> +static inline bool skb_pfmemalloc(struct sk_buff *skb)
> +{
> +	return unlikely(skb->pfmemalloc);
> +}
> +
>  /*
>   * skb might have a dst pointer attached, refcounted or not.
>   * _skb_refdst low order bit is set if refcount was _not_ taken
> @@ -568,7 +578,7 @@ extern bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 1d6ecc8..9a58dcc 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -839,6 +900,13 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
>  	skb_shinfo(new)->gso_type = skb_shinfo(old)->gso_type;
>  }
>  
> +static inline int skb_alloc_rx_flag(const struct sk_buff *skb)
> +{
> +	if (skb_pfmemalloc((struct sk_buff *)skb))
> +		return SKB_ALLOC_RX;
> +	return 0;
> +}
> +
>  /**
>   *	skb_copy	-	create private copy of an sk_buff
>   *	@skb: buffer to copy

If merge this chunk

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6510a5d..2acfec9 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -510,7 +510,7 @@ struct sk_buff {
 #define SKB_ALLOC_RX		0x02
 
 /* Returns true if the skb was allocated from PFMEMALLOC reserves */
-static inline bool skb_pfmemalloc(struct sk_buff *skb)
+static inline bool skb_pfmemalloc(const struct sk_buff *skb)
 {
 	return unlikely(skb->pfmemalloc);
 }
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c44ab68..6ce94b5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -852,7 +852,7 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 
 static inline int skb_alloc_rx_flag(const struct sk_buff *skb)
 {
-	if (skb_pfmemalloc((struct sk_buff *)skb))
+	if (skb_pfmemalloc(skb))
 		return SKB_ALLOC_RX;
 	return 0;
 }


Then you should be able to drop the case in skb_alloc_rx_flag() without adding
a warning.

Sebastian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* Re: [RFC] TCP:  Support configurable delayed-ack parameters.
From: Ben Greear @ 2012-06-21 16:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Daniel Baluta,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1340082688.7491.2299.camel@edumazet-glaptop>

On 06/18/2012 10:11 PM, Eric Dumazet wrote:
> On Mon, 2012-06-18 at 17:52 -0700, greearb-my8/4N5VtI7c+919tysfdA@public.gmane.org wrote:

>> In order to keep a multiply out of the hot path, the segs * mss
>> computation is recalculated and cached whenever segs or mss changes.
>>
>
> I know David was worried about this multiply, but current cpus do a
> multiply in at most 3 cycles.
>
> Addding an u32 field in socket structure adds 1/16 of a cache line, and
> adds more penalty.
>
> Avoiding to build/send an ACK packet can save us so many cpu cycles that
> the multiply is pure noise.

I modified the patch as you suggested to remove the cached multiply
and just do the multiply in the hot path (and fixed a few other bugs in
the implementation).  And yes, I know Dave doesn't like the patch, so
it's unlikely to ever go upstream...

Test system is i5 processor laptop, 3.3.7+ kernel, Fedora 17, running wifi
traffic and wired NIC through an AP (sending-to-self, with proper
routing rules to make this function as desired).  AP is 3x3 mimo,
laptop is 2x2, max nominal rate of 300Mbps.  Channel is 149.
Both nics are Atheros (ath9k).
Laptop and AP is about 3 feet apart, and AP antenna & laptop rotation
have been tweaked for maximum throughput.

Traffic generator is our in-house tool, but it generally matches
iperf when used with the same configuration.  Send-buffer size
is configured at 1MB (with system defaults performance is much worse).

This is wifi upload, with station sending to wired Ethernet port.

I only changed the max-segs values for this test, leaving the min/max
delay-ack timers at defaults.

Rate is calculated over TCP data throughput, ie not counting headers.
The rates bounce around a bit, but I tried to report the average.

segs == 1:  196Mbps TCP throughput, 17,000 pps tx, 4,000 pps rx on wlan interface.

segs == 20: 203Mbps, 17,300 pps tx, 1400 pps rx

segs == 64: 217Mbps, 18300 pps tx, 311 pps rx

segs == 1024: 231Mbps, 19200 pps tx, 118 pps rx.

Note that with pure UDP throughput, I see right at 230-240Mbps when
everything is running smoothly, so setting delack-segs to a high value
allows TCP to approach UDP throughput.

I'll repost the patch (against 3.5-rcX) that I'm using later today
after some more testing in case someone else wants to try it out.

Thanks,
Ben

-- 
Ben Greear <greearb-my8/4N5VtI7c+919tysfdA@public.gmane.org>
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] bnx2x: fix panic when TX ring is full
From: Dmitry Kravkov @ 2012-06-21 16:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tomas Hruby, David Miller, netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com
In-Reply-To: <1340294182.18721.30.camel@lb-tlvb-dmitry>

On Thu, 2012-06-21 at 18:56 +0300, Dmitry Kravkov wrote:
> On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote:
> > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote:
> > 
> > > The crash happens with default configuration since
> > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory
> > > limits initialization when !CONFIG_SYSCTL", but it can be hit by
> > > increasing values of tcp_wmem even earlier.
> > 
> > This makes no sense.
> Bisected to this commit and reproduced before the commit only after:
> echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
> by this max nr_frags raised from 8 to 17, when running 40 netperfs
> 
> i've decreased rx queue to 200, during the test
sorry, tx queue
> 
> 
> > > From: Dmitry Kravkov <dmitry@broadcom.com>
> > > Subject: [PATCH net-next] bnx2x: reservation for NEXT tx BDs
> > > 
> > > Commit [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2]
> > > net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL
> > > provided new default value for tcp_wmem, since heavy tcp
> > > traffic may cause the TSO packet to consume 20 BDs + 1 for next page
> > > descriptor.
> > 
> > This is completely bogus. I have no idea how you came to this.
> > 
> > A forwarding workload can trigger same bug, if GRO is enabled.
> > 
> > Remove this wrong bit, please ?
> > 
> > 
> > 
> 

^ permalink raw reply

* Re: [PATCH] bnx2x: fix panic when TX ring is full
From: Dmitry Kravkov @ 2012-06-21 15:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tomas Hruby, David Miller, netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com
In-Reply-To: <1340291526.4604.5710.camel@edumazet-glaptop>

On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote:
> 
> > The crash happens with default configuration since
> > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory
> > limits initialization when !CONFIG_SYSCTL", but it can be hit by
> > increasing values of tcp_wmem even earlier.
> 
> This makes no sense.
Bisected to this commit and reproduced before the commit only after:
echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
by this max nr_frags raised from 8 to 17, when running 40 netperfs

i've decreased rx queue to 200, during the test


> > From: Dmitry Kravkov <dmitry@broadcom.com>
> > Subject: [PATCH net-next] bnx2x: reservation for NEXT tx BDs
> > 
> > Commit [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2]
> > net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL
> > provided new default value for tcp_wmem, since heavy tcp
> > traffic may cause the TSO packet to consume 20 BDs + 1 for next page
> > descriptor.
> 
> This is completely bogus. I have no idea how you came to this.
> 
> A forwarding workload can trigger same bug, if GRO is enabled.
> 
> Remove this wrong bit, please ?
> 
> 
> 

^ permalink raw reply

* RE: [PATCH] l2tp: synchronise u64 stats writer callsites
From: David Laight @ 2012-06-21 15:18 UTC (permalink / raw)
  To: Tom Parkin, netdev; +Cc: James Chapman
In-Reply-To: <1340291054-16077-1-git-send-email-tparkin@katalix.com>

 
> Subject: [PATCH] l2tp: synchronise u64 stats writer callsites
> 
> Fix statistics update race in l2tp_core.  As described in
> include/linux/u64_stats_sync.h, it is necessary for the writers of
> u64 statistics to ensure mutual exclusion to the seqcount for
> statistics synchronisation.  Failure to do so may result in a
> missed seqcount update, leaving readers blocking forever.
...
> +			spin_lock_bh(&sstats->writelock);
>  			u64_stats_update_begin(&sstats->syncp);
>  			sstats->rx_oos_packets++;
>  			u64_stats_update_end(&sstats->syncp);
> +			spin_unlock_bh(&sstats->writelock);

The purpose of the u64_stats_update_begin/end is to
perform lockless writes of the stats.
If you need to lock them (because multiple threads can
be writing to stats covered by the same 'syncp' at the
same time) then the reader might as well use the same lock.

Otherwise split the 'syncp' such that only one update
can be happening (for each sync).

	David

^ permalink raw reply

* Re: [PATCH] bnx2x: fix panic when TX ring is full
From: Eric Dumazet @ 2012-06-21 15:12 UTC (permalink / raw)
  To: Dmitry Kravkov
  Cc: Tomas Hruby, David Miller, netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com
In-Reply-To: <1340281166.15484.16.camel@lb-tlvb-dmitry>

On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote:

> The crash happens with default configuration since
> [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory
> limits initialization when !CONFIG_SYSCTL", but it can be hit by
> increasing values of tcp_wmem even earlier.

This makes no sense.

> From: Dmitry Kravkov <dmitry@broadcom.com>
> Subject: [PATCH net-next] bnx2x: reservation for NEXT tx BDs
> 
> Commit [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2]
> net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL
> provided new default value for tcp_wmem, since heavy tcp
> traffic may cause the TSO packet to consume 20 BDs + 1 for next page
> descriptor.

This is completely bogus. I have no idea how you came to this.

A forwarding workload can trigger same bug, if GRO is enabled.

Remove this wrong bit, please ?

^ permalink raw reply

* [PATCH] l2tp: synchronise u64 stats writer callsites
From: Tom Parkin @ 2012-06-21 15:04 UTC (permalink / raw)
  To: netdev; +Cc: Tom Parkin, James Chapman

Fix statistics update race in l2tp_core.  As described in
include/linux/u64_stats_sync.h, it is necessary for the writers of
u64 statistics to ensure mutual exclusion to the seqcount for
statistics synchronisation.  Failure to do so may result in a
missed seqcount update, leaving readers blocking forever.

This race was discovered on an AMD64 SMP machine running a 32bit
kernel.  Running "ip l2tp" while sending data over an Ethernet
pseudowire resulted in an occasional soft lockup in
u64_stats_fetch_begin() called from l2tp_nl_session_send().

Statistics writers are now serialized via. a spinlock in the stats
structure, preventing the seqcount update race.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
---
 net/l2tp/l2tp_core.c |   34 +++++++++++++++++++++++++++++++---
 net/l2tp/l2tp_core.h |    5 +++++
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 32b2155..e9e7cb0 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -343,9 +343,11 @@ static void l2tp_recv_queue_skb(struct l2tp_session *session, struct sk_buff *sk
 				 "%s: pkt %hu, inserted before %hu, reorder_q len=%d\n",
 				 session->name, ns, L2TP_SKB_CB(skbp)->ns,
 				 skb_queue_len(&session->reorder_q));
+			spin_lock_bh(&sstats->writelock);
 			u64_stats_update_begin(&sstats->syncp);
 			sstats->rx_oos_packets++;
 			u64_stats_update_end(&sstats->syncp);
+			spin_unlock_bh(&sstats->writelock);
 			goto out;
 		}
 	}
@@ -370,15 +372,20 @@ static void l2tp_recv_dequeue_skb(struct l2tp_session *session, struct sk_buff *
 	skb_orphan(skb);
 
 	tstats = &tunnel->stats;
+	spin_lock_bh(&tstats->writelock);
 	u64_stats_update_begin(&tstats->syncp);
-	sstats = &session->stats;
-	u64_stats_update_begin(&sstats->syncp);
 	tstats->rx_packets++;
 	tstats->rx_bytes += length;
+	u64_stats_update_end(&tstats->syncp);
+	spin_unlock_bh(&tstats->writelock);
+
+	sstats = &session->stats;
+	spin_lock_bh(&sstats->writelock);
+	u64_stats_update_begin(&sstats->syncp);
 	sstats->rx_packets++;
 	sstats->rx_bytes += length;
-	u64_stats_update_end(&tstats->syncp);
 	u64_stats_update_end(&sstats->syncp);
+	spin_unlock_bh(&sstats->writelock);
 
 	if (L2TP_SKB_CB(skb)->has_seq) {
 		/* Bump our Nr */
@@ -420,10 +427,12 @@ start:
 	sstats = &session->stats;
 	skb_queue_walk_safe(&session->reorder_q, skb, tmp) {
 		if (time_after(jiffies, L2TP_SKB_CB(skb)->expires)) {
+			spin_lock_bh(&sstats->writelock);
 			u64_stats_update_begin(&sstats->syncp);
 			sstats->rx_seq_discards++;
 			sstats->rx_errors++;
 			u64_stats_update_end(&sstats->syncp);
+			spin_unlock_bh(&sstats->writelock);
 			l2tp_dbg(session, L2TP_MSG_SEQ,
 				 "%s: oos pkt %u len %d discarded (too old), waiting for %u, reorder_q_len=%d\n",
 				 session->name, L2TP_SKB_CB(skb)->ns,
@@ -599,9 +608,11 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb,
 				  "%s: cookie mismatch (%u/%u). Discarding.\n",
 				  tunnel->name, tunnel->tunnel_id,
 				  session->session_id);
+			spin_lock_bh(&sstats->writelock);
 			u64_stats_update_begin(&sstats->syncp);
 			sstats->rx_cookie_discards++;
 			u64_stats_update_end(&sstats->syncp);
+			spin_unlock_bh(&sstats->writelock);
 			goto discard;
 		}
 		ptr += session->peer_cookie_len;
@@ -670,9 +681,11 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb,
 			l2tp_warn(session, L2TP_MSG_SEQ,
 				  "%s: recv data has no seq numbers when required. Discarding.\n",
 				  session->name);
+			spin_lock_bh(&sstats->writelock);
 			u64_stats_update_begin(&sstats->syncp);
 			sstats->rx_seq_discards++;
 			u64_stats_update_end(&sstats->syncp);
+			spin_unlock_bh(&sstats->writelock);
 			goto discard;
 		}
 
@@ -691,9 +704,11 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb,
 			l2tp_warn(session, L2TP_MSG_SEQ,
 				  "%s: recv data has no seq numbers when required. Discarding.\n",
 				  session->name);
+			spin_lock_bh(&sstats->writelock);
 			u64_stats_update_begin(&sstats->syncp);
 			sstats->rx_seq_discards++;
 			u64_stats_update_end(&sstats->syncp);
+			spin_unlock_bh(&sstats->writelock);
 			goto discard;
 		}
 	}
@@ -747,9 +762,11 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb,
 			 * packets
 			 */
 			if (L2TP_SKB_CB(skb)->ns != session->nr) {
+				spin_lock_bh(&sstats->writelock);
 				u64_stats_update_begin(&sstats->syncp);
 				sstats->rx_seq_discards++;
 				u64_stats_update_end(&sstats->syncp);
+				spin_unlock_bh(&sstats->writelock);
 				l2tp_dbg(session, L2TP_MSG_SEQ,
 					 "%s: oos pkt %u len %d discarded, waiting for %u, reorder_q_len=%d\n",
 					 session->name, L2TP_SKB_CB(skb)->ns,
@@ -775,9 +792,11 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb,
 	return;
 
 discard:
+	spin_lock_bh(&sstats->writelock);
 	u64_stats_update_begin(&sstats->syncp);
 	sstats->rx_errors++;
 	u64_stats_update_end(&sstats->syncp);
+	spin_unlock_bh(&sstats->writelock);
 	kfree_skb(skb);
 
 	if (session->deref)
@@ -892,9 +911,11 @@ discard_bad_csum:
 	LIMIT_NETDEBUG("%s: UDP: bad checksum\n", tunnel->name);
 	UDP_INC_STATS_USER(tunnel->l2tp_net, UDP_MIB_INERRORS, 0);
 	tstats = &tunnel->stats;
+	spin_lock_bh(&tstats->writelock);
 	u64_stats_update_begin(&tstats->syncp);
 	tstats->rx_errors++;
 	u64_stats_update_end(&tstats->syncp);
+	spin_unlock_bh(&tstats->writelock);
 	kfree_skb(skb);
 
 	return 0;
@@ -1051,8 +1072,10 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 
 	/* Update stats */
 	tstats = &tunnel->stats;
+	spin_lock_bh(&tstats->writelock);
 	u64_stats_update_begin(&tstats->syncp);
 	sstats = &session->stats;
+	spin_lock_bh(&sstats->writelock);
 	u64_stats_update_begin(&sstats->syncp);
 	if (error >= 0) {
 		tstats->tx_packets++;
@@ -1064,7 +1087,9 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 		sstats->tx_errors++;
 	}
 	u64_stats_update_end(&tstats->syncp);
+	spin_unlock_bh(&tstats->writelock);
 	u64_stats_update_end(&sstats->syncp);
+	spin_unlock_bh(&sstats->writelock);
 
 	return 0;
 }
@@ -1575,6 +1600,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 	tunnel->magic = L2TP_TUNNEL_MAGIC;
 	sprintf(&tunnel->name[0], "tunl %u", tunnel_id);
 	rwlock_init(&tunnel->hlist_lock);
+	spin_lock_init(&tunnel->stats.writelock);
 
 	/* The net we belong to */
 	tunnel->l2tp_net = net;
@@ -1758,6 +1784,8 @@ struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunn
 		INIT_HLIST_NODE(&session->hlist);
 		INIT_HLIST_NODE(&session->global_hlist);
 
+		spin_lock_init(&session->stats.writelock);
+
 		/* Inherit debug options from tunnel */
 		session->debug = tunnel->debug;
 
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index a38ec6c..80c4475 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -35,6 +35,10 @@ enum {
 
 struct sk_buff;
 
+/* Stats are synchronised via a synchronisation point for safe
+ * reader/writer access on 64 and 32 bit kernels.  Multi-threaded
+ * stats writers are serialized through a spinlock.
+ */
 struct l2tp_stats {
 	u64			tx_packets;
 	u64			tx_bytes;
@@ -46,6 +50,7 @@ struct l2tp_stats {
 	u64			rx_errors;
 	u64			rx_cookie_discards;
 	struct u64_stats_sync	syncp;
+	spinlock_t		writelock;
 };
 
 struct l2tp_tunnel;
-- 
1.7.9.5

^ permalink raw reply related

* smsc95xx: download ok, upload hangs
From: Émeric Vigier @ 2012-06-21 14:34 UTC (permalink / raw)
  To: netdev kernel; +Cc: steve.glendinning, Jérôme Oufella
In-Reply-To: <674701881.595779.1340222798864.JavaMail.root@mail.savoirfairelinux.com>

Hi,

I am experiencing ethernet issue with a pandaboard A3 (OMAP4430 rev 2.2) featuring smsc LAN9514-JZX usbnet chipset.
My panda runs android-4.0.4 with linux kernel 3.0.8 (the latest from omapzoom).

Receiving ethernet frames work fine, but transmitting them does not. The driver/chip seems stuck.
Moving the USB mouse (or USB keyboard key pressed) unlocks this behavior and transmission gets resumed for a second or two. Then ethernet transmission gets stuck again.

Recently, I cherry-picked dozen of usbnet and smsc95xx patches, and managed to get a watchdog barking (see test 21 below).

Unfortunately I have no JTAG probe, so I am limited to driver tweaks and tryouts...
Here are the tests I have performed so far, along with a todo list:

FAILED means that the issue came up.
PASSED means that the issue has not come up.
DONE, NOT DONE, ONGOING are more related to a todo list than a test report.

1. check with constant cpu load (stress -c 2) - FAILED
2. check if problem occurs on older releases (non ICS) - NOT DONE
3. try CONFIG_PL310_ERRATA_769419 patch in cpuidle - FAILED
4. check without USB hub connected - FAILED
5. check with usbcore.autosuspend=600 added to cmdline - FAILED
6. patch ehci-omap.c to verify clock frequency - NOT DONE
7. check with CPU1 offlined - FAILED
8. check ethtool on android - FAILED
Cannot get register dump: Operation not supported on transport endpoint
9. check without USB_EHCI_TT_NEWSCHED - FAILED

10. try to unbind, rebind smsc95xx - FAILED
11. disable turbo_mode and reset the chip - FAILED
12. test with "CONFIG_NO_HZ is not set" - FAILED
13. test with another external USB ethernet dongle - NOT DONE
14. test linaro-12.05 ICS release and see ethernet behavior - PASSED
Ethernet runs fine on release:
. 12.05 tracking - PASSED
. 11.10 tracking - PASSED
. 11.09 release - PASSED
15. try with "netcfg eth0 dhcp" - FAILED
16. check datasheet - ONGOING
registers description is missing on 9514.pdf, only eeprom is described
17. adapt driver to ethtool - DONE
18. dump registers and check against linaro 11.09 - ONGOING
19. ethtool returns heaps of "0", the pattern I added to the array is all replaced by "0"...
Actually the eeprom is blank. I found it out since each time I unbind/bind the device to smsc95xx driver, I get a random MAC address...

20. test with 11.09 linaro kernel - NOT DONE
zygote not starting
21. uploading 24MB file on the web (http://dl.free.fr) - FAILED
This occurred only with these patches added to my kernel:
 From 8a78335 [PATCH] usbnet: consider device busy at each recieved packet
 From 5d5440a [PATCH] usbnet: don't clear urb->dev in tx_complete
 From 4231d47 [PATCH] net/usbnet: avoid recursive locking in usbnet_stop()
 From 1aa9bc5 [PATCH] usbnet: use netif_tx_wake_queue instead of netif_start_queue
 From 7bdd402 [PATCH] net/usbnet: reserve headroom on rx skbs
 From 0956a8c [PATCH] usbnet: increase URB reference count before usb_unlink_urb
 From 9bbf566 [PATCH] net: usb: smsc95xx: fix mtu
 From 720f3d7 [PATCH] usbnet: fix leak of transfer buffer of dev->interrupt
 From a472384 [PATCH] usbnet: fix failure handling in usbnet_probe
 From 5b6e9bc [PATCH] usbnet: fix skb traversing races during unlink(v2)
 From 07d69d4 [PATCH] smsc95xx: mark link down on startup and let PHY interrupt
a timeout occurred:
http://pastebin.com/KpaTJY3N

My current kernel is based on:
commit 52f476403350050beb0dff135a55c06c9e7a82a9
Author: Jean-Baptiste Queru <jbq@google.com>
Subject: Revert "gpu: pvr: Revert to 1.8@550175"

I managed to get a register and PHY dump when upload is stuck, thanks to ethtool:

000:     01 00 00 ec 00 00 00 00 00 00 00 00 00 00 00 00
010:     04 00 00 00 00 14 00 00 00 00 00 00 00 20 00 00
020:     81 00 00 00 00 00 11 01 1f 00 00 1f a0 30 f8 00
030:     00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 00
040:     00 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00
050:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
060:     00 00 00 00 00 00 00 00 00 80 00 00 00 20 00 00
070:     00 00 00 00 83 0f 83 0f 00 00 00 00 0f 06 0f 06
080:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
090:     00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
0a0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0b0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0c0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0d0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0e0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0f0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
100:     0c 20 10 00 f7 6f 00 00 a2 7d 13 dd 00 00 00 40
110:     20 00 00 80 40 09 00 00 e1 c1 00 00 00 00 00 00
120:     00 81 00 00 ff ff 00 00 00 00 00 00 00 00 00 00
130:     00 31 00 00 2d 78 00 00 07 00 00 00 c3 c0 00 00
140:     e1 0d 00 00 e1 c1 00 00 0b 00 00 00 ff ff 00 00
150:     ff ff 00 00 ff ff 00 00 ff ff 00 00 ff ff 00 00
160:     ff ff 00 00 ff ff 00 00 ff ff 00 00 00 00 00 00
170:     40 00 00 00 02 00 00 00 e1 00 00 00 ff ff 00 00
180:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
190:     ff ff 00 00 ff ff 00 00 00 00 00 00 0a 00 00 00
1a0:     00 00 00 00 c8 00 00 00 50 00 00 00 58 10 00 00

But the 9514.pdf datasheet I have misses register description.
Then decoding all this is quite troublesome.

I saw that Ubuntu release got trouble with this chipset and acpi. But there is no acpi on Android AFAIK.
Did anyone else experience this issue?
Does anyone have an idea where it can come from?

Thanks a lot for your kind support,
Emeric

^ permalink raw reply

* PTP + H/W time stamping + CONFIG_PREEMPT_RT_FULL == Oops in slab
From: satpal parmar @ 2012-06-21 14:39 UTC (permalink / raw)
  To: netdev

Hi  All!

Recently  I successfully  integrated a certain
(http://sourceforge.net/scm/?type=svn&group_id=503885) open source
implementation of PTP protocol with Linux  3.0.1 running on ARM based
SoC. Since Phy  (Phyter DP 83640) on board  supported  H/W time
stamping support I am using for that for the same. Integration worked
smoothly  until I applied RT patch and enabled full preemption
(CONFIG_PREEMPT_RT_FULL).

After applying RT patch and enabling full preemption  I start getting
oops in slab allocator. I did some goggling  but got mixed results.
Some of the articles says slabs are not stable with RT patch while the
code comments in slab allocator says its preemption safe. Since I am
getting this oops only while running PTP client I am not certain if
its slab corruption issue (network traffic running smoothly).  I did
little debugging and I figure out that If I comment out h/w time
stamping in TX buy commenting out  skb_tx_timestamp from  network
drivers ndo_start_xmit function I am not getting any oops. Moreover if
I run ptp as high priority rt thread ,again I am not egtting nay oops.
Now I am got little confused as there are too many black boxes to
debug.  With RT patch, slabs, network driver, PTP client  I am not
sure which way to go. I am just wondering If anyone tried what I am
doing and came across similar issues. Or if anyone well verse in
either of these subsystem can help me eliminate some of the
possibilities.

I am not sure if this is the right mailing list. If anyone know better
forum please direct me to that.  Appreciate your patience and time.

-Satpal

Logs:

1. With CONFIG_PREEMPT_RT_FULL=y

tarting kernel ...

Uncompressing Linux... done, booting the kernel.
Linux version 3.0.1-rt11-svn7595 (satpal@BTS-Server) (gcc version
4.3.3 (Sourcery G++ Lite 2009q1-203) ) #29 PREEMPT RT Thu Jun 21
19:47:06 IST 2012
CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c53c7f
CPU: VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine: picodmb
Truncating RAM at 80000000-bfffffff to -afffffff (vmalloc region overlap).
reserved size = 0 at 0
Memory policy: ECC disabled, Data cache writeback
OMAP chip is TI8148
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 195072
Kernel command line: console=ttyO0,115200n8 mem=1024MB root=/dev/ram
rw initrd=0x82000000,20MB ramdisk_size=131072 earlyprintk
mtdparts=physmap-flash.0:512k@0(bdp0)ro,512k(bdp1)ro,128k(env0)ro,128k(env1)ro,2M(Kernel0),2M(Kernel1),10M(rootfs0),10M(rootfs1),10M(logs),10M(rw-fs),512k(Factory)ro,128k(post),1M(misc),-(FreeNOR)
run_app=no
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 768MB = 768MB total
Memory: 754044k/754044k available, 32388k reserved, 0K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    DMA     : 0xffc00000 - 0xffe00000   (   2 MB)
    vmalloc : 0xf0800000 - 0xf8000000   ( 120 MB)
    lowmem  : 0xc0000000 - 0xf0000000   ( 768 MB)
    modules : 0xbf000000 - 0xc0000000   (  16 MB)
      .init : 0xc0008000 - 0xc0032000   ( 168 kB)
      .text : 0xc0032000 - 0xc045f000   (4276 kB)
      .data : 0xc0460000 - 0xc04984a0   ( 226 kB)
       .bss : 0xc04984c4 - 0xc04cb470   ( 204 kB)
Preemptible hierarchical RCU implementation.
        Verbose stalled-CPUs detection is disabled.
NR_IRQS:375
IRQ: Found an INTC at 0xfa200000 (revision 5.0) with 128 interrupts
Total of 128 interrupts on 1 active controller
OMAP clockevent source: GPTIMER1 at 20000000 Hz
Console: colour dummy device 80x30
Calibrating delay loop... 597.60 BogoMIPS (lpj=2988032)
pid_max: default: 32768 minimum: 301
Security Framework initialized
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
devtmpfs: initialized
omap_hwmod: _populate_mpu_rt_base found no _mpu_rt_va for l3_slow
omap_hwmod: _populate_mpu_rt_base found no _mpu_rt_va for l4_slow
print_constraints: dummy:
NET: Registered protocol family 16
GPMC revision 6.0
Trying to set irq flags for IRQ375
ahci : Failed to get AHCI clock
Registered ti81xx_fb device
NSS Crypto DMA hardware revision 1.9 @ IRQ 116
bio: create slab <bio-0> at 0
omap_i2c omap_i2c.1: bus 1 rev4.0 at 100 kHz
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
<giometti@linux.it>
PTP clock support registered
Switching to clocksource gp timer
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 8, 1572864 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
UDP hash table entries: 512 (order: 3, 32768 bytes)
UDP-Lite hash table entries: 512 (order: 3, 32768 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Trying to unpack rootfs image as initramfs...
rootfs image is not initramfs (no cpio magic); looks like an initrd
Freeing initrd memory: 20480K
NetWinder Floating Point Emulator V0.97 (double precision)
omap-iommu omap-iommu.0: ducati registered
omap-iommu omap-iommu.1: sys registered
JFFS2 version 2.2. (NAND) Â© 2001-2006 Red Hat, Inc.
msgmni has been set to 1512
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
omap_uart.0: ttyO0 at MMIO 0x48020000 (irq = 72) is a OMAP UART0
console [ttyO0] enabled
omap_uart.1: ttyO1 at MMIO 0x48022000 (irq = 73) is a OMAP UART1
omap_uart.2: ttyO2 at MMIO 0x48024000 (irq = 74) is a OMAP UART2
omap_uart.3: ttyO3 at MMIO 0x481a6000 (irq = 44) is a OMAP UART3
omap_uart.4: ttyO4 at MMIO 0x481a8000 (irq = 45) is a OMAP UART4
omap_uart.5: ttyO5 at MMIO 0x481aa000 (irq = 46) is a OMAP UART5
brd: module loaded
loop: module loaded
physmap platform flash device: 04000000 at 08000000
physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank.
Manufacturer ID 0x000089 Chip ID 0x008965
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Using buffer write method
Using auto-unlock on power-up/resume
cfi_cmdset_0001: Erase suspend on write enabled
14 cmdlinepart partitions found on MTD device physmap-flash.0
Creating 14 MTD partitions on "physmap-flash.0":
0x000000000000-0x000000080000 : "bdp0"
0x000000080000-0x000000100000 : "bdp1"
0x000000100000-0x000000120000 : "env0"
0x000000120000-0x000000140000 : "env1"
0x000000140000-0x000000340000 : "Kernel0"
0x000000340000-0x000000540000 : "Kernel1"
0x000000540000-0x000000f40000 : "rootfs0"
0x000000f40000-0x000001940000 : "rootfs1"
0x000001940000-0x000002340000 : "logs"
0x000002340000-0x000002d40000 : "rw-fs"
0x000002d40000-0x000002dc0000 : "Factory"
0x000002dc0000-0x000002de0000 : "post"
0x000002de0000-0x000002ee0000 : "misc"
0x000002ee0000-0x000004000000 : "FreeNOR"
Generic platform RAM MTD, (c) 2004 Simtec Electronics
davinci_mdio davinci_mdio.0: davinci mdio revision 1.6
davinci_mdio davinci_mdio.0: detected phy mask fffffffd
davinci_mdio.0: probed
davinci_mdio davinci_mdio.0: phy[1]: device 0:01, driver NatSemi DP83640
mousedev: PS/2 mouse device common for all mice
lm73 1-0048: hwmon0: sensor 'lm73'
lm73 1-0049: hwmon1: sensor 'lm73'
omap_i2c omap_i2c.1: controller timed out
omap_i2c omap_i2c.1: controller timed out
OMAP Watchdog Timer Rev 0x00: initial timeout 60 sec
nss_aes_mod_init: loading NSS AES driver
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 0 @0x41140000)
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 1 @0x41141000)
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 2 @0x411a0000)
nss-aes nss-aes: NSS AES hw accel rev: 3.2 (context 3 @0x411a1000)
nss_aes_probe: probe() done
nss_des_mod_init: loading NSS DES driver
nss-des nss-des: NSS DES hw accel rev: 2.2 (context 0 @0x41160000)
nss-des nss-des: NSS DES hw accel rev: 2.2 (context 1 @0x41161000)
nss_des_probe: probe() done
nss_sham_mod_init: loading NSS SHA/MD5 driver
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 0 @0x41100000)
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 1 @0x41101000)
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 2 @0x411c0000)
nss-sham nss-sham: NSS SHA/MD5 hw accel rev: 4.03 (context 3 @0x411c1000)
nss_sham_probe: probe() done
dsp_hpi: initialized
fpga: initialized
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 10
NET: Registered protocol family 17
NET: Registered protocol family 15
sctp: Hash tables configured (established 32768 bind 43690)
Registering the dns_resolver key type
VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 3
clock: disabling unused clocks to save power
Detected MACID=3:2:1:2:e0:2

System initialization...

Mounting /sys          : [OK]
Populating /dev        : [OK]
Mounting fstab         : [OK]
Mounting devpts        : [OK]
Starting unconfigured  : [OK]
Starting Net           : ## Error: "ue" not defined
[OK]
Starting Logging       : [OK]
Starting telnet daemon : [OK]
Starting setup.sh      : [OK]
Executing setup        : [OK]
---- Warning application (/root/lm_cp_pico.elf) not started -----
Starting ptpd: [OK]

Please press Enter to activate this console. env variable run_app=no,
so supervisor will "free run", (to keep init happy)
RAMDISK: gzip image found at block 0
VFS: Mounted root (ext2 filesystem) on device 1:0.
devtmpfs: mounted
Freeing init memory: 168K

CPSW phy found : id is : 0x20005ce1
**************phy_start****************
c2cc0e00:phy_start:phydev->state:4

CPSW phyless for slave=1
ADDRCONF(NETDEV_UP): eth0: link is not ready
c2cc0e00:phy_start_aneg:phydev->state:5
PHY: 0:01 - Link is Up - 100/Full
link up on port 1, speed 1000, full duplex
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
bug:slab names_cache, slabp->inuse= -1, cachep->num:1
kernel BUG at /home/satpal/sandbox/console_p/trunk/bts/source/vendor/linux/mm/slab.c:3205!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e9e08000
[00000000] *pgd=ae31c831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (3.0.1-rt11-svn7595 #29)
PC is at __bug+0x20/0x2c
LR is at vprintk+0x18c/0x470
pc : [<c0041ee0>]    lr : [<c005eaa0>]    psr: 60000013
sp : ef89fea8  ip : 60000013  fp : ef89feb4
r10: ef8009c0  r9 : ef811f20  r8 : ef807e00
r7 : 0000000c  r6 : 000000d0  r5 : ffffffff  r4 : c2cf92c0
r3 : 00000000  r2 : 00000000  r1 : ef89e000  r0 : 00000061
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: a9e08019  DAC: 00000015
Process ptp-main (pid: 136, stack limit = 0xef89e2e8)
Stack: (0xef89fea8 to 0xef8a0000)
fea0:                   ef89ff04 ef89feb8 c00c3fd4 c0041ecc c04672a0 00000010
fec0: 000000d0 ef89e000 000000d0 ef811f44 ef811f30 ef811f28 ef89e000 00000000
fee0: ef89e000 c0484cdc ef8009c0 000000d0 ef89e000 00000000 ef89ff2c ef89ff08
ff00: c00c4628 c00c3d60 00120dc4 40b73490 00000002 00000000 c0462284 ef89e000
ff20: ef89ff54 ef89ff30 c00d2468 c00c4554 00000001 40b73490 00000002 ffffff9c
ff40: c003ea28 ef89e000 ef89ff64 ef89ff58 c00d2548 c00d2450 ef89ff94 ef89ff68
ff60: c00c5780 c00d2540 00000002 00000000 00000026 00000100 00000000 40b73490
ff80: 40b72fc0 00000005 ef89ffa4 ef89ff98 c00c5840 c00c56d8 00000000 ef89ffa8
ffa0: c003e880 c00c5828 00000000 40b73490 00120dc4 00000002 00fc0a28 40b72da4
ffc0: 00000000 40b73490 40b72fc0 00000005 003d0f00 00000000 4007e134 40b72d94
ffe0: 00000000 40b72d68 4007b0a7 4007c154 80000010 00120dc4 a87ff0f3 7d3f11fa
Backtrace:
[<c0041ec0>] (__bug+0x0/0x2c) from [<c00c3fd4>] (cache_alloc_refill+0x280/0x680)
[<c00c3d54>] (cache_alloc_refill+0x0/0x680) from [<c00c4628>]
(kmem_cache_alloc+0xe0/0x110)
[<c00c4548>] (kmem_cache_alloc+0x0/0x110) from [<c00d2468>]
(getname_flags+0x24/0xf0)
 r9:ef89e000 r8:c0462284 r7:00000000 r6:00000002 r5:40b73490
r4:00120dc4
[<c00d2444>] (getname_flags+0x0/0xf0) from [<c00d2548>] (getname+0x14/0x18)
 r9:ef89e000 r8:c003ea28 r7:ffffff9c r6:00000002 r5:40b73490
r4:00000001
[<c00d2534>] (getname+0x0/0x18) from [<c00c5780>] (do_sys_open+0xb4/0x13c)
[<c00c56cc>] (do_sys_open+0x0/0x13c) from [<c00c5840>] (sys_open+0x24/0x28)
 r7:00000005 r6:40b72fc0 r5:40b73490 r4:00000000
[<c00c581c>] (sys_open+0x0/0x28) from [<c003e880>] (ret_fast_syscall+0x0/0x30)
Code: e1a01000 e59f000c eb0cb576 e3a03000 (e5833000)

^ permalink raw reply

* [PATCH 13/13] netfilter: nf_conntrack_l4proto_icmpv6 cleanup
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

add function icmpv6_kmemdup_sysctl_table to make codes
more clearer.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c |   17 +++++++++++++----
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index 807ae09..9fc5cf5 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -333,22 +333,31 @@ static struct ctl_table icmpv6_sysctl_table[] = {
 };
 #endif /* CONFIG_SYSCTL */
 
-static int icmpv6_init_net(struct net *net, u_int16_t proto)
+static int icmpv6_kmemdup_sysctl_table(struct nf_proto_net *pn,
+				       struct nf_icmp_net *in)
 {
-	struct nf_icmp_net *in = icmpv6_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)in;
-	in->timeout = nf_ct_icmpv6_timeout;
 #ifdef CONFIG_SYSCTL
 	pn->ctl_table = kmemdup(icmpv6_sysctl_table,
 				sizeof(icmpv6_sysctl_table),
 				GFP_KERNEL);
 	if (!pn->ctl_table)
 		return -ENOMEM;
+
 	pn->ctl_table[0].data = &in->timeout;
 #endif
 	return 0;
 }
 
+static int icmpv6_init_net(struct net *net, u_int16_t proto)
+{
+	struct nf_icmp_net *in = icmpv6_pernet(net);
+	struct nf_proto_net *pn = &in->pn;
+
+	in->timeout = nf_ct_icmpv6_timeout;
+
+	return icmpv6_kmemdup_sysctl_table(pn, in);
+}
+
 struct nf_conntrack_l4proto nf_conntrack_l4proto_icmpv6 __read_mostly =
 {
 	.l3proto		= PF_INET6,
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 12/13] netfilter: nf_conntrack_l4proto_icmp cleanup
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

add two functions icmp_kmemdup_sysctl_table and
icmp_kmemdup_compat_sysctl_table to make codes more
clearer.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/ipv4/netfilter/nf_conntrack_proto_icmp.c |   41 ++++++++++++++++++++------
 1 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index 76f7a2f..9c2095c 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -337,34 +337,57 @@ static struct ctl_table icmp_compat_sysctl_table[] = {
 #endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 #endif /* CONFIG_SYSCTL */
 
-static int icmp_init_net(struct net *net, u_int16_t proto)
+static int icmp_kmemdup_sysctl_table(struct nf_proto_net *pn,
+				     struct nf_icmp_net *in)
 {
-	struct nf_icmp_net *in = icmp_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)in;
-	in->timeout = nf_ct_icmp_timeout;
-
 #ifdef CONFIG_SYSCTL
 	pn->ctl_table = kmemdup(icmp_sysctl_table,
 				sizeof(icmp_sysctl_table),
 				GFP_KERNEL);
 	if (!pn->ctl_table)
 		return -ENOMEM;
+
 	pn->ctl_table[0].data = &in->timeout;
+#endif
+	return 0;
+}
+
+static int icmp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn,
+					    struct nf_icmp_net *in)
+{
+#ifdef CONFIG_SYSCTL
 #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
 	pn->ctl_compat_table = kmemdup(icmp_compat_sysctl_table,
 				       sizeof(icmp_compat_sysctl_table),
 				       GFP_KERNEL);
-	if (!pn->ctl_compat_table) {
-		kfree(pn->ctl_table);
-		pn->ctl_table = NULL;
+	if (!pn->ctl_compat_table)
 		return -ENOMEM;
-	}
+
 	pn->ctl_compat_table[0].data = &in->timeout;
 #endif
 #endif
 	return 0;
 }
 
+static int icmp_init_net(struct net *net, u_int16_t proto)
+{
+	int ret;
+	struct nf_icmp_net *in = icmp_pernet(net);
+	struct nf_proto_net *pn = &in->pn;
+
+	in->timeout = nf_ct_icmp_timeout;
+
+	ret = icmp_kmemdup_compat_sysctl_table(pn, in);
+	if (ret < 0)
+		return ret;
+
+	ret = icmp_kmemdup_sysctl_table(pn, in);
+	if (ret < 0)
+		nf_ct_kfree_compat_sysctl_table(pn);
+
+	return ret;
+}
+
 struct nf_conntrack_l4proto nf_conntrack_l4proto_icmp __read_mostly =
 {
 	.l3proto		= PF_INET,
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 10/13] netfilter: nf_conntrack_l4proto_generic cleanup
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

some cleanup of nf_conntrack_l4proto_generic,
split the code to make it more clearer.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto_generic.c |   39 ++++++++++++++++++++++-----
 1 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_generic.c b/net/netfilter/nf_conntrack_proto_generic.c
index d1ed7b4..7c11c54 100644
--- a/net/netfilter/nf_conntrack_proto_generic.c
+++ b/net/netfilter/nf_conntrack_proto_generic.c
@@ -135,34 +135,57 @@ static struct ctl_table generic_compat_sysctl_table[] = {
 #endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 #endif /* CONFIG_SYSCTL */
 
-static int generic_init_net(struct net *net, u_int16_t proto)
+static int generic_kmemdup_sysctl_table(struct nf_proto_net *pn,
+					struct nf_generic_net *gn)
 {
-	struct nf_generic_net *gn = generic_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)gn;
-	gn->timeout = nf_ct_generic_timeout;
 #ifdef CONFIG_SYSCTL
 	pn->ctl_table = kmemdup(generic_sysctl_table,
 				sizeof(generic_sysctl_table),
 				GFP_KERNEL);
 	if (!pn->ctl_table)
 		return -ENOMEM;
+
 	pn->ctl_table[0].data = &gn->timeout;
+#endif
+	return 0;
+}
 
+static int generic_kmemdup_compat_sysctl_table(struct nf_proto_net *pn,
+					       struct nf_generic_net *gn)
+{
+#ifdef CONFIG_SYSCTL
 #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
 	pn->ctl_compat_table = kmemdup(generic_compat_sysctl_table,
 				       sizeof(generic_compat_sysctl_table),
 				       GFP_KERNEL);
-	if (!pn->ctl_compat_table) {
-		kfree(pn->ctl_table);
-		pn->ctl_table = NULL;
+	if (!pn->ctl_compat_table)
 		return -ENOMEM;
-	}
+
 	pn->ctl_compat_table[0].data = &gn->timeout;
 #endif
 #endif
 	return 0;
 }
 
+static int generic_init_net(struct net *net, u_int16_t proto)
+{
+	int ret;
+	struct nf_generic_net *gn = generic_pernet(net);
+	struct nf_proto_net *pn = &gn->pn;
+
+	gn->timeout = nf_ct_generic_timeout;
+
+	ret = generic_kmemdup_compat_sysctl_table(pn, gn);
+	if (ret < 0)
+		return ret;
+
+	ret = generic_kmemdup_sysctl_table(pn, gn);
+	if (ret < 0)
+		nf_ct_kfree_compat_sysctl_table(pn);
+
+	return ret;
+}
+
 struct nf_conntrack_l4proto nf_conntrack_l4proto_generic __read_mostly =
 {
 	.l3proto		= PF_UNSPEC,
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 08/13] netfilter: nf_conntrack_l4proto_udplite[4,6] cleanup
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

some cleanup for nf_conntrack_l4proto_udplite[4,6],
make codes more clearer and ready for moving the
sysctl code to nf_conntrack_proto_*_sysctl.c to
reduce the ifdef pollution.

and use nf_proto_net.users to identify if it's the first time
we use the nf_proto_net. when it's the first time,we will
initialized it.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto_udplite.c |   43 +++++++++++++++++-----------
 1 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_udplite.c b/net/netfilter/nf_conntrack_proto_udplite.c
index d33e511..4b66df2 100644
--- a/net/netfilter/nf_conntrack_proto_udplite.c
+++ b/net/netfilter/nf_conntrack_proto_udplite.c
@@ -234,29 +234,38 @@ static struct ctl_table udplite_sysctl_table[] = {
 };
 #endif /* CONFIG_SYSCTL */
 
-static int udplite_init_net(struct net *net, u_int16_t proto)
+static int udplite_kmemdup_sysctl_table(struct nf_proto_net *pn,
+					struct udplite_net *un)
 {
-	int i;
-	struct udplite_net *un = udplite_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)un;
 #ifdef CONFIG_SYSCTL
-	if (!pn->ctl_table) {
-#else
-	if (!pn->users++) {
+	if (pn->ctl_table)
+		return 0;
+
+	pn->ctl_table = kmemdup(udplite_sysctl_table,
+				sizeof(udplite_sysctl_table),
+				GFP_KERNEL);
+	if (!pn->ctl_table)
+		return -ENOMEM;
+
+	pn->ctl_table[0].data = &un->timeouts[UDPLITE_CT_UNREPLIED];
+	pn->ctl_table[1].data = &un->timeouts[UDPLITE_CT_REPLIED];
 #endif
+	return 0;
+}
+
+static int udplite_init_net(struct net *net, u_int16_t proto)
+{
+	struct udplite_net *un = udplite_pernet(net);
+	struct nf_proto_net *pn = &un->pn;
+
+	if (!pn->users) {
+		int i;
+
 		for (i = 0 ; i < UDPLITE_CT_MAX; i++)
 			un->timeouts[i] = udplite_timeouts[i];
-#ifdef CONFIG_SYSCTL
-		pn->ctl_table = kmemdup(udplite_sysctl_table,
-					sizeof(udplite_sysctl_table),
-					GFP_KERNEL);
-		if (!pn->ctl_table)
-			return -ENOMEM;
-		pn->ctl_table[0].data = &un->timeouts[UDPLITE_CT_UNREPLIED];
-		pn->ctl_table[1].data = &un->timeouts[UDPLITE_CT_REPLIED];
-#endif
 	}
-	return 0;
+
+	return udplite_kmemdup_sysctl_table(pn, un);
 }
 
 static struct nf_conntrack_l4proto nf_conntrack_l4proto_udplite4 __read_mostly =
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 07/13] netfilter: merge udpv[4,6]_net_init into udp_net_init
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

merge udpv4_net_init and udpv6_net_init into udp_net_init to
reduce the redundancy codes.

and use nf_proto_net.users to identify if it's the first time
we use the nf_proto_net. when it's the first time,we will
initialized it.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto_udp.c |   65 +++++++++++--------------------
 1 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index 2b978e6..e7e0434 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -235,10 +235,10 @@ static struct ctl_table udp_compat_sysctl_table[] = {
 #endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 #endif /* CONFIG_SYSCTL */
 
-static int udp_kmemdup_sysctl_table(struct nf_proto_net *pn)
+static int udp_kmemdup_sysctl_table(struct nf_proto_net *pn,
+				    struct nf_udp_net *un)
 {
 #ifdef CONFIG_SYSCTL
-	struct nf_udp_net *un = (struct nf_udp_net *)pn;
 	if (pn->ctl_table)
 		return 0;
 	pn->ctl_table = kmemdup(udp_sysctl_table,
@@ -252,11 +252,11 @@ static int udp_kmemdup_sysctl_table(struct nf_proto_net *pn)
 	return 0;
 }
 
-static int udp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn)
+static int udp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn,
+					   struct nf_udp_net *un)
 {
 #ifdef CONFIG_SYSCTL
 #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
-	struct nf_udp_net *un = (struct nf_udp_net *)pn;
 	pn->ctl_compat_table = kmemdup(udp_compat_sysctl_table,
 				       sizeof(udp_compat_sysctl_table),
 				       GFP_KERNEL);
@@ -270,50 +270,31 @@ static int udp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn)
 	return 0;
 }
 
-static void udp_init_net_data(struct nf_udp_net *un)
-{
-	int i;
-#ifdef CONFIG_SYSCTL
-	if (!un->pn.ctl_table) {
-#else
-	if (!un->pn.users++) {
-#endif
-		for (i = 0; i < UDP_CT_MAX; i++)
-			un->timeouts[i] = udp_timeouts[i];
-	}
-}
-
-static int udpv4_init_net(struct net *net, u_int16_t proto)
+static int udp_init_net(struct net *net, u_int16_t proto)
 {
 	int ret;
 	struct nf_udp_net *un = udp_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)un;
-
-	udp_init_net_data(un);
+	struct nf_proto_net *pn = &un->pn;
 
-	ret = udp_kmemdup_compat_sysctl_table(pn);
-	if (ret < 0)
-		return ret;
+	if (!pn->users) {
+		int i;
 
-	ret = udp_kmemdup_sysctl_table(pn);
-#ifdef CONFIG_SYSCTL
-#ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
-	if (ret < 0) {
-		kfree(pn->ctl_compat_table);
-		pn->ctl_compat_table = NULL;
+		for (i = 0; i < UDP_CT_MAX; i++)
+			un->timeouts[i] = udp_timeouts[i];
 	}
-#endif
-#endif
-	return ret;
-}
 
-static int udpv6_init_net(struct net *net, u_int16_t proto)
-{
-	struct nf_udp_net *un = udp_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)un;
+	if (proto == AF_INET) {
+		ret = udp_kmemdup_compat_sysctl_table(pn, un);
+		if (ret < 0)
+			return ret;
 
-	udp_init_net_data(un);
-	return udp_kmemdup_sysctl_table(pn);
+		ret = udp_kmemdup_sysctl_table(pn, un);
+		if (ret < 0)
+			nf_ct_kfree_compat_sysctl_table(pn);
+	} else
+		ret = udp_kmemdup_sysctl_table(pn, un);
+
+	return ret;
 }
 
 struct nf_conntrack_l4proto nf_conntrack_l4proto_udp4 __read_mostly =
@@ -343,7 +324,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_udp4 __read_mostly =
 		.nla_policy	= udp_timeout_nla_policy,
 	},
 #endif /* CONFIG_NF_CT_NETLINK_TIMEOUT */
-	.init_net		= udpv4_init_net,
+	.init_net		= udp_init_net,
 };
 EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_udp4);
 
@@ -374,6 +355,6 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_udp6 __read_mostly =
 		.nla_policy	= udp_timeout_nla_policy,
 	},
 #endif /* CONFIG_NF_CT_NETLINK_TIMEOUT */
-	.init_net		= udpv6_init_net,
+	.init_net		= udp_init_net,
 };
 EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_udp6);
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 06/13] netfilter: merge tcpv[4,6]_net_init into tcp_net_init
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

merge tcpv4_net_init and tcpv6_net_init into tcp_net_init to
reduce the redundancy codes.

and use nf_proto_net.users to identify if it's the first time
we use the nf_proto_net. when it's the first time,we will
initialized it.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto_tcp.c |   71 +++++++++----------------------
 1 files changed, 21 insertions(+), 50 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 6db9d3c..44f0da8 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1533,11 +1533,10 @@ static struct ctl_table tcp_compat_sysctl_table[] = {
 #endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 #endif /* CONFIG_SYSCTL */
 
-static int tcp_kmemdup_sysctl_table(struct nf_proto_net *pn)
+static int tcp_kmemdup_sysctl_table(struct nf_proto_net *pn,
+				    struct nf_tcp_net *tn)
 {
 #ifdef CONFIG_SYSCTL
-	struct nf_tcp_net *tn = (struct nf_tcp_net *)pn;
-
 	if (pn->ctl_table)
 		return 0;
 
@@ -1564,11 +1563,11 @@ static int tcp_kmemdup_sysctl_table(struct nf_proto_net *pn)
 	return 0;
 }
 
-static int tcp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn)
+static int tcp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn,
+					   struct nf_tcp_net *tn)
 {
 #ifdef CONFIG_SYSCTL
 #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
-	struct nf_tcp_net *tn = (struct nf_tcp_net *)pn;
 	pn->ctl_compat_table = kmemdup(tcp_compat_sysctl_table,
 				       sizeof(tcp_compat_sysctl_table),
 				       GFP_KERNEL);
@@ -1593,18 +1592,15 @@ static int tcp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn)
 	return 0;
 }
 
-static int tcpv4_init_net(struct net *net, u_int16_t proto)
+static int tcp_init_net(struct net *net, u_int16_t proto)
 {
-	int i;
-	int ret = 0;
+	int ret;
 	struct nf_tcp_net *tn = tcp_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)tn;
+	struct nf_proto_net *pn = &tn->pn;
+
+	if (!pn->users) {
+		int i;
 
-#ifdef CONFIG_SYSCTL
-	if (!pn->ctl_table) {
-#else
-	if (!pn->users++) {
-#endif
 		for (i = 0; i < TCP_CONNTRACK_TIMEOUT_MAX; i++)
 			tn->timeouts[i] = tcp_timeouts[i];
 
@@ -1613,45 +1609,20 @@ static int tcpv4_init_net(struct net *net, u_int16_t proto)
 		tn->tcp_max_retrans = nf_ct_tcp_max_retrans;
 	}
 
-	ret = tcp_kmemdup_compat_sysctl_table(pn);
-
-	if (ret < 0)
-		return ret;
+	if (proto == AF_INET) {
+		ret = tcp_kmemdup_compat_sysctl_table(pn, tn);
+		if (ret < 0)
+			return ret;
 
-	ret = tcp_kmemdup_sysctl_table(pn);
+		ret = tcp_kmemdup_sysctl_table(pn, tn);
+		if (ret < 0)
+			nf_ct_kfree_compat_sysctl_table(pn);
+	} else
+		ret = tcp_kmemdup_sysctl_table(pn, tn);
 
-#ifdef CONFIG_SYSCTL
-#ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
-	if (ret < 0) {
-		kfree(pn->ctl_compat_table);
-		pn->ctl_compat_table = NULL;
-	}
-#endif
-#endif
 	return ret;
 }
 
-static int tcpv6_init_net(struct net *net, u_int16_t proto)
-{
-	int i;
-	struct nf_tcp_net *tn = tcp_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)tn;
-
-#ifdef CONFIG_SYSCTL
-	if (!pn->ctl_table) {
-#else
-	if (!pn->users++) {
-#endif
-		for (i = 0; i < TCP_CONNTRACK_TIMEOUT_MAX; i++)
-			tn->timeouts[i] = tcp_timeouts[i];
-		tn->tcp_loose = nf_ct_tcp_loose;
-		tn->tcp_be_liberal = nf_ct_tcp_be_liberal;
-		tn->tcp_max_retrans = nf_ct_tcp_max_retrans;
-	}
-
-	return tcp_kmemdup_sysctl_table(pn);
-}
-
 struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp4 __read_mostly =
 {
 	.l3proto		= PF_INET,
@@ -1684,7 +1655,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp4 __read_mostly =
 		.nla_policy	= tcp_timeout_nla_policy,
 	},
 #endif /* CONFIG_NF_CT_NETLINK_TIMEOUT */
-	.init_net		= tcpv4_init_net,
+	.init_net		= tcp_init_net,
 };
 EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_tcp4);
 
@@ -1720,6 +1691,6 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp6 __read_mostly =
 		.nla_policy	= tcp_timeout_nla_policy,
 	},
 #endif /* CONFIG_NF_CT_NETLINK_TIMEOUT */
-	.init_net		= tcpv6_init_net,
+	.init_net		= tcp_init_net,
 };
 EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_tcp6);
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 05/13] netfilter: fix memory leak when register sysctl failed
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

in nf_ct_l4proto_register_sysctl,when register l4proto' sysctl failed,
we should free the compat sysctl table.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto.c |   29 ++++++++++++++++-------------
 1 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 63612e6..42e686b 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -341,25 +341,28 @@ int nf_ct_l4proto_register_sysctl(struct net *net,
 				kfree(pn->ctl_table);
 				pn->ctl_table = NULL;
 			}
-			goto out;
 		}
 	}
 #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
 	if (l4proto->l3proto != AF_INET6 && pn->ctl_compat_table != NULL) {
-		err = nf_ct_register_sysctl(net,
-					    &pn->ctl_compat_header,
-					    "net/ipv4/netfilter",
-					    pn->ctl_compat_table);
-		if (err == 0)
-			goto out;
-
-		nf_ct_kfree_compat_sysctl_table(pn);
-		nf_ct_unregister_sysctl(&pn->ctl_table_header,
-					&pn->ctl_table,
-					pn->users);
+		if (err < 0)
+			nf_ct_kfree_compat_sysctl_table(pn);
+		else {
+			err = nf_ct_register_sysctl(net,
+						    &pn->ctl_compat_header,
+						    "net/ipv4/netfilter",
+						    pn->ctl_compat_table);
+			if (err == 0)
+				goto out;
+
+			nf_ct_kfree_compat_sysctl_table(pn);
+			nf_ct_unregister_sysctl(&pn->ctl_table_header,
+						&pn->ctl_table,
+						pn->users);
+		}
 	}
-#endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 out:
+#endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 #endif /* CONFIG_SYSCTL */
 	return err;
 }
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 04/13] netfilter: regard users as refcount for l4proto's per-net data
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

Now, nf_proto_net's users is confusing.
we should regard it as the refcount for l4proto's per-net data,
because maybe there are two l4protos use the same per-net data.

so increment pn->users when nf_conntrack_l4proto_register
success, and decrement it for nf_conntrack_l4_unregister case.

because nf_conntrack_l3proto_ipv[4|6] don't use the same per-net
data,so we don't need to add a refcnt for their per-net data.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto.c |   76 ++++++++++++++++++++++--------------
 1 files changed, 46 insertions(+), 30 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 9d6b6ab..63612e6 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -39,16 +39,13 @@ static int
 nf_ct_register_sysctl(struct net *net,
 		      struct ctl_table_header **header,
 		      const char *path,
-		      struct ctl_table *table,
-		      unsigned int *users)
+		      struct ctl_table *table)
 {
 	if (*header == NULL) {
 		*header = register_net_sysctl(net, path, table);
 		if (*header == NULL)
 			return -ENOMEM;
 	}
-	if (users != NULL)
-		(*users)++;
 
 	return 0;
 }
@@ -56,9 +53,9 @@ nf_ct_register_sysctl(struct net *net,
 static void
 nf_ct_unregister_sysctl(struct ctl_table_header **header,
 			struct ctl_table **table,
-			unsigned int *users)
+			unsigned int users)
 {
-	if (users != NULL && --*users > 0)
+	if (users > 0)
 		return;
 
 	unregister_net_sysctl_table(*header);
@@ -191,8 +188,7 @@ static int nf_ct_l3proto_register_sysctl(struct net *net,
 		err = nf_ct_register_sysctl(net,
 					    &in->ctl_table_header,
 					    l3proto->ctl_table_path,
-					    in->ctl_table,
-					    NULL);
+					    in->ctl_table);
 		if (err < 0) {
 			kfree(in->ctl_table);
 			in->ctl_table = NULL;
@@ -213,7 +209,7 @@ static void nf_ct_l3proto_unregister_sysctl(struct net *net,
 	if (in->ctl_table_header != NULL)
 		nf_ct_unregister_sysctl(&in->ctl_table_header,
 					&in->ctl_table,
-					NULL);
+					0);
 #endif
 }
 
@@ -329,20 +325,17 @@ static struct nf_proto_net *nf_ct_l4proto_net(struct net *net,
 
 static
 int nf_ct_l4proto_register_sysctl(struct net *net,
+				  struct nf_proto_net *pn,
 				  struct nf_conntrack_l4proto *l4proto)
 {
 	int err = 0;
-	struct nf_proto_net *pn = nf_ct_l4proto_net(net, l4proto);
-	if (pn == NULL)
-		return 0;
 
 #ifdef CONFIG_SYSCTL
 	if (pn->ctl_table != NULL) {
 		err = nf_ct_register_sysctl(net,
 					    &pn->ctl_table_header,
 					    "net/netfilter",
-					    pn->ctl_table,
-					    &pn->users);
+					    pn->ctl_table);
 		if (err < 0) {
 			if (!pn->users) {
 				kfree(pn->ctl_table);
@@ -356,15 +349,14 @@ int nf_ct_l4proto_register_sysctl(struct net *net,
 		err = nf_ct_register_sysctl(net,
 					    &pn->ctl_compat_header,
 					    "net/ipv4/netfilter",
-					    pn->ctl_compat_table,
-					    NULL);
+					    pn->ctl_compat_table);
 		if (err == 0)
 			goto out;
 
 		nf_ct_kfree_compat_sysctl_table(pn);
 		nf_ct_unregister_sysctl(&pn->ctl_table_header,
 					&pn->ctl_table,
-					&pn->users);
+					pn->users);
 	}
 #endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
 out:
@@ -374,25 +366,21 @@ out:
 
 static
 void nf_ct_l4proto_unregister_sysctl(struct net *net,
+				     struct nf_proto_net *pn,
 				     struct nf_conntrack_l4proto *l4proto)
 {
-	struct nf_proto_net *pn = nf_ct_l4proto_net(net, l4proto);
-	if (pn == NULL)
-		return;
 #ifdef CONFIG_SYSCTL
 	if (pn->ctl_table_header != NULL)
 		nf_ct_unregister_sysctl(&pn->ctl_table_header,
 					&pn->ctl_table,
-					&pn->users);
+					pn->users);
 
 #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
 	if (l4proto->l3proto != AF_INET6 && pn->ctl_compat_header != NULL)
 		nf_ct_unregister_sysctl(&pn->ctl_compat_header,
 					&pn->ctl_compat_table,
-					NULL);
+					0);
 #endif /* CONFIG_NF_CONNTRACK_PROC_COMPAT */
-#else
-	pn->users--;
 #endif /* CONFIG_SYSCTL */
 }
 
@@ -458,23 +446,32 @@ int nf_conntrack_l4proto_register(struct net *net,
 				  struct nf_conntrack_l4proto *l4proto)
 {
 	int ret = 0;
+	struct nf_proto_net *pn = NULL;
 
 	if (l4proto->init_net) {
 		ret = l4proto->init_net(net, l4proto->l3proto);
 		if (ret < 0)
-			return ret;
+			goto out;
 	}
 
-	ret = nf_ct_l4proto_register_sysctl(net, l4proto);
+	pn = nf_ct_l4proto_net(net, l4proto);
+	if (pn == NULL)
+		goto out;
+
+	ret = nf_ct_l4proto_register_sysctl(net, pn, l4proto);
 	if (ret < 0)
-		return ret;
+		goto out;
 
 	if (net == &init_net) {
 		ret = nf_conntrack_l4proto_register_net(l4proto);
-		if (ret < 0)
-			nf_ct_l4proto_unregister_sysctl(net, l4proto);
+		if (ret < 0) {
+			nf_ct_l4proto_unregister_sysctl(net, pn, l4proto);
+			goto out;
+		}
 	}
 
+	pn->users++;
+out:
 	return ret;
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_register);
@@ -499,10 +496,18 @@ nf_conntrack_l4proto_unregister_net(struct nf_conntrack_l4proto *l4proto)
 void nf_conntrack_l4proto_unregister(struct net *net,
 				     struct nf_conntrack_l4proto *l4proto)
 {
+	struct nf_proto_net *pn = NULL;
+
 	if (net == &init_net)
 		nf_conntrack_l4proto_unregister_net(l4proto);
 
-	nf_ct_l4proto_unregister_sysctl(net, l4proto);
+	pn = nf_ct_l4proto_net(net, l4proto);
+	if (pn == NULL)
+		return;
+
+	pn->users--;
+	nf_ct_l4proto_unregister_sysctl(net, pn, l4proto);
+
 	/* Remove all contrack entries for this protocol */
 	rtnl_lock();
 	nf_ct_iterate_cleanup(net, kill_l4proto, l4proto);
@@ -514,11 +519,15 @@ int nf_conntrack_proto_init(struct net *net)
 {
 	unsigned int i;
 	int err;
+	struct nf_proto_net *pn = nf_ct_l4proto_net(net,
+					&nf_conntrack_l4proto_generic);
+
 	err = nf_conntrack_l4proto_generic.init_net(net,
 					nf_conntrack_l4proto_generic.l3proto);
 	if (err < 0)
 		return err;
 	err = nf_ct_l4proto_register_sysctl(net,
+					    pn,
 					    &nf_conntrack_l4proto_generic);
 	if (err < 0)
 		return err;
@@ -528,13 +537,20 @@ int nf_conntrack_proto_init(struct net *net)
 			rcu_assign_pointer(nf_ct_l3protos[i],
 					   &nf_conntrack_l3proto_generic);
 	}
+
+	pn->users++;
 	return 0;
 }
 
 void nf_conntrack_proto_fini(struct net *net)
 {
 	unsigned int i;
+	struct nf_proto_net *pn = nf_ct_l4proto_net(net,
+					&nf_conntrack_l4proto_generic);
+
+	pn->users--;
 	nf_ct_l4proto_unregister_sysctl(net,
+					pn,
 					&nf_conntrack_l4proto_generic);
 	if (net == &init_net) {
 		/* free l3proto protocol tables */
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 01/13] netfilter: fix problem with proto register
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng

before commit 2c352f444ccfa966a1aa4fd8e9ee29381c467448
(netfilter: nf_conntrack: prepare namespace support for
l4 protocol trackers), we register sysctl before register
protos, so if sysctl is registered faild, the protos will
not be registered.

but now, we register protos first, and when register
sysctl failed, we can use protos too, it's different
from before.

so change to register sysctl before register protos.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto.c |   36 +++++++++++++++++++++++-------------
 1 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 1ea9194..9bd88aa 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -253,18 +253,23 @@ int nf_conntrack_l3proto_register(struct net *net,
 {
 	int ret = 0;
 
-	if (net == &init_net)
-		ret = nf_conntrack_l3proto_register_net(proto);
+	if (proto->init_net) {
+		ret = proto->init_net(net);
+		if (ret < 0)
+			return ret;
+	}
 
+	ret = nf_ct_l3proto_register_sysctl(net, proto);
 	if (ret < 0)
 		return ret;
 
-	if (proto->init_net) {
-		ret = proto->init_net(net);
+	if (net == &init_net) {
+		ret = nf_conntrack_l3proto_register_net(proto);
 		if (ret < 0)
-			return ret;
+			nf_ct_l3proto_unregister_sysctl(net, proto);
 	}
-	return nf_ct_l3proto_register_sysctl(net, proto);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_l3proto_register);
 
@@ -454,19 +459,24 @@ int nf_conntrack_l4proto_register(struct net *net,
 				  struct nf_conntrack_l4proto *l4proto)
 {
 	int ret = 0;
-	if (net == &init_net)
-		ret = nf_conntrack_l4proto_register_net(l4proto);
 
-	if (ret < 0)
-		return ret;
-
-	if (l4proto->init_net)
+	if (l4proto->init_net) {
 		ret = l4proto->init_net(net);
+		if (ret < 0)
+			return ret;
+	}
 
+	ret = nf_ct_l4proto_register_sysctl(net, l4proto);
 	if (ret < 0)
 		return ret;
 
-	return nf_ct_l4proto_register_sysctl(net, l4proto);
+	if (net == &init_net) {
+		ret = nf_conntrack_l4proto_register_net(l4proto);
+		if (ret < 0)
+			nf_ct_l4proto_unregister_sysctl(net, l4proto);
+	}
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_register);
 
-- 
1.7.7.6


^ permalink raw reply related

* [PATCH 11/13] netfilter: nf_conntrack_l4proto_dccp[4,6] cleanup
From: Gao feng @ 2012-06-21 14:36 UTC (permalink / raw)
  To: pablo; +Cc: netdev, netfilter-devel, Gao feng
In-Reply-To: <1340289410-17642-1-git-send-email-gaofeng@cn.fujitsu.com>

some cleanup of nf_conntrack_l4proto_dccp[4,6],
make codes more clearer and ready for moving the
sysctl code to nf_conntrack_proto_*_sysctl.c to
reduce the ifdef pollution.

and use nf_proto_net.users to identify if it's the first time
we use the nf_proto_net. when it's the first time,we will
initialized it.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/netfilter/nf_conntrack_proto_dccp.c |   54 +++++++++++++++++--------------
 1 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c
index 52da8f0..6535326 100644
--- a/net/netfilter/nf_conntrack_proto_dccp.c
+++ b/net/netfilter/nf_conntrack_proto_dccp.c
@@ -387,7 +387,7 @@ dccp_state_table[CT_DCCP_ROLE_MAX + 1][DCCP_PKT_SYNCACK + 1][CT_DCCP_MAX + 1] =
 /* this module per-net specifics */
 static int dccp_net_id __read_mostly;
 struct dccp_net {
-	struct nf_proto_net np;
+	struct nf_proto_net pn;
 	int dccp_loose;
 	unsigned int dccp_timeout[CT_DCCP_MAX + 1];
 };
@@ -815,16 +815,37 @@ static struct ctl_table dccp_sysctl_table[] = {
 };
 #endif /* CONFIG_SYSCTL */
 
+static int dccp_kmemdup_sysctl_table(struct nf_proto_net *pn,
+				     struct dccp_net *dn)
+{
+#ifdef CONFIG_SYSCTL
+	if (pn->ctl_table)
+		return 0;
+
+	pn->ctl_table = kmemdup(dccp_sysctl_table,
+				sizeof(dccp_sysctl_table),
+				GFP_KERNEL);
+	if (!pn->ctl_table)
+		return -ENOMEM;
+
+	pn->ctl_table[0].data = &dn->dccp_timeout[CT_DCCP_REQUEST];
+	pn->ctl_table[1].data = &dn->dccp_timeout[CT_DCCP_RESPOND];
+	pn->ctl_table[2].data = &dn->dccp_timeout[CT_DCCP_PARTOPEN];
+	pn->ctl_table[3].data = &dn->dccp_timeout[CT_DCCP_OPEN];
+	pn->ctl_table[4].data = &dn->dccp_timeout[CT_DCCP_CLOSEREQ];
+	pn->ctl_table[5].data = &dn->dccp_timeout[CT_DCCP_CLOSING];
+	pn->ctl_table[6].data = &dn->dccp_timeout[CT_DCCP_TIMEWAIT];
+	pn->ctl_table[7].data = &dn->dccp_loose;
+#endif
+	return 0;
+}
+
 static int dccp_init_net(struct net *net, u_int16_t proto)
 {
 	struct dccp_net *dn = dccp_pernet(net);
-	struct nf_proto_net *pn = (struct nf_proto_net *)dn;
+	struct nf_proto_net *pn = &dn->pn;
 
-#ifdef CONFIG_SYSCTL
-	if (!pn->ctl_table) {
-#else
-	if (!pn->users++) {
-#endif
+	if (!pn->users) {
 		/* default values */
 		dn->dccp_loose = 1;
 		dn->dccp_timeout[CT_DCCP_REQUEST]	= 2 * DCCP_MSL;
@@ -834,24 +855,9 @@ static int dccp_init_net(struct net *net, u_int16_t proto)
 		dn->dccp_timeout[CT_DCCP_CLOSEREQ]	= 64 * HZ;
 		dn->dccp_timeout[CT_DCCP_CLOSING]	= 64 * HZ;
 		dn->dccp_timeout[CT_DCCP_TIMEWAIT]	= 2 * DCCP_MSL;
-#ifdef CONFIG_SYSCTL
-		pn->ctl_table = kmemdup(dccp_sysctl_table,
-					sizeof(dccp_sysctl_table),
-					GFP_KERNEL);
-		if (!pn->ctl_table)
-			return -ENOMEM;
-
-		pn->ctl_table[0].data = &dn->dccp_timeout[CT_DCCP_REQUEST];
-		pn->ctl_table[1].data = &dn->dccp_timeout[CT_DCCP_RESPOND];
-		pn->ctl_table[2].data = &dn->dccp_timeout[CT_DCCP_PARTOPEN];
-		pn->ctl_table[3].data = &dn->dccp_timeout[CT_DCCP_OPEN];
-		pn->ctl_table[4].data = &dn->dccp_timeout[CT_DCCP_CLOSEREQ];
-		pn->ctl_table[5].data = &dn->dccp_timeout[CT_DCCP_CLOSING];
-		pn->ctl_table[6].data = &dn->dccp_timeout[CT_DCCP_TIMEWAIT];
-		pn->ctl_table[7].data = &dn->dccp_loose;
-#endif
 	}
-	return 0;
+
+	return dccp_kmemdup_sysctl_table(pn, dn);
 }
 
 static struct nf_conntrack_l4proto dccp_proto4 __read_mostly = {
-- 
1.7.7.6

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox