Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] ipv6: fix NULL reference in proxy neighbor discovery
From: YOSHIFUJI Hideaki @ 2010-06-23 14:44 UTC (permalink / raw)
  To: Stephen Hemminger, David Miller
  Cc: Andreas Klauer, Hagen Paul Pfeifer, netdev, Octavian Purdila,
	YOSHIFUJI Hideaki
In-Reply-To: <20100621140013.508741df@nehalam>

Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

(2010/06/22 6:00), Stephen Hemminger wrote:
> The addition of TLLAO option created a kernel OOPS regression
> for the case where neighbor advertisement is being sent via
> proxy path.  When using proxy, ipv6_get_ifaddr() returns NULL
> causing the NULL dereference.
> 
> Change causing the bug was:
> commit f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50
> Author: Octavian Purdila<opurdila@ixiacom.com>
> Date:   Fri Oct 2 11:39:15 2009 +0000
> 
>      make TLLAO option for NA packets configurable
> 
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
> 
> ---
> Patch for -net and -stable.
> Applies to 2.6.33 and later.
> 
> --- a/net/ipv6/ndisc.c	2010-06-11 08:13:13.008657498 -0700
> +++ b/net/ipv6/ndisc.c	2010-06-21 13:52:57.961486303 -0700
> @@ -586,6 +586,7 @@ static void ndisc_send_na(struct net_dev
>   		src_addr = solicited_addr;
>   		if (ifp->flags&  IFA_F_OPTIMISTIC)
>   			override = 0;
> +		inc_opt |= ifp->idev->cnf.force_tllao;
>   		in6_ifa_put(ifp);
>   	} else {
>   		if (ipv6_dev_get_saddr(dev_net(dev), dev, daddr,
> @@ -599,7 +600,6 @@ static void ndisc_send_na(struct net_dev
>   	icmp6h.icmp6_solicited = solicited;
>   	icmp6h.icmp6_override = override;
> 
> -	inc_opt |= ifp->idev->cnf.force_tllao;
>   	__ndisc_send(dev, neigh, daddr, src_addr,
>   		&icmp6h, solicited_addr,
>   		     inc_opt ? ND_OPT_TARGET_LL_ADDR : 0);
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* Re: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
From: Justin P. Mattock @ 2010-06-23 14:41 UTC (permalink / raw)
  To: John W. Linville; +Cc: Linux Kernel Mailing List, netdev
In-Reply-To: <20100623141622.GC15205@tuxdriver.com>

On 06/23/2010 07:16 AM, John W. Linville wrote:
> On Tue, Jun 22, 2010 at 04:16:53PM -0700, Justin Mattock wrote:
>> I remember ipsec was able to work cleanly on my machines probably
>> about 4/6 months ago
>> now I get this:
>
> <snip>
>
> Perhaps netdev would be a more appropriate list than linux-wireless
> for this?
>
> John

alright added the cc's..
almost done bisecting..hopefully this points to
the right area..(if not I'll re-bisect until I get this).

as for the troubled machines I've a macbookpro2,2(no proprietary
stuff) and an imac9,1 with broadcom-sta(blah) both
seem to hit this right when the ipsec transaction starts
with ssh, vncviewer..(last good kernel I have is
2.6.33-rc5-01007-ge9449d8).

Justin P. Mattock

^ permalink raw reply

* [PATCH] ipv6: remove ipv6_statistics
From: Eric Dumazet @ 2010-06-23 12:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Hideaki YOSHIFUJI, Denis V.Lunev, Alexey Dobriyan

commit 9261e5370112 (ipv6: making ip and icmp statistics per/namespace)
forgot to remove ipv6_statistics variable.

commit bc417d99bf27 (ipv6: remove stale MIB definitions) took care of
icmpv6_statistics & icmpv6msg_statistics

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
---
 net/ipv6/ipv6_sockglue.c |    2 --
 1 files changed, 2 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index bd43f01..a7f66bc 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -55,8 +55,6 @@
 
 #include <asm/uaccess.h>
 
-DEFINE_SNMP_STAT(struct ipstats_mib, ipv6_statistics) __read_mostly;
-
 struct ip6_ra_chain *ip6_ra_chain;
 DEFINE_RWLOCK(ip6_ra_lock);
 



^ permalink raw reply related

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Christoph Hellwig @ 2010-06-23 12:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Frederic Weisbecker, Ian Munsie, linux-kernel, linuxppc-dev,
	Jason Baron, Steven Rostedt, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Alexander Viro, Andrew Morton,
	Jeff Moyer, David Howells, Oleg Nesterov, Arnd Bergmann,
	David S. Miller, Greg Kroah-Hartman, Dinakar Guniguntala,
	Thomas Gleixner, Ingo Molnar, Eric Biederman, Simon Kagstrom,
	WANG Cong
In-Reply-To: <4C21FF9A.2040207@linux.intel.com>

On Wed, Jun 23, 2010 at 02:35:38PM +0200, Andi Kleen wrote:
> >I haven't heard any complains about existing syscalls wrappers.
> 
> At least for me they always interrupt my grepping.
> 
> >
> >What kind of annotations could solve that?
> 
> If you put the annotation in a separate macro and leave the original
> prototype alone. Then C parsers could still parse it.

I personally hate the way SYSCALL_DEFINE works with passion, mostly
for the grep reason, but also because it looks horribly ugly.

But there is no reason not to be consistent here.  We already use
the wrappers for all native system calls, so leaving the compat
calls out doesn't make any sense.  And I'd cheer for anyone who
comes up with a better scheme for the native and compat wrappers.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Andi Kleen @ 2010-06-23 12:35 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ian Munsie, linux-kernel, linuxppc-dev, Jason Baron,
	Steven Rostedt, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Alexander Viro, Andrew Morton,
	Jeff Moyer, David Howells, Oleg Nesterov, Arnd Bergmann,
	David S. Miller, Greg Kroah-Hartman, Dinakar Guniguntala,
	Thomas Gleixner, Ingo Molnar, Eric Biederman, Simon Kagstrom,
	WANG Cong, Sam Ravnborg
In-Reply-To: <20100623113806.GD5242@nowhere>


>
> I haven't heard any complains about existing syscalls wrappers.

At least for me they always interrupt my grepping.

>
> What kind of annotations could solve that?

If you put the annotation in a separate macro and leave the original
prototype alone. Then C parsers could still parse it.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* RE: [PATCH] net: add dependency on fw class module
From: Amit Salecha @ 2010-06-23 11:49 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem@davemloft.net, netdev@vger.kernel.org, Ameen Rahman,
	Anirban Chakraborty
In-Reply-To: <1277291941.26161.9.camel@localhost>

Actually that is not actually reflect to board type.
"board_type" is a encoding to find out number of physical port and phy speed.

NX3031 (netxen_nic) and QLOGIC CNA (qlcnic) device have some common board form.

But not all board types are valid for QLOGIC CNA device. We will clean this up.

Thanks for pointing this out.

-Amit

-----Original Message-----
From: Ben Hutchings [mailto:bhutchings@solarflare.com] 
Sent: Wednesday, June 23, 2010 4:49 PM
To: Amit Salecha
Cc: davem@davemloft.net; netdev@vger.kernel.org; Ameen Rahman; Anirban Chakraborty
Subject: Re: [PATCH] net: add dependency on fw class module

On Tue, 2010-06-22 at 23:54 -0700, Amit Kumar Salecha wrote:
> From: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
> 
> netxen_nic and qlcnic driver depends on firmware_class module.
[...]

By the way, I noticed that these drivers are very similar, and it looks
like qlcnic has code for a bunch of NetXen boards even though it doesn't
include their device IDs in its device ID table.  Have you considered
merging the two?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Frederic Weisbecker @ 2010-06-23 11:38 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ian Munsie, linux-kernel, linuxppc-dev, Jason Baron,
	Steven Rostedt, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Alexander Viro, Andrew Morton,
	Jeff Moyer, David Howells, Oleg Nesterov, Arnd Bergmann,
	David S. Miller, Greg Kroah-Hartman, Dinakar Guniguntala,
	Thomas Gleixner, Ingo Molnar, Eric Biederman, Simon Kagstrom,
	WANG Cong, Sam Ravnborg, Rol
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On Wed, Jun 23, 2010 at 12:37:44PM +0200, Andi Kleen wrote:
> , Frederic Weisbecker wrote:
>> On Wed, Jun 23, 2010 at 12:19:38PM +0200, Andi Kleen wrote:
>>> , Ian Munsie wrote:
>>>> From: Ian Munsie<imunsie@au1.ibm.com>
>>>>
>>>> This patch converts numerous trivial compat syscalls through the generic
>>>> kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
>>>
>>> Why? This just makes the code look uglier and the functions harder
>>> to grep for.
>>
>>
>> Because it makes them usable with syscall tracing.
>
> Ok that information is missing in the changelog then.


Agreed, the changelog lacks the purpose of what it does.



> Also I hope the uglification<->usefullness factor is really worth it.
> The patch is certainly no slouch on the uglification side.


It's worth because the kernel's syscall tracing is not complete, we lack all
the compat part.

These wrappers let us create TRACE_EVENT() for every syscalls automatically.
If we had to create them manually, the uglification would be way much more worse.

Most syscalls use the syscall wrappers already, so the uglification
is there mostly. We just forgot to uglify a bunch of them :)


> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools. For those
> it would be actually better if you used separate annotation
> that does not confuse standard C parsers.


I haven't heard any complains about existing syscalls wrappers.

What kind of annotations could solve that?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH] net: add dependency on fw class module
From: Ben Hutchings @ 2010-06-23 11:19 UTC (permalink / raw)
  To: Amit Kumar Salecha; +Cc: davem, netdev, ameen.rahman, Anirban Chakraborty
In-Reply-To: <1277276075-29322-1-git-send-email-amit.salecha@qlogic.com>

On Tue, 2010-06-22 at 23:54 -0700, Amit Kumar Salecha wrote:
> From: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
> 
> netxen_nic and qlcnic driver depends on firmware_class module.
[...]

By the way, I noticed that these drivers are very similar, and it looks
like qlcnic has code for a bunch of NetXen boards even though it doesn't
include their device IDs in its device ID table.  Have you considered
merging the two?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Andi Kleen @ 2010-06-23 10:37 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ian Munsie, linux-kernel, linuxppc-dev, Jason Baron,
	Steven Rostedt, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Alexander Viro, Andrew Morton,
	Jeff Moyer, David Howells, Oleg Nesterov, Arnd Bergmann,
	David S. Miller, Greg Kroah-Hartman, Dinakar Guniguntala,
	Thomas Gleixner, Ingo Molnar, Eric Biederman, Simon Kagstrom,
	WANG Cong, Sam Ravnborg
In-Reply-To: <20100623102931.GB5242@nowhere>

, Frederic Weisbecker wrote:
> On Wed, Jun 23, 2010 at 12:19:38PM +0200, Andi Kleen wrote:
>> , Ian Munsie wrote:
>>> From: Ian Munsie<imunsie@au1.ibm.com>
>>>
>>> This patch converts numerous trivial compat syscalls through the generic
>>> kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
>>
>> Why? This just makes the code look uglier and the functions harder
>> to grep for.
>
>
> Because it makes them usable with syscall tracing.

Ok that information is missing in the changelog then.

Also I hope the uglification<->usefullness factor is really worth it.
The patch is certainly no slouch on the uglification side.

It also has maintenance costs, e.g. I doubt ctags and cscope
will be able to deal with these kinds of macros, so it has a
high cost for everyone using these tools. For those
it would be actually better if you used separate annotation
that does not confuse standard C parsers.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH] snmp: fix SNMP_ADD_STATS()
From: Eric Dumazet @ 2010-06-23 10:32 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Tom Herbert

commit aa2ea0586d9d (tcp: fix outsegs stat for TSO segments) incorrectly
assumed SNMP_ADD_STATS() was used from BH context.

Fix this using mib[!in_softirq()] instead of mib[0]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
---
 include/net/snmp.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/snmp.h b/include/net/snmp.h
index 92456f1..899003d 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -134,7 +134,7 @@ struct linux_xfrm_mib {
 #define SNMP_ADD_STATS_USER(mib, field, addend)	\
 			this_cpu_add(mib[1]->mibs[field], addend)
 #define SNMP_ADD_STATS(mib, field, addend)	\
-			this_cpu_add(mib[0]->mibs[field], addend)
+			this_cpu_add(mib[!in_softirq()]->mibs[field], addend)
 /*
  * Use "__typeof__(*mib[0]) *ptr" instead of "__typeof__(mib[0]) ptr"
  * to make @ptr a non-percpu pointer.



^ permalink raw reply related

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: KOSAKI Motohiro @ 2010-06-23 10:30 UTC (permalink / raw)
  To: Andi Kleen
  Cc: kosaki.motohiro, Ian Munsie, linux-kernel, linuxppc-dev,
	Jason Baron, Frederic Weisbecker, Steven Rostedt, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Alexander Viro, Andrew Morton, Jeff Moyer, David Howells,
	Oleg Nesterov, Arnd Bergmann, David S. Miller, Greg Kroah-Hartman,
	Dinakar Guniguntala, Thomas Gleixner, Ingo 
In-Reply-To: <4C21DFBA.2070202@linux.intel.com>

> , Ian Munsie wrote:
> > From: Ian Munsie<imunsie@au1.ibm.com>
> >
> > This patch converts numerous trivial compat syscalls through the generic
> > kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
> 
> Why? This just makes the code look uglier and the functions harder
> to grep for.

I guess trace-syscall feature need to override COMPAT_SYSCALL_DEFINE. but
It's only guess...


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Frederic Weisbecker @ 2010-06-23 10:29 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ian Munsie, linux-kernel, linuxppc-dev, Jason Baron,
	Steven Rostedt, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Alexander Viro, Andrew Morton,
	Jeff Moyer, David Howells, Oleg Nesterov, Arnd Bergmann,
	David S. Miller, Greg Kroah-Hartman, Dinakar Guniguntala,
	Thomas Gleixner, Ingo Molnar, Eric Biederman, Simon Kagstrom,
	WANG Cong, Sam Ravnborg, Rol
In-Reply-To: <4C21DFBA.2070202@linux.intel.com>

On Wed, Jun 23, 2010 at 12:19:38PM +0200, Andi Kleen wrote:
> , Ian Munsie wrote:
>> From: Ian Munsie<imunsie@au1.ibm.com>
>>
>> This patch converts numerous trivial compat syscalls through the generic
>> kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
>
> Why? This just makes the code look uglier and the functions harder
> to grep for.


Because it makes them usable with syscall tracing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Andi Kleen @ 2010-06-23 10:19 UTC (permalink / raw)
  To: Ian Munsie
  Cc: linux-kernel, linuxppc-dev, Jason Baron, Frederic Weisbecker,
	Steven Rostedt, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Alexander Viro, Andrew Morton,
	Jeff Moyer, David Howells, Oleg Nesterov, Arnd Bergmann,
	David S. Miller, Greg Kroah-Hartman, Dinakar Guniguntala,
	Thomas Gleixner, Ingo Molnar, Eric Biederman, Simon Kagstrom,
	WANG Cong, Sam Ravnborg
In-Reply-To: <1277287401-28571-32-git-send-email-imunsie@au1.ibm.com>

, Ian Munsie wrote:
> From: Ian Munsie<imunsie@au1.ibm.com>
>
> This patch converts numerous trivial compat syscalls through the generic
> kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.

Why? This just makes the code look uglier and the functions harder
to grep for.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* RE: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
From: Dong, Eddie @ 2010-06-23 10:05 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Xin, Xiaohui, Stephen Hemminger, netdev@vger.kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mst@redhat.com,
	mingo@elte.hu, davem@davemloft.net, jdike@linux.intel.com,
	Dong, Eddie
In-Reply-To: <20100623095254.GA32491@gondor.apana.org.au>

Herbert Xu wrote:
> On Wed, Jun 23, 2010 at 04:09:40PM +0800, Dong, Eddie wrote:
>> 
>> Xiaohui & Herbert:
>> 	Mixing copy of head & 0-copy of bulk data imposes additional
>> 	challange to find the guest buffer. The backend driver may be
>> unable to find a spare guest buffer from virtqueue at that time
>> which may block the receiving process then. Can't we completely
>> eliminate netdev_alloc_skb here? Assigning guest buffer at this time
>> makes life much easier.    
> 
> I'm not sure I understand you concern.  If you mean that when
> the guest doesn't give enough pages to the host and the host
> can't receive on behalf of the guest then isn't that already
> the case with the original patch-set?
> 

I mean once the frontend side driver post the buffers to the backend driver, the backend driver will "immediately" use that buffers to compose skb or gro_frags and post them to the assigned host NIC driver as receive buffers. In that case, if the backend driver recieves a packet from the NIC that requires to do copy, it may be unable to find additional free guest buffer because all of them are already used by the NIC driver. We have to reserve some guest buffers for the possible copy even if the buffer address is not identified by original skb :(

Thx, Eddie

^ permalink raw reply

* [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Ian Munsie @ 2010-06-23 10:03 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Jason Baron, Frederic Weisbecker, Steven Rostedt, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Ian Munsie, Alexander Viro, Andrew Morton, Jeff Moyer,
	David Howells, Oleg Nesterov, Arnd Bergmann, David S. Miller,
	Greg Kroah-Hartman, Dinakar Guniguntala, Thomas Gleixner,
	Ingo Molnar, Eric Biederman, Simon Kagstrom <simon.kags
In-Reply-To: <1277287401-28571-1-git-send-email-imunsie@au1.ibm.com>

From: Ian Munsie <imunsie@au1.ibm.com>

This patch converts numerous trivial compat syscalls through the generic
kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 fs/compat.c            |    2 +-
 fs/compat_ioctl.c      |    4 ++--
 ipc/compat_mq.c        |   24 ++++++++++++------------
 kernel/futex_compat.c  |   19 ++++++++++---------
 kernel/kexec.c         |    8 ++++----
 kernel/ptrace.c        |    4 ++--
 kernel/sysctl_binary.c |    2 +-
 mm/mempolicy.c         |   19 ++++++++++---------
 net/compat.c           |   28 ++++++++++++++--------------
 9 files changed, 56 insertions(+), 54 deletions(-)

diff --git a/fs/compat.c b/fs/compat.c
index df0b502..9897b7b 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -1823,7 +1823,7 @@ struct compat_sel_arg_struct {
 	compat_uptr_t tvp;
 };
 
-asmlinkage long compat_sys_old_select(struct compat_sel_arg_struct __user *arg)
+COMPAT_SYSCALL_DEFINE1(old_select, struct compat_sel_arg_struct __user *, arg)
 {
 	struct compat_sel_arg_struct a;
 
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 641640d..60d7e91 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -1674,8 +1674,8 @@ static int compat_ioctl_check_table(unsigned int xcmd)
 	return ioctl_pointer[i] == xcmd;
 }
 
-asmlinkage long compat_sys_ioctl(unsigned int fd, unsigned int cmd,
-				unsigned long arg)
+COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
+				unsigned long, arg)
 {
 	struct file *filp;
 	int error = -EBADF;
diff --git a/ipc/compat_mq.c b/ipc/compat_mq.c
index d8d1e9f..53593d3 100644
--- a/ipc/compat_mq.c
+++ b/ipc/compat_mq.c
@@ -46,9 +46,9 @@ static inline int put_compat_mq_attr(const struct mq_attr *attr,
 		| __put_user(attr->mq_curmsgs, &uattr->mq_curmsgs);
 }
 
-asmlinkage long compat_sys_mq_open(const char __user *u_name,
-			int oflag, compat_mode_t mode,
-			struct compat_mq_attr __user *u_attr)
+COMPAT_SYSCALL_DEFINE4(mq_open, const char __user *, u_name,
+			int, oflag, compat_mode_t, mode,
+			struct compat_mq_attr __user *, u_attr)
 {
 	void __user *p = NULL;
 	if (u_attr && oflag & O_CREAT) {
@@ -75,10 +75,10 @@ static int compat_prepare_timeout(struct timespec __user * *p,
 	return 0;
 }
 
-asmlinkage long compat_sys_mq_timedsend(mqd_t mqdes,
-			const char __user *u_msg_ptr,
-			size_t msg_len, unsigned int msg_prio,
-			const struct compat_timespec __user *u_abs_timeout)
+COMPAT_SYSCALL_DEFINE5(mq_timedsend, mqd_t, mqdes,
+			const char __user *, u_msg_ptr,
+			size_t, msg_len, unsigned int, msg_prio,
+			const struct compat_timespec __user *, u_abs_timeout)
 {
 	struct timespec __user *u_ts;
 
@@ -102,8 +102,8 @@ asmlinkage ssize_t compat_sys_mq_timedreceive(mqd_t mqdes,
 			u_msg_prio, u_ts);
 }
 
-asmlinkage long compat_sys_mq_notify(mqd_t mqdes,
-			const struct compat_sigevent __user *u_notification)
+COMPAT_SYSCALL_DEFINE2(mq_notify, mqd_t, mqdes,
+			const struct compat_sigevent __user *, u_notification)
 {
 	struct sigevent __user *p = NULL;
 	if (u_notification) {
@@ -119,9 +119,9 @@ asmlinkage long compat_sys_mq_notify(mqd_t mqdes,
 	return sys_mq_notify(mqdes, p);
 }
 
-asmlinkage long compat_sys_mq_getsetattr(mqd_t mqdes,
-			const struct compat_mq_attr __user *u_mqstat,
-			struct compat_mq_attr __user *u_omqstat)
+COMPAT_SYSCALL_DEFINE3(mq_getsetattr, mqd_t, mqdes,
+			const struct compat_mq_attr __user *, u_mqstat,
+			struct compat_mq_attr __user *, u_omqstat)
 {
 	struct mq_attr mqstat;
 	struct mq_attr __user *p = compat_alloc_user_space(2 * sizeof(*p));
diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c
index d49afb2..d798c9f 100644
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -10,6 +10,7 @@
 #include <linux/compat.h>
 #include <linux/nsproxy.h>
 #include <linux/futex.h>
+#include <linux/syscalls.h>
 
 #include <asm/uaccess.h>
 
@@ -114,9 +115,9 @@ void compat_exit_robust_list(struct task_struct *curr)
 	}
 }
 
-asmlinkage long
-compat_sys_set_robust_list(struct compat_robust_list_head __user *head,
-			   compat_size_t len)
+COMPAT_SYSCALL_DEFINE2(set_robust_list,
+		struct compat_robust_list_head __user *, head,
+		compat_size_t, len)
 {
 	if (!futex_cmpxchg_enabled)
 		return -ENOSYS;
@@ -129,9 +130,9 @@ compat_sys_set_robust_list(struct compat_robust_list_head __user *head,
 	return 0;
 }
 
-asmlinkage long
-compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr,
-			   compat_size_t __user *len_ptr)
+COMPAT_SYSCALL_DEFINE3(get_robust_list, int, pid,
+		compat_uptr_t __user *, head_ptr,
+		compat_size_t __user *, len_ptr)
 {
 	struct compat_robust_list_head __user *head;
 	unsigned long ret;
@@ -170,9 +171,9 @@ err_unlock:
 	return ret;
 }
 
-asmlinkage long compat_sys_futex(u32 __user *uaddr, int op, u32 val,
-		struct compat_timespec __user *utime, u32 __user *uaddr2,
-		u32 val3)
+COMPAT_SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
+		struct compat_timespec __user *, utime, u32 __user *, uaddr2,
+		u32, val3)
 {
 	struct timespec ts;
 	ktime_t t, *tp = NULL;
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 474a847..0b261ed 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1024,10 +1024,10 @@ out:
 }
 
 #ifdef CONFIG_COMPAT
-asmlinkage long compat_sys_kexec_load(unsigned long entry,
-				unsigned long nr_segments,
-				struct compat_kexec_segment __user *segments,
-				unsigned long flags)
+COMPAT_SYSCALL_DEFINE4(kexec_load, unsigned long, entry,
+				unsigned long, nr_segments,
+				struct compat_kexec_segment __user *, segments,
+				unsigned long, flags)
 {
 	struct compat_kexec_segment in;
 	struct kexec_segment out, __user *ksegments;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 74a3d69..0d91d7f 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -826,8 +826,8 @@ int compat_ptrace_request(struct task_struct *child, compat_long_t request,
 	return ret;
 }
 
-asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
-				  compat_long_t addr, compat_long_t data)
+COMPAT_SYSCALL_DEFINE4(ptrace, compat_long_t, request, compat_long_t, pid,
+				  compat_long_t, addr, compat_long_t, data)
 {
 	struct task_struct *child;
 	long ret;
diff --git a/kernel/sysctl_binary.c b/kernel/sysctl_binary.c
index 1357c57..fb061c7 100644
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -1502,7 +1502,7 @@ struct compat_sysctl_args {
 	compat_ulong_t	__unused[4];
 };
 
-asmlinkage long compat_sys_sysctl(struct compat_sysctl_args __user *args)
+COMPAT_SYSCALL_DEFINE1(sysctl, struct compat_sysctl_args __user *, args)
 {
 	struct compat_sysctl_args tmp;
 	compat_size_t __user *compat_oldlenp;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 5d6fb33..b9fbceb 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1372,10 +1372,10 @@ SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
 
 #ifdef CONFIG_COMPAT
 
-asmlinkage long compat_sys_get_mempolicy(int __user *policy,
-				     compat_ulong_t __user *nmask,
-				     compat_ulong_t maxnode,
-				     compat_ulong_t addr, compat_ulong_t flags)
+COMPAT_SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
+				     compat_ulong_t __user *, nmask,
+				     compat_ulong_t, maxnode,
+				     compat_ulong_t, addr, compat_ulong_t, flags)
 {
 	long err;
 	unsigned long __user *nm = NULL;
@@ -1400,8 +1400,9 @@ asmlinkage long compat_sys_get_mempolicy(int __user *policy,
 	return err;
 }
 
-asmlinkage long compat_sys_set_mempolicy(int mode, compat_ulong_t __user *nmask,
-				     compat_ulong_t maxnode)
+COMPAT_SYSCALL_DEFINE3(set_mempolicy, int, mode,
+				     compat_ulong_t __user *, nmask,
+				     compat_ulong_t, maxnode)
 {
 	long err = 0;
 	unsigned long __user *nm = NULL;
@@ -1423,9 +1424,9 @@ asmlinkage long compat_sys_set_mempolicy(int mode, compat_ulong_t __user *nmask,
 	return sys_set_mempolicy(mode, nm, nr_bits+1);
 }
 
-asmlinkage long compat_sys_mbind(compat_ulong_t start, compat_ulong_t len,
-			     compat_ulong_t mode, compat_ulong_t __user *nmask,
-			     compat_ulong_t maxnode, compat_ulong_t flags)
+COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
+			     compat_ulong_t, mode, compat_ulong_t __user *, nmask,
+			     compat_ulong_t, maxnode, compat_ulong_t, flags)
 {
 	long err = 0;
 	unsigned long __user *nm = NULL;
diff --git a/net/compat.c b/net/compat.c
index ec24d9e..eb861c6 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -385,8 +385,8 @@ static int compat_sock_setsockopt(struct socket *sock, int level, int optname,
 	return sock_setsockopt(sock, level, optname, optval, optlen);
 }
 
-asmlinkage long compat_sys_setsockopt(int fd, int level, int optname,
-				char __user *optval, unsigned int optlen)
+COMPAT_SYSCALL_DEFINE5(setsockopt, int, fd, int, level, int, optname,
+				char __user *, optval, unsigned int, optlen)
 {
 	int err;
 	struct socket *sock;
@@ -498,8 +498,8 @@ int compat_sock_get_timestampns(struct sock *sk, struct timespec __user *usersta
 }
 EXPORT_SYMBOL(compat_sock_get_timestampns);
 
-asmlinkage long compat_sys_getsockopt(int fd, int level, int optname,
-				char __user *optval, int __user *optlen)
+COMPAT_SYSCALL_DEFINE5(getsockopt, int, fd, int, level, int, optname,
+				char __user *, optval, int __user *, optlen)
 {
 	int err;
 	struct socket *sock;
@@ -731,31 +731,31 @@ static unsigned char nas[20]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
 				AL(4),AL(5)};
 #undef AL
 
-asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg, unsigned flags)
+COMPAT_SYSCALL_DEFINE3(sendmsg, int, fd, struct compat_msghdr __user *, msg, unsigned, flags)
 {
 	return sys_sendmsg(fd, (struct msghdr __user *)msg, flags | MSG_CMSG_COMPAT);
 }
 
-asmlinkage long compat_sys_recvmsg(int fd, struct compat_msghdr __user *msg, unsigned int flags)
+COMPAT_SYSCALL_DEFINE3(recvmsg, int, fd, struct compat_msghdr __user *, msg, unsigned int, flags)
 {
 	return sys_recvmsg(fd, (struct msghdr __user *)msg, flags | MSG_CMSG_COMPAT);
 }
 
-asmlinkage long compat_sys_recv(int fd, void __user *buf, size_t len, unsigned flags)
+COMPAT_SYSCALL_DEFINE4(recv, int, fd, void __user *, buf, size_t, len, unsigned, flags)
 {
 	return sys_recv(fd, buf, len, flags | MSG_CMSG_COMPAT);
 }
 
-asmlinkage long compat_sys_recvfrom(int fd, void __user *buf, size_t len,
-				    unsigned flags, struct sockaddr __user *addr,
-				    int __user *addrlen)
+COMPAT_SYSCALL_DEFINE6(recvfrom, int, fd, void __user *, buf, size_t, len,
+				    unsigned, flags, struct sockaddr __user *, addr,
+				    int __user *, addrlen)
 {
 	return sys_recvfrom(fd, buf, len, flags | MSG_CMSG_COMPAT, addr, addrlen);
 }
 
-asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
-				    unsigned vlen, unsigned int flags,
-				    struct compat_timespec __user *timeout)
+COMPAT_SYSCALL_DEFINE5(recvmmsg, int, fd, struct compat_mmsghdr __user *, mmsg,
+				    unsigned, vlen, unsigned int, flags,
+				    struct compat_timespec __user *, timeout)
 {
 	int datagrams;
 	struct timespec ktspec;
@@ -775,7 +775,7 @@ asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
 	return datagrams;
 }
 
-asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
+COMPAT_SYSCALL_DEFINE2(socketcall, int, call, u32 __user *, args)
 {
 	int ret;
 	u32 a[6];
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* Re: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
From: Herbert Xu @ 2010-06-23  9:52 UTC (permalink / raw)
  To: Dong, Eddie
  Cc: Xin, Xiaohui, Stephen Hemminger, netdev@vger.kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mst@redhat.com,
	mingo@elte.hu, davem@davemloft.net, jdike@linux.intel.com
In-Reply-To: <1A42CE6F5F474C41B63392A5F80372B21F58CD9A@shsmsx501.ccr.corp.intel.com>

On Wed, Jun 23, 2010 at 04:09:40PM +0800, Dong, Eddie wrote:
>
> Xiaohui & Herbert:
> 	Mixing copy of head & 0-copy of bulk data imposes additional challange to find the guest buffer. The backend driver may be unable to find a spare guest buffer from virtqueue at that time which may block the receiving process then.
> 	Can't we completely eliminate netdev_alloc_skb here? Assigning guest buffer at this time makes life much easier.

I'm not sure I understand you concern.  If you mean that when
the guest doesn't give enough pages to the host and the host
can't receive on behalf of the guest then isn't that already
the case with the original patch-set?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] phylib: Add autoload support for the LXT973 phy.
From: David Woodhouse @ 2010-06-23  9:00 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev
In-Reply-To: <20100623053723.GA3404@riccoc20.at.omicron.at>

On Wed, 2010-06-23 at 07:37 +0200, Richard Cochran wrote:
> 
> Question about the whole PHY MODULE_DEVICE_TABLE system:
> 
> I recently posted a phy driver for the National Semiconductor
> DP83640. During development, I used drivers/net/arm/ixp4xx_eth.c as
> the MAC driver, which was linked into the kernel (not a module). I
> noticed that the phy driver's probe function only gets called if the
> phy driver is also statically linked, but not when it is loaded as a
> module.
>
> Is this the correct behavior? 

Hm, that seems like the _expected_ behaviour, certainly. The MAC driver
will probe its device at boot time, and will issue a request_module() to
load the a specific PHY driver if there is one. When no such module
turns up (which it won't if you have no file system mounted yet), it'll
just fall back to the generic PHY support.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

^ permalink raw reply

* RE: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
From: Dong, Eddie @ 2010-06-23  8:09 UTC (permalink / raw)
  To: Xin, Xiaohui, Herbert Xu
  Cc: Stephen Hemminger, netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com, mingo@elte.hu,
	davem@davemloft.net, jdike@linux.intel.com, Dong, Eddie
In-Reply-To: <F2E9EB7348B8264F86B6AB8151CE2D7915089FE4CC@shsmsx502.ccr.corp.intel.com>


> 3) As I have mentioned above, with this idea, netdev_alloc_skb() will
> allocate 
> as usual, the data pointed by skb->data will be copied into the first
> guest buffer. 
> That means we should reserve sufficient room in guest buffer. For PS
> mode 
> supported driver (for example ixgbe), the room will be more than 128.
> After 128bytes, 
> we will put the first frag data. Look into virtio-net.c the function
> page_to_skb() 
> and receive_mergeable(), that means we should modify guest virtio-net
> driver to 
> compute the offset as the parameter for skb_set_frag().
> 
> How do you think about this? Attached is a patch to how to modify the
> guest driver. 
> I reserve 512 bytes as an example, and transfer the header len of the
> skb in hdr->hdr_len. 
> 
Xiaohui & Herbert:
	Mixing copy of head & 0-copy of bulk data imposes additional challange to find the guest buffer. The backend driver may be unable to find a spare guest buffer from virtqueue at that time which may block the receiving process then.
	Can't we completely eliminate netdev_alloc_skb here? Assigning guest buffer at this time makes life much easier.
Thx, Eddie

^ permalink raw reply

* Re: [PATCH] sched: silence PROVE_RCU in sched_fork()
From: Li Zefan @ 2010-06-23  7:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Miles Lane, paulmck, Vivek Goyal, Eric Paris, Lai Jiangshan,
	Ingo Molnar, LKML, nauman, eric.dumazet, netdev, Jens Axboe,
	Gui Jianfeng, Johannes Berg, Paul Menage
In-Reply-To: <1277199893.1875.690.camel@laptop>

Peter Zijlstra wrote:
> Paul Menage, Li Zefan, any comments?
> 
> 
> ---
> Because cgroup_fork() is ran before sched_fork() [ from copy_process() ]
> and the child's pid is not yet visible the child is pinned to its
> cgroup. Therefore we can silence this warning.
> 

The explanation is correct.

We silenced another warning according to the same reason.

See freezer_fork() in kernel/cgroup_freezer.c

> A nicer solution would be moving cgroup_fork() to right after
> dup_task_struct() and exclude PF_STARTING from task_subsys_state().
> 

Seems PF_STARTING is set in copy_flags(), that's after dup_task_struct() and
before cgroup_fork(). But it's cleared after copy_process(), that's after
the task is linked into tasklist, so this doesn't seem work...

> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

For this patch:

Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>

> ---
>  kernel/sched.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index b697606..2e79518 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2561,7 +2561,16 @@ void sched_fork(struct task_struct *p, int clone_flags)
>  	if (p->sched_class->task_fork)
>  		p->sched_class->task_fork(p);
>  
> +	/*
> +	 * The child is not yet in the pid-hash so no cgroup attach races,
> +	 * and the cgroup is pinned to this child due to cgroup_fork()
> +	 * is ran before sched_fork().
> +	 *
> +	 * Silence PROVE_RCU.
> +	 */
> +	rcu_read_lock();
>  	set_task_cpu(p, cpu);
> +	rcu_read_unlock();
>  
>  #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
>  	if (likely(sched_info_on()))
> 
> 
> 

^ permalink raw reply

* [PATCH net-2.6] cnic: Disable statistics initialization for eth clients that do not support statistics
From: Dmitry Kravkov @ 2010-06-23  7:15 UTC (permalink / raw)
  To: davem, netdev; +Cc: mchan, eilong

Disable statistics initialization for eth clients that do not support
 statistics. This prevents memory corruption on bnx2x hw.

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/cnic.c |   55
++++++++++++++++++++++++++++++++-------------------
 1 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c
index fe92566..b314407 100644
--- a/drivers/net/cnic.c
+++ b/drivers/net/cnic.c
@@ -3919,8 +3919,9 @@ static void cnic_init_bnx2x_tx_ring(struct
cnic_dev *dev)
 		HC_INDEX_DEF_C_ETH_ISCSI_CQ_CONS;
 	context->cstorm_st_context.status_block_id = BNX2X_DEF_SB_ID;
 
-	context->xstorm_st_context.statistics_data = (cli |
-				XSTORM_ETH_ST_CONTEXT_STATISTICS_ENABLE);
+	if (cli < MAX_X_STAT_COUNTER_ID)
+		context->xstorm_st_context.statistics_data = cli |
+				XSTORM_ETH_ST_CONTEXT_STATISTICS_ENABLE;
 
 	context->xstorm_ag_context.cdu_reserved =
 		CDU_RSRVD_VALUE_TYPE_A(BNX2X_HW_CID(BNX2X_ISCSI_L2_CID, func),
@@ -3928,10 +3929,12 @@ static void cnic_init_bnx2x_tx_ring(struct
cnic_dev *dev)
 					ETH_CONNECTION_TYPE);
 
 	/* reset xstorm per client statistics */
-	val = BAR_XSTRORM_INTMEM +
-	      XSTORM_PER_COUNTER_ID_STATS_OFFSET(port, cli);
-	for (i = 0; i < sizeof(struct xstorm_per_client_stats) / 4; i++)
-		CNIC_WR(dev, val + i * 4, 0);
+	if (cli < MAX_X_STAT_COUNTER_ID) {
+		val = BAR_XSTRORM_INTMEM +
+		      XSTORM_PER_COUNTER_ID_STATS_OFFSET(port, cli);
+		for (i = 0; i < sizeof(struct xstorm_per_client_stats) / 4; i++)
+			CNIC_WR(dev, val + i * 4, 0);
+	}
 
 	cp->tx_cons_ptr =
 		&cp->bnx2x_def_status_blk->c_def_status_block.index_values[
@@ -3978,9 +3981,11 @@ static void cnic_init_bnx2x_rx_ring(struct
cnic_dev *dev)
 						BNX2X_ISCSI_RX_SB_INDEX_NUM;
 	context->ustorm_st_context.common.clientId = cli;
 	context->ustorm_st_context.common.status_block_id = BNX2X_DEF_SB_ID;
-	context->ustorm_st_context.common.flags =
-		USTORM_ETH_ST_CONTEXT_CONFIG_ENABLE_STATISTICS;
-	context->ustorm_st_context.common.statistics_counter_id = cli;
+	if (cli < MAX_U_STAT_COUNTER_ID) {
+		context->ustorm_st_context.common.flags =
+			USTORM_ETH_ST_CONTEXT_CONFIG_ENABLE_STATISTICS;
+		context->ustorm_st_context.common.statistics_counter_id = cli;
+	}
 	context->ustorm_st_context.common.mc_alignment_log_size = 0;
 	context->ustorm_st_context.common.bd_buff_size =
 						cp->l2_single_buf_size;
@@ -4011,10 +4016,13 @@ static void cnic_init_bnx2x_rx_ring(struct
cnic_dev *dev)
 
 	/* client tstorm info */
 	tstorm_client.mtu = cp->l2_single_buf_size - 14;
-	tstorm_client.config_flags =
-			(TSTORM_ETH_CLIENT_CONFIG_E1HOV_REM_ENABLE |
-			TSTORM_ETH_CLIENT_CONFIG_STATSITICS_ENABLE);
-	tstorm_client.statistics_counter_id = cli;
+	tstorm_client.config_flags =
TSTORM_ETH_CLIENT_CONFIG_E1HOV_REM_ENABLE;
+
+	if (cli < MAX_T_STAT_COUNTER_ID) {
+		tstorm_client.config_flags |=
+				TSTORM_ETH_CLIENT_CONFIG_STATSITICS_ENABLE;
+		tstorm_client.statistics_counter_id = cli;
+	}
 
 	CNIC_WR(dev, BAR_TSTRORM_INTMEM +
 		   TSTORM_CLIENT_CONFIG_OFFSET(port, cli),
@@ -4024,16 +4032,21 @@ static void cnic_init_bnx2x_rx_ring(struct
cnic_dev *dev)
 		   ((u32 *)&tstorm_client)[1]);
 
 	/* reset tstorm per client statistics */
-	val = BAR_TSTRORM_INTMEM +
-	      TSTORM_PER_COUNTER_ID_STATS_OFFSET(port, cli);
-	for (i = 0; i < sizeof(struct tstorm_per_client_stats) / 4; i++)
-		CNIC_WR(dev, val + i * 4, 0);
+	if (cli < MAX_T_STAT_COUNTER_ID) {
+
+		val = BAR_TSTRORM_INTMEM +
+		      TSTORM_PER_COUNTER_ID_STATS_OFFSET(port, cli);
+		for (i = 0; i < sizeof(struct tstorm_per_client_stats) / 4; i++)
+			CNIC_WR(dev, val + i * 4, 0);
+	}
 
 	/* reset ustorm per client statistics */
-	val = BAR_USTRORM_INTMEM +
-	      USTORM_PER_COUNTER_ID_STATS_OFFSET(port, cli);
-	for (i = 0; i < sizeof(struct ustorm_per_client_stats) / 4; i++)
-		CNIC_WR(dev, val + i * 4, 0);
+	if (cli < MAX_U_STAT_COUNTER_ID) {
+		val = BAR_USTRORM_INTMEM +
+		      USTORM_PER_COUNTER_ID_STATS_OFFSET(port, cli);
+		for (i = 0; i < sizeof(struct ustorm_per_client_stats) / 4; i++)
+			CNIC_WR(dev, val + i * 4, 0);
+	}
 
 	cp->rx_cons_ptr =
 		&cp->bnx2x_def_status_blk->u_def_status_block.index_values[
-- 
1.7.1




^ permalink raw reply related

* [PATCH] netfilter: xtables target SYNPROXY
From: Changli Gao @ 2010-06-23  7:06 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David S. Miller, Alexey Kuznetsov, Jan Engelhardt,
	Jozsef Kadlecsik, Pekka Savola (ipv6), James Morris,
	Hideaki YOSHIFUJI, netfilter-devel, netdev, Changli Gao

xtables target SYNPROXY.

This patch implements an xtables target SYNPROXY. As the connection to the
TCP server won't be established until the ACK from the client is received, it
can protect the TCP server from the SYN-flood attacks.

It works in the raw table of the PREROUTING chain, before conntracking system.
Syncookies is used, so no new state is introduced into the conntracking system.
In fact, until the first connection is established, conntracking system doesn't
see any packets. So when there is a SYN-flood attack, conntracking system won't
be busy on finding and deleting the un-assured ct.

As the SYN-packet of the second connection request is sent locally, the DNAT
rules which are in the PREROUTING chain should be moved to the OUTPUT chain.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/netfilter/nf_conntrack.h        |   10 
 include/net/netfilter/nf_conntrack_core.h   |   21 
 include/net/netfilter/nf_conntrack_extend.h |    2 
 include/net/tcp.h                           |    7 
 net/ipv4/syncookies.c                       |   22 
 net/ipv4/tcp_ipv4.c                         |    9 
 net/netfilter/Kconfig                       |   17 
 net/netfilter/Makefile                      |    1 
 net/netfilter/nf_conntrack_core.c           |   42 +
 net/netfilter/xt_SYNPROXY.c                 |  678 ++++++++++++++++++++++++++++
 10 files changed, 790 insertions(+), 19 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index e624dae..5e6d8e4 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -311,5 +311,15 @@ do {							\
 #define MODULE_ALIAS_NFCT_HELPER(helper) \
         MODULE_ALIAS("nfct-helper-" helper)
 
+#if defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY) || \
+    defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY_MODULE)
+extern unsigned int (*syn_proxy_pre_hook)(struct sk_buff *skb,
+					  struct nf_conn *ct,
+					  enum ip_conntrack_info ctinfo);
+
+extern unsigned int (*syn_proxy_post_hook)(struct sk_buff *skb,
+					   struct nf_conn *ct,
+					   enum ip_conntrack_info ctinfo);
+#endif
 #endif /* __KERNEL__ */
 #endif /* _NF_CONNTRACK_H */
diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index aced085..637b404 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -54,6 +54,23 @@ nf_conntrack_find_get(struct net *net, u16 zone,
 
 extern int __nf_conntrack_confirm(struct sk_buff *skb);
 
+static inline unsigned int syn_proxy_post_call(struct sk_buff *skb,
+					       struct nf_conn *ct,
+					       enum ip_conntrack_info ctinfo)
+{
+	unsigned int ret = NF_ACCEPT;
+#if defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY) || \
+    defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY_MODULE)
+	unsigned int (*syn_proxy)(struct sk_buff *, struct nf_conn *,
+				  enum ip_conntrack_info);
+	syn_proxy = rcu_dereference(syn_proxy_post_hook);
+	if (syn_proxy)
+		ret = syn_proxy(skb, ct, ctinfo);
+#endif
+
+	return ret;
+}
+
 /* Confirm a connection: returns NF_DROP if packet must be dropped. */
 static inline int nf_conntrack_confirm(struct sk_buff *skb)
 {
@@ -63,8 +80,10 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb)
 	if (ct && !nf_ct_is_untracked(ct)) {
 		if (!nf_ct_is_confirmed(ct))
 			ret = __nf_conntrack_confirm(skb);
-		if (likely(ret == NF_ACCEPT))
+		if (likely(ret == NF_ACCEPT)) {
 			nf_ct_deliver_cached_events(ct);
+			ret = syn_proxy_post_call(skb, ct, skb->nfctinfo);
+		}
 	}
 	return ret;
 }
diff --git a/include/net/netfilter/nf_conntrack_extend.h b/include/net/netfilter/nf_conntrack_extend.h
index 32d15bd..b2ae7e9 100644
--- a/include/net/netfilter/nf_conntrack_extend.h
+++ b/include/net/netfilter/nf_conntrack_extend.h
@@ -11,6 +11,7 @@ enum nf_ct_ext_id {
 	NF_CT_EXT_ACCT,
 	NF_CT_EXT_ECACHE,
 	NF_CT_EXT_ZONE,
+	NF_CT_EXT_SYNPROXY,
 	NF_CT_EXT_NUM,
 };
 
@@ -19,6 +20,7 @@ enum nf_ct_ext_id {
 #define NF_CT_EXT_ACCT_TYPE struct nf_conn_counter
 #define NF_CT_EXT_ECACHE_TYPE struct nf_conntrack_ecache
 #define NF_CT_EXT_ZONE_TYPE struct nf_conntrack_zone
+#define NF_CT_EXT_SYNPROXY_TYPE struct syn_proxy_state
 
 /* Extensions: optional stuff which isn't permanently in struct. */
 struct nf_ct_ext {
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 18c246c..e1fa5f9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -460,8 +460,11 @@ extern int			tcp_disconnect(struct sock *sk, int flags);
 extern __u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
 extern struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, 
 				    struct ip_options *opt);
-extern __u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, 
-				     __u16 *mss);
+extern __u32 __cookie_v4_init_sequence(__be32 saddr, __be32 daddr,
+				       __be16 sport, __be16 dport, __u32 seq,
+				       __u16 *mssp);
+extern int cookie_v4_check_sequence(const struct iphdr *iph,
+				    const struct tcphdr *th, __u32 cookie);
 
 extern __u32 cookie_init_timestamp(struct request_sock *req);
 extern bool cookie_check_timestamp(struct tcp_options_received *tcp_opt);
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 51b5662..c6b5e84 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -160,26 +160,21 @@ static __u16 const msstab[] = {
  * Generate a syncookie.  mssp points to the mss, which is returned
  * rounded down to the value encoded in the cookie.
  */
-__u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp)
+__u32 __cookie_v4_init_sequence(__be32 saddr, __be32 daddr, __be16 sport,
+				__be16 dport, __u32 seq, __u16 *mssp)
 {
-	const struct iphdr *iph = ip_hdr(skb);
-	const struct tcphdr *th = tcp_hdr(skb);
 	int mssind;
 	const __u16 mss = *mssp;
 
-	tcp_synq_overflow(sk);
-
 	for (mssind = ARRAY_SIZE(msstab) - 1; mssind ; mssind--)
 		if (mss >= msstab[mssind])
 			break;
 	*mssp = msstab[mssind];
 
-	NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_SYNCOOKIESSENT);
-
-	return secure_tcp_syn_cookie(iph->saddr, iph->daddr,
-				     th->source, th->dest, ntohl(th->seq),
+	return secure_tcp_syn_cookie(saddr, daddr, sport, dport, seq,
 				     jiffies / (HZ * 60), mssind);
 }
+EXPORT_SYMBOL(__cookie_v4_init_sequence);
 
 /*
  * This (misnamed) value is the age of syncookie which is permitted.
@@ -192,10 +187,9 @@ __u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp)
  * Check if a ack sequence number is a valid syncookie.
  * Return the decoded mss if it is, or 0 if not.
  */
-static inline int cookie_check(struct sk_buff *skb, __u32 cookie)
+int cookie_v4_check_sequence(const struct iphdr *iph, const struct tcphdr *th,
+			     __u32 cookie)
 {
-	const struct iphdr *iph = ip_hdr(skb);
-	const struct tcphdr *th = tcp_hdr(skb);
 	__u32 seq = ntohl(th->seq) - 1;
 	__u32 mssind = check_tcp_syn_cookie(cookie, iph->saddr, iph->daddr,
 					    th->source, th->dest, seq,
@@ -204,6 +198,7 @@ static inline int cookie_check(struct sk_buff *skb, __u32 cookie)
 
 	return mssind < ARRAY_SIZE(msstab) ? msstab[mssind] : 0;
 }
+EXPORT_SYMBOL(cookie_v4_check_sequence);
 
 static inline struct sock *get_cookie_sock(struct sock *sk, struct sk_buff *skb,
 					   struct request_sock *req,
@@ -283,7 +278,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 		goto out;
 
 	if (tcp_synq_no_recent_overflow(sk) ||
-	    (mss = cookie_check(skb, cookie)) == 0) {
+	    (mss = cookie_v4_check_sequence(ip_hdr(skb), tcp_hdr(skb),
+					    cookie)) == 0) {
 		NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_SYNCOOKIESFAILED);
 		goto out;
 	}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2e41e6f..3c4456d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1333,9 +1333,16 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 
 	if (want_cookie) {
 #ifdef CONFIG_SYN_COOKIES
+		struct tcphdr *th;
+
 		req->cookie_ts = tmp_opt.tstamp_ok;
+		tcp_synq_overflow(sk);
+		NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_SYNCOOKIESSENT);
+		th = tcp_hdr(skb);
+		isn = __cookie_v4_init_sequence(saddr, daddr, th->source,
+						th->dest, ntohl(th->seq),
+						&req->mss);
 #endif
-		isn = cookie_v4_init_sequence(sk, skb, &req->mss);
 	} else if (!isn) {
 		struct inet_peer *peer = NULL;
 
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 413ed24..fd8ad8c 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -560,6 +560,23 @@ config NETFILTER_XT_TARGET_SECMARK
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
+config NETFILTER_XT_TARGET_SYNPROXY
+	tristate '"SYNPROXY" target support (EXPERIMENTAL)'
+	depends on EXPERIMENTAL
+	depends on SYN_COOKIES
+	depends on IP_NF_RAW
+	depends on NF_CONNTRACK
+	depends on NETFILTER_ADVANCED
+	help
+	  The SYNPROXY target allows a raw rule to specify that some TCP
+	  connections are relayed to protect the TCP servers from the SYN-flood
+	  DoS attacks. Syn cookies is used to save the initial state, so no
+	  conntrack is needed until the client side connection is established.
+	  It frees the connection tracking system from creating/deleting
+	  conntracks when SYN-flood DoS attack acts.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_TARGET_TCPMSS
 	tristate '"TCPMSS" target support'
 	depends on (IPV6 || IPV6=n)
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index e28420a..4e32834 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -62,6 +62,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP) += xt_TCPOPTSTRIP.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_TEE) += xt_TEE.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_TRACE) += xt_TRACE.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_SYNPROXY) += xt_SYNPROXY.o
 
 # matches
 obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 16b41b4..011fa34 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -800,6 +800,26 @@ resolve_normal_ct(struct net *net, struct nf_conn *tmpl,
 	return ct;
 }
 
+static inline unsigned int syn_proxy_pre_call(int protonum, struct sk_buff *skb,
+					      struct nf_conn *ct,
+					      enum ip_conntrack_info ctinfo)
+{
+	unsigned int ret = NF_ACCEPT;
+#if defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY) || \
+    defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY_MODULE)
+	unsigned int (*syn_proxy)(struct sk_buff *, struct nf_conn *,
+				  enum ip_conntrack_info);
+
+	if (protonum == IPPROTO_TCP) {
+		syn_proxy = rcu_dereference(syn_proxy_pre_hook);
+		if (syn_proxy)
+			ret = syn_proxy(skb, ct, ctinfo);
+	}
+#endif
+
+	return ret;
+}
+
 unsigned int
 nf_conntrack_in(struct net *net, u_int8_t pf, unsigned int hooknum,
 		struct sk_buff *skb)
@@ -855,8 +875,9 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned int hooknum,
 			       l3proto, l4proto, &set_reply, &ctinfo);
 	if (!ct) {
 		/* Not valid part of a connection */
-		NF_CT_STAT_INC_ATOMIC(net, invalid);
-		ret = NF_ACCEPT;
+		ret = syn_proxy_pre_call(protonum, skb, NULL, ctinfo);
+		if (ret == NF_ACCEPT)
+			NF_CT_STAT_INC_ATOMIC(net, invalid);
 		goto out;
 	}
 
@@ -869,6 +890,9 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned int hooknum,
 
 	NF_CT_ASSERT(skb->nfct);
 
+	ret = syn_proxy_pre_call(protonum, skb, ct, ctinfo);
+	if (ret != NF_ACCEPT)
+		goto out;
 	ret = l4proto->packet(ct, skb, dataoff, ctinfo, pf, hooknum);
 	if (ret <= 0) {
 		/* Invalid: inverse of the return code tells
@@ -1476,6 +1500,17 @@ s16 (*nf_ct_nat_offset)(const struct nf_conn *ct,
 			u32 seq);
 EXPORT_SYMBOL_GPL(nf_ct_nat_offset);
 
+#if defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY) || \
+    defined(CONFIG_NETFILTER_XT_TARGET_SYNPROXY_MODULE)
+unsigned int (*syn_proxy_pre_hook)(struct sk_buff *skb, struct nf_conn *ct,
+				   enum ip_conntrack_info ctinfo);
+EXPORT_SYMBOL(syn_proxy_pre_hook);
+
+unsigned int (*syn_proxy_post_hook)(struct sk_buff *skb, struct nf_conn *ct,
+				    enum ip_conntrack_info ctinfo);
+EXPORT_SYMBOL(syn_proxy_post_hook);
+#endif
+
 int nf_conntrack_init(struct net *net)
 {
 	int ret;
@@ -1496,6 +1531,9 @@ int nf_conntrack_init(struct net *net)
 
 		/* Howto get NAT offsets */
 		rcu_assign_pointer(nf_ct_nat_offset, NULL);
+
+		rcu_assign_pointer(syn_proxy_pre_hook, NULL);
+		rcu_assign_pointer(syn_proxy_post_hook, NULL);
 	}
 	return 0;
 
diff --git a/net/netfilter/xt_SYNPROXY.c b/net/netfilter/xt_SYNPROXY.c
new file mode 100644
index 0000000..5e05259
--- /dev/null
+++ b/net/netfilter/xt_SYNPROXY.c
@@ -0,0 +1,678 @@
+/* (C) 2010- Changli Gao <xiaosuo@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * It bases on ipt_REJECT.c
+ */
+#define pr_fmt(fmt) "SYNPROXY: " fmt
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/ip.h>
+#include <linux/udp.h>
+#include <linux/icmp.h>
+#include <linux/unaligned/access_ok.h>
+#include <net/icmp.h>
+#include <net/ip.h>
+#include <net/tcp.h>
+#include <net/route.h>
+#include <net/dst.h>
+#include <net/netfilter/nf_conntrack.h>
+#include <net/netfilter/nf_conntrack_extend.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Changli Gao <xiaosuo@gmail.com>");
+MODULE_DESCRIPTION("Xtables: \"SYNPROXY\" target for IPv4");
+MODULE_ALIAS("ipt_SYNPROXY");
+
+enum {
+	TCP_SEND_FLAG_NOTRACE	= 0x1,
+	TCP_SEND_FLAG_SYNCOOKIE	= 0x2,
+	TCP_SEND_FLAG_ACK2SYN	= 0x4,
+};
+
+struct syn_proxy_state {
+	u16	seq_inited;
+	__be16	window;
+	u32	seq_diff;
+};
+
+static int get_mtu(const struct dst_entry *dst)
+{
+	int mtu;
+
+	mtu = dst_mtu(dst);
+	if (mtu)
+		return mtu;
+
+	return dst->dev ? dst->dev->mtu : 0;
+}
+
+static int get_advmss(const struct dst_entry *dst)
+{
+	int advmss;
+
+	advmss = dst_metric(dst, RTAX_ADVMSS);
+	if (advmss)
+		return advmss;
+	advmss = get_mtu(dst);
+	if (advmss)
+		return advmss - (sizeof(struct iphdr) + sizeof(struct tcphdr));
+
+	return TCP_MSS_DEFAULT;
+}
+
+static int syn_proxy_route(struct sk_buff *skb, struct net *net, u16 *pmss)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+	struct rtable *rt;
+	struct flowi fl = {};
+	unsigned int type;
+	int flags = 0;
+	int err;
+	u16 mss;
+
+	type = inet_addr_type(net, iph->saddr);
+	if (type != RTN_LOCAL) {
+		type = inet_addr_type(net, iph->daddr);
+		if (type == RTN_LOCAL)
+			flags |= FLOWI_FLAG_ANYSRC;
+	}
+
+	if (type == RTN_LOCAL) {
+		fl.nl_u.ip4_u.daddr = iph->daddr;
+		fl.nl_u.ip4_u.saddr = iph->saddr;
+		fl.nl_u.ip4_u.tos = RT_TOS(iph->tos);
+		fl.flags = flags;
+		err = ip_route_output_key(net, &rt, &fl);
+		if (err)
+			goto out;
+
+		skb_dst_set(skb, &rt->dst);
+	} else {
+		/* non-local src, find valid iif to satisfy
+		 * rp-filter when calling ip_route_input. */
+		fl.nl_u.ip4_u.daddr = iph->saddr;
+		err = ip_route_output_key(net, &rt, &fl);
+		if (err)
+			goto out;
+
+		err = ip_route_input(skb, iph->daddr, iph->saddr,
+				     RT_TOS(iph->tos), rt->dst.dev);
+		if (err) {
+			dst_release(&rt->dst);
+			goto out;
+		}
+		if (pmss) {
+			mss = get_advmss(&rt->dst);
+			if (*pmss > mss)
+				*pmss = mss;
+		}
+		dst_release(&rt->dst);
+	}
+
+	err = skb_dst(skb)->error;
+	if (!err && pmss) {
+		mss = get_advmss(skb_dst(skb));
+		if (*pmss > mss)
+			*pmss = mss;
+	}
+
+out:
+	return err;
+}
+
+static int tcp_send(__be32 src, __be32 dst, __be16 sport, __be16 dport,
+		    u32 seq, u32 ack_seq, __be16 window, u16 mss, u8 tcp_flags,
+		    u8 tos, struct net_device *dev, int flags,
+		    struct sk_buff *oskb)
+{
+	struct sk_buff *skb;
+	struct iphdr *iph;
+	struct tcphdr *th;
+	int err, len;
+
+	len = sizeof(*th);
+	if (mss)
+		len += TCPOLEN_MSS;
+
+	skb = NULL;
+	/* caller must give me a large enough oskb */
+	if (oskb) {
+		unsigned char *odata = oskb->data;
+
+		if (skb_recycle_check(oskb, 0)) {
+			oskb->data = odata;
+			skb_reset_tail_pointer(oskb);
+			skb = oskb;
+			pr_debug("recycle skb\n");
+		}
+	}
+	if (!skb) {
+		skb = alloc_skb(LL_MAX_HEADER + sizeof(*iph) + len, GFP_ATOMIC);
+		if (!skb) {
+			err = -ENOMEM;
+			goto out;
+		}
+		skb_reserve(skb, LL_MAX_HEADER);
+	}
+
+	skb_reset_network_header(skb);
+	if (!(flags & TCP_SEND_FLAG_ACK2SYN) || skb != oskb) {
+		iph = (struct iphdr *)skb_put(skb, sizeof(*iph));
+		iph->version	= 4;
+		iph->ihl	= sizeof(*iph) / 4;
+		iph->tos	= tos;
+		/* tot_len is set in ip_local_out() */
+		iph->id		= 0;
+		iph->frag_off	= htons(IP_DF);
+		iph->protocol	= IPPROTO_TCP;
+		iph->saddr	= src;
+		iph->daddr	= dst;
+		th = (struct tcphdr *)skb_put(skb, len);
+		th->source	= sport;
+		th->dest	= dport;
+	} else {
+		iph = (struct iphdr *)skb->data;
+		iph->id		= 0;
+		iph->frag_off	= htons(IP_DF);
+		skb_put(skb, iph->ihl * 4 + len);
+		th = (struct tcphdr *)(skb->data + iph->ihl * 4);
+	}
+
+	th->seq		= htonl(seq);
+	th->ack_seq	= htonl(ack_seq);
+	tcp_flag_byte(th) = tcp_flags;
+	th->doff	= len / 4;
+	th->window	= window;
+	th->urg_ptr	= 0;
+
+	if ((flags & TCP_SEND_FLAG_SYNCOOKIE) && mss)
+		err = syn_proxy_route(skb, dev_net(dev), &mss);
+	else
+		err = syn_proxy_route(skb, dev_net(dev), NULL);
+	if (err)
+		goto err_out;
+
+	if ((flags & TCP_SEND_FLAG_SYNCOOKIE)) {
+		if (mss) {
+			th->seq = htonl(__cookie_v4_init_sequence(dst, src,
+								  dport, sport,
+								  ack_seq - 1,
+								  &mss));
+		} else {
+			mss = TCP_MSS_DEFAULT;
+			th->seq = htonl(__cookie_v4_init_sequence(dst, src,
+								  dport, sport,
+								  ack_seq - 1,
+								  &mss));
+			mss = 0;
+		}
+	}
+
+	if (mss)
+		* (__force __be32 *)(th + 1) = htonl((TCPOPT_MSS << 24) |
+						     (TCPOLEN_MSS << 16) |
+						     mss);
+	skb->ip_summed = CHECKSUM_PARTIAL;
+	th->check = ~tcp_v4_check(len, src, dst, 0);
+	skb->csum_start = (unsigned char *)th - skb->head;
+	skb->csum_offset = offsetof(struct tcphdr, check);
+
+	if (!(flags & TCP_SEND_FLAG_ACK2SYN) || skb != oskb)
+		iph->ttl	= dst_metric(skb_dst(skb), RTAX_HOPLIMIT);
+
+	if (skb->len > get_mtu(skb_dst(skb))) {
+		if (printk_ratelimit())
+			pr_warning("%s has smaller mtu: %d\n",
+				   skb_dst(skb)->dev->name,
+				   get_mtu(skb_dst(skb)));
+		err = -EINVAL;
+		goto err_out;
+	}
+
+	if ((flags & TCP_SEND_FLAG_NOTRACE)) {
+		skb->nfct = &nf_ct_untracked_get()->ct_general;
+		skb->nfctinfo = IP_CT_NEW;
+		nf_conntrack_get(skb->nfct);
+	}
+
+	pr_debug("ip_local_out: %pI4n:%hu -> %pI4n:%hu (seq=%u, "
+		 "ack_seq=%u mss=%hu flags=%hhx)\n", &src, ntohs(th->source),
+		 &dst, ntohs(th->dest), ntohl(th->seq), ack_seq, mss,
+		 tcp_flags);
+
+	err = ip_local_out(skb);
+	if (err > 0)
+		err = net_xmit_errno(err);
+
+	pr_debug("ip_local_out: return with %d\n", err);
+out:
+	if (oskb && oskb != skb)
+		kfree_skb(oskb);
+
+	return err;
+
+err_out:
+	kfree_skb(skb);
+	goto out;
+}
+
+static int get_mss(u8 *data, int len)
+{
+	u8 olen;
+
+	while (len >= TCPOLEN_MSS) {
+		switch (data[0]) {
+		case TCPOPT_EOL:
+			return 0;
+		case TCPOPT_NOP:
+			data++;
+			len--;
+			break;
+		case TCPOPT_MSS:
+			if (data[1] != TCPOLEN_MSS)
+				return -EINVAL;
+			return get_unaligned_be16(data + 2);
+		default:
+			olen = data[1];
+			if (olen < 2 || olen > len)
+				return -EINVAL;
+			data += olen;
+			len -= olen;
+			break;
+		}
+	}
+
+	return 0;
+}
+
+static DEFINE_PER_CPU(struct syn_proxy_state, syn_proxy_state);
+
+/* syn_proxy_pre isn't under the protection of nf_conntrack_proto_tcp.c */
+static unsigned int syn_proxy_pre(struct sk_buff *skb, struct nf_conn *ct,
+				  enum ip_conntrack_info ctinfo)
+{
+	struct syn_proxy_state *state;
+	struct iphdr *iph;
+	struct tcphdr *th, _th;
+
+	/* only support IPv4 now */
+	iph = ip_hdr(skb);
+	if (iph->version != 4)
+		return NF_ACCEPT;
+
+	th = skb_header_pointer(skb, iph->ihl * 4, sizeof(_th), &_th);
+	if (th == NULL)
+		return NF_DROP;
+
+	if (!ct || !nf_ct_is_confirmed(ct)) {
+		int ret;
+
+		if (!th->syn && th->ack) {
+			u16 mss;
+			struct sk_buff *rec_skb;
+
+			mss = cookie_v4_check_sequence(iph, th,
+						       ntohl(th->ack_seq) - 1);
+			if (!mss)
+				return NF_ACCEPT;
+
+			pr_debug("%pI4n:%hu -> %pI4n:%hu(mss=%hu)\n",
+				 &iph->saddr, ntohs(th->source),
+				 &iph->daddr, ntohs(th->dest), mss);
+
+			if (skb_tailroom(skb) < TCPOLEN_MSS &&
+			    skb->len < iph->ihl * 4 + sizeof(*th) + TCPOLEN_MSS)
+				rec_skb = NULL;
+			else
+				rec_skb = skb;
+
+			local_bh_disable();
+			state = &__get_cpu_var(syn_proxy_state);
+			state->seq_inited = 1;
+			state->window = th->window;
+			state->seq_diff = ntohl(th->ack_seq) - 1;
+			if (rec_skb)
+				tcp_send(iph->saddr, iph->daddr, 0, 0,
+					 ntohl(th->seq) - 1, 0, th->window,
+					 mss, TCPHDR_SYN, 0, skb->dev,
+					 TCP_SEND_FLAG_ACK2SYN, rec_skb);
+			else
+				tcp_send(iph->saddr, iph->daddr, th->source,
+					 th->dest, ntohl(th->seq) - 1, 0,
+					 th->window, mss, TCPHDR_SYN,
+					 iph->tos, skb->dev, 0, NULL);
+			state->seq_inited = 0;
+			local_bh_enable();
+
+			if (!rec_skb)
+				kfree_skb(skb);
+
+			return NF_STOLEN;
+		}
+
+		if (!ct || !th->syn || th->ack)
+			return NF_ACCEPT;
+
+		ret = NF_ACCEPT;
+		local_bh_disable();
+		state = &__get_cpu_var(syn_proxy_state);
+		if (state->seq_inited) {
+			struct syn_proxy_state *nstate;
+
+			nstate = nf_ct_ext_add(ct, NF_CT_EXT_SYNPROXY,
+					       GFP_ATOMIC);
+			if (nstate != NULL) {
+				nstate->seq_inited = 0;
+				nstate->window = state->window;
+				nstate->seq_diff = state->seq_diff;
+				pr_debug("seq_diff: %u\n", nstate->seq_diff);
+			} else {
+				ret = NF_DROP;
+			}
+		}
+		local_bh_enable();
+
+		return ret;
+	}
+
+	state = nf_ct_ext_find(ct, NF_CT_EXT_SYNPROXY);
+	if (!state)
+		return NF_ACCEPT;
+
+	if (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) {
+		__be32 newack;
+
+		/* don't need to mangle duplicate SYN packets */
+		if (th->syn && !th->ack)
+			return NF_ACCEPT;
+		if (!skb_make_writable(skb, ip_hdrlen(skb) + sizeof(*th)))
+			return NF_DROP;
+		th = (struct tcphdr *)(skb->data + ip_hdrlen(skb));
+		newack = htonl(ntohl(th->ack_seq) - state->seq_diff);
+		inet_proto_csum_replace4(&th->check, skb, th->ack_seq, newack,
+					 0);
+		pr_debug("alter ack seq: %u -> %u\n",
+			 ntohl(th->ack_seq), ntohl(newack));
+		th->ack_seq = newack;
+	} else {
+		/* Simultaneous open ? Oh, no. The connection between
+		 * client and us is established. */
+		if (th->syn && !th->ack)
+			return NF_DROP;
+	}
+
+	return NF_ACCEPT;
+}
+
+static unsigned int syn_proxy_mangle_pkt(struct sk_buff *skb, struct iphdr *iph,
+					 struct tcphdr *th, u32 seq_diff)
+{
+	__be32 new;
+	int olen;
+
+	if (skb->len < (iph->ihl + th->doff) * 4)
+		return NF_DROP;
+	if (!skb_make_writable(skb, (iph->ihl + th->doff) * 4))
+		return NF_DROP;
+	iph = (struct iphdr *)(skb->data);
+	th = (struct tcphdr *)(skb->data + iph->ihl * 4);
+
+	new = tcp_flag_word(th) & (~TCP_FLAG_SYN);
+	inet_proto_csum_replace4(&th->check, skb, tcp_flag_word(th), new, 0);
+	tcp_flag_word(th) = new;
+
+	new = htonl(ntohl(th->seq) + seq_diff);
+	inet_proto_csum_replace4(&th->check, skb, th->seq, new, 0);
+	pr_debug("alter seq: %u -> %u\n", ntohl(th->seq), ntohl(new));
+	th->seq = new;
+
+	olen = th->doff - sizeof(*th) / 4;
+	if (olen) {
+		__be32 *opt;
+
+		opt = (__force __be32 *)(th + 1);
+#define TCPOPT_EOL_WORD ((TCPOPT_EOL << 24) + (TCPOPT_EOL << 16) + \
+			 (TCPOPT_EOL << 8) + TCPOPT_EOL)
+		inet_proto_csum_replace4(&th->check, skb, *opt, TCPOPT_EOL_WORD,
+					 0);
+		*opt = TCPOPT_EOL_WORD;
+	}
+
+	return NF_ACCEPT;
+}
+
+static unsigned int syn_proxy_post(struct sk_buff *skb, struct nf_conn *ct,
+				   enum ip_conntrack_info ctinfo)
+{
+	struct syn_proxy_state *state;
+	struct iphdr *iph;
+	struct tcphdr *th;
+
+	/* untraced packets don't have NF_CT_EXT_SYNPROXY ext, as they don't
+	 * enter syn_proxy_pre() */
+	state = nf_ct_ext_find(ct, NF_CT_EXT_SYNPROXY);
+	if (state == NULL)
+		return NF_ACCEPT;
+
+	iph = ip_hdr(skb);
+	if (!skb_make_writable(skb, iph->ihl * 4 + sizeof(*th)))
+		return NF_DROP;
+	th = (struct tcphdr *)(skb->data + iph->ihl * 4);
+	if (!state->seq_inited) {
+		if (th->syn) {
+			/* It must be from original direction, as the ones
+			 * from the other side are dropped in function
+			 * syn_proxy_pre() */
+			if (!th->ack)
+				return NF_ACCEPT;
+
+			pr_debug("SYN-ACK %pI4n:%hu -> %pI4n:%hu "
+				 "(seq=%u ack_seq=%u)\n",
+				 &iph->saddr, ntohs(th->source), &iph->daddr,
+				 ntohs(th->dest), ntohl(th->seq),
+				 ntohl(th->ack_seq));
+
+			/* SYN-ACK from reply direction with the protection
+			 * of conntrack */
+			spin_lock_bh(&ct->lock);
+			if (!state->seq_inited) {
+				state->seq_inited = 1;
+				pr_debug("update seq_diff %u -> %u\n",
+					 state->seq_diff,
+					 state->seq_diff - ntohl(th->seq));
+				state->seq_diff -= ntohl(th->seq);
+			}
+			spin_unlock_bh(&ct->lock);
+			tcp_send(iph->daddr, iph->saddr, th->dest, th->source,
+				 ntohl(th->ack_seq),
+				 ntohl(th->seq) + 1 + state->seq_diff,
+				 state->window, 0, TCPHDR_ACK, iph->tos,
+				 skb->dev, 0, NULL);
+
+			return syn_proxy_mangle_pkt(skb, iph, th,
+						    state->seq_diff + 1);
+		} else {
+			__be32 newseq;
+
+			if (!th->rst)
+				return NF_ACCEPT;
+			newseq = htonl(state->seq_diff + 1);
+			inet_proto_csum_replace4(&th->check, skb, th->seq,
+						 newseq, 0);
+			pr_debug("alter RST seq: %u -> %u\n",
+				 ntohl(th->seq), ntohl(newseq));
+			th->seq = newseq;
+
+			return NF_ACCEPT;
+		}
+	}
+
+	/* ct should be in ESTABLISHED state, but if the ack packets from
+	 * us are lost. */
+	if (th->syn) {
+		if (!th->ack)
+			return NF_ACCEPT;
+
+		tcp_send(iph->daddr, iph->saddr, th->dest, th->source,
+			 ntohl(th->ack_seq),
+			 ntohl(th->seq) + 1 + state->seq_diff,
+			 state->window, 0, TCPHDR_ACK, iph->tos,
+			 skb->dev, 0, NULL);
+
+		return syn_proxy_mangle_pkt(skb, iph, th, state->seq_diff + 1);
+	}
+
+	if (CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY) {
+		__be32 newseq;
+
+		newseq = htonl(ntohl(th->seq) + state->seq_diff);
+		inet_proto_csum_replace4(&th->check, skb, th->seq, newseq, 0);
+		pr_debug("alter seq: %u -> %u\n", ntohl(th->seq),
+			 ntohl(newseq));
+		th->seq = newseq;
+	}
+
+	return NF_ACCEPT;
+}
+
+static unsigned int tcp_process(struct sk_buff *skb)
+{
+	const struct iphdr *iph;
+	const struct tcphdr *th;
+	int err;
+	u16 mss;
+
+	iph = ip_hdr(skb);
+	if (iph->frag_off & htons(IP_OFFSET))
+		goto out;
+	if (!pskb_may_pull(skb, iph->ihl * 4 + sizeof(*th)))
+		goto out;
+	th = (const struct tcphdr *)(skb->data + iph->ihl * 4);
+	if ((tcp_flag_byte(th) &
+	     (TCPHDR_FIN | TCPHDR_RST | TCPHDR_ACK | TCPHDR_SYN)) != TCPHDR_SYN)
+		goto out;
+
+	if (nf_ip_checksum(skb, NF_INET_PRE_ROUTING, iph->ihl * 4, IPPROTO_TCP))
+		goto out;
+	mss = 0;
+	if (th->doff > sizeof(*th) / 4) {
+		if (!pskb_may_pull(skb, (iph->ihl + th->doff) * 4))
+			goto out;
+		err = get_mss((u8 *)(th + 1), th->doff * 4 - sizeof(*th));
+		if (err < 0)
+			goto out;
+		if (err != 0)
+			mss = err;
+	} else if (th->doff != sizeof(*th) / 4)
+		goto out;
+
+	tcp_send(iph->daddr, iph->saddr, th->dest, th->source, 0,
+		 ntohl(th->seq) + 1, 0, mss, TCPHDR_SYN | TCPHDR_ACK,
+		 iph->tos, skb->dev,
+		 TCP_SEND_FLAG_NOTRACE | TCP_SEND_FLAG_SYNCOOKIE, skb);
+
+	return NF_STOLEN;
+
+out:
+	return NF_DROP;
+}
+
+static unsigned int synproxy_tg(struct sk_buff *skb,
+				const struct xt_action_param *par)
+{
+	struct nf_conn *ct;
+	enum ip_conntrack_info ctinfo;
+	int ret;
+
+	/* received from lo */
+	ct = nf_ct_get(skb, &ctinfo);
+	if (ct)
+		return IPT_CONTINUE;
+
+	local_bh_disable();
+	if (!__get_cpu_var(syn_proxy_state).seq_inited)
+		ret = tcp_process(skb);
+	else
+		ret = IPT_CONTINUE;
+	local_bh_enable();
+
+	return ret;
+}
+
+static int synproxy_tg_check(const struct xt_tgchk_param *par)
+{
+	int ret;
+
+	ret = nf_ct_l3proto_try_module_get(par->family);
+	if (ret < 0)
+		pr_info("cannot load conntrack support for proto=%u\n",
+			par->family);
+
+	return ret;
+}
+
+static void synproxy_tg_destroy(const struct xt_tgdtor_param *par)
+{
+	nf_ct_l3proto_module_put(par->family);
+}
+
+static struct xt_target synproxy_tg_reg __read_mostly = {
+	.name		= "SYNPROXY",
+	.family		= NFPROTO_IPV4,
+	.target		= synproxy_tg,
+	.table		= "raw",
+	.hooks		= 1 << NF_INET_PRE_ROUTING,
+	.proto		= IPPROTO_TCP,
+	.checkentry	= synproxy_tg_check,
+	.destroy	= synproxy_tg_destroy,
+	.me		= THIS_MODULE,
+};
+
+static struct nf_ct_ext_type syn_proxy_state_ext __read_mostly = {
+	.len	= sizeof(struct syn_proxy_state),
+	.align	= __alignof__(struct syn_proxy_state),
+	.id	= NF_CT_EXT_SYNPROXY,
+};
+
+static int __init synproxy_tg_init(void)
+{
+	int err;
+
+	rcu_assign_pointer(syn_proxy_pre_hook, syn_proxy_pre);
+	rcu_assign_pointer(syn_proxy_post_hook, syn_proxy_post);
+	err = nf_ct_extend_register(&syn_proxy_state_ext);
+	if (err)
+		goto err_out;
+	err = xt_register_target(&synproxy_tg_reg);
+	if (err)
+		goto err_out2;
+
+	return err;
+
+err_out2:
+	nf_ct_extend_unregister(&syn_proxy_state_ext);
+err_out:
+	rcu_assign_pointer(syn_proxy_post_hook, NULL);
+	rcu_assign_pointer(syn_proxy_pre_hook, NULL);
+	rcu_barrier();
+
+	return err;
+}
+
+static void __exit synproxy_tg_exit(void)
+{
+	xt_unregister_target(&synproxy_tg_reg);
+	nf_ct_extend_unregister(&syn_proxy_state_ext);
+	rcu_assign_pointer(syn_proxy_post_hook, NULL);
+	rcu_assign_pointer(syn_proxy_pre_hook, NULL);
+	rcu_barrier();
+}
+
+module_init(synproxy_tg_init);
+module_exit(synproxy_tg_exit);

^ permalink raw reply related

* [PATCH net-next-2.6] snmp: add align parameter to snmp_mib_init()
From: Eric Dumazet @ 2010-06-23  6:58 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Hideaki YOSHIFUJI, Arnaldo Carvalho de Melo,
	Vlad Yasevich, Herbert Xu

In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 include/net/ip.h       |    2 +-
 net/dccp/proto.c       |    3 ++-
 net/ipv4/af_inet.c     |   27 +++++++++++++++++----------
 net/ipv6/addrconf.c    |    9 ++++++---
 net/ipv6/af_inet6.c    |   15 ++++++++++-----
 net/sctp/protocol.c    |    3 ++-
 net/xfrm/xfrm_policy.c |    3 ++-
 7 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index d52f011..3b524df 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -178,7 +178,7 @@ extern struct ipv4_config ipv4_config;
 #define NET_ADD_STATS_USER(net, field, adnd) SNMP_ADD_STATS_USER((net)->mib.net_statistics, field, adnd)
 
 extern unsigned long snmp_fold_field(void __percpu *mib[], int offt);
-extern int snmp_mib_init(void __percpu *ptr[2], size_t mibsize);
+extern int snmp_mib_init(void __percpu *ptr[2], size_t mibsize, size_t align);
 extern void snmp_mib_free(void __percpu *ptr[2]);
 
 extern struct local_ports {
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index f79bcef..096250d 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -1002,7 +1002,8 @@ EXPORT_SYMBOL_GPL(dccp_shutdown);
 static inline int dccp_mib_init(void)
 {
 	return snmp_mib_init((void __percpu **)dccp_statistics,
-			     sizeof(struct dccp_mib));
+			     sizeof(struct dccp_mib),
+			     __alignof__(struct dccp_mib));
 }
 
 static inline void dccp_mib_exit(void)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index d99e7e0..711de7c 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1425,13 +1425,13 @@ unsigned long snmp_fold_field(void __percpu *mib[], int offt)
 }
 EXPORT_SYMBOL_GPL(snmp_fold_field);
 
-int snmp_mib_init(void __percpu *ptr[2], size_t mibsize)
+int snmp_mib_init(void __percpu *ptr[2], size_t mibsize, size_t align)
 {
 	BUG_ON(ptr == NULL);
-	ptr[0] = __alloc_percpu(mibsize, __alignof__(unsigned long));
+	ptr[0] = __alloc_percpu(mibsize, align);
 	if (!ptr[0])
 		goto err0;
-	ptr[1] = __alloc_percpu(mibsize, __alignof__(unsigned long));
+	ptr[1] = __alloc_percpu(mibsize, align);
 	if (!ptr[1])
 		goto err1;
 	return 0;
@@ -1488,25 +1488,32 @@ static const struct net_protocol icmp_protocol = {
 static __net_init int ipv4_mib_init_net(struct net *net)
 {
 	if (snmp_mib_init((void __percpu **)net->mib.tcp_statistics,
-			  sizeof(struct tcp_mib)) < 0)
+			  sizeof(struct tcp_mib),
+			  __alignof__(struct tcp_mib)) < 0)
 		goto err_tcp_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.ip_statistics,
-			  sizeof(struct ipstats_mib)) < 0)
+			  sizeof(struct ipstats_mib),
+			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.net_statistics,
-			  sizeof(struct linux_mib)) < 0)
+			  sizeof(struct linux_mib),
+			  __alignof__(struct linux_mib)) < 0)
 		goto err_net_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.udp_statistics,
-			  sizeof(struct udp_mib)) < 0)
+			  sizeof(struct udp_mib),
+			  __alignof__(struct udp_mib)) < 0)
 		goto err_udp_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.udplite_statistics,
-			  sizeof(struct udp_mib)) < 0)
+			  sizeof(struct udp_mib),
+			  __alignof__(struct udp_mib)) < 0)
 		goto err_udplite_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.icmp_statistics,
-			  sizeof(struct icmp_mib)) < 0)
+			  sizeof(struct icmp_mib),
+			  __alignof__(struct icmp_mib)) < 0)
 		goto err_icmp_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.icmpmsg_statistics,
-			  sizeof(struct icmpmsg_mib)) < 0)
+			  sizeof(struct icmpmsg_mib),
+			  __alignof__(struct icmpmsg_mib)) < 0)
 		goto err_icmpmsg_mib;
 
 	tcp_mib_init(net);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b97bb1f..c20a7c2 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -284,13 +284,16 @@ static void addrconf_mod_timer(struct inet6_ifaddr *ifp,
 static int snmp6_alloc_dev(struct inet6_dev *idev)
 {
 	if (snmp_mib_init((void __percpu **)idev->stats.ipv6,
-			  sizeof(struct ipstats_mib)) < 0)
+			  sizeof(struct ipstats_mib),
+			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip;
 	if (snmp_mib_init((void __percpu **)idev->stats.icmpv6,
-			  sizeof(struct icmpv6_mib)) < 0)
+			  sizeof(struct icmpv6_mib),
+			  __alignof__(struct icmpv6_mib)) < 0)
 		goto err_icmp;
 	if (snmp_mib_init((void __percpu **)idev->stats.icmpv6msg,
-			  sizeof(struct icmpv6msg_mib)) < 0)
+			  sizeof(struct icmpv6msg_mib),
+			  __alignof__(struct icmpv6msg_mib)) < 0)
 		goto err_icmpmsg;
 
 	return 0;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 94b1b9c..e830cd4 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -971,19 +971,24 @@ static void ipv6_packet_cleanup(void)
 static int __net_init ipv6_init_mibs(struct net *net)
 {
 	if (snmp_mib_init((void __percpu **)net->mib.udp_stats_in6,
-			  sizeof (struct udp_mib)) < 0)
+			  sizeof(struct udp_mib),
+			  __alignof__(struct udp_mib)) < 0)
 		return -ENOMEM;
 	if (snmp_mib_init((void __percpu **)net->mib.udplite_stats_in6,
-			  sizeof (struct udp_mib)) < 0)
+			  sizeof(struct udp_mib),
+			  __alignof__(struct udp_mib)) < 0)
 		goto err_udplite_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.ipv6_statistics,
-			  sizeof(struct ipstats_mib)) < 0)
+			  sizeof(struct ipstats_mib),
+			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.icmpv6_statistics,
-			  sizeof(struct icmpv6_mib)) < 0)
+			  sizeof(struct icmpv6_mib),
+			  __alignof__(struct icmpv6_mib)) < 0)
 		goto err_icmp_mib;
 	if (snmp_mib_init((void __percpu **)net->mib.icmpv6msg_statistics,
-			  sizeof(struct icmpv6msg_mib)) < 0)
+			  sizeof(struct icmpv6msg_mib),
+			  __alignof__(struct icmpv6msg_mib)) < 0)
 		goto err_icmpmsg_mib;
 	return 0;
 
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index a0e1a7f..c0e162a 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1002,7 +1002,8 @@ int sctp_register_pf(struct sctp_pf *pf, sa_family_t family)
 static inline int init_sctp_mibs(void)
 {
 	return snmp_mib_init((void __percpu **)sctp_statistics,
-			     sizeof(struct sctp_mib));
+			     sizeof(struct sctp_mib),
+			     __alignof__(struct sctp_mib));
 }
 
 static inline void cleanup_sctp_mibs(void)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 4bf27d9..593c06b 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2480,7 +2480,8 @@ static int __net_init xfrm_statistics_init(struct net *net)
 	int rv;
 
 	if (snmp_mib_init((void __percpu **)net->mib.xfrm_statistics,
-			  sizeof(struct linux_xfrm_mib)) < 0)
+			  sizeof(struct linux_xfrm_mib),
+			  __alignof__(struct linux_xfrm_mib)) < 0)
 		return -ENOMEM;
 	rv = xfrm_proc_init(net);
 	if (rv < 0)



^ permalink raw reply related

* [PATCH] net: add dependency on fw class module
From: Amit Kumar Salecha @ 2010-06-23  6:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Anirban Chakraborty

From: Anirban Chakraborty <anirban.chakraborty@qlogic.com>

netxen_nic and qlcnic driver depends on firmware_class module.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/Kconfig |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index fe113d0..71e6f8f 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2755,6 +2755,7 @@ config MYRI10GE_DCA
 config NETXEN_NIC
 	tristate "NetXen Multi port (1/10) Gigabit Ethernet NIC"
 	depends on PCI
+	select FW_LOADER
 	help
 	  This enables the support for NetXen's Gigabit Ethernet card.
 
@@ -2820,6 +2821,7 @@ config BNX2X
 config QLCNIC
 	tristate "QLOGIC QLCNIC 1/10Gb Converged Ethernet NIC Support"
 	depends on PCI
+	select FW_LOADER
 	help
 	  This driver supports QLogic QLE8240 and QLE8242 Converged Ethernet
 	  devices.
-- 
1.6.0.2


^ permalink raw reply related

* Re: [PATCH] phylib: Add autoload support for the LXT973 phy.
From: Richard Cochran @ 2010-06-23  5:37 UTC (permalink / raw)
  To: David Woodhouse; +Cc: netdev
In-Reply-To: <1277210293.21798.11.camel@localhost>

On Tue, Jun 22, 2010 at 01:38:13PM +0100, David Woodhouse wrote:
> prefer that we just remember to update the table and don't need to be
> forced :)

Oops, and thanks for catching this.

>  static struct mdio_device_id lxt_tbl[] = {
>  	{ 0x78100000, 0xfffffff0 },
>  	{ 0x001378e0, 0xfffffff0 },
> +	{ 0x00137a10, 0xfffffff0 },
>  	{ }
>  };

Question about the whole PHY MODULE_DEVICE_TABLE system:

I recently posted a phy driver for the National Semiconductor
DP83640. During development, I used drivers/net/arm/ixp4xx_eth.c as
the MAC driver, which was linked into the kernel (not a module). I
noticed that the phy driver's probe function only gets called if the
phy driver is also statically linked, but not when it is loaded as a
module.

Is this the correct behavior?

Thanks,

Richard

^ permalink raw reply

* Re: linux-next: manual merge of the net tree with the net-current tree
From: Herbert Xu @ 2010-06-23  4:14 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: David Miller, netdev, linux-next, linux-kernel, Changli Gao,
	Eric Dumazet
In-Reply-To: <20100623125116.1370bdd4.sfr@canb.auug.org.au>

On Wed, Jun 23, 2010 at 12:51:16PM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the net tree got a conflict in
> net/ipv4/ip_output.c between commit
> 26cde9f7e2747b6d254b704594eed87ab959afa5 ("udp: Fix bogus UFO packet
> generation") from the net-current tree and commit
> d8d1f30b95a635dbd610dcc5eb641aca8f4768cf ("net-next: remove useless union
> keyword") from the net tree.
> 
> Just context changes. I fixed it up (see below) and can carry the fix as
> necessary.

Looks good to me.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox