Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] nf_nat: use local variable hdrlen
From: Patrick McHardy @ 2010-08-02 15:16 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, netfilter-devel, netdev
In-Reply-To: <1280476725-7637-1-git-send-email-xiaosuo@gmail.com>

On 30.07.2010 09:58, Changli Gao wrote:
> use local variable hdrlen instead of ip_hdrlen(skb).

Applied. But again, please prefix your subject lines with
"netfilter: " and use capital letters for the beginning of
a sentence as I asked you to.

^ permalink raw reply

* Re: [PATCH v2 1/2] nf_nat: make unique_tuple return void
From: Patrick McHardy @ 2010-08-02 15:21 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, netfilter-devel, netdev
In-Reply-To: <1280560117-30747-1-git-send-email-xiaosuo@gmail.com>

On 31.07.2010 09:08, Changli Gao wrote:
> the only user of unique_tuple() get_unique_tuple() doesn't care about the
> return value of unique_tuple(), so make unique_tuple() return void (nothing).

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v2 2/2] nf_nat: don't check if the tuple is unique when there isn't any other choice
From: Patrick McHardy @ 2010-08-02 15:36 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, netfilter-devel, netdev
In-Reply-To: <1280560164-30783-1-git-send-email-xiaosuo@gmail.com>

On 31.07.2010 09:09, Changli Gao wrote:
> the tuple got from unique_tuple() doesn't need to be really unique, so the
> check for the unique tuple isn't necessary, when there isn't any other
> choice. Eliminating the unnecessary nf_nat_used_tuple() can save some CPU
> cycles too.

Applied.

^ permalink raw reply

* Re: [PATCH v2] nf_nat: no IP_NAT_RANGE_MAP_IPS flags when alloc_null_binding()
From: Patrick McHardy @ 2010-08-02 15:41 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, netfilter-devel, netdev
In-Reply-To: <1280673907-12619-1-git-send-email-xiaosuo@gmail.com>

On 01.08.2010 16:45, Changli Gao wrote:
> when alloc_null_binding(), no IP_NAT_RNAGE_MAP_IPS in flags means no IP address
> translation is needed. It isn't necessary to specify the address explicitly.

When sending updated patches, you have to tell people what has changed.

^ permalink raw reply

* Re: [PATCH] nf: make skb_make_writable() return bool
From: Patrick McHardy @ 2010-08-02 15:45 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, netfilter-devel, netdev
In-Reply-To: <1280559384-25197-1-git-send-email-xiaosuo@gmail.com>

On 31.07.2010 08:56, Changli Gao wrote:
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> ----

> -int skb_make_writable(struct sk_buff *skb, unsigned int writable_len)
> +bool skb_make_writable(struct sk_buff *skb, unsigned int writable_len)

You can change things like this when making actual changes to the code,
but this is just useless noise in the changelogs and makes it harder
to locate real changes in the history.

^ permalink raw reply

* Re: [PATCH] ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
From: Patrick McHardy @ 2010-08-02 15:50 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, Bart De Schuymer, netdev
In-Reply-To: <1280618708-9241-1-git-send-email-xiaosuo@gmail.com>

On 01.08.2010 01:25, Changli Gao wrote:
> 6c79bf0f2440fd250c8fce8d9b82fcf03d4e8350 subtracts PPPOE_SES_HLEN from mtu at
> the front of ip_fragment(). So the later subtraction should be removed. The
> MTU of 802.1q is also 1500, so MTU should not be changed.

Bart, please review, thanks.

> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> ----
>  net/ipv4/ip_output.c |    6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 6652bd9..04b6989 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -446,7 +446,7 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
>  	int ptr;
>  	struct net_device *dev;
>  	struct sk_buff *skb2;
> -	unsigned int mtu, hlen, left, len, ll_rs, pad;
> +	unsigned int mtu, hlen, left, len, ll_rs;
>  	int offset;
>  	__be16 not_last_frag;
>  	struct rtable *rt = skb_rtable(skb);
> @@ -585,9 +585,7 @@ slow_path:
>  	/* for bridged IP traffic encapsulated inside f.e. a vlan header,
>  	 * we need to make room for the encapsulating header
>  	 */
> -	pad = nf_bridge_pad(skb);
> -	ll_rs = LL_RESERVED_SPACE_EXTRA(rt->dst.dev, pad);
> -	mtu -= pad;
> +	ll_rs = LL_RESERVED_SPACE_EXTRA(rt->dst.dev, nf_bridge_pad(skb));
>  
>  	/*
>  	 *	Fragment the datagram.
> 


^ permalink raw reply

* Re: [PATCH v2 2/2] macvtap: Implement multiqueue macvtap driver
From: Arnd Bergmann @ 2010-08-02 15:52 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: davem, bhutchings, netdev, mst, therbert
In-Reply-To: <20100802143310.1517.55824.sendpatchset@krkumar2.in.ibm.com>

On Monday 02 August 2010, Krishna Kumar wrote:
> Implement multiqueue facility for macvtap driver. The idea is that
> a macvtap device can be opened multiple times and the fd's can be
> used to register eg, as backend for vhost.
>
> Please review.

Only two very minor points from my side:

> diff -ruNp org/include/linux/netdevice.h new/include/linux/netdevice.h
> --- org/include/linux/netdevice.h	2010-07-25 16:57:07.000000000 +0530
> +++ new/include/linux/netdevice.h	2010-08-02 16:05:57.000000000 +0530
> @@ -2253,6 +2253,7 @@ static inline const char *netdev_name(co
>  	return dev->name;
>  }
>  
> +extern int skb_calculate_flow(struct net_device *dev, struct sk_buff *skb);
>  extern int netdev_printk(const char *level, const struct net_device *dev,
>  			 const char *format, ...)
>  	__attribute__ ((format (printf, 3, 4)));

This logically belongs into the first patch.

> diff -ruNp org/include/linux/if_macvlan.h new/include/linux/if_macvlan.h
> --- org/include/linux/if_macvlan.h	2010-08-02 15:32:33.000000000 +0530
> +++ new/include/linux/if_macvlan.h	2010-08-02 15:32:33.000000000 +0530
> @@ -40,6 +40,14 @@ struct macvlan_rx_stats {
>  	unsigned long		rx_errors;
>  };
>  
> +#define MIN(x, y)		(((x) < (y)) ? (x) : (y))
> +
> +/*
> + * Maximum times a macvtap device can be opened. This can be used to
> + * configure the number of receive queue, e.g. for multiqueue virtio.
> + */
> +#define MAX_MACVTAP_QUEUES	MIN(16, NR_CPUS)
> +

Please use the existing min() or min_t() macro instead of providing your own.

	Arnd

^ permalink raw reply

* Re: [PATCH] nf_conntrack_acct: use skb->len for accounting
From: Patrick McHardy @ 2010-08-02 15:56 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, netfilter-devel, netdev
In-Reply-To: <1279856399-9058-1-git-send-email-xiaosuo@gmail.com>

On 23.07.2010 05:39, Changli Gao wrote:
> use skb->len for accounting as xt_quota does. Since nf_conntrack works at
> the network layer, skb_network_offset should always returns ZERO.

Applied.

^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Shirley Ma @ 2010-08-02 16:01 UTC (permalink / raw)
  To: Avi Kivity
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mst, mingo, davem,
	herbert, jdike
In-Reply-To: <4C552DC4.5000600@redhat.com>

Hello Avi,

On Sun, 2010-08-01 at 11:18 +0300, Avi Kivity wrote:
> I don't understand.  Under what conditions do you use
> get_user_pages() 
> instead of get_user_pages_fast()?  Why?

The code always calls get_user_pages_fast, however, the page will be
unpinned in skb_free if the same page is not used again for a new
buffer. The reason for unpin the page is we don't want to pin all of the
guest kernel memory(memory over commit). So get_user_pages_fast will
call slow path get_user_pages. 

Your previous comment is suggesting to keep the page pinned for
get_user_pages_fast fast path?

Thanks
Shirley


^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Shirley Ma @ 2010-08-02 16:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mingo, davem, herbert,
	jdike
In-Reply-To: <20100801083113.GB16158@redhat.com>

Hello Michael,

On Sun, 2010-08-01 at 11:31 +0300, Michael S. Tsirkin wrote:
> I think we should explore the idea for the driver to fall back on data
> copy
> for small message sizes.
> The benefit of zero copy would then be CPU utilization on large
> messages.

Yes, we used to have 128 bytes for small copy in other driver. I saw
Xiaohui's patch here is using 64 bytes. I think we need to compare the
performance on different platform to decide what's the best for small
message size.

Thanks
Shirley


^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Shirley Ma @ 2010-08-02 16:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mingo, davem, herbert,
	jdike
In-Reply-To: <20100801083113.GB16158@redhat.com>

Hello Michael,

On Sun, 2010-08-01 at 11:31 +0300, Michael S. Tsirkin wrote:
> Could you provide an example of a good setup?
> Specifically, is it a good idea for the vhost thread
> to inherit CPU affinities from qemu? 

I need to retest my set up with multi-threads vhost. My previous set up
applies to single thread vhost. The single stream netperf/netserver set
up, for example, if we have two quad-cores sockets to get the consistent
9.4Gb/s BW:

socket 1:
cpu0: netperf/netserver
cpu1: ixgbe 10GbE NIC IRQ 
cpu2: I/O thread
cpu3: vhost thread

socket 2:
cpu0: QEMU VCPU0
cpu1: QEMU VCPU1
cpu2:
cpu3:

Thanks
Shirley


^ permalink raw reply

* Re: Re: why do we need printk on sending syn flood cookie?
From: Franchoze Eric @ 2010-08-02 16:11 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev
In-Reply-To: <20100802081716.GA8374@Chamillionaire.breakpoint.cc>



02.08.10, 12:17, "Florian Westphal" <fw@strlen.de>:

> Franchoze Eric  wrote:
>  >  Just sirious why do we need printk each 1 second (60*HZ) about possible syn-flood? It really floods dmesg. Is there something dengerous? I have suggestion to turn off printk about sending tcp cookie each 1 second.
>  
>  It is handled exactly like other printks in the networking path,
>  e.g. receipt of tcp wscale == 15.
>  
>  Why does this need special treatment?
>  

For now I see "possible SYN flooding on port %d. Sending cookies.\n" message each second on my server. I know that there are a lot of SYNs and I know that kernel sends cookie. Why do I need so mach printk?
So I suggested add new value to /proc/sys/net/ipv4/tcp_syncookies, which will enable cookie but this printk will be turned off.

^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Avi Kivity @ 2010-08-02 16:11 UTC (permalink / raw)
  To: Shirley Ma
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mst, mingo, davem,
	herbert, jdike
In-Reply-To: <1280764918.22830.7.camel@localhost.localdomain>

  On 08/02/2010 07:01 PM, Shirley Ma wrote:
> Hello Avi,
>
> On Sun, 2010-08-01 at 11:18 +0300, Avi Kivity wrote:
>> I don't understand.  Under what conditions do you use
>> get_user_pages()
>> instead of get_user_pages_fast()?  Why?
> The code always calls get_user_pages_fast, however, the page will be
> unpinned in skb_free if the same page is not used again for a new
> buffer. The reason for unpin the page is we don't want to pin all of the
> guest kernel memory(memory over commit).

That is fine.

> So get_user_pages_fast will
> call slow path get_user_pages.

I don't understand this. gup_fast() only calls gup() if the page is 
swapped out or read-only.

> Your previous comment is suggesting to keep the page pinned for
> get_user_pages_fast fast path?
>

Right now I'm not sure I understand what's happening.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Shirley Ma @ 2010-08-02 16:25 UTC (permalink / raw)
  To: Avi Kivity
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mst, mingo, davem,
	herbert, jdike
In-Reply-To: <4C56EE3D.1050203@redhat.com>

On Mon, 2010-08-02 at 19:11 +0300, Avi Kivity wrote:
> I don't understand this. gup_fast() only calls gup() if the page is 
> swapped out or read-only.

Oh, I used the page as read-only on xmit path. Should I use write
instead?

Thanks
Shirley


^ permalink raw reply

* Re: [PATCH v2 2/2] macvtap: Implement multiqueue macvtap driver
From: Krishna Kumar2 @ 2010-08-02 16:28 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: bhutchings, davem, mst, netdev, therbert
In-Reply-To: <201008021752.28475.arnd@arndb.de>

Hi Arnd,

Thanks for your comments. The declaration was in the 2nd patch
since the function was not used outside net/core/dev.c after the
1st patch is applied, but now I think you are right.

Regarding min/min_t, I had tried both and got this error:

"include/linux/if_macvlan.h:57: error: braced-group within expression
	allowed only inside a function"
Please let me know if there is any alternative (curly braces cannot be
used outside of functions). Otherwise one change required is to add:

#ifndef MIN
#endif

I will wait for a few hours and resubmit the patches.

thanks,

- KK

Arnd Bergmann <arnd@arndb.de> wrote on 08/02/2010 09:22:28 PM:

> Arnd Bergmann <arnd@arndb.de>
> 08/02/2010 09:22 PM
>
> To
>
> Krishna Kumar2/India/IBM@IBMIN
>
> cc
>
> davem@davemloft.net, bhutchings@solarflare.com,
> netdev@vger.kernel.org, mst@redhat.com, therbert@google.com
>
> Subject
>
> Re: [PATCH v2 2/2] macvtap: Implement multiqueue macvtap driver
>
> On Monday 02 August 2010, Krishna Kumar wrote:
> > Implement multiqueue facility for macvtap driver. The idea is that
> > a macvtap device can be opened multiple times and the fd's can be
> > used to register eg, as backend for vhost.
> >
> > Please review.
>
> Only two very minor points from my side:
>
> > diff -ruNp org/include/linux/netdevice.h new/include/linux/netdevice.h
> > --- org/include/linux/netdevice.h   2010-07-25 16:57:07.000000000 +0530
> > +++ new/include/linux/netdevice.h   2010-08-02 16:05:57.000000000 +0530
> > @@ -2253,6 +2253,7 @@ static inline const char *netdev_name(co
> >     return dev->name;
> >  }
> >
> > +extern int skb_calculate_flow(struct net_device *dev, struct sk_buff
*skb);
> >  extern int netdev_printk(const char *level, const struct net_device
*dev,
> >            const char *format, ...)
> >     __attribute__ ((format (printf, 3, 4)));
>
> This logically belongs into the first patch.
>
> > diff -ruNp org/include/linux/if_macvlan.h
new/include/linux/if_macvlan.h
> > --- org/include/linux/if_macvlan.h   2010-08-02 15:32:33.000000000
+0530
> > +++ new/include/linux/if_macvlan.h   2010-08-02 15:32:33.000000000
+0530
> > @@ -40,6 +40,14 @@ struct macvlan_rx_stats {
> >     unsigned long      rx_errors;
> >  };
> >
> > +#define MIN(x, y)      (((x) < (y)) ? (x) : (y))
> > +
> > +/*
> > + * Maximum times a macvtap device can be opened. This can be used to
> > + * configure the number of receive queue, e.g. for multiqueue virtio.
> > + */
> > +#define MAX_MACVTAP_QUEUES   MIN(16, NR_CPUS)
> > +
>
> Please use the existing min() or min_t() macro instead of providing your
own.
>
>    Arnd


^ permalink raw reply

* Re: 2.6.35-rc6-git6: Reported regressions from 2.6.34
From: Tejun Heo @ 2010-08-02 16:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List,
	Maciej Rutecki, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <AANLkTimcH7+Bq1UEbaSU7SQpzArPgmSLegiqE13V8=CF@mail.gmail.com>

Hello, Linus.

On 08/01/2010 08:01 PM, Linus Torvalds wrote:
> This has a proposed patch. I don't know what the status of it is, though. Jens?
> 
>    http://marc.info/?l=linux-kernel&m=127950018204029&w=2
> 
>> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=16393
>> Subject         : kernel BUG at fs/block_dev.c:765!
>> Submitter       : Markus Trippelsdorf <markus@trippelsdorf.de>
>> Date            : 2010-07-14 13:52 (19 days old)
>> Message-ID      : <20100714135217.GA1797@arch.tripp.de>
>> References      : http://marc.info/?l=linux-kernel&m=127911564213748&w=2
> 
> This one is interesting. And I think I perhaps see where it's coming from.
> 
> bd_start_claiming() (through bd_prepare_to_claim()) has two separate
> success cases: either there was no holder (bd_claiming is NULL) or the
> new holder was already claiming it (bd_claiming == holder).
> 
> Note in particular the case of the holder _already_ holding it. What happens is:
> 
>  - bd_start_claiming() succeeds because we had _already_ claimed it
> with the same holder
> 
>  - then some error happens, and we call bd_abort_claiming(), which
> does whole->bd_claiming = NULL;
> 
>  - the original holder thinks it still holds the bd, but it has been released!
> 
>  - a new claimer comes in, and succeeds because bd_claiming is now NULL.
> 
>  - we now have two "owners" of the bd, but bd_claiming only points to
> the second one.
> 
> I think bd_start_claiming() needs to do some kind of refcount for the
> nested holder case, and bd_abort_claiming() needs to decrement the
> refcount and only clear the bd_claiming field when it goes down to
> zero.
> 
> I dunno. Maybe there's something else going on, but it does look
> suspicious, and the above would explain the BUG_ON().

Yeah, that definitely sounds plausible.  I think the condition check
in bd_prepare_to_claim() should have been "if (whole->bd_claiming)"
instead of "if (whole->bd_claiming && whole->bd_claiming != holder)".
It doesn't make much sense to allow multiple parallel claiming
operations anyway and the comment above already says - "This function
fails if @bdev is already claimed by another holder and waits if
another claiming is in progress."

I'll try to build a test case and verify it.

Thank you.

-- 
tejun

^ permalink raw reply

* Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.
From: Avi Kivity @ 2010-08-02 16:32 UTC (permalink / raw)
  To: Shirley Ma
  Cc: xiaohui.xin, netdev, kvm, linux-kernel, mst, mingo, davem,
	herbert, jdike
In-Reply-To: <1280766319.22830.24.camel@localhost.localdomain>

  On 08/02/2010 07:25 PM, Shirley Ma wrote:
> On Mon, 2010-08-02 at 19:11 +0300, Avi Kivity wrote:
>> I don't understand this. gup_fast() only calls gup() if the page is
>> swapped out or read-only.
> Oh, I used the page as read-only on xmit path. Should I use write
> instead?

No, for xmit getting the page as read only is fine.

I was inaccurate, gup_fast() performs as follows:

- if .write = 1, gup_fast() will be fast if the page is mapped and writeable
- if .write = 0, gup_fast() will be fast if the page is mapped

so, using .write = 0 for the xmit path will be faster in more cases than 
.write = 1.

When are you seeing gup_fast() fall back to gup()?  It should be at most 
once per page (when a guest starts up none of its pages are mapped, it 
faults them in on demand).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply

* Re: [PATCH v2 2/2] macvtap: Implement multiqueue macvtap driver
From: Arnd Bergmann @ 2010-08-02 16:34 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: bhutchings, davem, mst, netdev, therbert
In-Reply-To: <OFE0F15F8D.2C34C2AD-ON65257773.00579311-65257773.005A60A9@in.ibm.com>

On Monday 02 August 2010, Krishna Kumar2 wrote:
> "include/linux/if_macvlan.h:57: error: braced-group within expression
>         allowed only inside a function"
> Please let me know if there is any alternative (curly braces cannot be
> used outside of functions). Otherwise one change required is to add:
> 
> #ifndef MIN
> #endif
> 
> I will wait for a few hours and resubmit the patches.

Maybe just open-code the minimum computation:

#define MAX_MACVTAP_QUEUES   (NR_CPUS < 16 ? NR_CPUS : 16)

	ARnd

^ permalink raw reply

* Re: [PATCH] tc: make symbols loaded from tc action modules global.
From: Stephen Hemminger @ 2010-08-02 16:55 UTC (permalink / raw)
  To: Andreas Henriksson; +Cc: netdev
In-Reply-To: <20100802073032.GA32046@amd64.fatal.se>

On Mon, 2 Aug 2010 09:30:33 +0200
Andreas Henriksson <andreas@fatal.se> wrote:

> Fixes problems with xtables based MARK target ("ipt" module).
> When tc loads the "ipt" (xt) module it kept the symbols local,
> this made loading of libxtables not find the required struct.
> 
> currently ipt/xt is the only tc action module.
> iproute2 never seem to do dlclose.
> hopefully the modules doesn't export more symbols then needed.
> 
> In this situation hopefully the RTLD_GLOBAL flag won't hurt us.
> 
> I've been using this patch in the Debian package of iproute for
> the last 3 weeks and noone has complained.
> ( This fixes http://bugs.debian.org/584898 )
> 
> Signed-off-by: Andreas Henriksson <andreas@fatal.se>
> ---
>  tc/m_action.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/tc/m_action.c b/tc/m_action.c
> index a198158..6464b2e 100644
> --- a/tc/m_action.c
> +++ b/tc/m_action.c
> @@ -99,7 +99,7 @@ restart_s:
>  	}
>  
>  	snprintf(buf, sizeof(buf), "%s/m_%s.so", get_tc_lib(), str);
> -	dlh = dlopen(buf, RTLD_LAZY);
> +	dlh = dlopen(buf, RTLD_LAZY | RTLD_GLOBAL);
>  	if (dlh == NULL) {
>  		dlh = aBODY;
>  		if (dlh == NULL) {


Applied

^ permalink raw reply

* 加薪不靠老板靠自己,專業輔導3-8萬!!
From: coleenwcq06046432 @ 2010-08-02 17:49 UTC (permalink / raw)
  To: chiupc, 7637090, llen880921, mike140250360, ccs7418, alduyafang,
	helesa22, ayrtg

您是"月光族"還是每月"不滿族"?
您已厭倦再當"伸手牌"?
您不想下班後又趕著去接小孩的生活?
立即加入WFS在家工作系統
WFS是經過科學設計,
符合現代及未來的工作模式
只要跟著我們一起
進入這個系統按部就班
您將改變目前的生活 實現自己的夢想

快按下列網址改變您的生活(限年滿23歲)
http://www.moonnini.com/w/herblifetw/


--------------------------------------------------------------
Ovi Mail: Making email access easy
http://mail.ovi.com


^ permalink raw reply

* Re: why do we need printk on sending syn flood cookie?
From: Mitchell Erblich @ 2010-08-02 18:10 UTC (permalink / raw)
  To: Franchoze Eric; +Cc: Florian Westphal, netdev
In-Reply-To: <23001280765498@web50.yandex.ru>


On Aug 2, 2010, at 9:11 AM, Franchoze Eric wrote:

> 
> 
> 02.08.10, 12:17, "Florian Westphal" <fw@strlen.de>:
> 
>> Franchoze Eric  wrote:
>>> Just sirious why do we need printk each 1 second (60*HZ) about possible syn-flood? It really floods dmesg. Is there something dengerous? I have suggestion to turn off printk about sending tcp cookie each 1 second.
>> 
>> It is handled exactly like other printks in the networking path,
>> e.g. receipt of tcp wscale == 15.
>> 
>> Why does this need special treatment?
>> 
> 
> For now I see "possible SYN flooding on port %d. Sending cookies.\n" message each second on my server. I know that there are a lot of SYNs and I know that kernel sends cookie. Why do I need so mach printk?
> So I suggested add new value to /proc/sys/net/ipv4/tcp_syncookies, which will enable cookie but this printk will be turned off.


Once print per sec is a very good GENERIC informative msg to an admin that 
this system either has some  very small config'd or default values
(normally set up as a percentage of memory or set sock option and/or .. )
and/or that for some reason that a large number of SYNs are being rec'vd
and/or that a number of connections are being un/intentionally being
retried and/or dropped

Remember each printk may only be a small fraction of the number of SYNs
rcv'd and this fraction COULD depend on the Mb/Gb of the intf(s) or more
likely some type of  avg of summation of the number of network paths
involved.

Mitchell Erblich



> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* softirq warnings when calling dev_kfree_skb_irq - bug in conntrack?
From: Jeremy Fitzhardinge @ 2010-08-02 18:54 UTC (permalink / raw)
  To: NetDev
  Cc: Xu, Dongxiao, Xen-devel@lists.xensource.com, Ian Campbell,
	Patrick McHardy, Eric Dumazet

  Hi,

I'm seeing this in the current linux-next tree:

------------[ cut here ]------------
WARNING: at kernel/softirq.c:143 local_bh_enable+0x40/0x87()
Modules linked in: xt_state dm_mirror dm_region_hash dm_log microcode [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 2.6.35-rc6-next-20100729+ #29
Call Trace:
  <IRQ>   [<ffffffff81030de3>] warn_slowpath_common+0x80/0x98
  [<ffffffff81030e10>] warn_slowpath_null+0x15/0x17
  [<ffffffff81035ff3>] local_bh_enable+0x40/0x87
  [<ffffffff814236e5>] destroy_conntrack+0x78/0x9e
  [<ffffffff810bea55>] ? __kmalloc_track_caller+0xc3/0x135
  [<ffffffff814203b4>] nf_conntrack_destroy+0x16/0x18
  [<ffffffff813fadee>] skb_release_head_state+0x97/0xd9
  [<ffffffff813fabbe>] __kfree_skb+0x11/0x7a
  [<ffffffff813fac4e>] consume_skb+0x27/0x29
  [<ffffffff81402d3a>] dev_kfree_skb_irq+0x18/0x62
  [<ffffffff8130a762>] xennet_tx_buf_gc+0xfc/0x192
  [<ffffffff8130a8fb>] smart_poll_function+0x50/0x121
  [<ffffffff8130a8ab>] ? smart_poll_function+0x0/0x121
  [<ffffffff8104b8d1>] __run_hrtimer+0xcc/0x127
  [<ffffffff8104bad3>] hrtimer_interrupt+0x9c/0x17b
  [<ffffffff81005f24>] xen_timer_interrupt+0x2a/0x13e
  [<ffffffff81006180>] ? check_events+0x12/0x22
  [<ffffffff81005be9>] ? xen_force_evtchn_callback+0xd/0xf
  [<ffffffff81005be9>] ? xen_force_evtchn_callback+0xd/0xf
  [<ffffffff81077641>] handle_IRQ_event+0x52/0x119
  [<ffffffff81079abe>] handle_level_irq+0x6c/0xb2
  [<ffffffff8127b3dd>] __xen_evtchn_do_upcall+0xa9/0x12a
  [<ffffffff8100616d>] ? xen_restore_fl_direct_end+0x0/0x1
  [<ffffffff8127b491>] xen_evtchn_do_upcall+0x28/0x39
  [<ffffffff810097ac>] xen_do_hypervisor_callback+0x1c/0x30
  <EOI>   [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
  [<ffffffff81005c2d>] ? xen_safe_halt+0x10/0x1a
  [<ffffffff81003eb4>] ? xen_idle+0x38/0x44
  [<ffffffff81007de4>] ? cpu_idle+0x82/0xe9
  [<ffffffff814b84e3>] ? rest_init+0x67/0x69
  [<ffffffff81afcc10>] ? start_kernel+0x387/0x392
  [<ffffffff81afc2c8>] ? x86_64_start_reservations+0xb3/0xb7
  [<ffffffff81affed2>] ? xen_start_kernel+0x4be/0x4c2
---[ end trace 755676650ea49003 ]---


The warning is:

	WARN_ON_ONCE(in_irq() || irqs_disabled());


It seems the basic problem is that xennet_tx_buf_gc() is being called in 
interrupt context - with smartpoll it's from the timer interrupt, but 
even without it is being called from xennet_interrupt(), which in turn 
calls dev_kfree_skb_irq().

Since this should be perfectly OK, it appears the problem is actually in 
conntrack.  I'm not sure where this bug started happening, but its 
relatively recently I think.

Thanks,
     J

^ permalink raw reply

* Re: [PATCH] ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
From: Bart De Schuymer @ 2010-08-02 19:20 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Changli Gao, David S. Miller, netdev
In-Reply-To: <4C56E961.9070107@trash.net>

Patrick McHardy schreef:
> On 01.08.2010 01:25, Changli Gao wrote:
>   
>> 6c79bf0f2440fd250c8fce8d9b82fcf03d4e8350 subtracts PPPOE_SES_HLEN from mtu at
>> the front of ip_fragment(). So the later subtraction should be removed. The
>> MTU of 802.1q is also 1500, so MTU should not be changed.
>>     
>
> Bart, please review, thanks.
>
>   
The patch looks correct. The commit Changli refers to fixed the case 
where fragments are already available but broke the slow_path. The MTU 
for 802.1Q is indeed also 1500...

cheers,
Bart

Signed-off-by: Bart De Schuymer <bdschuym@pandora.bo>
>> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
>> ----
>>  net/ipv4/ip_output.c |    6 ++----
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>> index 6652bd9..04b6989 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -446,7 +446,7 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
>>  	int ptr;
>>  	struct net_device *dev;
>>  	struct sk_buff *skb2;
>> -	unsigned int mtu, hlen, left, len, ll_rs, pad;
>> +	unsigned int mtu, hlen, left, len, ll_rs;
>>  	int offset;
>>  	__be16 not_last_frag;
>>  	struct rtable *rt = skb_rtable(skb);
>> @@ -585,9 +585,7 @@ slow_path:
>>  	/* for bridged IP traffic encapsulated inside f.e. a vlan header,
>>  	 * we need to make room for the encapsulating header
>>  	 */
>> -	pad = nf_bridge_pad(skb);
>> -	ll_rs = LL_RESERVED_SPACE_EXTRA(rt->dst.dev, pad);
>> -	mtu -= pad;
>> +	ll_rs = LL_RESERVED_SPACE_EXTRA(rt->dst.dev, nf_bridge_pad(skb));
>>  
>>  	/*
>>  	 *	Fragment the datagram.
>>
>>     
>
>
>   


-- 
Bart De Schuymer
www.artinalgorithms.be


^ permalink raw reply

* [PATCH 1/2] phy/marvell: add 88e1121 interface mode support
From: Cyril Chemparathy @ 2010-08-02 19:44 UTC (permalink / raw)
  To: netdev; +Cc: Cyril Chemparathy
In-Reply-To: <1280778294-2993-1-git-send-email-cyril@ti.com>

This patch adds support for RGMII RX/TX delay configuration on marvell 88e1121
and derivatives.  With this patch, PHY_INTERFACE_MODE_RGMII_*ID modes are now
supported on these devices.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 drivers/net/phy/marvell.c |   35 ++++++++++++++++++++++++++++++++---
 1 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 78b74e8..b1413ae 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -69,6 +69,12 @@
 #define MII_M1111_COPPER		0
 #define MII_M1111_FIBER			1
 
+#define MII_88E1121_PHY_MSCR_PAGE	2
+#define MII_88E1121_PHY_MSCR_REG	21
+#define MII_88E1121_PHY_MSCR_RX_DELAY	BIT(5)
+#define MII_88E1121_PHY_MSCR_TX_DELAY	BIT(4)
+#define MII_88E1121_PHY_MSCR_DELAY_MASK	(~(0x3 << 4))
+
 #define MII_88E1121_PHY_LED_CTRL	16
 #define MII_88E1121_PHY_LED_PAGE	3
 #define MII_88E1121_PHY_LED_DEF		0x0030
@@ -180,7 +186,30 @@ static int marvell_config_aneg(struct phy_device *phydev)
 
 static int m88e1121_config_aneg(struct phy_device *phydev)
 {
-	int err, temp;
+	int err, oldpage, mscr;
+
+	oldpage = phy_read(phydev, MII_88E1121_PHY_PAGE);
+
+	err = phy_write(phydev, MII_88E1121_PHY_PAGE,
+			MII_88E1121_PHY_MSCR_PAGE);
+	if (err < 0)
+		return err;
+	mscr = phy_read(phydev, MII_88E1121_PHY_MSCR_REG) &
+		MII_88E1121_PHY_MSCR_DELAY_MASK;
+
+	if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)
+		mscr |= (MII_88E1121_PHY_MSCR_RX_DELAY |
+			 MII_88E1121_PHY_MSCR_TX_DELAY);
+	else if (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID)
+		mscr |= MII_88E1121_PHY_MSCR_RX_DELAY;
+	else if (phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
+		mscr |= MII_88E1121_PHY_MSCR_TX_DELAY;
+
+	err = phy_write(phydev, MII_88E1121_PHY_MSCR_REG, mscr);
+	if (err < 0)
+		return err;
+
+	phy_write(phydev, MII_88E1121_PHY_PAGE, oldpage);
 
 	err = phy_write(phydev, MII_BMCR, BMCR_RESET);
 	if (err < 0)
@@ -191,11 +220,11 @@ static int m88e1121_config_aneg(struct phy_device *phydev)
 	if (err < 0)
 		return err;
 
-	temp = phy_read(phydev, MII_88E1121_PHY_PAGE);
+	oldpage = phy_read(phydev, MII_88E1121_PHY_PAGE);
 
 	phy_write(phydev, MII_88E1121_PHY_PAGE, MII_88E1121_PHY_LED_PAGE);
 	phy_write(phydev, MII_88E1121_PHY_LED_CTRL, MII_88E1121_PHY_LED_DEF);
-	phy_write(phydev, MII_88E1121_PHY_PAGE, temp);
+	phy_write(phydev, MII_88E1121_PHY_PAGE, oldpage);
 
 	err = genphy_config_aneg(phydev);
 
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 0/2] Minor extensions to marvell phy driver
From: Cyril Chemparathy @ 2010-08-02 19:44 UTC (permalink / raw)
  To: netdev; +Cc: Cyril Chemparathy

This patch series adds a couple of minor extensions to the marvell phy driver.
The first patch in the series allows for RGMII TX and RX delay configuration
via interface mode.  The second patch adds support for a new device (88ec048).

Cyril Chemparathy (2):
  phy/marvell: add 88e1121 interface mode support
  phy/marvell: add 88ec048 support

 drivers/net/phy/marvell.c |   76 +++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 73 insertions(+), 3 deletions(-)


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox