Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 3/6] vxge: NETIF_F_LLTX removal
From: David Miller @ 2010-07-16  3:50 UTC (permalink / raw)
  To: jon.mason; +Cc: netdev, ramkrishna.vepa
In-Reply-To: <1279219648-10345-3-git-send-email-jon.mason@exar.com>

From: Jon Mason <jon.mason@exar.com>
Date: Thu, 15 Jul 2010 13:47:25 -0500

> NETIF_F_LLTX and it's usage of local transmit locks are depricated in
> favor of using the netdev queue's transmit lock.  Remove the local
> lock and all references to it, and use the netdev queue transmit lock
> in the transmit completion handler.
> 
> Signed-off-by: Jon Mason <jon.mason@exar.com>
> Signed-off-by: Ramkrishna Vepa <ramkrishna.vepa@exar.com>

Thanks a lot for doing this work.

Applied.

^ permalink raw reply

* Re: [PATCH 4/6] vxge: Update copyright information
From: David Miller @ 2010-07-16  3:51 UTC (permalink / raw)
  To: jon.mason; +Cc: netdev, sreenivasa.honnur, ramkrishna.vepa
In-Reply-To: <1279219648-10345-4-git-send-email-jon.mason@exar.com>

From: Jon Mason <jon.mason@exar.com>
Date: Thu, 15 Jul 2010 13:47:26 -0500

> Update copyright information to reflect the Exar purchase of Neterion
> 
> Signed-off-by: Jon Mason <jon.mason@exar.com>
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ramkrishna.vepa@exar.com>

Applied.

^ permalink raw reply

* Re: [PATCH 5/6] vxge: Update maintainers information
From: David Miller @ 2010-07-16  3:51 UTC (permalink / raw)
  To: jon.mason; +Cc: netdev, sreenivasa.honnur, ramkrishna.vepa
In-Reply-To: <1279219648-10345-5-git-send-email-jon.mason@exar.com>

From: Jon Mason <jon.mason@exar.com>
Date: Thu, 15 Jul 2010 13:47:27 -0500

> Update and correct maintainers information
> 
> Signed-off-by: Jon Mason <jon.mason@exar.com>
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ramkrishna.vepa@exar.com>

Applied.

^ permalink raw reply

* Re: [PATCH 6/6] vxge: Version update
From: David Miller @ 2010-07-16  3:51 UTC (permalink / raw)
  To: jon.mason; +Cc: netdev, sreenivasa.honnur, ramkrishna.vepa
In-Reply-To: <1279219648-10345-6-git-send-email-jon.mason@exar.com>

From: Jon Mason <jon.mason@exar.com>
Date: Thu, 15 Jul 2010 13:47:28 -0500

> Version update
> 
> Signed-off-by: Jon Mason <jon.mason@exar.com>
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ramkrishna.vepa@exar.com>

Applied.

^ permalink raw reply

* [PATCH] tcp: sizeof struct tcp_skb_cb is 44
From: Eric Dumazet @ 2010-07-16  4:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Correct comment stating sizeof(struct tcp_skb_cb) is 36 or 40, since its
44 bytes, since commit 951dbc8ac714b04 ([IPV6]: Move nextheader offset
to the IP6CB).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/tcp.h |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 468b01f..df6a2eb 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -570,11 +570,10 @@ extern u32 __tcp_select_window(struct sock *sk);
 #define TCPHDR_CWR 0x80
 
 /* This is what the send packet queuing engine uses to pass
- * TCP per-packet control information to the transmission
- * code.  We also store the host-order sequence numbers in
- * here too.  This is 36 bytes on 32-bit architectures,
- * 40 bytes on 64-bit machines, if this grows please adjust
- * skbuff.h:skbuff->cb[xxx] size appropriately.
+ * TCP per-packet control information to the transmission code.
+ * We also store the host-order sequence numbers in here too.
+ * This is 44 bytes if IPV6 is enabled.
+ * If this grows please adjust skbuff.h:skbuff->cb[xxx] size appropriately.
  */
 struct tcp_skb_cb {
 	union {



^ permalink raw reply related

* Re: [PATCH] tcp: sizeof struct tcp_skb_cb is 44
From: David Miller @ 2010-07-16  4:41 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1279254059.2433.9.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 16 Jul 2010 06:20:59 +0200

> Correct comment stating sizeof(struct tcp_skb_cb) is 36 or 40, since its
> 44 bytes, since commit 951dbc8ac714b04 ([IPV6]: Move nextheader offset
> to the IP6CB).
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH RFC] act_cpu: packet distributing
From: Eric Dumazet @ 2010-07-16  4:42 UTC (permalink / raw)
  To: Changli Gao; +Cc: hadi, David S. Miller, Patrick McHardy, Tom Herbert, netdev
In-Reply-To: <AANLkTiljdBpQkZAGybwFQtej-_j7sFtD72CTErT9TaOX@mail.gmail.com>

Le vendredi 16 juillet 2010 à 07:30 +0800, Changli Gao a écrit :
> On Fri, Jul 16, 2010 at 12:54 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> > Even if we solve this locking problem, using percpu variables, act_cpu
> > hits another problem :
> >
> > The socket refcount, taken by the 'master' cpu, and released by the
> > consumer cpu.
> This is why I asked if I can assign sk to skb.
> 

This assignement doesnt change the cache line ping pong problem.

Let me explain for you :

You lookup the TCP or UDP socket. This automatically takes a refcount on
it (atomic operation on sk->sk_refcnt).

Then you assign sk to skb.  (skb->sk = sk)

Then you transfert skb handling to another cpu (IPI cost already in RPS
as we all know)

This remote cpu finds skb->sk (and avoids the lookup/refcnt) and handles
the packet through TCP/UDP stack.

This remote cpu _releases_ the socket refcount.

Conclusion :

"The socket refcount, taken by the 'master' cpu, and released by the
consumer cpu."




^ permalink raw reply

* Re: [PATCH 2.6.35-rc1] net-next: fix LRO feature update in vmxnet3
From: David Miller @ 2010-07-16  5:17 UTC (permalink / raw)
  To: sbhatewara; +Cc: netdev, linux-kernel, pv-drivers
In-Reply-To: <alpine.LRH.2.00.1007151821440.10176@localhost.localdomain>

From: Shreyas Bhatewara <sbhatewara@vmware.com>
Date: Thu, 15 Jul 2010 18:28:31 -0700 (PDT)

> 
> From: Shreyas Bhatewara <sbhatewara@vmware.com>
> 
> Fix LRO feature update.
> 
> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 2.6.35-rc1] net-next: vmxnet3 fixes [2/5] Interrupt control bitmap
From: David Miller @ 2010-07-16  5:19 UTC (permalink / raw)
  To: sbhatewara; +Cc: netdev, linux-kernel, pv-drivers, ronghua
In-Reply-To: <alpine.LRH.2.00.1007150121400.25472@localhost.localdomain>

From: Shreyas Bhatewara <sbhatewara@vmware.com>
Date: Thu, 15 Jul 2010 18:20:03 -0700 (PDT)

> Reposting the patch with the endianness fix.
> 
> ---
> 
> A new bit map 'intrCtrl' is introduced in the DriverShared area. The 
> driver Ashould update VMXNET3_IC_DISABLE_ALL bit before writing IMR.
> 
> Signed-off-by: Ronghua Zang <ronghua@vmware.com>
> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>

Applied.

^ permalink raw reply

* [RFC PATCH] ipv6: Make IP6CB(skb)->nhoff 16-bit.
From: David Miller @ 2010-07-16  5:30 UTC (permalink / raw)
  To: netdev


Even with jumbograms I cannot see any way in which we would need
to records a larger than 65535 valued next-header offset.

The maximum extension header length is (256 << 3) == 2048.
There are only a handful of extension headers specified which
we'd even accept (say 5 or 6), therefore the largest next-header
offset we'd ever have to contend with is something less than
say 16k.

Therefore make it a u16 instead of a u32.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
Can anyone find any holes in this?  I want to do this so we
can legitimately rip ndisc_nodetype out of struct sk_buff
and stick it into the IP6CB where it belongs.  Currently
there isn't enough space, but with this change there will
be.

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 940e215..ab9e9e8 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -246,7 +246,7 @@ struct inet6_skb_parm {
 	__u16			srcrt;
 	__u16			dst1;
 	__u16			lastopt;
-	__u32			nhoff;
+	__u16			nhoff;
 	__u16			flags;
 #if defined(CONFIG_IPV6_MIP6) || defined(CONFIG_IPV6_MIP6_MODULE)
 	__u16			dsthao;

^ permalink raw reply related

* Re: [ipmr] ipmr: Don't leak memory if fib lookup fails.
From: David Miller @ 2010-07-16  5:39 UTC (permalink / raw)
  To: greearb; +Cc: netdev
In-Reply-To: <4C3FA22A.7090706@candelatech.com>

From: Ben Greear <greearb@candelatech.com>
Date: Thu, 15 Jul 2010 17:04:58 -0700

> On 07/15/2010 04:22 PM, Ben Greear wrote:
>> This was detected using two mcast router tables.  The
>> pimreg for the second interface did not have a specific
>> mrule, so packets received by it were handled by the
>> default table, which had nothing configured.
> 
> I just realized this should be applied against 2.6.35-pre, not 2.6.34.

Fix looks good, applied to net-2.6, thanks!

^ permalink raw reply

* Re: [PATCH 2.6.35-rc1] net-next: vmxnet3 fixes [3/5] Initialize link state at probe time
From: David Miller @ 2010-07-16  5:44 UTC (permalink / raw)
  To: sbhatewara; +Cc: netdev, linux-kernel, pv-drivers
In-Reply-To: <alpine.LRH.2.00.1007150125240.25472@localhost.localdomain>

From: Shreyas Bhatewara <sbhatewara@vmware.com>
Date: Thu, 15 Jul 2010 18:20:14 -0700 (PDT)

> 
> On Wed, 14 Jul 2010, David Miller wrote:
> 
>> From: Shreyas Bhatewara <sbhatewara@vmware.com>
>> Date: Tue, 13 Jul 2010 17:48:55 -0700 (PDT)
>> 
>> > 
>> > Initialize vmxnet3 link state at probe time
>> > 
>> > This change initializes the state of link at the time when driver is
>> > loaded. The ethtool output for 'link detected' and 'link speed'
>> > is thus valid even before the interface is brought up.
>> > 
>> > Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
>> 
>> You should never, ever, call netif_start_queue() on a device which has
>> not been brought up.
>> 
>> But that is what this patch is doing.
>> 
> 
> I do not understand why you say so. vmxnet3_check_link() is called in 
> probe() with affectTxQueue as false. Hence netif_start_queue() will not be 
> called before device is brought up.
> vmxnet3_check_link() is again called with affectTxQueue as true in 
> vmxnet3_activate_dev() after device was activated.

Aha, I see how the logic works now.

But still there is a problem with this patch, please remove the
driver version bump and resubmit.

You should only version bump at the last patch in a series.

Thanks.

^ permalink raw reply

* Re: [PATCH] cxgb4vf: fix SGE resource resource deallocation bug
From: David Miller @ 2010-07-16  5:47 UTC (permalink / raw)
  To: leedom; +Cc: netdev
In-Reply-To: <201007151554.43269.leedom@chelsio.com>

From: Casey Leedom <leedom@chelsio.com>
Date: Thu, 15 Jul 2010 15:54:43 -0700

>>From 1ecf246196c43d75e75525211450a64d650f4fbc Mon Sep 17 00:00:00 2001
> From: Casey Leedom <leedom@chelsio.com>
> Date: Thu, 15 Jul 2010 15:50:16 -0700
> Subject: [PATCH] cxgb4vf: fix SGE resource resource deallocation bug
> 
> Fix SGE resource resource deallocation bug.  Forgot to increment the RXQ and
> TXQ cursors in the loop ...
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>

Applied, thanks Casey.

^ permalink raw reply

* Re: [PATCH 1/2] bnx2: allocate with GFP_KERNEL flag on RX path init
From: Mitchell Erblich @ 2010-07-16  5:57 UTC (permalink / raw)
  To: David Miller; +Cc: mchan, sgruszka, netdev
In-Reply-To: <20100715.202537.116398412.davem@davemloft.net>


On Jul 15, 2010, at 8:25 PM, David Miller wrote:

> From: "Michael Chan" <mchan@broadcom.com>
> Date: Thu, 15 Jul 2010 07:48:40 -0700
> 
>> Stanislaw Gruszka wrote:
>> 
>>> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
>>> ---
>>> @@ -2974,7 +2974,7 @@ bnx2_rx_skb(struct bnx2 *bp, struct
>>> bnx2_rx_ring_info *rxr, struct sk_buff *skb,
>>>      int err;
>>>      u16 prod = ring_idx & 0xffff;
>>> 
>>> -     err = bnx2_alloc_rx_skb(bp, rxr, prod);
>>> +     err = bnx2_alloc_rx_skb(bp, rxr, prod, GFP_KERNEL);
>> 
>> This should be GFP_ATOMIC since it is called from NAPI softirq
>> context.
> 
> This fatal issue gives me doubts about whether this patch was even
> tested at all.
> 
> Immediately the kernel memory allocator should have issued a warning
> due to this GFP_KERNEL allocation in a non-sleep'able context.
> 
> Stanislaw, how did you test this patch?
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Group,

		Why NOT GFP_NOWAIT. This won't use the last resource pages
		versus GFP_ATOMIC?

		GFP_ATOMIC IMO, SHOULD be used in the paths that cleans
		and frees pages.

Mitchell Erblich



^ permalink raw reply

* Re: Question about way that NICs deliver packets to the kernel
From: Junchang Wang @ 2010-07-16  7:05 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: romieu, netdev
In-Reply-To: <1279204417.2118.12.camel@achroite.uk.solarflarecom.com>

>
> You should also compare the CPU usage.
>
> Ben.
>
Hi Ben,
I added options -c -C to netperf's command line. Result is as follows:
                    scheme 1    scheme 2    Imp.
Throughput:     683M        718M       5%
CPU usage:     47.8%       45.6%

That really surprised me because "top" command showed the CPU usage
was fluctuating between 0.5% and 1.5% rather that between 45% and 50%.

How can I get the exact CPU usage?

Thanks.

-- 
--Junchang

^ permalink raw reply

* Re: [PATCH 1/2] bnx2: allocate with GFP_KERNEL flag on RX path init
From: Stanislaw Gruszka @ 2010-07-16  7:13 UTC (permalink / raw)
  To: David Miller; +Cc: mchan, netdev
In-Reply-To: <20100715.202537.116398412.davem@davemloft.net>

On Thu, 15 Jul 2010 20:25:37 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> > This should be GFP_ATOMIC since it is called from NAPI softirq
> > context.
> 
> This fatal issue gives me doubts about whether this patch was even
> tested at all.
> 
> Immediately the kernel memory allocator should have issued a warning
> due to this GFP_KERNEL allocation in a non-sleep'able context.
> 
> Stanislaw, how did you test this patch?

I run net-next-2.6 kernel with patches on machine with bnx2 device,
but I compiled kernel with CONFIG_DEBUG_KOBJECT and all dmesg was filled
by messages like:

kobject: 'block' (ffff8801663122c0): kobject_add_internal: parent: '2:2:1:0', set: '(null)'
kobject: 'sdc' (ffff8801642ca070): kobject_add_internal: parent: 'block', set: 'devices'
kobject: 'sdc' (ffff8801642ca070): kobject_uevent_env

so I missed the warning, grr...

Stanislaw

^ permalink raw reply

* Re: Kexec Reboot Network Issue
From: Simon Horman @ 2010-07-16  7:15 UTC (permalink / raw)
  To: Richard Genthner
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Matt Carlson,
	kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Michael Chan
In-Reply-To: <4C3F2ACF.60601-2CkDue2ckGJOSnsfY10OVw@public.gmane.org>

[ CCed Michael Chan, Matt Carlson and Netdev ]

On Thu, Jul 15, 2010 at 11:35:43AM -0400, Richard Genthner wrote:
> On 07/15/2010 10:20 AM, Simon Horman wrote:
> >On Thu, Jul 15, 2010 at 07:59:07AM -0400, Richard Genthner wrote:
> >>I'm currently using the following string:
> >>
> >>kexex --type=elf-x86_64 --args-linux -l /boot/vmlinuz-2.6.18-12.el5
> >>--initrid=/boot/initrd-2.6.18-128.el6.img --append="`cat
> >>/proc/cmdline`"
> >>kexec -e
> >>
> >>Some times we can get to the box from any subnet, other times we can
> >>only get to the box from the same subnet only. Our solution to this
> >>is to down the iface and then restart networking. Has anyone else
> >>run into this issue?
> >Hi Richard,
> >
> >could you be more specific about which NIC you are using?
> >And is it at all possible to test a newer kernel version?
> >
> >What I suspect is happening is that the NIC is getting into an unknown
> >state. And what I'm hoping is that is a problem thats already been
> >addressed.
>
> I would try a different kenerl but our cluster fs has us locked to
> this kernel until we finish the upgrade to the new cluster fs
> version. heres ethtool on the iface
> 
> from lshw
> *-network
>                 description: Ethernet interface
>                 product: NetXtreme BCM5721 Gigabit Ethernet PCI Express
>                 vendor: Broadcom Corporation
>                 physical id: 0
>                 bus info: pci@0000:03:00.0
>                 logical name: eth0
>                 version: 21
>                 serial: 00:25:64:3b:9c:ae
>                 size: 1GB/s
>                 capacity: 1GB/s
>                 width: 64 bits
>                 clock: 33MHz
>                 capabilities: pm vpd msi pciexpress bus_master
> cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt
> 1000bt-fd autonegotiation
>                 configuration: autonegotiation=on broadcast=yes
> driver=tg3 driverversion=3.93 duplex=full firmware=5721-v3.65,
> ASFIPMI v6.25 ip=172.16.1.123 latency=0 link=yes module=tg3
> multicast=yes port=twisted pair speed=1GB/s

Hi Richard,

first, please don't top-post, its not the done thing in these parts.

I had a quick hunt through the git change log and the onl changed that
jumped out was "[TG3]: Fix msi issue with kexec/kdump"[1], but this
seems to have been back-ported to the initrd-2.6.18-128.el5 (I assume
you meant 5 not 6) kernel.

[1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ee6a99b539a50b4e9398938a0a6d37f8bf911550

In any case I strongly suspect that the problem is a kernel problem and as
such can't be solved modifying the kernel or at least tg.ko module (which
is probably in the initrd). I suggest logging bug report with Red Hat.

^ permalink raw reply

* Re: [PATCH 01/11] Removing dead RT2800PCI_SOC
From: Gertjan van Wingerde @ 2010-07-16  7:18 UTC (permalink / raw)
  To: Helmut Schaa
  Cc: Bartlomiej Zolnierkiewicz, Felix Fietkau, John W. Linville,
	Ivo Van Doorn, Christoph Egger, linux-wireless, users, netdev,
	linux-kernel, vamos-dev, Luis Correia
In-Reply-To: <AANLkTilqoYMMKYJT-YYbEzdnytUSYa0EylEiJ4x5xXXH@mail.gmail.com>

On 07/16/10 08:57, Helmut Schaa wrote:
> On Thu, Jul 15, 2010 at 10:41 AM, Bartlomiej Zolnierkiewicz <bzolnier@gmail.com <mailto:bzolnier@gmail.com>> wrote:
> 
>     On Wednesday 14 July 2010 04:44:44 pm Felix Fietkau wrote:
>     > On 2010-07-14 3:15 PM, John W. Linville wrote:
>     > > On Wed, Jul 14, 2010 at 02:52:14PM +0200, Ivo Van Doorn wrote:
>     > >> On Wed, Jul 14, 2010 at 2:46 PM, Luis Correia <luis.f.correia@gmail.com <mailto:luis.f.correia@gmail.com>> wrote:
>     > >> > On Wed, Jul 14, 2010 at 13:39, Christoph Egger <siccegge@cs.fau.de <mailto:siccegge@cs.fau.de>> wrote:
>     > >> >> While RT2800PCI_SOC exists in Kconfig, it depends on either
>     > >> >> RALINK_RT288X or RALINK_RT305X which are both not available in Kconfig
>     > >> >> so all Code depending on that can't ever be selected and, if there's
>     > >> >> no plan to add these options, should be cleaned up
>     > >> >>
>     > >> >> Signed-off-by: Christoph Egger <siccegge@cs.fau.de <mailto:siccegge@cs.fau.de>>
>     > >> >
>     > >> > NAK,
>     > >> >
>     > >> > this is not dead code, it is needed for the Ralink System-on-Chip
>     > >> > Platform devices.
>     > >> >
>     > >> > While I can't fix Kconfig errors and the current KConfig file may be
>     > >> > wrong, this code cannot and will not be deleted.
>     > >>
>     > >> When the config option was introduced, the config options RALINK_RT288X and
>     > >> RALINK_RT305X were supposed to be merged as well soon after by somebody (Felix?)
>     > >>
>     > >> But since testing is done on SoC boards by Helmut and Felix, I assume the code
>     > >> isn't dead but actually in use.
>     > >
>     > > Perhaps Helmut and Felix can send us the missing code?
>     > The missing code is a MIPS platform port, which is currently being
>     > maintained in OpenWrt, but is not ready for upstream submission yet.
>     > I'm not working on this code at the moment, but I think it will be
>     > submitted once it's ready.
> 
>     People are using automatic scripts to catch unused config options nowadays
>     so the issue is quite likely to come back again sooner or later..
> 
>     Would it be possible to improve situation somehow till the missing parts
>     get merged?  Maybe by adding a tiny comment documenting RT2800PCI_SOC
>     situation to Kconfig (if the config option itself really cannot be removed)
>     until all code is ready etc.?
> 
> 
> Or we could just remove RT2800PCI_SOC completely and build the soc specific
> parts always as part of rt2800pci. I mean it's not much code, just the platform
> driver stuff and the eeprom access.
> 

I'm not sure if that is feasible. Sure, we can reduce the usage of the variable by
unconditionally compiling in the generic SOC code, but we should not unconditionally
register the SOC platform device, which is currently also under the scope of this
Kconfig variable.

---
Gertjan.

^ permalink raw reply

* Re: [PATCH 1/2] bnx2: allocate with GFP_KERNEL flag on RX path init
From: Stanislaw Gruszka @ 2010-07-16  7:30 UTC (permalink / raw)
  To: Mitchell Erblich; +Cc: netdev, Michael Chan
In-Reply-To: <5AD9FE05-1F9B-44DE-8CC4-4D63F43E79C8@earthlink.net>

On Thu, 15 Jul 2010 11:57:59 -0700
Mitchell Erblich <erblichs@earthlink.net> wrote:

> > @@ -3039,7 +3039,8 @@ bnx2_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, struct sk_buff *skb,
> > 			rx_pg->page = NULL;
> > 
> > 			err = bnx2_alloc_rx_page(bp, rxr,
> > -						 RX_PG_RING_IDX(pg_prod));
> > +						 RX_PG_RING_IDX(pg_prod),
> > +	
> 
> > 					 GFP_ATOMIC);
> 
> Why not GFP_NOWAIT here?
> This would then not use the last reserved pages of memory.
> This still would remove the possibe sleep asociated with GFP_KERNEL.

There is no GFP_NOWAIT usage in any network driver. I'm not sure if
this flag is intended to driver usage. Anyway I can not judge if
GFP_ATOMIC -> GFP_NOWAIT conversion is good or bad idea, I think you
should ask mm guys about that.

Stanislaw

^ permalink raw reply

* Re: Question about way that NICs deliver packets to the kernel
From: Junchang Wang @ 2010-07-16  7:35 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev
In-Reply-To: <20100715211249.GA7278@electric-eye.fr.zoreil.com>

> It is a simple, straightforward fix against a 8169 hardware bug.
>
> See commit c0cd884af045338476b8e69a61fceb3f34ff22f1.
>
Fortunately, it seems my device is unaffected by this issue. :)

Thanks Francois.

-- 
--Junchang

^ permalink raw reply

* Re: [PATCH 2.6.35-rc1] net-next: vmxnet3 fixes [3/5] Initialize link state at probe time
From: Shreyas Bhatewara @ 2010-07-16  7:51 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	pv-drivers@vmware.com
In-Reply-To: <20100715.224435.45920963.davem@davemloft.net>



On Thu, 15 Jul 2010, David Miller wrote:

> From: Shreyas Bhatewara <sbhatewara@vmware.com>
> Date: Thu, 15 Jul 2010 18:20:14 -0700 (PDT)
> 
> > 
> > On Wed, 14 Jul 2010, David Miller wrote:
> > 
> >> From: Shreyas Bhatewara <sbhatewara@vmware.com>
> >> Date: Tue, 13 Jul 2010 17:48:55 -0700 (PDT)
> >> 
> >> > 
> >> > Initialize vmxnet3 link state at probe time
> >> > 
> >> > This change initializes the state of link at the time when driver is
> >> > loaded. The ethtool output for 'link detected' and 'link speed'
> >> > is thus valid even before the interface is brought up.
> >> > 
> >> > Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
> >> 
> >> You should never, ever, call netif_start_queue() on a device which has
> >> not been brought up.
> >> 
> >> But that is what this patch is doing.
> >> 
> > 
> > I do not understand why you say so. vmxnet3_check_link() is called in 
> > probe() with affectTxQueue as false. Hence netif_start_queue() will not be 
> > called before device is brought up.
> > vmxnet3_check_link() is again called with affectTxQueue as true in 
> > vmxnet3_activate_dev() after device was activated.
> 
> Aha, I see how the logic works now.
> 
> But still there is a problem with this patch, please remove the
> driver version bump and resubmit.
> 
> You should only version bump at the last patch in a series.
> 
> Thanks.
> 

---

Initialize vmxnet3 link state at probe time

This change initializes the state of link at the time when driver is
loaded. The ethtool output for 'link detected' and 'link speed'
is thus valid even before the interface is brought up.

Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>

---
 drivers/net/vmxnet3/vmxnet3_drv.c |   13 ++++++++-----
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 7792a44..1e31d40 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -130,7 +130,7 @@ vmxnet3_tq_stop(struct vmxnet3_tx_queue *tq, struct vmxnet3_adapter *adapter)
  * Check the link state. This may start or stop the tx queue.
  */
 static void
-vmxnet3_check_link(struct vmxnet3_adapter *adapter)
+vmxnet3_check_link(struct vmxnet3_adapter *adapter, bool affectTxQueue)
 {
 	u32 ret;
 
@@ -143,14 +143,16 @@ vmxnet3_check_link(struct vmxnet3_adapter *adapter)
 		if (!netif_carrier_ok(adapter->netdev))
 			netif_carrier_on(adapter->netdev);
 
-		vmxnet3_tq_start(&adapter->tx_queue, adapter);
+		if (affectTxQueue)
+			vmxnet3_tq_start(&adapter->tx_queue, adapter);
 	} else {
 		printk(KERN_INFO "%s: NIC Link is Down\n",
 		       adapter->netdev->name);
 		if (netif_carrier_ok(adapter->netdev))
 			netif_carrier_off(adapter->netdev);
 
-		vmxnet3_tq_stop(&adapter->tx_queue, adapter);
+		if (affectTxQueue)
+			vmxnet3_tq_stop(&adapter->tx_queue, adapter);
 	}
 }
 
@@ -165,7 +167,7 @@ vmxnet3_process_events(struct vmxnet3_adapter *adapter)
 
 	/* Check if link state has changed */
 	if (events & VMXNET3_ECR_LINK)
-		vmxnet3_check_link(adapter);
+		vmxnet3_check_link(adapter, true);
 
 	/* Check if there is an error on xmit/recv queues */
 	if (events & (VMXNET3_ECR_TQERR | VMXNET3_ECR_RQERR)) {
@@ -1947,7 +1949,7 @@ vmxnet3_activate_dev(struct vmxnet3_adapter *adapter)
 	 * Check link state when first activating device. It will start the
 	 * tx queue if the link is up.
 	 */
-	vmxnet3_check_link(adapter);
+	vmxnet3_check_link(adapter, true);
 
 	napi_enable(&adapter->napi);
 	vmxnet3_enable_all_intrs(adapter);
@@ -2549,6 +2551,7 @@ vmxnet3_probe_device(struct pci_dev *pdev,
 	}
 
 	set_bit(VMXNET3_STATE_BIT_QUIESCED, &adapter->state);
+	vmxnet3_check_link(adapter, false);
 	atomic_inc(&devices_found);
 	return 0;


^ permalink raw reply related

* sky2 mac hung again (or still?)
From: Helmut Grohne @ 2010-07-16  7:44 UTC (permalink / raw)
  To: Stephen Hemminger, netdev

Dear sky2 maintainers,

thanks for all your work on the continuously improving stability issues
of the sky2 driver. Unfortunately this battle is not yet over. To aid in
getting this fully working I'd like to give another data point.

Kernel: vanilla 2.6.33.2

lspci output:
05:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 19)
        Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet controller PCIe (Asus)
        Flags: bus master, fast devsel, latency 0, IRQ 36
        Memory at d1000000 (64-bit, non-prefetchable) [size=16K]
        I/O ports at a000 [size=256]
        [virtual] Expansion ROM at 80800000 [disabled] [size=128K]
        Capabilities: [48] Power Management version 2
        Capabilities: [50] Vital Product Data
        Capabilities: [5c] MSI: Enable- Count=1/2 Maskable- 64bit+
        Capabilities: [e0] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: sky2

dmesg output after rmmod and modprobe (for a currently very hung card):
sky2 driver version 1.26
sky2 0000:05:00.0: PCI INT A -> GSI 36 (level, low) -> IRQ 36
sky2 0000:05:00.0: setting latency timer to 64
sky2 0000:05:00.0: PCI: Disallowing DAC for device
sky2 0000:05:00.0: Yukon-2 EC chip revision 2
sky2 0000:05:00.0: irq 53 for MSI/MSI-X
sky2 0000:05:00.0: No interrupt generated using MSI, switching to INTx mode.
sky2 eth0: addr 00:15:f2:36:00:e9

The problem (again/still) is as usual:
sky2 eth0: hung mac 124:91 fifo 194 (182:176)
sky2 eth0: receiver hang detected
sky2 eth0: disabling interface
sky2 eth0: enabling interface

(I can get you more of these hung mac messages with slightly different
numbers if need be.)

This kind of hang happens about once a day for me and it kills my pptp vpn
connection, cause the card takes too long to recover.

Today in contrast the card took the opportunity to crash a bit harder,
so I did not get it up again (maybe a reboot can help later). I was
lucky to have an unused rtl8139too in that machine...

It seems like a precondition for the card to hang is a bit network
traffic. I don't remember getting it crashed with a load of less than
10Mbit symmetric (receive/send) while using this kernel version. On the
other I have seen the card survive this workload for 8 hours a few
times. It does not seem to have any problems with a quick and high
receive load (downloading Debian packages from the next 1Gbit mirror,
usually around 200Mbit).

To me this sounds like a race condition.

Did I miss out any details?

Thanks in advance

Helmut

^ permalink raw reply

* Re: [PATCH 2.6.35-rc1] net-next: vmxnet3 fixes [4/5] Do not reset when the device is not opened
From: Shreyas Bhatewara @ 2010-07-16  8:17 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	pv-drivers@vmware.com, Ronghua Zhang, Matthieu Bucchianeri
In-Reply-To: <20100715.183210.226762655.davem@davemloft.net>



On Thu, 15 Jul 2010, David Miller wrote:

> From: Shreyas Bhatewara <sbhatewara@vmware.com>
> Date: Thu, 15 Jul 2010 18:20:52 -0700 (PDT)
> 
> > Is this what you suggest :
> > 
> > ---
> > 
> > Hold rtnl_lock to get the right link state.
> 
> It ought to work, but make sure that it is legal to take the
> RTNL semaphore in all contexts in which this code block
> might be called.
> 

This code block is called only from the workqueue handler, which runs in
process context, so it is legal to take rtnl semaphore.
Tested this code by simulating event interrupts (which schedule this 
code) at considerable frequency while the interface was brought up and
down in a loop. Similar stress testing had revealed the bug originally. 

^ permalink raw reply

* [rfc] ipvs: Add ip_vs_onepacket_enabled()
From: Simon Horman @ 2010-07-16  8:30 UTC (permalink / raw)
  To: Thomas Graf
  Cc: Nick Chalk, Wensong Zhang, Julian Anastasov, lvs-devel, netdev

Add ip_vs_onepacket_enabled() instead of open-coding the logic three times.

Signed-off-by: Simon Horman <horms@verge.net.au>

--- 

Compile tested only.

Extracted from a patch by Thomas Graf <tgraf@infradead.org>.
Thomas, feel free to make this patch from you and add a signed-off-by line.

Index: nf-next-2.6/net/netfilter/ipvs/ip_vs_core.c
===================================================================
--- nf-next-2.6.orig/net/netfilter/ipvs/ip_vs_core.c	2010-07-16 16:44:49.000000000 +0900
+++ nf-next-2.6/net/netfilter/ipvs/ip_vs_core.c	2010-07-16 16:45:20.000000000 +0900
@@ -165,7 +165,6 @@ ip_vs_conn_stats(struct ip_vs_conn *cp,
 	spin_unlock(&ip_vs_stats.lock);
 }
 
-
 static inline int
 ip_vs_set_state(struct ip_vs_conn *cp, int direction,
 		const struct sk_buff *skb,
@@ -176,6 +175,12 @@ ip_vs_set_state(struct ip_vs_conn *cp, i
 	return pp->state_transition(cp, direction, skb, pp);
 }
 
+static inline __u16
+ip_vs_onepacket_enabled(struct ip_vs_service *svc, struct ip_vs_iphdr iph)
+{
+	return (svc->flags & IP_VS_SVC_F_ONEPACKET &&
+		iph->protocol == IPPROTO_UDP) ? IP_VS_CONN_F_ONE_PACKET : 0;
+}
 
 /*
  *  IPVS persistent scheduling function
@@ -194,7 +199,6 @@ ip_vs_sched_persist(struct ip_vs_service
 	struct ip_vs_dest *dest;
 	struct ip_vs_conn *ct;
 	__be16  dport;			/* destination port to forward */
-	__be16  flags;
 	union nf_inet_addr snet;	/* source network of the client,
 					   after masking */
 
@@ -341,10 +345,6 @@ ip_vs_sched_persist(struct ip_vs_service
 		dport = ports[1];
 	}
 
-	flags = (svc->flags & IP_VS_SVC_F_ONEPACKET
-		 && iph.protocol == IPPROTO_UDP)?
-		IP_VS_CONN_F_ONE_PACKET : 0;
-
 	/*
 	 *    Create a new connection according to the template
 	 */
@@ -352,7 +352,7 @@ ip_vs_sched_persist(struct ip_vs_service
 			    &iph.saddr, ports[0],
 			    &iph.daddr, ports[1],
 			    &dest->addr, dport,
-			    flags,
+			    ip_vs_onepacket_enabled(svc, &iph),
 			    dest);
 	if (cp == NULL) {
 		ip_vs_conn_put(ct);
@@ -382,7 +382,7 @@ ip_vs_schedule(struct ip_vs_service *svc
 	struct ip_vs_conn *cp = NULL;
 	struct ip_vs_iphdr iph;
 	struct ip_vs_dest *dest;
-	__be16 _ports[2], *pptr, flags;
+	__be16 _ports[2], *pptr;
 
 	ip_vs_fill_iphdr(svc->af, skb_network_header(skb), &iph);
 	pptr = skb_header_pointer(skb, iph.len, sizeof(_ports), _ports);
@@ -412,10 +412,6 @@ ip_vs_schedule(struct ip_vs_service *svc
 		return NULL;
 	}
 
-	flags = (svc->flags & IP_VS_SVC_F_ONEPACKET
-		 && iph.protocol == IPPROTO_UDP)?
-		IP_VS_CONN_F_ONE_PACKET : 0;
-
 	/*
 	 *    Create a connection entry.
 	 */
@@ -423,7 +419,7 @@ ip_vs_schedule(struct ip_vs_service *svc
 			    &iph.saddr, pptr[0],
 			    &iph.daddr, pptr[1],
 			    &dest->addr, dest->port ? dest->port : pptr[1],
-			    flags,
+			    ip_vs_onepacket_enabled(svc, &iph),
 			    dest);
 	if (cp == NULL)
 		return NULL;
@@ -473,9 +469,6 @@ int ip_vs_leave(struct ip_vs_service *sv
 	if (sysctl_ip_vs_cache_bypass && svc->fwmark && unicast) {
 		int ret, cs;
 		struct ip_vs_conn *cp;
-		__u16 flags = (svc->flags & IP_VS_SVC_F_ONEPACKET &&
-				iph.protocol == IPPROTO_UDP)?
-				IP_VS_CONN_F_ONE_PACKET : 0;
 		union nf_inet_addr daddr =  { .all = { 0, 0, 0, 0 } };
 
 		ip_vs_service_put(svc);
@@ -486,7 +479,8 @@ int ip_vs_leave(struct ip_vs_service *sv
 				    &iph.saddr, pptr[0],
 				    &iph.daddr, pptr[1],
 				    &daddr, 0,
-				    IP_VS_CONN_F_BYPASS | flags,
+				    IP_VS_CONN_F_BYPASS,
+				    ip_vs_onepacket_enabled(svc, &iph),
 				    NULL);
 		if (cp == NULL)
 			return NF_DROP;

^ permalink raw reply

* [PATCH v2 1/2] bnx2: allocate with GFP_KERNEL flag on RX path init
From: Stanislaw Gruszka @ 2010-07-16  8:55 UTC (permalink / raw)
  To: netdev; +Cc: Michael Chan
In-Reply-To: <C27F8246C663564A84BB7AB3439772421B842BD26F@IRVEXCHCCR01.corp.ad.broadcom.com>

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
v1->v2: use GFP_ATOMIC in bnx2_rx_skb

 drivers/net/bnx2.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index a203f39..a7df539 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -2664,13 +2664,13 @@ bnx2_set_mac_addr(struct bnx2 *bp, u8 *mac_addr, u32 pos)
 }
 
 static inline int
-bnx2_alloc_rx_page(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
+bnx2_alloc_rx_page(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index, gfp_t gfp)
 {
 	dma_addr_t mapping;
 	struct sw_pg *rx_pg = &rxr->rx_pg_ring[index];
 	struct rx_bd *rxbd =
 		&rxr->rx_pg_desc_ring[RX_RING(index)][RX_IDX(index)];
-	struct page *page = alloc_page(GFP_ATOMIC);
+	struct page *page = alloc_page(gfp);
 
 	if (!page)
 		return -ENOMEM;
@@ -2705,7 +2705,7 @@ bnx2_free_rx_page(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
 }
 
 static inline int
-bnx2_alloc_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
+bnx2_alloc_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index, gfp_t gfp)
 {
 	struct sk_buff *skb;
 	struct sw_bd *rx_buf = &rxr->rx_buf_ring[index];
@@ -2713,7 +2713,7 @@ bnx2_alloc_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
 	struct rx_bd *rxbd = &rxr->rx_desc_ring[RX_RING(index)][RX_IDX(index)];
 	unsigned long align;
 
-	skb = netdev_alloc_skb(bp->dev, bp->rx_buf_size);
+	skb = __netdev_alloc_skb(bp->dev, bp->rx_buf_size, gfp);
 	if (skb == NULL) {
 		return -ENOMEM;
 	}
@@ -2974,7 +2974,7 @@ bnx2_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, struct sk_buff *skb,
 	int err;
 	u16 prod = ring_idx & 0xffff;
 
-	err = bnx2_alloc_rx_skb(bp, rxr, prod);
+	err = bnx2_alloc_rx_skb(bp, rxr, prod, GFP_ATOMIC);
 	if (unlikely(err)) {
 		bnx2_reuse_rx_skb(bp, rxr, skb, (u16) (ring_idx >> 16), prod);
 		if (hdr_len) {
@@ -3039,7 +3039,8 @@ bnx2_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, struct sk_buff *skb,
 			rx_pg->page = NULL;
 
 			err = bnx2_alloc_rx_page(bp, rxr,
-						 RX_PG_RING_IDX(pg_prod));
+						 RX_PG_RING_IDX(pg_prod),
+						 GFP_ATOMIC);
 			if (unlikely(err)) {
 				rxr->rx_pg_cons = pg_cons;
 				rxr->rx_pg_prod = pg_prod;
@@ -5179,7 +5180,7 @@ bnx2_init_rx_ring(struct bnx2 *bp, int ring_num)
 
 	ring_prod = prod = rxr->rx_pg_prod;
 	for (i = 0; i < bp->rx_pg_ring_size; i++) {
-		if (bnx2_alloc_rx_page(bp, rxr, ring_prod) < 0) {
+		if (bnx2_alloc_rx_page(bp, rxr, ring_prod, GFP_KERNEL) < 0) {
 			netdev_warn(bp->dev, "init'ed rx page ring %d with %d/%d pages only\n",
 				    ring_num, i, bp->rx_pg_ring_size);
 			break;
@@ -5191,7 +5192,7 @@ bnx2_init_rx_ring(struct bnx2 *bp, int ring_num)
 
 	ring_prod = prod = rxr->rx_prod;
 	for (i = 0; i < bp->rx_ring_size; i++) {
-		if (bnx2_alloc_rx_skb(bp, rxr, ring_prod) < 0) {
+		if (bnx2_alloc_rx_skb(bp, rxr, ring_prod, GFP_KERNEL) < 0) {
 			netdev_warn(bp->dev, "init'ed rx ring %d with %d/%d skbs only\n",
 				    ring_num, i, bp->rx_ring_size);
 			break;
-- 
1.7.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox