Netdev List

Netdev List
 help / color / mirror / Atom feed

* mmotm 2010-05-11 - dies in pm_qos_update_request()
From: Valdis.Kletnieks @ 2010-05-12 19:52 UTC (permalink / raw)
  To: Andrew Morton, Rafael J. Wysocki, David S. Miller
  Cc: linux-kernel, e1000-devel, netdev
In-Reply-To: <201005120149.o4C1n7P4002637@imap1.linux-foundation.org>

[-- Attachment #1: Type: text/plain, Size: 4678 bytes --]

On Tue, 11 May 2010 18:21:22 PDT, akpm@linux-foundation.org said:
> The mm-of-the-moment snapshot 2010-05-11-18-20 has been uploaded to
> 
>    http://userweb.kernel.org/~akpm/mmotm/

Dell Latitude E6500, x86_64 kernel.

Died a horrid death at boot in the e1000e driver.  Seems to be
something in linux-next.patch. Didn't get a netconsole trace for obvious
reasons...

Copied-by-hand traceback:
pm_qos_update_request()+0x22
e1000_configure+0x478
e1000_open_device+0xee
? _raw_notifier_call_chain+0xf
__dev_open+0xec
dev_open+0x1b
netpoll_setup+0x28b
init_netconsole+0xbc

I suspect this commit:

commit 23606cf5d1192c2b17912cb2ef6e62f9b11de133
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Sun Mar 14 14:35:17 2010 +0000

    e1000e / PCI / PM: Add basic runtime PM support (rev. 4)

    Use the PCI runtime power management framework to add basic PCI
    runtime PM support to the e1000e driver.  Namely, make the driver
    suspend the device when the link is off and set it up for generating
    a wakeup event after the link has been detected again.  [This
    feature is disabled until the user space enables it with the help of
    the /sys/devices/.../power/contol device attribute.]

Not sure how to reconcile "this feature is disabled until" with my
traceback, since userspace hasn't even started when init_netconsole is
happening.

        /*
         * Enable early receives on supported devices, only takes effect when
         * packet size is equal or larger than the specified value (in 8 byte
         * units), e.g. using jumbo frames when setting to E1000_ERT_2048
         */
        if (adapter->flags & FLAG_HAS_ERT) {

How the heck did FLAG_HAS_ERT get set, we can't call pm_qos_update_request()
otherwise?

Aha.  Over in ich8lan.c, we have:

struct e1000_info e1000_ich9_info = {
        .mac                    = e1000_ich9lan,
        .flags                  = FLAG_HAS_JUMBO_FRAMES
                                  | FLAG_IS_ICH
                                  | FLAG_HAS_WOL
                                  | FLAG_RX_CSUM_ENABLED
                                  | FLAG_HAS_CTRLEXT_ON_LOAD
                                  | FLAG_HAS_AMT
                                  | FLAG_HAS_ERT
                                  | FLAG_HAS_FLASH
                                  | FLAG_APME_IN_WUC,

And lspci says:
00:00.0 Host bridge: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub (rev 07)
00:01.0 PCI bridge: Intel Corporation Mobile 4 Series Chipset PCI Express Graphics Port (rev 07)
00:19.0 Ethernet controller: Intel Corporation 82567LM Gigabit Network Connection (rev 03)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 03)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 03)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 (rev 03)
00:1c.2 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 3 (rev 03)
00:1c.3 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 4 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 93)
00:1f.0 ISA bridge: Intel Corporation ICH9M-E LPC Interface Controller (rev 03)
00:1f.2 RAID bus controller: Intel Corporation Mobile 82801 SATA RAID Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 03)
01:00.0 VGA compatible controller: nVidia Corporation G98M [Quadro NVS 160M] (rev a1)
03:01.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev ba)
03:01.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 04)
03:01.2 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 21)
0c:00.0 Network controller: Intel Corporation Wireless WiFi Link 5100

I see a lot of ICH9 listed there, guess a Mobile 4 is ICH9-ish enough...

Unfortunately, that's the limit of my ability to chase this down.






[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply

* Re: linux-next: Tree for May 12 (netfilter: xt_TEE)
From: Jan Engelhardt @ 2010-05-12 20:11 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Stephen Rothwell, netdev, Netfilter Developer Mailing List,
	linux-next, LKML, sebastian.classen, Randy Dunlap
In-Reply-To: <20100512085045.483f0e2d.randy.dunlap@oracle.com>

On Wednesday 2010-05-12 17:50, Randy Dunlap wrote:
>On Wed, 12 May 2010 16:46:38 +1000 Stephen Rothwell wrote:
>
>net/built-in.o: In function `tee_tg_route6':
>xt_TEE.c:(.text+0x45ca5): undefined reference to `ip6_route_output'
>net/built-in.o: In function `tee_tg6':
>xt_TEE.c:(.text+0x45d79): undefined reference to `ip6_local_out'

parent cba7a98a474a4f2a9316473734ba76829191a78a (v2.6.34-rc5-1489-gcba7a98)
commit cff884ea0de4bf8729ac33bb5d98ac9d39d09019
Author: Jan Engelhardt <jengelh@medozas.de>
Date:   Wed May 12 22:10:03 2010 +0200

netfilter: xtables: add missing depends for xt_TEE

Aviod these link-time errors when IPV6=m, XT_TEE=y:

net/built-in.o: In function `tee_tg_route6':
xt_TEE.c:(.text+0x45ca5): undefined reference to `ip6_route_output'
net/built-in.o: In function `tee_tg6':
xt_TEE.c:(.text+0x45d79): undefined reference to `ip6_local_out'

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/netfilter/Kconfig |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 673a6c8..e223f47 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -505,6 +505,7 @@ config NETFILTER_XT_TARGET_RATEEST
 config NETFILTER_XT_TARGET_TEE
 	tristate '"TEE" - packet cloning to alternate destiantion'
 	depends on NETFILTER_ADVANCED
+	depends on (IPV6 || IPV6=n)
 	---help---
 	This option adds a "TEE" target with which a packet can be cloned and
 	this clone be rerouted to another nexthop.
-- 
# Created with git-export-patch

^ permalink raw reply related

* Re: linux-next: Tree for May 12 (netfilter: xt_TEE)
From: Randy Dunlap @ 2010-05-12 20:19 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Patrick McHardy, Stephen Rothwell, netdev,
	Netfilter Developer Mailing List, linux-next, LKML,
	sebastian.classen, Randy Dunlap
In-Reply-To: <alpine.LSU.2.01.1005122210180.11190@obet.zrqbmnf.qr>

On Wed, 12 May 2010 22:11:35 +0200 (CEST) Jan Engelhardt wrote:

> On Wednesday 2010-05-12 17:50, Randy Dunlap wrote:
> >On Wed, 12 May 2010 16:46:38 +1000 Stephen Rothwell wrote:
> >
> >net/built-in.o: In function `tee_tg_route6':
> >xt_TEE.c:(.text+0x45ca5): undefined reference to `ip6_route_output'
> >net/built-in.o: In function `tee_tg6':
> >xt_TEE.c:(.text+0x45d79): undefined reference to `ip6_local_out'
> 
> parent cba7a98a474a4f2a9316473734ba76829191a78a (v2.6.34-rc5-1489-gcba7a98)
> commit cff884ea0de4bf8729ac33bb5d98ac9d39d09019
> Author: Jan Engelhardt <jengelh@medozas.de>
> Date:   Wed May 12 22:10:03 2010 +0200
> 
> netfilter: xtables: add missing depends for xt_TEE
> 
> Aviod these link-time errors when IPV6=m, XT_TEE=y:
> 
> net/built-in.o: In function `tee_tg_route6':
> xt_TEE.c:(.text+0x45ca5): undefined reference to `ip6_route_output'
> net/built-in.o: In function `tee_tg6':
> xt_TEE.c:(.text+0x45d79): undefined reference to `ip6_local_out'
> 
> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
> ---
>  net/netfilter/Kconfig |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index 673a6c8..e223f47 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -505,6 +505,7 @@ config NETFILTER_XT_TARGET_RATEEST
>  config NETFILTER_XT_TARGET_TEE
>  	tristate '"TEE" - packet cloning to alternate destiantion'
>  	depends on NETFILTER_ADVANCED
> +	depends on (IPV6 || IPV6=n)
>  	---help---
>  	This option adds a "TEE" target with which a packet can be cloned and
>  	this clone be rerouted to another nexthop.
> -- 

Acked-by: Randy Dunlap <randy.dunlap@oracle.com>

Thanks.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: [PATCH] net sched: cleanup and rate limit warning
From: Stephen Hemminger @ 2010-05-12 20:20 UTC (permalink / raw)
  To: hadi; +Cc: David Miller, netdev
In-Reply-To: <1273691628.16074.15.camel@bigi>

On Wed, 12 May 2010 15:13:48 -0400
jamal <hadi@cyberus.ca> wrote:

> Thanks for the info Stephen.
> 
> On Wed, 2010-05-12 at 11:17 -0700, Stephen Hemminger wrote:
> 
> > The Vyatta syntax is:
> > 
> >  traffic-limiter test-traffic-limit {
> >      class 2048 {
> >          bandwidth 1mbit
> >          burst 500kbit
> >          match onebox {
> >              ip {
> >                  destination {
> >                      address 192.168.100.99/32
> >                  }
> >              }
> >          }
> >      }
> >  }
> > 
> 
> ;-> I guess kids these days prefer juniperism over ciscoism?
> Why dont they just learn linuxism?;->
> 
> > Which generates these TC commands.
> > 
> > root@VC6:~# tc qdisc show dev eth0
> > qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
> > qdisc ingress ffff: parent ffff:fff1 ---------------- 
> > 
> > root@VC6:~# tc filter show dev eth0 parent ffff:
> > filter protocol all pref 20 u32 
> > filter protocol all pref 20 u32 fh 800: ht divisor 1 
> > filter protocol all pref 20 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid ffff:800 
> >   match c0a86463/ffffffff at 16
> >  police 0x3 rate 1000Kbit burst 63999b mtu 2Kb action reclassify overhead 0b 
> > ref 1 bind 1
> > 
> > 
> > I think the bad part is the huge burst size.
> 
> That may be - but it seems your tool is the culprit.
> It is generating wrong tc commands if i read the intent correctly.
> Basically what the tc command is saying is "if you exceed the 1Mbit upto
> a burst of 500kbit then reclassify".
> "Reclassify" means literally that: to reuse the same classification rule
> again, which will find that  we have exceeded 1M which will ask
> reclassify .... loop.... I am glad that code is there ;->

The tool isn't generating an action (just tc filter ... police ..)
so it is getting the unfortunate default of reclassify.


-- 

^ permalink raw reply

* Re: [PATCH] net sched: cleanup and rate limit warning
From: jamal @ 2010-05-12 20:41 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev, Patrick McHardy
In-Reply-To: <20100512132042.3b38da18@nehalam>

On Wed, 2010-05-12 at 13:20 -0700, Stephen Hemminger wrote:

> The tool isn't generating an action (just tc filter ... police ..)
> so it is getting the unfortunate default of reclassify.

Ah ok. 
My advice: you should never ever depend on defaults when you
can be explicit and say "drop". Or have the users in your tool be able
to specify what action to take if rate is exceeded etc (actually i think
juniper does that) 
I think "drop" would be the sane default for over-limit - my memory is
hazy because i assumed that was the default but there may have been some
reservations on that default. Patrick?

cheers,
jamal

^ permalink raw reply

* Re: mmotm 2010-05-11 - dies in pm_qos_update_request()
From: Rafael J. Wysocki @ 2010-05-12 21:07 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: e1000-devel, netdev, Andrew Morton, David S. Miller, linux-kernel
In-Reply-To: <5525.1273693962@localhost>

On Wednesday 12 May 2010, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 11 May 2010 18:21:22 PDT, akpm@linux-foundation.org said:
> > The mm-of-the-moment snapshot 2010-05-11-18-20 has been uploaded to
> > 
> >    http://userweb.kernel.org/~akpm/mmotm/
> 
> Dell Latitude E6500, x86_64 kernel.
> 
> Died a horrid death at boot in the e1000e driver.  Seems to be
> something in linux-next.patch. Didn't get a netconsole trace for obvious
> reasons...
> 
> Copied-by-hand traceback:
> pm_qos_update_request()+0x22
> e1000_configure+0x478
> e1000_open_device+0xee
> ? _raw_notifier_call_chain+0xf
> __dev_open+0xec
> dev_open+0x1b
> netpoll_setup+0x28b
> init_netconsole+0xbc
> 
> I suspect this commit:
> 
> commit 23606cf5d1192c2b17912cb2ef6e62f9b11de133
> Author: Rafael J. Wysocki <rjw@sisk.pl>
> Date:   Sun Mar 14 14:35:17 2010 +0000
> 
>     e1000e / PCI / PM: Add basic runtime PM support (rev. 4)

No, I don't think so.  I'm running -rc6 with this patch applied on a box with
e1000e and it works just fine.

Please try to revert this one instead:

http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=patch;h=ed77134bfccf5e75b6cbadab268e559dbe6a4ebb

Rafael

------------------------------------------------------------------------------

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: linux-next: Tree for May 12 (netfilter: xt_TEE)
From: David Miller @ 2010-05-12 21:53 UTC (permalink / raw)
  To: randy.dunlap
  Cc: jengelh, kaber, sfr, netdev, netfilter-devel, linux-next,
	linux-kernel, sebastian.classen
In-Reply-To: <20100512131942.f487b3a9.randy.dunlap@oracle.com>

From: Randy Dunlap <randy.dunlap@oracle.com>
Date: Wed, 12 May 2010 13:19:42 -0700

> On Wed, 12 May 2010 22:11:35 +0200 (CEST) Jan Engelhardt wrote:
> 
>> parent cba7a98a474a4f2a9316473734ba76829191a78a (v2.6.34-rc5-1489-gcba7a98)
>> commit cff884ea0de4bf8729ac33bb5d98ac9d39d09019
>> Author: Jan Engelhardt <jengelh@medozas.de>
>> Date:   Wed May 12 22:10:03 2010 +0200
>> 
>> netfilter: xtables: add missing depends for xt_TEE
>> 
>> Aviod these link-time errors when IPV6=m, XT_TEE=y:
>> 
>> net/built-in.o: In function `tee_tg_route6':
>> xt_TEE.c:(.text+0x45ca5): undefined reference to `ip6_route_output'
>> net/built-in.o: In function `tee_tg6':
>> xt_TEE.c:(.text+0x45d79): undefined reference to `ip6_local_out'
>> 
>> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
 ...
> Acked-by: Randy Dunlap <randy.dunlap@oracle.com>

I'll apply this directly, thanks everyone.

^ permalink raw reply

* Re: mmotm 2010-05-11 - dies in pm_qos_update_request()
From: Valdis.Kletnieks @ 2010-05-12 22:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, David S. Miller, linux-kernel, e1000-devel, netdev
In-Reply-To: <201005122307.20860.rjw@sisk.pl>

[-- Attachment #1: Type: text/plain, Size: 959 bytes --]

On Wed, 12 May 2010 23:07:20 +0200, "Rafael J. Wysocki" said:
 
> > I suspect this commit:
> > 
> > commit 23606cf5d1192c2b17912cb2ef6e62f9b11de133
> > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > Date:   Sun Mar 14 14:35:17 2010 +0000
> > 
> >     e1000e / PCI / PM: Add basic runtime PM support (rev. 4)
> 
> No, I don't think so.  I'm running -rc6 with this patch applied on a box with
> e1000e and it works just fine.
> 
> Please try to revert this one instead:
> 
> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=patch;h=ed77134bfccf5e75b6cbadab268e559dbe6a4ebb

Isn't a clean revert - build dies with:

  CC      net/mac80211/scan.o
net/mac80211/scan.c: In function 'ieee80211_scan_state_decision':
net/mac80211/scan.c:510: error: implicit declaration of function 'pm_qos_request'
make[1]: *** [net/mac80211/scan.o] Error 1
make: *** [net/mac80211/scan.o] Error 2

Will investigate more later, find what other commit needs reverting.

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply

* Re: [PATCH net-next-2.6 2/2] bonding: allow user-controlled output slave selection
From: Andy Gospodarek @ 2010-05-12 22:14 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Neil Horman, netdev
In-Reply-To: <25238.1273693314@death.nxdomain.ibm.com>

On Wed, May 12, 2010 at 12:41:54PM -0700, Jay Vosburgh wrote:
> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> >On Tue, May 11, 2010 at 01:09:39PM -0700, Jay Vosburgh wrote:
> >> Andy Gospodarek <andy@greyhouse.net> wrote:
> >> 
> >> >This patch give the user the ability to control the output slave for
> >> >round-robin and active-backup bonding.  Similar functionality was
> >> >discussed in the past, but Jay Vosburgh indicated he would rather see a
> >> >feature like this added to existing modes rather than creating a
> >> >completely new mode.  Jay's thoughts as well as Neil's input surrounding
> >> >some of the issues with the first implementation pushed us toward a
> >> >design that relied on the queue_mapping rather than skb marks.
> >> >Round-robin and active-backup modes were chosen as the first users of
> >> >this slave selection as they seemed like the most logical choices when
> >> >considering a multi-switch environment.
> >> >
> >> >Round-robin mode works without any modification, but active-backup does
> >> >require inclusion of the first patch in this series and setting
> >> >the 'keep_all' flag.  This will allow reception of unicast traffic on
> >> >any of the backup interfaces.
> >> 
> >> 	Yes, I did think that the mark business fit better into existing
> >> modes (I thought of it as kind of a new hash for xor and 802.3ad modes).
> >> I also didn't expect to see so much new stuff (this, as well as the FCOE
> >> special cases being discussed elsewhere) being shoehorned into the
> >> active-backup mode.  I'm not so sure that adding so many special cases
> >> to active-backup is a good thing.
> >> 
> >> 	Now, I'm starting to wonder if you were right, and it would be
> >> better overall to have a "manual" mode that would hopefully satisfy this
> >> case as well as the FCOE special case.  I don't think either of these is
> >> a bad use case, I'm just not sure the right way to handle them is
> >> another special knob in active-backup mode (either directly, or
> >> implicitly in __netif_receive_skb), which wasn't what I expected to see.
> >> 
> >I honestly don't think a separate mode is warranted here.  While I'm not opposed
> >to adding a new mode, I really think doing so is no different from overloading
> >an existing mode.  I say that because to add a new mode in which we explicitly
> >expect traffic to be directed to various slaves requires that we implement a
> >policy for frames which have no queue mapping determined on egress.  Any policy
> >I can think of is really an approximation of an existing policy, so we may as
> >well reuse the policy code that we already have in place.  About the only way a
> >separate mode makes sense is in the 'passthrough' queue mode you document below.
> >In this model, in which queue ids map to slaves in a 1:1 fashion it doesn't make
> >senes.
> 
> 	One goal I'm hoping to achieve is something that would satisfy
> both the queue map stuff that you're looking for, and would meet the
> needs of the FCOE people who also want to disable the duplicate
> suppression (i.e., permit incoming traffic on the inactive slave) for a
> different special case.
> 
> 	The FCOE proposal revolves around, essentially, active-backup
> mode, but permitting incoming traffic on the inactive slave.  At the
> moment, the patches attempt to special case things such that only
> dev_add_pack listeners directly bound to the inactive slave are checked
> (to permit the FCOE traffic to pass on the inactive slave, but still
> dropping IP, as ip_rcv is a wildcard bind).
> 
> 	Your keep_all patch is, by and large, the same thing, except
> that it permits anything to come in on the "inactive" slave, and it's a
> switch that has to be turned on.
> 
> 	This seems like needless duplication to me; I'd prefer to see a
> single solution that handles both cases instead of two special cases
> that each do 90% of what the other does.
> 
> 	As far as a new mode goes, one major reason I think a separate
> mode is warranted is the semantics: with either of these changes (to
> permit more or less regular use of the "inactive" slaves), the mode
> isn't really an "active-backup" mode any longer; there is no "inactive"
> or "backup" slave.  I think of this as being a major change of
> functionality, not simply a minor option.
> 
> 	Hence my thought that "active-backup" could stay as a "true" hot
> standby mode (backup slaves are just that: for backup, only), and this
> new mode would be the place to do the special queue-map / FCOE /
> whatever that isn't really a hot standby configuration any longer.
> 
> 	As far as the behavior of the new mode (your concern about its
> policy map approximations), in the end, it would probably act pretty
> much like active-backup with your patch applied: traffic goes out the
> active slave, unless directed otherwise.  It's a lot less complicated
> than I had feared.
> 

It's beginning to sound like the 'FCoE use-case' and the one Neil and I
are proposing are quite similar.  The main goal of both is to have the
option to have multiple slaves send and receive traffic during the
steady-state, but in the event of a failover all traffic would run on a
single interface.

The implementation proposed with this patch is a bit different that the
'mark-mode' patch you may recall I posted a few months ago.  It created
a new mode that essentially did exactly what you are describing --
transmit on the primary interface unless pushed to another interface via
info in the skb and receive on all interfaces.  We initially did not
create a new mode based on your reservations about the previous
mark-mode patch and went the direction of enhancing one or two modes
initially (figuring it would be good to run before walking), with the
idea that other modes could take care of this output slave selection
logic in the future.


> >> 	I presume you're overloading active-backup because it's not
> >> etherchannel, 802.3ad, etc, and just talks right to the switch.  For the
> >> regular load balance modes, I still think overlay into the existing
> >> modes is preferable (more on that later); I'm thinking of "manual"
> >> instead of another tweak to active-backup.
> >> 
> >> 	If users want to have actual hot-standby functionality, then
> >> active-backup would do that, and nothing else (and it can be multi-queue
> >> aware, but only one slave active at a time).
> >> 
> >Yes, but active backup doesn't provide prefered output path selection in and of
> >itself.  Thats the feature here.
> 
> 	I understand that; I'm suggesting that active-backup should
> provide no service other than hot standby, and not be overloaded into a
> manual load balancing scheme (both for your use, and for FCOE).
> 
> 	Maybe I'm worrying too much about defending the purity of the
> active-backup mode; I understand what you're trying to do a little
> better now, and yes, the "manual" mode I think of (in your queue mapping
> scheme, not the other doodads I talked about) would basically be
> active-backup with your queue mapper, minus the duplicate suppressor.
> 

It doesn't matter terribly to me which direction is taken.  Again, a
major reason this route was proposed was that you were not as keen on
creating a new mode as I was at the time of that patch posting.  It's
somewhat understandable as once a mode is added it's tough to take away,
but when one sees how much we are really changing the way active-backup
might behave in some cases maybe it makes sense to use a new mode?

I guess I like the idea of adding this output selection to existing
modes because it at least gives us the option to use queue maps to
select output interfaces for more than a mode that looks like
present-day active-backup minus the duplicate suppression.   I'm happy to
code-up a patch that creates a new mode, but before I go do that and
test it, I'd like to know we have come to an agreement on the direction
for the future.

> >> 	Users who want the set of bonded slaves to look like a big
> >> multiqueue buffet could use this "manual" mode and set things up however
> >> they want.  One way to set it up is simply that the bond is N queues
> >> wide, where N is the total of the queue counts of all the slaves.  If a
> >> slave fails, N gets smaller, and the user code has to deal with that.
> >> Since the queue count of a device can't change dynamically, the bond
> >> would have to actually be set up with some big number of queues, and
> >> then only a subset is actually active (or there is some sort of wrap).
> >> 
> >> 	In such an implementation, each slave would have a range of
> >> queue IDs, not necessarily just one.  I'm a bit leery of exposing an API
> >> where each slave is one queue ID, as it could make transitioning to real
> >> multi-queue awareness difficult.
> >> 
> >I'm sorry, what exactly do you mean when you say 'real' multi queue
> >awareness?  How is this any less real than any other implementation?  The
> >approach you outline above isn't any more or less valid than this one.
> 
> 	That was my misunderstanding of how you planned to handle
> things.  I had thought this patch was simply a scheme to use the queue
> IDs for slave selection, without any method to further perform queue
> selection on the slave itself (I hadn't thought of placing a tc action
> on the slave itself, which you described later on).  I had been thinking
> in terms of schemes to expose all of the slave queues on the bonding
> device.

It wasn't our original intention either.  I didn't mention it in my
original post as it wasn't really the intent of our patch, but a nice
side-effect for the informed user. :) Obviously a bit more testing could
take place and we could add more examples to the documentation for the
nice side-effect feature of this patch, but since this wasn't our
original intent and we didn't test it, we did not advertise it.

> 	So, I don't see any issues with the queue mapping part.  I still
> want to find a common solution for FCOE / your patch with regards to the
> duplicate suppressor.

Understood.

> >While we're on the subject, Andy and I did discuss a model simmilar to what you
> >describe above (what I'll refer to as a queue id passthrough model), in which
> >you can tell the bonding driver to map a frame to a queue, and the bonding
> >driver doesn't really do anything with the queue id other than pass to the slave
> >device for hardware based multiqueue tx handling.  While we could do that, its
> >my feeling such a model isn't the way to go for two primary reasons:
> >
> >1) Inconsistent behavior.  Such an implementation makes assumptions regarding
> >queue id specification within a driver.  For example, What if one of the slaves
> >reserves some fixed number of low order queues for a sepecific purpose, and as
> >such general use queues begin an at offset from zero, while other slaves do not.
> >While its easy to accomidate such needs when writing the tc filters, if a slave
> >fails over, such a bias would change output traffic behavior, as the bonding
> >driver can't be clearly informed of such a bias.  Likewise, what if a slave
> >driver allocates more queues than it actually supports in hardware (like the
> >implementation you propose, ixgbe IIRC actually does this).  If slaves handled
> >unimplemented tx queues different (if one wrapped queues, while the other simply
> >dropped frames to unimplemented queues for instance).  A failover would change
> >traffic patterns dramatically.
> >
> >2) Need.  While (1) can pretty easily be managed with a few configuration
> >guidelines (output queues on slaves have to be configured identically, lets
> >chaos and madness befall you, etc), theres really no reason to bind users to
> >such a system.  We're using tc filters to set the queue id on skbs enqueued to
> >the bonding driver, theres absolutely no reason you can add addition filters to
> >the slaves directly.  Since the bonding driver uses dev_queue_xmit to send a
> >frame to a slave, it has the opportunity to pass through another set of queuing
> >diciplines and filters that can reset and re-assign the skbs queue mapping.  So
> >with the approach in this patch you can get both direct output control without
> >sacrificing actual hardware tx output queue control.  With a passthrough model,
> >you save a bit of filter configuration, but at the expense of having to be much
> >more careful about how you configure your slave nics, and detecting such errors
> >in configuration would be rather difficult to track down, as it would require
> >the generation of traffic that hit the right filter after a failover.
> 
> 	I don't disagree with any of this.  One thought I do have is
> that Eric Dumazet, I believe, has mentioned that the read lock in
> bonding is a limiting factor on 10G performance.  In the far distant
> future when bonding is RCU, going through the lock(s) on the tc actions
> of the slave could have the same net effect, and in such a case, a
> qdisc-less path may be of benefit.  Not a concern for today, I suspect.
> 
> >> 	There might also be a way to tie it in to the new RPS code on
> >> the receive side.
> >> 
> >> 	If the slaves all have the same MAC and attach to a single
> >> switch via etherchannel, then it all looks pretty much like a single big
> >> honkin' multiqueue device.  The switch probably won't map the flows back
> >> the same way, though.
> >> 
> >I agree, they probably wont.  Receive side handling wasn't really our focus here
> >though.  Thats largely why we chose round robin and active backup as our first
> >modes to use this with.  They are already written to expect frames on either
> >interface.
> >
> >> 	If the slaves are on discrete switches (without etherchannel),
> >> things become more complicated.  If the slaves have the same MAC, then
> >> the switches will be irritated about seeing that same MAC coming in from
> >> multiple places.  If the slaves have different MACs, then ARP has the
> >> same sort of issues.
> >> 
> >> 	In thinking about it, if it's linux bonding at both ends, there
> >> could be any number of discrete switches in the path, and it wouldn't
> >> matter as long as the linux end can work things out, e.g.,
> >> 
> >>         -- switch 1 --
> >> hostA  /              \  hostB
> >> bond  ---- switch 2 ---- bond
> >>        \              /
> >>         -- switch 3 --
> >> 
> >> 	For something like this, the switches would never share MAC
> >> information for the bonding slaves.  The issue here then becomes more of
> >> detecting link failures (it would require either a "trunk failover" type
> >> of function on the switch, or some kind of active probe between the
> >> bonds).
> >> 
> >> 	Now, I realize that I'm babbling a bit, as from reading your
> >> description, this isn't necessarily your target topology (which sounded
> >> more like a case of slave A can reach only network X, and slave B can
> >> reach anywhere, so sending to network X should use slave A
> >> preferentially), or, as long as I'm doing ASCII-art,
> >> 
> >>        --- switch 1 ---- network X
> >> hostA /               /
> >> bond  ---- switch 2 -+-- anywhere
> >> 
> >> 	Is that an accurate representation?  Or is it something a bit
> >> different, e.g.,
> >> 
> >>        --- switch 1 ---- network X -\
> >> hostA /                             /
> >> bond  ---- switch 2 ---- anywhere --
> >> 
> >> 	I.e., the "anywhere" connects back to network X from the
> >> outside, so to speak.  Or, oh, maybe I'm missing it entirely, and you're
> >> thinking of something like this:
> >> 
> >>        --- switch 1 --- VPN --- web site
> >> hostA /                          /
> >> bond  ---- switch 2 - Internet -/
> >> 
> >> 	Where you prefer to hit "web site" via the VPN (perhaps it's a
> >> more efficient or secure path), but can do it from the public network at
> >> large if necessary.
> >> 
> >Yes, this one.  I think the other models are equally interesting, but this model
> >in which either path had universal reachabilty, but for some classes of traffic
> >one path is preferred over the other is the one we had in mind.
> >
> >> 	Now, regardless of the above, your first patch ("keep_all") is
> >> to deal with the reverse problem, if this is a piggyback on top of
> >> active-backup mode: how to get packets back, when both channels can be
> >> active simultaneously.  That actually dovetails to a degree with work
> >> I've been doing lately, but the solution there probably isn't what
> >> you're looking for (there's a user space daemon to do path finding, and
> >> the "bond IP" address is piggybacked on the slaves' MAC addresses, which
> >> are not changed; the "bond IP" set exists in a separate subnet all its
> >> own).
> >> 
> >> 	As I said, I'm not convinced that the "keep_all" option to
> >> active-backup is really better than just a "manual" mode that lacks the
> >> dup suppression and expects the user to set everything up.
> >> 
> >> 	As for the round-robin change in this patch, if I'm reading it
> >> right, then the way it works is that the packets are round-robined,
> >> unless there's a queue id passed in, in which case it's assigned to the
> >> slave mapped to that queue id.  I'm not entirely sure why you picked
> >> round-robin mode for that over balance-xor; it doesn't seem to fit well
> >> with the description in the documentation.  Or is it just sort of a
> >> demonstrator?
> >> 
> >It was selected because round robin allows transmits on any interface already,
> >and expects frames on any interface, so it was a 'safe' choice.  I would think
> >balance-xor would also work.  Ideally it would be nice to get more modes
> >supporting this mechanism.
> 
> 	I think that this should work for balance-xor and 802.3ad.  The
> only limitation for 802.3ad is that the spec requires "conversations" to
> not be striped or to skip around in a manner that could lead to out of
> order delivery.

Agreed.  Checking would probably also have to be done to make sure that
we were not trasmitting on an inactive aggregator.

> 
> 	I'm not so sure about the alb/tlb modes; at first thought, I
> think it could have conflicts with the internal balancing done within
> the modes (if, e.g., the tc action put traffic for the same destination
> on two slaves).
> 

TLB and ALB modes would certainly have to be done differently.  It
should not be terribly difficult to move from the existing hashing
that's done to one that relies on the queue_mapping, but it will take a
bit to make sure it's not a complete hack.

We decided against doing that for all modes on the first pass as it
seemed like the active-backup and round-robin were the most-likely
users.  We also wanted present the code early rather that spending time
supporting this on every-mode to find out that it just wasn't rational
to do it on some of them.

> >> 	I do like one other aspect of the patch, and that's the concept
> >> of overlaying the queue map on top of the balance algorithm.  So, e.g.,
> >> balance-xor would do its usual thing, unless the packet is queue mapped,
> >> in which case the packet's assignment is obeyed.  The balance-xor could
> >> even optionally do its xor across the full set of all slaves output
> >> queues instead of just across the slaves.  Round-robin can operate
> >> similarly.  For those modes, a "balance by queue vs. balance by slave"
> >> seems like a reasonable knob to have.
> >Not sure what you mean here.  In the model implemented by this patch, there is
> >one output queue per slave, and as such, balance by queue == balance by slave.
> >That would make sense in the model you describe earlier in this note, but not in
> >the model presented by this patch.
> 
> 	Yes, I was thinking about what I had described; again,
> predicated on my misunderstanding of how it all worked.
> 
> >> 	I do understand that you're proposing something relatively
> >> simple, and I'm thinking out loud about alternate or additional
> >> implementation details.  Some of this is "ooh ahh what if", but we also
> >> don't want to end up with something that's forwards incompatible, and
> >> I'm hoping to find one solution to multiple problems.
> >> 
> >For clarification, can you ennumerate what other problems you are trying to
> >solve with this feature, or features simmilar to this?  From this email, the one
> >that I most clearly see is the desire to allow a passthrough mode of queue
> >selection, which I think I've noted can be done already (even without this
> >patch), by attaching additional tc filters to the slaves output queues directly.
> >What else do you have in mind?
> 
> 	As I said above, I hadn't thought of stacking tc actions on to
> the slaves directly, so I was thinking on ways to expose the slave
> queues.
> 
> 	I still find something intriguing about a round-robin or xor
> mode that robins/xors through all of the slave queues, though, but that
> should be something separate (I'm not sure if such a scheme is actually
> "better", either).
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [BUG] crashes with kvm/nat networking and net-next
From: Stephen Hemminger @ 2010-05-12 22:15 UTC (permalink / raw)
  To: Bart De Schuymer; +Cc: Patrick McHardy, Eric Dumazet, netdev
In-Reply-To: <4BEAB54A.2070203@pandora.be>

On Wed, 12 May 2010 16:03:54 +0200
Bart De Schuymer <bdschuym@pandora.be> wrote:

> Patrick McHardy wrote:
> > Eric Dumazet wrote:
> >> Le mardi 11 mai 2010 à 20:25 -0700, Stephen Hemminger a écrit :
> >>> This is a regression that is showing up now in net-next, not sure what
> >>> changed recently in bridge netfilter that could be causing it?
> >>>
> >>> [ 4593.956206] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> >>> [ 4593.956219] IP: [<ffffffffa03357a4>] br_nf_forward_finish+0x154/0x170 [bridge]
> >> Not sure, but br_nf_forward_ip() has following check :
> >>
> >> if (!skb->nf_bridge)
> >> 	return NF_ACCEPT;
> >>
> >> while br_nf_forward_arp() missed this check ...
> >>
> >> So we can dereference null pointer later
> > 
> > That looks correct to me, offset 0x18 would be nf_bridge_info->mask.
> > Bart, please review, thanks.
> > 
> >> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> >> index 93f80fe..cd2e5f5 100644
> >> --- a/net/bridge/br_netfilter.c
> >> +++ b/net/bridge/br_netfilter.c
> >> @@ -723,6 +723,9 @@ static unsigned int br_nf_forward_arp(unsigned int hook, struct sk_buff *skb,
> >>  		return NF_ACCEPT;
> >>  #endif
> >>  
> >> +	if (!skb->nf_bridge)
> >> +		return NF_ACCEPT;
> >> +
> >>  	if (skb->protocol != htons(ETH_P_ARP)) {
> >>  		if (!IS_VLAN_ARP(skb))
> >>  			return NF_ACCEPT;
> 
> That won't fix it since nf_bridge isn't used in the ARP case. The
> correct fix is below. Does anyone know why the null pointer
> dereference (skb->nf_bridge->mask & BRNF_8021Q) in
> nf_bridge_update_protocol() didn't cause my uml kernel to crash??
> 
> cheers,
> Bart
> 
> 
> Don't call nf_bridge_update_protocol() for ARP traffic as
> skb->nf_bridge isn't used in the ARP case.
> 
> 
> Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
> Reported-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 93f80fe..4442099 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -643,10 +643,10 @@ static int br_nf_forward_finish(struct sk_buff *skb)
>  			skb->pkt_type = PACKET_OTHERHOST;
>  			nf_bridge->mask ^= BRNF_PKT_TYPE;
>  		}
> +		nf_bridge_update_protocol(skb);
>  	} else {
>  		in = *((struct net_device **)(skb->cb));
>  	}
> -	nf_bridge_update_protocol(skb);
>  	nf_bridge_push_encap_header(skb);
>  
>  	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_FORWARD, skb, in,
> 

This has worked all day for me without problem.


-- 

^ permalink raw reply

* Re: TCP-MD5 checksum failure on x86_64 SMP
From: Stephen Hemminger @ 2010-05-12 22:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Bijay Singh, David Miller, <bhaskie@gmail.com>,
	<bhutchings@solarflare.com>, netdev, Ilpo Järvinen
In-Reply-To: <1273634421.2512.21.camel@edumazet-laptop>

On Wed, 12 May 2010 05:20:21 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le mardi 11 mai 2010 à 22:50 +0200, Eric Dumazet a écrit :
> > Le mardi 11 mai 2010 à 04:08 +0000, Bijay Singh a écrit :
> > > Hi Eric,
> > > 
> > > I guess that makes me the enviable one. So I am keen to test out this feature completely, as long as I know what to do as a next step, directions, patches.
> > > 
> > > Thanks
> > 
> > 
> > I believe third problem comes from commit 4957faad
> > (TCPCT part 1g: Responder Cookie => Initiator), from William Allen
> > Simpson.
> 
> And a fourth problem might be that tcp_md5_hash_skb_data() is not
> frag_list aware ?
> 
> 
> 
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 8ce2974..56ee0f2 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2985,6 +2985,7 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
>  	const unsigned head_data_len = skb_headlen(skb) > header_len ?
>  				       skb_headlen(skb) - header_len : 0;
>  	const struct skb_shared_info *shi = skb_shinfo(skb);
> +	struct sk_buff *frag_iter;
>  
>  	sg_init_table(&sg, 1);
>  
> @@ -2999,6 +3000,10 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
>  			return 1;
>  	}
>  
> +	skb_walk_frags(skb, frag_iter)
> +		if (tcp_md5_hash_skb_data(hp, frag_iter, 0))
> +			return 1;
> +
>  	return 0;
>  }

Yes, that looks like a possible bug, not sure what hardware
generates frag_list.


^ permalink raw reply

* Re: TCP-MD5 checksum failure on x86_64 SMP
From: David Miller @ 2010-05-12 22:24 UTC (permalink / raw)
  To: shemminger
  Cc: eric.dumazet, Bijay.Singh, bhaskie, bhutchings, netdev,
	ilpo.jarvinen
In-Reply-To: <20100512152207.0e0c321a@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 12 May 2010 15:22:07 -0700

> Yes, that looks like a possible bug, not sure what hardware
> generates frag_list.

GRO generates frag_list

^ permalink raw reply

* Re: mmotm 2010-05-11 - dies in pm_qos_update_request()
From: Rafael J. Wysocki @ 2010-05-12 22:33 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Andrew Morton, David S. Miller, linux-kernel, e1000-devel, netdev
In-Reply-To: <17286.1273702370@localhost>

On Thursday 13 May 2010, Valdis.Kletnieks@vt.edu wrote:
> On Wed, 12 May 2010 23:07:20 +0200, "Rafael J. Wysocki" said:
>  
> > > I suspect this commit:
> > > 
> > > commit 23606cf5d1192c2b17912cb2ef6e62f9b11de133
> > > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > > Date:   Sun Mar 14 14:35:17 2010 +0000
> > > 
> > >     e1000e / PCI / PM: Add basic runtime PM support (rev. 4)
> > 
> > No, I don't think so.  I'm running -rc6 with this patch applied on a box with
> > e1000e and it works just fine.
> > 
> > Please try to revert this one instead:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=patch;h=ed77134bfccf5e75b6cbadab268e559dbe6a4ebb
> 
> Isn't a clean revert - build dies with:
> 
>   CC      net/mac80211/scan.o
> net/mac80211/scan.c: In function 'ieee80211_scan_state_decision':
> net/mac80211/scan.c:510: error: implicit declaration of function 'pm_qos_request'
> make[1]: *** [net/mac80211/scan.o] Error 1
> make: *** [net/mac80211/scan.o] Error 2
> 
> Will investigate more later, find what other commit needs reverting.

Simply rename pm_qos_request() to pm_qos_requirement() in
ieee80211_scan_state_decision() and it should build.

Rafael

^ permalink raw reply

* [PATCH 0/5] sky2: Avoid memory allocations on suspend/resume
From: Mike McCormack @ 2010-05-12 23:16 UTC (permalink / raw)
  To: shemminger; +Cc: netdev

Hi Stephen,

This patch set rearranges sky2_suspend and sky2_resume so that we don't free memory and reallocate it when suspending.

Furthermore, there's two small fixes (Avoid race in sky2_change_mtu) for which I've clarified the commit message, and
(Restore multicast after restart) which restores the multicast settings after reset, as they may be lost.

thanks,

Mike

^ permalink raw reply

* [PATCH 1/5] sky2: Avoid race in sky2_change_mtu
From: Mike McCormack @ 2010-05-12 23:16 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

netif_stop_queue does not ensure all in-progress transmits are complete,
 so use netif_tx_disable() instead.

Secondly, make sure NAPI polls are disabled before stopping the tx queue,
 otherwise sky2_status_intr might trigger a TX queue wakeup between when
 we stop the queue and NAPI is disabled.

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 088c797..b839bae 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -2236,8 +2236,8 @@ static int sky2_change_mtu(struct net_device *dev, int new_mtu)
 	sky2_write32(hw, B0_IMSK, 0);
 
 	dev->trans_start = jiffies;	/* prevent tx timeout */
-	netif_stop_queue(dev);
 	napi_disable(&hw->napi);
+	netif_tx_disable(dev);
 
 	synchronize_irq(hw->pdev->irq);
 
-- 
1.5.6.5



^ permalink raw reply related

* [PATCH 2/5] sky2: Restore multicast after restart
From: Mike McCormack @ 2010-05-12 23:16 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Multicast settings will be lost on reset, so restore them.

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index b839bae..58ae840 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -3293,6 +3293,7 @@ static void sky2_restart(struct work_struct *work)
 			continue;
 
 		sky2_hw_up(sky2);
+		sky2_set_multicast(dev);
 		netif_wake_queue(dev);
 	}
 
-- 
1.5.6.5



^ permalink raw reply related

* [PATCH 3/5] sky2: Shut off interrupts before NAPI
From: Mike McCormack @ 2010-05-12 23:29 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Interrupts should be masked, then synchronized, and
finally NAPI should be disabled.

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 58ae840..04adcee 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -3266,10 +3266,10 @@ static void sky2_restart(struct work_struct *work)
 
 	rtnl_lock();
 
-	napi_disable(&hw->napi);
-	synchronize_irq(hw->pdev->irq);
 	imask = sky2_read32(hw, B0_IMSK);
 	sky2_write32(hw, B0_IMSK, 0);
+	synchronize_irq(hw->pdev->irq);
+	napi_disable(&hw->napi);
 
 	for (i = 0; i < hw->ports; i++) {
 		struct net_device *dev = hw->dev[i];
-- 
1.5.6.5



^ permalink raw reply related

* [PATCH 4/5] sky2: Refactor down/up code out of sky2_restart()
From: Mike McCormack @ 2010-05-12 23:29 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Code to bring down all sky2 interfaces and bring it up
again can be reused in sky2_suspend and sky2_resume.

Factor the code to bring the interfaces down into
sky2_all_down and the up code into sky2_all_up.

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |   26 +++++++++++++++++++-------
 1 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 04adcee..a8060c8 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -3258,15 +3258,11 @@ static int sky2_reattach(struct net_device *dev)
 	return err;
 }
 
-static void sky2_restart(struct work_struct *work)
+static void sky2_all_down(struct sky2_hw *hw)
 {
-	struct sky2_hw *hw = container_of(work, struct sky2_hw, restart_work);
-	u32 imask;
 	int i;
 
-	rtnl_lock();
-
-	imask = sky2_read32(hw, B0_IMSK);
+	sky2_read32(hw, B0_IMSK);
 	sky2_write32(hw, B0_IMSK, 0);
 	synchronize_irq(hw->pdev->irq);
 	napi_disable(&hw->napi);
@@ -3282,8 +3278,12 @@ static void sky2_restart(struct work_struct *work)
 		netif_tx_disable(dev);
 		sky2_hw_down(sky2);
 	}
+}
 
-	sky2_reset(hw);
+static void sky2_all_up(struct sky2_hw *hw)
+{
+	u32 imask = Y2_IS_BASE;
+	int i;
 
 	for (i = 0; i < hw->ports; i++) {
 		struct net_device *dev = hw->dev[i];
@@ -3294,6 +3294,7 @@ static void sky2_restart(struct work_struct *work)
 
 		sky2_hw_up(sky2);
 		sky2_set_multicast(dev);
+		imask |= portirq_msk[i];
 		netif_wake_queue(dev);
 	}
 
@@ -3302,6 +3303,17 @@ static void sky2_restart(struct work_struct *work)
 
 	sky2_read32(hw, B0_Y2_SP_LISR);
 	napi_enable(&hw->napi);
+}
+
+static void sky2_restart(struct work_struct *work)
+{
+	struct sky2_hw *hw = container_of(work, struct sky2_hw, restart_work);
+
+	rtnl_lock();
+
+	sky2_all_down(hw);
+	sky2_reset(hw);
+	sky2_all_up(hw);
 
 	rtnl_unlock();
 }
-- 
1.5.6.5



^ permalink raw reply related

* [PATCH 5/5] sky2: Avoid allocating memory in sky2_resume
From: Mike McCormack @ 2010-05-12 23:30 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Allocating memory can fail, and since we have the memory we need
in sky2_resume when sky2_suspend is called, just stop the hardware
without freeing the memory it's using.

This avoids the possibility of failing because we can't allocate
memory in sky2_resume(), and allows sharing code with sky2_restart().

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |   20 +++++---------------
 1 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index a8060c8..02d9d6f 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -4842,12 +4842,12 @@ static int sky2_suspend(struct pci_dev *pdev, pm_message_t state)
 	cancel_work_sync(&hw->restart_work);
 
 	rtnl_lock();
+
+	sky2_all_down(hw);
 	for (i = 0; i < hw->ports; i++) {
 		struct net_device *dev = hw->dev[i];
 		struct sky2_port *sky2 = netdev_priv(dev);
 
-		sky2_detach(dev);
-
 		if (sky2->wol)
 			sky2_wol_init(sky2);
 
@@ -4856,8 +4856,6 @@ static int sky2_suspend(struct pci_dev *pdev, pm_message_t state)
 
 	device_set_wakeup_enable(&pdev->dev, wol != 0);
 
-	sky2_write32(hw, B0_IMSK, 0);
-	napi_disable(&hw->napi);
 	sky2_power_aux(hw);
 	rtnl_unlock();
 
@@ -4872,12 +4870,11 @@ static int sky2_suspend(struct pci_dev *pdev, pm_message_t state)
 static int sky2_resume(struct pci_dev *pdev)
 {
 	struct sky2_hw *hw = pci_get_drvdata(pdev);
-	int i, err;
+	int err;
 
 	if (!hw)
 		return 0;
 
-	rtnl_lock();
 	err = pci_set_power_state(pdev, PCI_D0);
 	if (err)
 		goto out;
@@ -4895,20 +4892,13 @@ static int sky2_resume(struct pci_dev *pdev)
 		goto out;
 	}
 
+	rtnl_lock();
 	sky2_reset(hw);
-	sky2_write32(hw, B0_IMSK, Y2_IS_BASE);
-	napi_enable(&hw->napi);
-
-	for (i = 0; i < hw->ports; i++) {
-		err = sky2_reattach(hw->dev[i]);
-		if (err)
-			goto out;
-	}
+	sky2_all_up(hw);
 	rtnl_unlock();
 
 	return 0;
 out:
-	rtnl_unlock();
 
 	dev_err(&pdev->dev, "resume failed (%d)\n", err);
 	pci_disable_device(pdev);
-- 
1.5.6.5


^ permalink raw reply related

* Re: linux-next: manual merge of the net tree with the wireless-current tree
From: Stephen Rothwell @ 2010-05-13  1:02 UTC (permalink / raw)
  To: reinette chatre
  Cc: David Miller, netdev, linux-next, linux-kernel, Berg, Johannes,
	John W. Linville
In-Reply-To: <1273680943.2370.9533.camel@rchatre-DESK>

[-- Attachment #1: Type: text/plain, Size: 865 bytes --]

Hi Reinette,

On Wed, 12 May 2010 09:15:43 -0700 reinette chatre <reinette.chatre@intel.com> wrote:
>
> Please note that this member of iwl_priv is now located in two
> places ... here and also in the device specific union a few lines up
> inside the _agn struct. In the code it is indeed the one in the _agn
> struct that is being used (see usage in iwl-agn.c), so adding this line
> is not necessary.
> 
> This should not cause any problems with driver operation and can
> probably remain as such if indeed it is cleaned up somewhere else before
> it hits you again or linux-2.6.

Thanks for pointing this out.  I will update my merge resolution in
linux-next and I assume it will be done correctly when Dave or John also
do that merge.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: Config Items appearing twice in same Kconfig file?
From: Michael Ellerman @ 2010-05-13  1:08 UTC (permalink / raw)
  To: Christoph Egger
  Cc: linux-ia64, vamos, Jesse Barnes, Timo Teras, Chen Liqin,
	Paul Mackerras, H. Peter Anvin, Lennox Wu, Jesper Nilsson,
	Pekka Savola (ipv6), x86, James Morris, Ingo Molnar,
	Alexey Kuznetsov, Matt Turner, Fenghua Yu, Mike Frysinger,
	Arnd Bergmann, Graf Yang, Mikael Starvik, Ivan Kokshaysky,
	Thomas Gleixner, Arjan van de Ven, Richard Henderson,
	Karsten Keil
In-Reply-To: <20100512144017.GA722@faui48a.informatik.uni-erlangen.de>


[-- Attachment #1.1: Type: text/plain, Size: 994 bytes --]

On Wed, 2010-05-12 at 16:40 +0200, Christoph Egger wrote:
> Just noticed it might have been worth adding the output so everyone
> out there doesn't ned to grep through his tree for reference:
> 
> % grep  ^config **/Kconfig* | sort | uniq -dc                               

That didn't actually work for me?

>       2 arch/powerpc/Kconfig:config KERNEL_START
>       2 arch/powerpc/Kconfig:config PAGE_OFFSET
>       2 arch/powerpc/Kconfig:config PHYSICAL_START
>       2 arch/powerpc/Kconfig:config RELOCATABLE

The duplication of these is caused by the second definitions being
inside an "if PPC64" block.

For PAGE_OFFSET and KERNEL_START we want the 2nd definitions, the
default value is different between 32 & 64 bit.

For RELOCATABLE it's a little weird, but we want the 2nd definition
because the depends condition is different between 32 & 64bit - though
they could probably be merged.

I don't really see why we need two versions of PHYSICAL_START.

cheers

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 150 bytes --]

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* printk mac address --hangs
From: ratheesh k @ 2010-05-13  5:16 UTC (permalink / raw)
  To: netdev

/* code snippent */
 char src[ETH_ALEN+2];
 char dst[ETH_ALEN+2];

 eth_header = eth_hdr(skb);
 memcpy(src , eth_header->h_source , ETH_ALEN );
 memcpy(dst , eth_header->h_dest , ETH_ALEN );

Linux crashes . But if comment out memcpy lines  . It works . What is
the problem here ?
How can i printk both source mac and dest mac address  ?

Thanks,
Ratheesh

^ permalink raw reply

* RE: [PATCH 2.6.34-rc6] net: Improve ks8851 snl transmit performance
From: Ha, Tristram @ 2010-05-13  5:38 UTC (permalink / raw)
  To: Arce, Abraham, Ben Dooks
  Cc: David Miller, netdev, linux-kernel, Jan, Sebastien
In-Reply-To: <27F9C60D11D683428E133F85D2BB4A53043DF7779A@dlee03.ent.ti.com>

I use a web browser to send patches through my company's e-mail system.  The message is composed by cut and paste, so it may not conform to Linux standard.

The latest nuttcp default size for UDP is 1500 bytes, rather than 8192 bytes.  In my case, the transmit performance improves from 10 Mbps to 11.  Have you tried TCP?


-----Original Message-----
From: Arce, Abraham [mailto:x0066660@ti.com]
Sent: Thu 5/6/2010 10:02 PM
To: Ha, Tristram; Ben Dooks
Cc: David Miller; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Jan, Sebastien
Subject: RE: [PATCH 2.6.34-rc6] net: Improve ks8851 snl transmit performance
 
Hi,

[snip]

> There is a driver option no_tx_opt so that the driver can revert to original
> implementation.  This allows user to verify if the transmit performance
> actually improves.

Should we limit patch description to 80 characters also?

> Signed-off-by: Tristram Ha <Tristram.Ha@micrel.com>
> ---
> This replaces the [patch 01/13] patch I submitted and was objected by David.
> 
> Other users with Micrel KSZ8851 SNL chip please verify the transmit
> performance does improve or not.

Tested-by: Abraham Arce <x0066660@ti.com>

Executing some nuttcp scenarios:

- Without Patch -

# /testsuites/ethernet/bin/nuttcp -u -i -Ri50m <serverip>
 1.2676 MB /   1.00 sec =   10.6330 Mbps     0 /  1298 ~drop/pkt  0.00 ~%loss
 1.2705 MB /   1.00 sec =   10.6579 Mbps     0 /  1301 ~drop/pkt  0.00 ~%loss
 1.2686 MB /   1.00 sec =   10.6414 Mbps     0 /  1299 ~drop/pkt  0.00 ~%loss
 1.2695 MB /   1.00 sec =   10.6496 Mbps     0 /  1300 ~drop/pkt  0.00 ~%loss
 1.2695 MB /   1.00 sec =   10.6496 Mbps     0 /  1300 ~drop/pkt  0.00 ~%loss
 1.2686 MB /   1.00 sec =   10.6414 Mbps     0 /  1299 ~drop/pkt  0.00 ~%loss
 1.2686 MB /   1.00 sec =   10.6414 Mbps     0 /  1299 ~drop/pkt  0.00 ~%loss
 1.2646 MB /   1.00 sec =   10.6086 Mbps     0 /  1295 ~drop/pkt  0.00 ~%loss
 1.2686 MB /   1.00 sec =   10.6412 Mbps     0 /  1299 ~drop/pkt  0.00 ~%loss
 1.2695 MB /   1.00 sec =   10.6498 Mbps     0 /  1300 ~drop/pkt  0.00 ~%loss

12.7314 MB /  10.02 sec =   10.6637 Mbps 4 %TX 0 %RX 0 / 13037 drop/pkt 0.00 %loss

- With Patch -

# /testsuites/ethernet/bin/nuttcp -u -i -Ri50m 10.87.231.229
    1.2891 MB /   1.00 sec =   10.8133 Mbps     0 /  1320 ~drop/pkt  0.00 ~%loss
    1.2900 MB /   1.00 sec =   10.8217 Mbps     0 /  1321 ~drop/pkt  0.00 ~%loss
    1.2900 MB /   1.00 sec =   10.8217 Mbps     0 /  1321 ~drop/pkt  0.00 ~%loss
    1.2910 MB /   1.00 sec =   10.8298 Mbps     0 /  1322 ~drop/pkt  0.00 ~%loss
    1.2910 MB /   1.00 sec =   10.8299 Mbps     0 /  1322 ~drop/pkt  0.00 ~%loss
    1.2900 MB /   1.00 sec =   10.8216 Mbps     0 /  1321 ~drop/pkt  0.00 ~%loss
    1.2900 MB /   1.00 sec =   10.8216 Mbps     0 /  1321 ~drop/pkt  0.00 ~%loss
    1.2891 MB /   1.00 sec =   10.8135 Mbps     0 /  1320 ~drop/pkt  0.00 ~%loss
    1.2900 MB /   1.00 sec =   10.8216 Mbps     0 /  1321 ~drop/pkt  0.00 ~%loss
    1.2910 MB /   1.00 sec =   10.8298 Mbps     0 /  1322 ~drop/pkt  0.00 ~%loss

   12.9492 MB /  10.02 sec =   10.8461 Mbps 4 %TX 0 %RX 0 / 13260 drop/pkt 0.00
%loss

Also simulated heavy transmission consisting of 40 processes executed in parallel:

 - 20 ping instances using packet size of 32768
 - 20 dd instances creating a 50MB file each under the nfs rootfs

If any specific test scenario/application is required please do let me know...

Best Regards
Abraham


^ permalink raw reply

* Re: [PATCH net-next 0/16] tipc: 1st integration of basic changes from sourceforge
From: David Miller @ 2010-05-13  6:28 UTC (permalink / raw)
  To: paul.gortmaker; +Cc: netdev, allan.stephens
In-Reply-To: <1273624218-22514-1-git-send-email-paul.gortmaker@windriver.com>

From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Tue, 11 May 2010 20:30:02 -0400

> The following patches are step one in making use of the changes that
> were stored away on sourceforge but not yet integrated into the kernel.
> 
> Since I am far from knowledgeable on tipc, this starts by pulling in
> the 1st batch of basic/cosmetic changes that will at least start to
> reduce the delta between the two.   Hopefully getting all the simple
> stuff out of the way 1st will help clarify what is left of interest.
> 
> I've also put the same commits on the branch tipc-May11_2010 in:
> git://git.kernel.org/pub/scm/linux/kernel/git/paulg/net-next-2.6.git
> in case that is more convenient for people.

All applied to net-next-2.6, thanks for doing this work Paul.

^ permalink raw reply

* Re: [PATCH NEXT 0/4]netxen: bug fixes
From: David Miller @ 2010-05-13  6:29 UTC (permalink / raw)
  To: amit.salecha; +Cc: netdev, ameen.rahman
In-Reply-To: <1273657985-13405-1-git-send-email-amit.salecha@qlogic.com>

From: Amit Kumar Salecha <amit.salecha@qlogic.com>
Date: Wed, 12 May 2010 02:53:01 -0700

>   Series of 4 patches to fix diagnostic tools access and register usage
>   for NX3031.
>   Please apply them on net-next branch.

All applied, thank you.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox