Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 0/2] cfg80211: firmware and hardware version
From: Inaky Perez-Gonzalez @ 2009-10-01 20:12 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: John W. Linville, Kalle Valo, linux-wireless@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <43e72e890910011256v18b30e7ck420ce80b5d35fdcb@mail.gmail.com>

On Thu, 2009-10-01 at 13:56 -0600, Luis R. Rodriguez wrote:
> On Thu, Oct 1, 2009 at 5:07 PM, John W. Linville <linville@tuxdriver.com> wrote:
> 
> > I don't predict a huge problem if there are
> > valid extensions required for use by wireless drivers in the future.
> > But for now, I'd like to see us make use of some of the debugging
> > facilities available in the ethtool API -- hopefully the iwlwifi guys
> > are listening... ;-)
> 
> Does the same apply to wimax then? Ethtool for 802.11 and wimax? Eh.

Not really -- WiMAX is not eth-frame based, but IP based.

The WiMAX stack doesn't require any type of framing/network device
typing requirement. That is left up to the device driver writer
(although yes, emulating eth is easier).

-- 
-- Inaky



^ permalink raw reply

* Re: [PATCH 0/2] cfg80211: firmware and hardware version
From: Luis R. Rodriguez @ 2009-10-01 19:56 UTC (permalink / raw)
  To: John W. Linville, Perez-Gonzalez, Inaky
  Cc: Kalle Valo, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20091001170722.GC2895-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>

On Thu, Oct 1, 2009 at 5:07 PM, John W. Linville <linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org> wrote:

> I don't predict a huge problem if there are
> valid extensions required for use by wireless drivers in the future.
> But for now, I'd like to see us make use of some of the debugging
> facilities available in the ethtool API -- hopefully the iwlwifi guys
> are listening... ;-)

Does the same apply to wimax then? Ethtool for 802.11 and wimax? Eh.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* 2.6.32-rc1-git2: Reported regressions 2.6.30 -> 2.6.31
From: Rafael J. Wysocki @ 2009-10-01 19:53 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

[Notes:

 * Quite a number of new regressions from 2.6.30 has been reported during
   the last three weeks.

 * The number of unresolved regressions 2.6.30 -> 2.6.31 is now the second
   highest ever.]

This message contains a list of some regressions introduced between 2.6.30 and
2.6.31, for which there are no fixes in the mainline I know of.  If any of them
have been fixed already, please let me know.

If you know of any other unresolved regressions introduced between 2.6.30
and 2.6.31, please let me know either and I'll add them to the list.
Also, please let me know if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-02      151       49          42
  2009-09-06      123       34          27
  2009-08-26      108       33          26
  2009-08-20      102       32          29
  2009-08-10       89       27          24
  2009-08-02       76       36          28
  2009-07-27       70       51          43
  2009-07-07       35       25          21
  2009-06-29       22       22          15


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14301
Subject		: WARNING: at net/ipv4/af_inet.c:154
Submitter	: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
Date		: 2009-09-30 12:24 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125431350218137&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
Subject		: kernel BUG at drivers/ide/ide-disk.c:187
Submitter	: Santiago Garcia Mantinan <manty@manty.net>
Date		: 2009-09-30 11:05 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
Handled-By	: David Miller <davem@davemloft.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14270
Subject		: Cannot boot on a PIII Celeron
Submitter	: Michael Tokarev <mjt@tls.msk.ru>
Date		: 2009-09-28 15:26 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125415160524110&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Date		: 2009-09-24 10:16 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
Subject		: regression in page writeback
Submitter	: Shaohua Li <shaohua.li@intel.com>
Date		: 2009-09-22 5:49 (10 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
Handled-By	: Wu Fengguang <fengguang.wu@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Date		: 2009-09-15 12:05 (17 days old)
References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
Subject		: ehci problem - mouse dead on scroll
Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
Date		: 2009-09-12 7:46 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14257
Subject		: Not able to boot on 32 bit System
Submitter	: Rishikesh <risrajak@linux.vnet.ibm.com>
Date		: 2009-09-21 15:25 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=125354604314412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14256
Subject		: kernel BUG at fs/ext3/super.c:435
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2009-09-21 7:29 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=125351816109264&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14255
Subject		: WARNING: at drivers/char/tty_io.c:1267
Submitter	: Heinz Diehl <htd@fancy-poultry.org>
Date		: 2009-09-20 11:37 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125344629506309&w=4
		  http://lkml.org/lkml/2009/9/8/393
Handled-By	: Linus Torvalds <torvalds@linux-foundation.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14254
Subject		: Hibernation broken by clocksource: Save mult_orig in clocksource_disable()
Submitter	: Ondrej Zary <linux@rainbow-software.org>
Date		: 2009-09-19 19:55 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c7121843685de2bf7f3afd3ae1d6a146010bf1fc
References	: http://marc.info/?l=linux-kernel&m=125339012527719&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-09-20 11:26 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14251
Subject		: 2.6.31: no login prompt
Submitter	: Frédéric L. W. Meunier <fredlwm@gmail.com>
Date		: 2009-09-19 22:43 (13 days old)
References	: http://marc.info/?l=linux-kernel&m=125340020804711&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14249
Subject		: BUG: oops in gss_validate on 2.6.31
Submitter	: Bastian Blank <bastian@waldi.eu.org>
Date		: 2009-09-16 10:29 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125309700417283&w=4
Handled-By	: Trond Myklebust <trond.myklebust@fys.uio.no>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14248
Subject		: 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
Submitter	: Jurriaan <thunder8@xs4all.nl>
Date		: 2009-09-13 7:32 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=125282721113553&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14222
Subject		: Hibernation oopses for the 2nd time with 2.6.31 (won't fit the screen)
Submitter	: Ondrej Zary <linux@rainbow-software.org>
Date		: 2009-09-24 14:07 (8 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c7121843685de2bf7f3afd3ae1d6a146010bf1fc


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14205
Subject		: Intel DX58SO mainboard - powering off takes really long
Submitter	: Tomasz Chmielewski <tch@wpkg.org>
Date		: 2009-09-22 10:14 (10 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo@no-log.org>
Date		: 2009-09-21 20:36 (11 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
Subject		: Oops in driversbasefirmware_class
Submitter	:  <lars_ericsson@telia.com>
Date		: 2009-09-17 05:09 (15 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at system shutdown
Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
Date		: 2009-09-15 18:34 (17 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik@gmail.com>
Date		: 2009-09-11 07:42 (21 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab@clamav.net>
Date		: 2009-09-08 08:48 (24 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
Subject		: order 2 page allocation failures in iwlagn
Submitter	: Frans Pop <elendil@planet.nl>
Date		: 2009-09-06 7:40 (26 days old)
References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14133
Subject		: WARNING: at arch/x86/kernel/smp.c:117 native_smp_send_reschedule
Submitter	: Jens Axboe <jens.axboe@oracle.com>
Date		: 2009-08-31 20:43 (32 days old)
References	: http://marc.info/?l=linux-kernel&m=125175143918050&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov@itron.com>
Date		: 2009-09-03 21:06 (29 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14090
Subject		: WARNING: at fs/notify/inotify/inotify_user.c:394
Submitter	: Joerg Platte <bugzilla@jako.ping.de>
Date		: 2009-08-30 15:21 (33 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14070
Subject		: lockdep warning triggered by dup_fd
Submitter	: Bart Van Assche <bart.vanassche@gmail.com>
Date		: 2009-08-23 09:36 (40 days old)
References	: http://lkml.org/lkml/2009/8/23/8


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson@zen.co.uk>
Date		: 2009-08-20 15:48 (43 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14013
Subject		: hd don't show up
Submitter	: Tim Blechmann <tim@klingt.org>
Date		: 2009-08-14 8:26 (49 days old)
References	: http://marc.info/?l=linux-kernel&m=125023842514480&w=4
Handled-By	: Tejun Heo <tj@kernel.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-08-15 07:55 (48 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13950
Subject		: Oops when USB Serial disconnected while in use
Submitter	: Bruno Prémont <bonbons@linux-vserver.org>
Date		: 2009-08-08 17:47 (55 days old)
References	: http://marc.info/?l=linux-kernel&m=124975432900466&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-08-06 20:15 (57 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13942
Subject		: Troubles with AoE and uninitialized object
Submitter	: Bruno Prémont <bonbons@linux-vserver.org>
Date		: 2009-08-04 10:12 (59 days old)
References	: http://marc.info/?l=linux-kernel&m=124938117104811&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (60 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13940
Subject		: iwlagn and sky2 stopped working, ACPI-related
Submitter	: Ricardo Jorge da Fonseca Marques Ferreira <storm@sys49152.net>
Date		: 2009-08-07 22:33 (56 days old)
References	: http://marc.info/?l=linux-kernel&m=124968457731107&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13935
Subject		: 2.6.31-rcX breaks Apple MightyMouse (Bluetooth version)
Submitter	: Adrian Ulrich <kernel@blinkenlights.ch>
Date		: 2009-08-08 22:08 (55 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fa047e4f6fa63a6e9d0ae4d7749538830d14a343


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy@gmail.com>
Date		: 2009-08-04 09:02 (59 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13869
Subject		: Radeon framebuffer (w/o KMS) corruption at boot.
Submitter	: Duncan <1i5t5.duncan@cox.net>
Date		: 2009-07-29 16:44 (65 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra@gmail.com>
Date		: 2009-07-17 21:24 (77 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan@redhat.com>
Date		: 2009-07-22 13:35 (72 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-07-06 18:32 (88 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak@linux.vnet.ibm.com>
Date		: 2009-06-17 17:56 (107 days old)
References	: http://lkml.org/lkml/2009/6/17/194


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
Submitter	: gabriele balducci <balducci@units.it>
Date		: 2009-09-30 15:02 (2 days old)
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
Submitter	: Nix <nix@esperi.org.uk>
Date		: 2009-09-26 11:16 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
Patch		: http://patchwork.kernel.org/patch/50277/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Date		: 2009-09-22 4:18 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael@ellerman.id.au>
Patch		: http://patchwork.kernel.org/patch/49258/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
Date		: 2009-09-16 20:44 (16 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Patch		: http://patchwork.kernel.org/patch/49914/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14137
Subject		: usb console regressions
Submitter	: Jason Wessel <jason.wessel@windriver.com>
Date		: 2009-09-05 21:08 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125218501310512&w=4
Handled-By	: Jason Wessel <jason.wessel@windriver.com>
Patch		: http://patchwork.kernel.org/patch/45953/
		  http://patchwork.kernel.org/patch/45952/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare@suse.de>
Date		: 2009-08-13 6:45 (50 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13948
Subject		: ath5k broken after suspend-to-ram
Submitter	: Johannes Stezenbach <js@sig21.net>
Date		: 2009-08-07 21:51 (56 days old)
References	: http://marc.info/?l=linux-kernel&m=124968192727854&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>
Patch		: http://patchwork.kernel.org/patch/38550/


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions introduced
between 2.6.30 and 2.6.31, unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13615

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

^ permalink raw reply

* [RFC][PATCH] ethtool: Add reset operation
From: Ben Hutchings @ 2009-10-01 19:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers

After updating firmware stored in flash, users may wish to reset the
relevant hardware and start the new firmware immediately.  This should
not be completely automatic as it may be disruptive.

A selective reset may also be useful for debugging or diagnostics.

This adds a separate reset operation which takes flags indicating the
components to be reset.  Drivers are allowed to reset only a subset of
those requested, and must report the actual subset.  This allows the
use of generic component masks and some future expansion.
---
This is intentionally not signed off yet.

Our new controller has an emebedded management controller shared between
two ports.  We have a customer requirement to be able to update its
firmware in flash and then reboot it into the new firmware under driver
control.

David, you indicated that the proper interface for this is a new ethtool
op and not something specific to sfc.  So I've tried to make this
reasonably generic.

Ben.

 include/linux/ethtool.h |   29 +++++++++++++++++++++++++++++
 net/core/ethtool.c      |   23 +++++++++++++++++++++++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 9cbe5f3..acf0242 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -503,6 +503,7 @@ struct ethtool_ops {
 	int	(*get_rxnfc)(struct net_device *, struct ethtool_rxnfc *, void *);
 	int	(*set_rxnfc)(struct net_device *, struct ethtool_rxnfc *);
 	int     (*flash_device)(struct net_device *, struct ethtool_flash *);
+	int	(*reset)(struct net_device *, u32 *);
 };
 #endif /* __KERNEL__ */
 
@@ -560,6 +561,7 @@ struct ethtool_ops {
 #define	ETHTOOL_SRXCLSRLDEL	0x00000031 /* Delete RX classification rule */
 #define	ETHTOOL_SRXCLSRLINS	0x00000032 /* Insert RX classification rule */
 #define	ETHTOOL_FLASHDEV	0x00000033 /* Flash firmware to device */
+#define	ETHTOOL_RESET		0x00000034 /* Reset hardware */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
@@ -690,4 +692,31 @@ struct ethtool_ops {
 
 #define	RX_CLS_FLOW_DISC	0xffffffffffffffffULL
 
+/* Reset flags */
+/* The driver must update the flags to indicate which components were
+ * actually reset, which must be equal to or a subset of those requested.
+ */
+enum ethtool_reset_flags {
+	/* These flags represent components dedicated to the interface
+	 * the command is addressed to.  Shift any flag left by
+	 * ETH_RESET_SHARED_SHIFT to reset a shared component of the
+	 * same type.
+	 */
+  	ETH_RESET_MGMT		= 1 << 0,	/* Management processor */
+	ETH_RESET_IRQ		= 1 << 1,	/* Interrupt requester */
+	ETH_RESET_DMA		= 1 << 2,	/* DMA engine */
+	ETH_RESET_FILTER	= 1 << 3,	/* Filtering/flow direction */
+	ETH_RESET_OFFLOAD	= 1 << 4,	/* Protocol offload */
+	ETH_RESET_MAC		= 1 << 5,	/* Media access controller */
+	ETH_RESET_PHY		= 1 << 6,	/* Transceiver/PHY */
+	ETH_RESET_RAM		= 1 << 7,	/* RAM shared between
+						 * multiple components */
+
+	ETH_RESET_DEDICATED	= 0x0000ffff,	/* All components dedicated to
+						 * this interface */
+	ETH_RESET_ALL		= 0xffffffff,	/* All components used by this
+						 * interface, even if shared */
+};
+#define ETH_RESET_SHARED_SHIFT	16
+
 #endif /* _LINUX_ETHTOOL_H */
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 4c12ddb..6c7429c 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -309,6 +309,26 @@ static int ethtool_get_regs(struct net_device *dev, char __user *useraddr)
 	return ret;
 }
 
+static int ethtool_reset(struct net_device *dev, char __user *useraddr)
+{
+	struct ethtool_value reset;
+	int ret;
+
+	if (!dev->ethtool_ops->reset)
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&reset, useraddr, sizeof(reset)))
+		return -EFAULT;
+
+	ret = dev->ethtool_ops->reset(dev, &reset.data);
+	if (ret)
+		return ret;
+
+	if (copy_to_user(useraddr, &reset, sizeof(reset)))
+		return -EFAULT;
+	return 0;
+}
+
 static int ethtool_get_wol(struct net_device *dev, char __user *useraddr)
 {
 	struct ethtool_wolinfo wol = { ETHTOOL_GWOL };
@@ -1127,6 +1147,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_FLASHDEV:
 		rc = ethtool_flash_device(dev, useraddr);
 		break;
+	case ETHTOOL_RESET:
+		rc = ethtool_reset(dev, useraddr);
+		break;
 	default:
 		rc = -EOPNOTSUPP;
 	}

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* Re: pull request: wireless-2.6 2009-10-01
From: David Miller @ 2009-10-01 19:43 UTC (permalink / raw)
  To: linville-2XuSBdqkA4R54TAoqtyWWQ
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20091001182450.GD2895-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>

From: "John W. Linville" <linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Date: Thu, 1 Oct 2009 14:24:51 -0400

> One is a brown paper bag fix for an uninitialized variable, another is a
> USB ID.  There is a beaconing fix for the mac80211_hwsim "fake" driver,
> and a bug fix for AP mode related to buffering frames for stations in
> power save mode.
> 
> The b43 fix looks a bit long, but it is more-or-less the same simple
> fix applied in multiple places.  It addresses a bug where "the last
> bytes of data sent/received to/from PIO FIFOs on SDIO-based cards get
> 'swizzled' when its length is not multiple of 4 bytes."
> 
> Please let me know if there are problems!

Pulled, thanks a lot John.

I'll try to make sure the netif_rx_ni() et al. discussion keeps
making forward progress so that will get fixed too :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] ethtool: Add space between obsolete operations and later additions
From: David Miller @ 2009-10-01 19:41 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev
In-Reply-To: <1254425841.2735.9.camel@achroite>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 01 Oct 2009 20:37:21 +0100

> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

While a space is nice, a comment is even better! :-)

^ permalink raw reply

* Re: [PATCH] make TLLAO option for NA packets configurable
From: David Miller @ 2009-10-01 19:37 UTC (permalink / raw)
  To: shemminger; +Cc: opurdila, cratiu, netdev
In-Reply-To: <20091001115611.0baa77b8@s6510>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Thu, 1 Oct 2009 11:56:11 -0700

> On Thu, 1 Oct 2009 21:39:32 +0300
> Octavian Purdila <opurdila@ixiacom.com> wrote:
> 
>> On Thursday 01 October 2009 21:14:50 you wrote:
>> > 
>> > Probably this should be a per interface property rather than per namespace.
>> 
>> In our case, where we have lots of interfaces active, it would be nice to have 
>> the per namespace property as well.
> 
> The ipv6 control infrastructure already has that option. If you changed your
> patch to use a per-interface control then there would be:
> 
>   /proc/sys/net/ipv6/conf/all/force_tllao

Right, this would work a lot better.

^ permalink raw reply

* [PATCH] ethtool: Add space between obsolete operations and later additions
From: Ben Hutchings @ 2009-10-01 19:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 include/linux/ethtool.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 15e4eb7..9cbe5f3 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -499,6 +499,7 @@ struct ethtool_ops {
 	/* the following hooks are obsolete */
 	int	(*self_test_count)(struct net_device *);/* use get_sset_count */
 	int	(*get_stats_count)(struct net_device *);/* use get_sset_count */
+
 	int	(*get_rxnfc)(struct net_device *, struct ethtool_rxnfc *, void *);
 	int	(*set_rxnfc)(struct net_device *, struct ethtool_rxnfc *);
 	int     (*flash_device)(struct net_device *, struct ethtool_flash *);

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* Re: [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
From: David Miller @ 2009-10-01 19:37 UTC (permalink / raw)
  To: eric.dumazet; +Cc: kaber, netdev
In-Reply-To: <4AC4FE07.5070204@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 01 Oct 2009 21:07:51 +0200

> We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
> is running.
> 
> # tc -s -d qdisc
> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> 
> User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
> one (because no estimator is active)
> 
> After this patch, tc command output is :
> $ tc -s -d qdisc
> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

I'm generally fine with this idea.

The new behavior is certainly more intuitive even to me :-)

Unless there are other objections I'm ok with this and I'll apply
your final version when I start taking changes for net-next-2.6
(which is probably right after -rc3 is released).

^ permalink raw reply

* Re: [PATCH] net: fix NOHZ: local_softirq_pending 08
From: David Miller @ 2009-10-01 19:32 UTC (permalink / raw)
  To: mb; +Cc: johannes, oliver, kalle.valo, linville, linux-wireless, netdev
In-Reply-To: <200910012110.34216.mb@bu3sch.de>

From: Michael Buesch <mb@bu3sch.de>
Date: Thu, 1 Oct 2009 21:10:32 +0200

> For the benefit of a much bigger critical section? I don't get it
> why this would be any better.

Think about what you are saying when you introduce things
like this into your code:

	if (in_interrupt())
		foo();
	else
		bar();

That thing there means you don't know anything about how you'll need
to do locking properly, because you have no idea about even the
context in which your code is executed.

Sure, you can lock for the most stringent case, but that's silly and
wasteful.

^ permalink raw reply

* 2.6.32-rc1-git2: Reported regressions from 2.6.31
From: Rafael J. Wysocki @ 2009-10-01 19:26 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

[Notes:

 * Here's the first summary report of known regressions from 2.6.31.  There's
   not too many of them at the moment, which is nice.

 * We're still getting quite a number of reports of regressions from 2.6.30 and
   it's been that way since 2.6.31 was released.  For details please see the
   summary report of regressions 2.6.30 -> 2.6.31 that will follow shortly.]

This message contains a list of some regressions from 2.6.31, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.31, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-02       22       15           9


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14299
Subject		: oops in wireless, iwl3945 related?
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-09-29 17:12 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=125424439725743&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14298
Subject		: warning at manage.c:361 (set_irq_wake), matrix-keypad related?
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-09-30 20:07 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125434130703538&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14297
Subject		: console resume broken since ba15ab0e8d
Submitter	: Sascha Hauer <s.hauer@pengutronix.de>
Date		: 2009-09-30 15:11 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125432349404060&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14296
Subject		: spitz boots but suspend/resume is broken
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2009-09-30 12:06 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125431244516449&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14279
Subject		: Suspend to RAM freeze totally since 2.6.32-rc1
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-09-30 18:14 (2 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14277
Subject		: Caught 8-bit read from freed memory in b43 driver at association
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-09-30 18:06 (2 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14276
Subject		: nfsroot will not remount rw and claims illegal options
Submitter	: Hans de Bruin <bruinjm@xs4all.nl>
Date		: 2009-09-30 15:08 (2 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14260
Subject		: T400 suspend/resume regression
Submitter	: Theodore Tso <tytso@mit.edu>
Date		: 2009-09-26 6:57 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=125394827806011&w=4
Handled-By	: Rafael J. Wysocki <rjw@sisk.pl>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14214
Subject		: BUG at drivers/scsi/scsi_lib.c:1108!
Submitter	: Plamen Petrov <pvp-lsts@fs.ru.acad.bg>
Date		: 2009-09-23 11:13 (9 days old)


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14302
Subject		: Kernel panic on i386 machine when booting with profile=2
Submitter	: Shi, Alex <alex.shi@intel.com>
Date		: 2009-10-01 3:23 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=125436749607199&w=4
Handled-By	: Alex Shi <alex.shi@intel.com>
Patch		: http://patchwork.kernel.org/patch/50813/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14300
Subject		: BUG_ON crash w/ ext4
Submitter	: Markus Trippelsdorf <markus@trippelsdorf.de>
Date		: 2009-10-01 1:41 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=125436130800340&w=4
		  http://marc.info/?l=linux-kernel&m=125436568504914&w=4
Handled-By	: Theodore Tso <tytso@mit.edu>
Patch		: http://patchwork.kernel.org/patch/50810/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14278
Subject		: New message "NOHZ: local_softirq_pending 08" at each ping request
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-09-30 18:12 (2 days old)
Handled-By	: Michael Buesch <mb@bu3sch.de>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23220


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14271
Subject		: ACPI boot memory leaks
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2009-09-29 9:18 (3 days old)
References	: http://marc.info/?l=linux-kernel&m=125421594111690&w=4
Handled-By	: Bjorn Helgaas <bjorn.helgaas@hp.com>
Patch		: http://patchwork.kernel.org/patch/50565/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14259
Subject		: NFS problem with past 2.6.31 git tree
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2009-09-25 15:12 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=125389156504570&w=4
Handled-By	: Trond Myklebust <Trond.Myklebust@netapp.com>
Patch		: http://patchwork.kernel.org/patch/50428/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14247
Subject		: ACPI Exception: AE_TIME, Returned by Handler for [EmbeddedControl] flooding logs
Submitter	: Thomas Backlund <tmb@mandriva.org>
Date		: 2009-09-25 15:08 (7 days old)
References	: http://lkml.org/lkml/2009/9/25/121
Handled-By	: Alexey Starikovskiy <astarikovskiy@suse.de>
Patch		: http://patchwork.kernel.org/patch/50516/


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.31,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=14230

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


^ permalink raw reply

* Re: [PATCH] net: fix NOHZ: local_softirq_pending 08
From: Johannes Berg @ 2009-10-01 19:26 UTC (permalink / raw)
  To: Michael Buesch
  Cc: David Miller, oliver-fJ+pQTUTwRTk1uMJSBkQmQ,
	kalle.valo-X3B1VOXEql0, linville-2XuSBdqkA4R54TAoqtyWWQ,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <200910012110.34216.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

On Thu, 2009-10-01 at 21:10 +0200, Michael Buesch wrote:

> > I agree with davem, don't. Just fix the driver to local_bh_disable()
> > around the rx function if necessary.
> 
> For the benefit of a much bigger critical section? I don't get it why this would be any better.

And how do you know mac80211 is actually safe with this change? It uses
tasklets too. At the very least you'd have to require drivers to never
mix & match the regular/irqsafe functions at all.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Gregory Haskins @ 2009-10-01 19:24 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ira W. Snyder, Michael S. Tsirkin, netdev, virtualization, kvm,
	linux-kernel, mingo, linux-mm, akpm, hpa, Rusty Russell, s.hetze,
	alacrityvm-devel
In-Reply-To: <4AC46989.7030502@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 42845 bytes --]

Avi Kivity wrote:
> On 09/30/2009 10:04 PM, Gregory Haskins wrote:
> 
> 
>>> A 2.6.27 guest, or Windows guest with the existing virtio drivers,
>>> won't work
>>> over vbus.
>>>      
>> Binary compatibility with existing virtio drivers, while nice to have,
>> is not a specific requirement nor goal.  We will simply load an updated
>> KMP/MSI into those guests and they will work again.  As previously
>> discussed, this is how more or less any system works today.  It's like
>> we are removing an old adapter card and adding a new one to "uprev the
>> silicon".
>>    
> 
> Virtualization is about not doing that.  Sometimes it's necessary (when
> you have made unfixable design mistakes), but just to replace a bus,
> with no advantages to the guest that has to be changed (other
> hypervisors or hypervisorless deployment scenarios aren't).

The problem is that your continued assertion that there is no advantage
to the guest is a completely unsubstantiated claim.  As it stands right
now, I have a public git tree that, to my knowledge, is the fastest KVM
PV networking implementation around.  It also has capabilities that are
demonstrably not found elsewhere, such as the ability to render generic
shared-memory interconnects (scheduling, timers), interrupt-priority
(qos), and interrupt-coalescing (exit-ratio reduction).  I designed each
of these capabilities after carefully analyzing where KVM was coming up
short.

Those are facts.

I can't easily prove which of my new features alone are what makes it
special per se, because I don't have unit tests for each part that
breaks it down.  What I _can_ state is that its the fastest and most
feature rich KVM-PV tree that I am aware of, and others may download and
test it themselves to verify my claims.

The disproof, on the other hand, would be in a counter example that
still meets all the performance and feature criteria under all the same
conditions while maintaining the existing ABI.  To my knowledge, this
doesn't exist.

Therefore, if you believe my work is irrelevant, show me a git tree that
accomplishes the same feats in a binary compatible way, and I'll rethink
my position.  Until then, complaining about lack of binary compatibility
is pointless since it is not an insurmountable proposition, and the one
and only available solution declares it a required casualty.

> 
>>>   Further, non-shmem virtio can't work over vbus.
>>>      
>> Actually I misspoke earlier when I said virtio works over non-shmem.
>> Thinking about it some more, both virtio and vbus fundamentally require
>> shared-memory, since sharing their metadata concurrently on both sides
>> is their raison d'être.
>>
>> The difference is that virtio utilizes a pre-translation/mapping (via
>> ->add_buf) from the guest side.  OTOH, vbus uses a post translation
>> scheme (via memctx) from the host-side.  If anything, vbus is actually
>> more flexible because it doesn't assume the entire guest address space
>> is directly mappable.
>>
>> In summary, your statement is incorrect (though it is my fault for
>> putting that idea in your head).
>>    
> 
> Well, Xen requires pre-translation (since the guest has to give the host
> (which is just another guest) permissions to access the data).

Actually I am not sure that it does require pre-translation.  You might
be able to use the memctx->copy_to/copy_from scheme in post translation
as well, since those would be able to communicate to something like the
xen kernel.  But I suppose either method would result in extra exits, so
there is no distinct benefit using vbus there..as you say below "they're
just different".

The biggest difference is that my proposed model gets around the notion
that the entire guest address space can be represented by an arbitrary
pointer.  For instance, the copy_to/copy_from routines take a GPA, but
may use something indirect like a DMA controller to access that GPA.  On
the other hand, virtio fully expects a viable pointer to come out of the
interface iiuc.  This is in part what makes vbus more adaptable to non-virt.

> So neither is a superset of the other, they're just different.
> 
> It doesn't really matter since Xen is unlikely to adopt virtio.

Agreed.

> 
>> An interesting thing here is that you don't even need a fancy
>> multi-homed setup to see the effects of my exit-ratio reduction work:
>> even single port configurations suffer from the phenomenon since many
>> devices have multiple signal-flows (e.g. network adapters tend to have
>> at least 3 flows: rx-ready, tx-complete, and control-events (link-state,
>> etc).  Whats worse, is that the flows often are indirectly related (for
>> instance, many host adapters will free tx skbs during rx operations, so
>> you tend to get bursts of tx-completes at the same time as rx-ready.  If
>> the flows map 1:1 with IDT, they will suffer the same problem.
>>    
> 
> You can simply use the same vector for both rx and tx and poll both at
> every interrupt.

Yes, but that has its own problems: e.g. additional exits or at least
additional overhead figuring out what happens each time.  This is even
more important as we scale out to MQ which may have dozens of queue
pairs.  You really want finer grained signal-path decode if you want
peak performance.

> 
>> In any case, here is an example run of a simple single-homed guest over
>> standard GigE.  Whats interesting here is that .qnotify to .notify
>> ratio, as this is the interrupt-to-signal ratio.  In this case, its
>> 170047/151918, which comes out to about 11% savings in interrupt
>> injections:
>>
>> vbus-guest:/home/ghaskins # netperf -H dev
>> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
>> dev.laurelwood.net (192.168.1.10) port 0 AF_INET
>> Recv   Send    Send
>> Socket Socket  Message  Elapsed
>> Size   Size    Size     Time     Throughput
>> bytes  bytes   bytes    secs.    10^6bits/sec
>>
>> 1048576  16384  16384    10.01     940.77
>> vbus-guest:/home/ghaskins # cat /sys/kernel/debug/pci-to-vbus-bridge
>>    .events                        : 170048
>>    .qnotify                       : 151918
>>    .qinject                       : 0
>>    .notify                        : 170047
>>    .inject                        : 18238
>>    .bridgecalls                   : 18
>>    .buscalls                      : 12
>> vbus-guest:/home/ghaskins # cat /proc/interrupts
>>              CPU0
>>     0:         87   IO-APIC-edge      timer
>>     1:          6   IO-APIC-edge      i8042
>>     4:        733   IO-APIC-edge      serial
>>     6:          2   IO-APIC-edge      floppy
>>     7:          0   IO-APIC-edge      parport0
>>     8:          0   IO-APIC-edge      rtc0
>>     9:          0   IO-APIC-fasteoi   acpi
>>    10:          0   IO-APIC-fasteoi   virtio1
>>    12:         90   IO-APIC-edge      i8042
>>    14:       3041   IO-APIC-edge      ata_piix
>>    15:       1008   IO-APIC-edge      ata_piix
>>    24:     151933   PCI-MSI-edge      vbus
>>    25:          0   PCI-MSI-edge      virtio0-config
>>    26:        190   PCI-MSI-edge      virtio0-input
>>    27:         28   PCI-MSI-edge      virtio0-output
>>   NMI:          0   Non-maskable interrupts
>>   LOC:       9854   Local timer interrupts
>>   SPU:          0   Spurious interrupts
>>   CNT:          0   Performance counter interrupts
>>   PND:          0   Performance pending work
>>   RES:          0   Rescheduling interrupts
>>   CAL:          0   Function call interrupts
>>   TLB:          0   TLB shootdowns
>>   TRM:          0   Thermal event interrupts
>>   THR:          0   Threshold APIC interrupts
>>   MCE:          0   Machine check exceptions
>>   MCP:          1   Machine check polls
>>   ERR:          0
>>   MIS:          0
>>
>> Its important to note here that we are actually looking at the interrupt
>> rate, not the exit rate (which is usually a multiple of the interrupt
>> rate, since you have to factor in as many as three exits per interrupt
>> (IPI, window, EOI).  Therefore we saved about 18k interrupts in this 10
>> second burst, but we may have actually saved up to 54k exits in the
>> process. This is only over a 10 second window at GigE rates, so YMMV.
>> These numbers get even more dramatic on higher end hardware, but I
>> haven't had a chance to generate new numbers yet.
>>    
> 
> (irq window exits should only be required on a small percentage of
> interrupt injections, since the guest will try to disable interrupts for
> short periods only)

Good point. You are probably right. Certainly the other 2 remain, however.

Ultimately, the fastest exit is the one you do not take.  That is what I
am trying to achieve.

> 
>> Looking at some external stats paints an even bleaker picture: "exits"
>> as reported by kvm_stat for virtio-pci based virtio-net tip the scales
>> at 65k/s vs 36k/s for vbus based venet.  And virtio is consuming ~30% of
>> my quad-core's cpu, vs 19% for venet during the test.  Its hard to know
>> which innovation or innovations may be responsible for the entire
>> reduction, but certainly the interrupt-to-signal ratio mentioned above
>> is probably helping.
>>    
> 
> Can you please stop comparing userspace-based virtio hosts to
> kernel-based venet hosts?  We know the userspace implementation sucks.

Sorry, but its all I have right now.  Last time I tried vhost it
required a dedicated adapter which was a non-starter for my lab rig
since I share it with others.  I didn't want to tear apart the bridge
setup, especially since mst told me the performance was worse than
userspace.  Therefore, there was no real point in working hard to get it
running.  I figured I would wait till the config and performance issues
were resolved and there is a git tree to pull in.

> 
>> The even worse news for 1:1 models is that the ratio of
>> exits-per-interrupt climbs with load (exactly when it hurts the most)
>> since that is when the probability that the vcpu will need all three
>> exits is the highest.
>>    
> 
> Requiring all three exits means the guest is spending most of its time
> with interrupts disabled; that's unlikely.

(see "softirqs" above)

> 
> Thanks for the numbers.  Are those 11% attributable to rx/tx
> piggybacking from the same interface?

Its hard to tell, since I am not instrumented to discern the difference
in this run.  I do know from previous traces on the 10GE rig that the
chelsio T3 that I am running reaps the pending-tx ring at the same time
as a rx polling, so its very likely that both events are often
coincident at least there.

> 
> Also, 170K interupts -> 17K interrupts/sec -> 55kbit/interrupt ->
> 6.8kB/interrupt.  Ignoring interrupt merging and assuming equal rx/tx
> distribution, that's about 13kB/interrupt.  Seems rather low for a
> saturated link.

I am not following: Do you suspect that I have too few interrupts to
represent 940Mb/s, or that I have too little data/interrupt and this
ratio should be improved?

> 
>>>     
>>>> and priortizable/nestable signals.
>>>>
>>>>        
>>> That doesn't belong in a bus.
>>>      
>> Everyone is of course entitled to an opinion, but the industry as a
>> whole would disagree with you.  Signal path routing (1:1, aggregated,
>> etc) is at the discretion of the bus designer.  Most buses actually do
>> _not_ support 1:1 with IDT (think USB, SCSI, IDE, etc).
>>    
> 
> With standard PCI, they do not.  But all modern host adapters support
> MSI and they will happily give you one interrupt per queue.

While MSI is a good technological advancement for PCI, I was referring
to signal:IDT ratio.  MSI would still classify as 1:1.

> 
>> PCI is somewhat of an outlier in that regard afaict.  Its actually a
>> nice feature of PCI when its used within its design spec (HW).  For
>> SW/PV, 1:1 suffers from, among other issues, that "triple-exit scaling"
>> issue in the signal path I mentioned above.  This is one of the many
>> reasons I think PCI is not the best choice for PV.
>>    
> 
> Look at the vmxnet3 submission (recently posted on virtualization@). 
> It's a perfectly ordinary PCI NIC driver, apart from having so many 'V's
> in the code.  16 rx queues, 8 tx queues, 25 MSIs, BARs for the
> registers.  So while the industry as a whole might disagree with me, it
> seems VMware does not.

At the very least, BARs for the registers is worrisome, but I will
reserve judgment until I see the numbers and review the code.

> 
> 
>>>> http://developer.novell.com/wiki/images/b/b7/31-rc4_throughput.png
>>>>
>>>>        
>>> That's a red herring.  The problem is not with virtio as an ABI, but
>>> with its implementation in userspace.  vhost-net should offer equivalent
>>> performance to vbus.
>>>      
>> That's pure speculation.  I would advise you to reserve such statements
>> until after a proper bakeoff can be completed.
> 
> Let's do that then.  Please reserve the corresponding comparisons from
> your side as well.

That is quite the odd request.  My graphs are all built using readily
available code and open tools and do not speculate as to what someone
else may come up with in the future.  They reflect what is available
today.  Do you honestly think I should wait indefinitely for a competing
idea to try to catch up before I talk about my results?  That's
certainly an interesting perspective.

With all due respect, the only red-herring is your unsubstantiated
claims that my results do not matter.

> 
>> This is not to mention
>> that vhost-net does nothing to address our other goals, like scheduler
>> coordination and non-802.x fabrics.
>>    
> 
> What are scheduler coordination and non-802.x fabrics?

We are working on real-time, IB and QOS, for examples, in addition to
the now well known 802.x venet driver.

> 
>>> Right, when you ignore the points where they don't fit, it's a perfect
>>> mesh.
>>>      
>> Where doesn't it fit?
>>    
> 
> (avoiding infinite loop)

I'm serious.  Where doesn't it fit?  Point me at a URL if its already
discussed.

> 
>>>>> But that's not a strong argument for vbus; instead of adding vbus you
>>>>> could make virtio more friendly to non-virt
>>>>>
>>>>>          
>>>> Actually, it _is_ a strong argument then because adding vbus is what
>>>> helps makes virtio friendly to non-virt, at least for when performance
>>>> matters.
>>>>
>>>>        
>>> As vhost-net shows, you can do that without vbus
>>>      
>> Citation please.  Afaict, the one use case that we looked at for vhost
>> outside of KVM failed to adapt properly, so I do not see how this is
>> true.
>>    
> 
> I think Ira said he can make vhost work?
> 

Not exactly.  It kind of works for 802.x only (albeit awkwardly) because
there is no strong distinction between "resource" and "consumer" with
ethernet.  So you can run it inverted without any serious consequences
(at least, not from consequences of the inversion).  Since the x86
boards are the actual resource providers in his system, other device
types will fail to map to the vhost model properly, like disk-io or
consoles for instance.

>>> and without breaking compatibility.
>>>      
>> Compatibility with what?  vhost hasn't even been officially deployed in
>> KVM environments afaict, nevermind non-virt.  Therefore, how could it
>> possibly have compatibility constraints with something non-virt already?
>>   Citation please.
>>    
> 
> virtio-net over pci is deployed.  Replacing the backend with vhost-net
> will require no guest modifications.

That _is_ a nice benefit, I agree.  I just do not agree its a hard
requirement.

>  Replacing the frontend with venet or virt-net/vbus-pci will require guest modifications.

Understood, and I am ok with that.  I think its necessary to gain
critical performance enhancing features, and I think it will help in the
long term to support more guests.  I have not yet been proven wrong.

> 
> Obviously virtio-net isn't deployed in non-virt.  But if we adopt vbus,
> we have to migrate guests.

As a first step, lets just shoot for "support" instead of "adopt".

Ill continue to push patches to you that help interfacing with the guest
in a vbus neutral way (like irqfd/ioeventfd) and we can go from there.
Are you open to this work assuming it passes normal review cycles, etc?
 It would presumably be of use to others that want to interface to a
guest (e.g. vhost) as well.

> 
> 
> 
>>> Of course there is such a thing as native, a pci-ready guest has tons of
>>> support built into it
>>>      
>> I specifically mentioned that already ([1]).
>>
>> You are also overstating its role, since the basic OS is what implements
>> the native support for bus-objects, hotswap, etc, _not_ PCI.  PCI just
>> rides underneath and feeds trivial events up, as do other bus-types
>> (usb, scsi, vbus, etc).
> 
> But we have to implement vbus for each guest we want to support.  That
> includes Windows and older Linux which has a different internal API, so
> we have to port the code multiple times, to get existing functionality.

Perhaps, but in reality its not very bad.  The windows driver will
already support any recent version that matters (at least back to
2000/XP), and the Linux side doesn't do anything weird so I know it
works at least back to 2.6.16 iirc, and probably further.

> 
>> And once those events are fed, you still need a
>> PV layer to actually handle the bus interface in a high-performance
>> manner so its not like you really have a "native" stack in either case.
>>    
> 
> virtio-net doesn't use any pv layer.

Well, it does when you really look closely at how it works.  For one, it
has the virtqueues library that would be (or at least _should be_)
common for all virtio-X adapters, etc etc.  Even if this layer is
collapsed into each driver on the Windows platform, its still there
nonetheless.

> 
>>> that doesn't need to be retrofitted.
>>>      
>> No, that is incorrect.  You have to heavily modify the pci model with
>> layers on top to get any kind of performance out of it.  Otherwise, we
>> would just use realtek emulation, which is technically the native PCI
>> you are apparently so enamored with.
>>    
> 
> virtio-net doesn't modify the PCI model.

Sure it does.  It doesn't use MMIO/PIO bars for registers, it uses
vq->kick().  It doesn't use pci-config-space, it uses virtio->features.
 It doesn't use PCI interrupts, it uses a callback on the vq etc, etc.
You would never use raw "registers", as the exit rate would crush you.
You would never use raw interrupts, as you need a shared-memory based
mitigation scheme.

IOW: Virtio has a device model layer that tunnels over PCI.  It doesn't
actually use PCI directly.  This is in fact what allows the linux
version to work over lguest, s390 and vbus in addition to PCI.

>  And if you look at vmxnet3,
> they mention that it conforms to somthing called UPT, which allows
> hardware vendors to implement parts of their NIC model.  So vmxnet3 is
> apparently suitable to both hardware and software implementations.
> 

That's interesting and all, but the charter for vbus is for optimal
software-to-software interfaces to a linux host, so I don't mind if my
spec doesn't look conducive to a hardware implementation.  As it turns
out, I'm sure it would work there as well, but some of the optimizations
wouldn't matter as much since hardware behaves differently.

>> Not to mention there are things you just plain can't do in PCI today,
>> like dynamically assign signal-paths,
> 
> You can have dynamic MSI/queue routing with virtio, and each MSI can be
> routed to a vcpu at will.

Can you arbitrarily create a new MSI/queue on a per-device basis on the
fly?   We want to do this for some upcoming designs.  Or do you need to
predeclare the vectors when the device is hot-added?

> 
>> priority, and coalescing, etc.
>>    
> 
> Do you mean interrupt priority?  Well, apic allows interrupt priorities
> and Windows uses them; Linux doesn't.  I don't see a reason to provide
> more than native hardware.

The APIC model is not optimal for PV given the exits required for a
basic operation like an interrupt injection, and has scaling/flexibility
issues with its 16:16 priority mapping.

OTOH, you don't necessarily want to rip it out because of all the
additional features it has like the IPI facility and the handling of
many low-performance data-paths.  Therefore, I am of the opinion that
the optimal placement for advanced signal handling is directly at the
bus that provides the high-performance resources.  I could be convinced
otherwise with a compelling argument, but I think this is the path of
least resistance.

> 
>>> Since
>>> practically everyone (including Xen) does their paravirt drivers atop
>>> pci, the claim that pci isn't suitable for high performance is
>>> incorrect.
>>>      
>> Actually IIUC, I think Xen bridges to their own bus as well (and only
>> where they have to), just like vbus.  They don't use PCI natively.  PCI
>> is perfectly suited as a bridge transport for PV, as I think the Xen and
>> vbus examples have demonstrated.  Its the 1:1 device-model where PCI has
>> the most problems.
>>    
> 
> N:1 breaks down on large guests since one vcpu will have to process all
> events.

Well, first of all that is not necessarily true.  Some high performance
buses like SCSI and FC work fine with an aggregated model, so its not a
foregone conclusion that aggregation kills SMP IO performance.  This is
especially true when you adding coalescing on top, like AlacrityVM does.

I do agree that other subsystems, like networking for instance, may
sometimes benefit from flexible signal-routing because of multiqueue,
etc, for particularly large guests.  However, the decision to make the
current kvm-connector used in AlacrityVM aggregate one priority FIFO per
IRQ was an intentional design tradeoff.  My experience with my target
user base is that these data-centers are typically deploying 1-4 vcpu
guests, so I optimized for that.  YMMV, so we can design a different
connector, or a different mode of the existing connector, to accommodate
large guests as well if that was something desirable.

> You could do N:M, with commands to change routings, but where's
> your userspace interface?

Well, we should be able to add that when/if its needed.  I just don't
think the need is there yet.  KVM tops out at 16 IIUC anyway.

> you can't tell from /proc/interrupts which
> vbus interupts are active

This should be trivial to add some kind of *fs display.  I will fix this
shortly.

> and irqbalance can't steer them towards less
> busy cpus since they're invisible to the interrupt controller.

(see N:M above)

> 
> 
>>>> And lastly, why would you _need_ to use the so called "native"
>>>> mechanism?  The short answer is, "you don't".  Any given system (guest
>>>> or bare-metal) already have a wide-range of buses (try running "tree
>>>> /sys/bus" in Linux).  More importantly, the concept of adding new buses
>>>> is widely supported in both the Windows and Linux driver model (and
>>>> probably any other guest-type that matters).  Therefore, despite claims
>>>> to the contrary, its not hard or even unusual to add a new bus to the
>>>> mix.
>>>>
>>>>        
>>> The short answer is "compatibility".
>>>      
>> There was a point in time where the same could be said for virtio-pci
>> based drivers vs realtek and e1000, so that argument is demonstrably
>> silly.  No one tried to make virtio work in a binary compatible way with
>> realtek emulation, yet we all survived the requirement for loading a
>> virtio driver to my knowledge.
>>    
> 
> The larger your installed base, the more difficult it is.  Of course
> it's doable, but I prefer not doing it and instead improving things in a
> binary backwards compatible manner.  If there is no choice we will bow
> to the inevitable and make our users upgrade.  But at this point there
> is a choice, and I prefer to stick with vhost-net until it is proven
> that it won't work.

Fair enough.  But note you are likely going to need to respin your
existing drivers anyway to gain peak performance, since there are known
shortcomings in the virtio-pci ABI today (like queue identification in
the interrupt hotpath) as it stands.  So that pain is coming one way or
the other.

> 
>> The bottom line is: Binary device compatibility is not required in any
>> other system (as long as you follow sensible versioning/id rules), so
>> why is KVM considered special?
>>    
> 
> One of the benefits of virtualization is that the guest model is
> stable.  You can live-migrate guests and upgrade the hardware
> underneath.  You can have a single guest image that you clone to
> provision new guests.  If you switch to a new model, you give up those
> benefits, or you support both models indefinitely.

I understand what you are saying, but I don't buy it.  If you add a new
feature to an existing model even without something as drastic as a new
bus, you still have the same exact dilemma:  The migration target needs
feature parity with consumed features in the guest.  Its really the same
no matter what unless you never add guest-visible features.

> 
> Note even hardware nowadays is binary compatible.  One e1000 driver
> supports a ton of different cards, and I think (not sure) newer cards
> will work with older drivers, just without all their features.

Noted, but that is not really the same thing.  Thats more like adding a
feature bit to virtio, not replacing GigE with 10GE.

> 
>> The fact is, it isn't special (at least not in this regard).  What _is_
>> required is "support" and we fully intend to support these proposed
>> components.  I assure you that at least the users that care about
>> maximum performance will not generally mind loading a driver.  Most of
>> them would have to anyway if they want to get beyond realtek emulation.
>>    
> 
> For a new install, sure.  I'm talking about existing deployments (and
> those that will exist by the time vbus is ready for roll out).

The user will either specify "-net nic,model=venet", or they won't.  Its
their choice.  Changing those parameters, vbus or otherwise, has
ramifications w.r.t. what drivers must be loaded, and the user will
understand this.

> 
>> I am certainly in no position to tell you how to feel, but this
>> declaration would seem from my perspective to be more of a means to an
>> end than a legitimate concern.  Otherwise we would never have had virtio
>> support in the first place, since it was not "compatible" with previous
>> releases.
>>    
> 
> virtio was certainly not pain free, needing Windows drivers, updates to
> management tools (you can't enable it by default, so you have to offer
> it as a choice), mkinitrd, etc.  I'd rather not have to go through that
> again.

No general argument here, other than to reiterate that the driver is
going to have to be redeployed anyway, since it will likely need new
feature bits to fix the current ABI.

> 
>>>   Especially if the device changed is your boot disk.
>>>      
>> If and when that becomes a priority concern, that would be a function
>> transparently supported in the BIOS shipped with the hypervisor, and
>> would thus be invisible to the user.
>>    
> 
> No, you have to update the driver in your initrd (for Linux)

Thats fine, the distros generally do this automatically when you load
the updated KMP package.

> or properly install the new driver (for Windows).  It's especially
> difficult for Windows.

What is difficult here?  I never seem to have any problems and I have
all kinds of guests from XP to Win7.

>>>   You may not care about the pain caused to users, but I do, so I will
>>> continue to insist on compatibility.
>>>      
>> For the users that don't care about maximum performance, there is no
>> change (and thus zero pain) required.  They can use realtek or virtio if
>> they really want to.  Neither is going away to my knowledge, and lets
>> face it: 2.6Gb/s out of virtio to userspace isn't *that* bad.  But "good
>> enough" isn't good enough, and I won't rest till we get to native
>> performance.
> 
> I don't want to support both virtio and vbus in parallel.  There's
> enough work already.

Until I find some compelling reason that indicates I was wrong about all
of this, I will continue building a community around the vbus code base
and developing support for its components anyway.  So that effort is
going to happen in parallel regardless.

This is purely a question about whether you will work with me to make
vbus an available option in upstream KVM or not.

>  If we adopt vbus, we'll have to deprecate and eventually kill off virtio.

Thats more hyperbole.  virtio is technically fine and complementary as
it is.  No one says you have to do anything drastic w.r.t. virtio.  If
you _did_ adopt vbus, perhaps you would want to optionally deprecate
vhost or possibly the virtio-pci adapter, but that is about it.  The
rest of the infrastructure should be preserved if it was designed properly.

> 
>> 2) True pain to users is not caused by lack of binary compatibility.
>> Its caused by lack of support.  And its a good thing or we would all be
>> emulating 8086 architecture forever...
>>
>> ..oh wait, I guess we kind of do that already ;).  But at least we can
>> slip in something more advanced once in a while (APIC vs PIC, USB vs
>> uart, iso9660 vs floppy, for instance) and update the guest stack
>> instead of insisting it must look like ISA forever for compatibility's
>> sake.
>>    
> 
> PCI is continuously updated, with MSI, MSI-X, and IOMMU support being
> some recent updates.  I'd like to ride on top of that instead of having
> to clone it for every guest I support.

While a noble goal, one of the points I keep making though, as someone
who has built the stack both ways, is almost none of the PCI stack is
actually needed to get the PV job done.  The part you do need is
primarily a function of the generic OS stack and trivial to interface
with anyway.

Plus, as a lesser point: it doesn't work everywhere so you end up
solving the same kind of vbus-like design problem again and again when
PCI is missing.

> 
>>> So we have: vbus needs a connector, vhost needs a connector.  vbus
>>> doesn't need userspace to program the addresses (but does need userspace
>>> to instantiate the devices and to program the bus address decode)
>>>      
>> First of all, bus-decode is substantially easier than per-device decode
>> (you have to track all those per-device/per-signal fds somewhere,
>> integrate with hotswap, etc), and its only done once per guest at
>> startup and left alone.  So its already not apples to apples.
>>    
> 
> Right, it means you can hand off those eventfds to other qemus or other
> pure userspace servers.  It's more flexible.
> 
>> Second, while its true that the general kvm-connector bus-decode needs
>> to be programmed,  that is a function of adapting to the environment
>> that _you_ created for me.  The original kvm-connector was discovered
>> via cpuid and hypercalls, and didn't need userspace at all to set it up.
>>   Therefore it would be entirely unfair of you to turn around and somehow
>> try to use that trait of the design against me since you yourself
>> imposed it.
>>    
> 
> No kvm feature will ever be exposed to a guest without userspace
> intervention.  It's a basic requirement.  If it causes complexity (and
> it does) we have to live with it.

Right.  cpuid is exposed by userspace, so that was the control point in
the original design.  The presence of the PCI_BRIDGE in the new code
(again exported by userspace) is what controls it now.  From there,
there are various mechanisms we employ to control that features the
guest may see, such as the sysfs/attribute system, the revision of the
bridge, and the feature bits that it and its subordinate devices expose.

> 
>>>   Does it work on Windows?
>>>      
>> This question doesn't make sense.  Hotswap control occurs on the host,
>> which is always Linux.
>>
>> If you were asking about whether a windows guest will support hotswap:
>> the answer is "yes".  Our windows driver presents a unique PDO/FDO pair
>> for each logical device instance that is pushed out (just like the built
>> in usb, pci, scsi bus drivers that windows supports natively).
>>    
> 
> Ah, you have a Windows venet driver?

Almost.  It's WIP, but hopefully soon, along with core support for the
bus, etc.

> 
> 
>>>> As an added bonus, its device-model is modular.  A developer can
>>>> write a
>>>> new device model, compile it, insmod it to the host kernel, hotplug it
>>>> to the running guest with mkdir/ln, and the come back out again
>>>> (hotunplug with rmdir, rmmod, etc).  They may do this all without
>>>> taking
>>>> the guest down, and while eating QEMU based IO solutions for breakfast
>>>> performance wise.
>>>>
>>>> Afaict, qemu can't do either of those things.
>>>>
>>>>        
>>> We've seen that herring before,
>>>      
>> Citation?
>>    
> 
> It's the compare venet-in-kernel to virtio-in-userspace thing again.

No, you said KVM has "userspace hotplug".  I retorted that vbus not only
has hotplug, it also has a modular architecture.  You then countered
that this feature is a red-herring.  If this was previously discussed
and rejected for some reason, I would like to know the history.  Or did
I misunderstand you?

Or if you are somehow implying that the lack of modularity has to do
with virtio-in-userspace, I beg to differ.  Even with vhost, you still
have to have a paired model in qemu, so it will not be a modular
architecture by virtue of the vhost patch series either.  You would need
qemu to support modular devices as well, which I've been told that isn't
going to happen any time soon.

> Let's defer that until mst complete vhost-net mergable buffers, it which
> time we can compare vhost-net to venet and see how much vbus contributes
> to performance and how much of it comes from being in-kernel.

I look forward to it.

> 
>>>>> Refactor instead of duplicating.
>>>>>
>>>>>          
>>>> There is no duplicating.  vbus has no equivalent today as virtio
>>>> doesn't
>>>> define these layers.
>>>>
>>>>        
>>> So define them if they're missing.
>>>      
>> I just did.
>>    
> 
> Since this is getting confusing to me, I'll start from scratch looking
> at the vbus layers, top to bottom:

I wouldn't describe it like this

> 
> Guest side:
> 1. venet guest kernel driver - AFAICT, duplicates the virtio-net guest
> driver functionality
> 2. vbus guest driver (config and hotplug) - duplicates pci, or if you
> need non-pci support, virtio config and its pci bindings; needs
> reimplementation for all supported guests
> 3. vbus guest driver (interrupt coalescing, priority) - if needed,
> should be implemented as an irqchip (and be totally orthogonal to the
> driver); needs reimplementation for all supported guests
> 4. vbus guest driver (shm/ioq) - finder grained layering than virtio
> (which only supports the combination, due to the need for Xen support);
> can be retrofitted to virtio at some cost
> 
> Host side:
> 1. venet host kernel driver - is duplicated by vhost-net; doesn't
> support live migration, unprivileged users, or slirp
> 2. vbus host driver (config and hotplug) - duplicates pci support in
> userspace (which will need to be kept in any case); already has two
> userspace interfaces
> 3. vbus host driver (interrupt coalescing, priority) - if we think we
> need it (and I don't), should be part of kvm core, not a bus
> 4. vbus host driver (shm) - partially duplicated by vhost memory slots
> 5. vbus host driver (ioq) - duplicates userspace virtio, duplicated by
> vhost

For one, we have the common layer of shm-signal, and IOQ.  These
libraries were designed to be reused on both sides of the link.
Generally shm-signal has no counterpart in the existing model, though
its functionality is integrated into the virtqueue.  IOQ is duplicated
by virtqueue, but I think its a better design at least in this role, so
I use it pervasively throughout the stack.  We can discuss that in a
separate thread.

From there, going down the stack, it looks like

    (guest-side)
|-------------------------
| venet (competes with virtio-net)
|-------------------------
| vbus-proxy (competes with pci-bus, config+hotplug, sync/async)
|-------------------------
| vbus-pcibridge (interrupt coalescing + priority, fastpath)
|-------------------------
           |
|-------------------------
| vbus-kvmconnector (interrupt coalescing + priority, fast-path)
|-------------------------
| vbus-core (hotplug, address decoding, etc)
|-------------------------
| venet-device (ioq frame/deframe to tap/macvlan/vmdq, etc)
|-------------------------

If you want to use virtio, insert a virtio layer between the "driver"
and "device" components at the outer edges of the stack.

> 
>>>> There is no rewriting.  vbus has no equivalent today as virtio doesn't
>>>> define these layers.
>>>>
>>>> By your own admission, you said if you wanted that capability, use a
>>>> library.  What I think you are not understanding is vbus _is_ that
>>>> library.  So what is the problem, exactly?
>>>>
>>>>        
>>> It's not compatible.
>>>      
>> No, that is incorrect.  What you are apparently not understanding is
>> that not only is vbus that library, but its extensible.  So even if
>> compatibility is your goal (it doesn't need to be IMO) it can be
>> accommodated by how you interface to the library.
>>    
> 
> To me, compatible means I can live migrate an image to a new system
> without the user knowing about the change.  You'll be able to do that
> with vhost-net.

As soon as you add any new guest-visible feature, you are in the same
exact boat.

> 
>>>>>
>>>>>          
>>>> No, it does not.  vbus just needs a relatively simple single message
>>>> pipe between the guest and host (think "hypercall tunnel", if you
>>>> will).
>>>>
>>>>        
>>> That's ioeventfd.  So far so similar.
>>>      
>> No, that is incorrect.  For one, vhost uses them on a per-signal path
>> basis, whereas vbus only has one channel for the entire guest->host.
>>    
> 
> You'll probably need to change that as you start running smp guests.

The hypercall channel is already SMP optimized over a single PIO path,
so I think we are covered there.  See "fastcall" in my code for details:

http://git.kernel.org/?p=linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git;a=blob;f=drivers/vbus/pci-bridge.c;h=81f7cdd2167ae2f53406850ebac448a2183842f2;hb=fd1c156be7735f8b259579f18268a756beccfc96#l102

It just passes the cpuid into the PIO write so we can have parallel,
lockless "hypercalls".  This forms the basis of our guest scheduler
support, for instance.

> 
>> Second, I do not use ioeventfd anymore because it has too many problems
>> with the surrounding technology.  However, that is a topic for a
>> different thread.
>>    
> 
> Please post your issues.  I see ioeventfd/irqfd as critical kvm interfaces.

Will do.  It would be nice to come back to this interface.

> 
>>> vbus devices aren't magically instantiated.  Userspace needs to
>>> instantiate them too.  Sure, there's less work on the host side since
>>> you're using vbus instead of the native interface, but more work on the
>>> guest side since you're using vbus instead of the native interface.
>>>      
>>
>> No, that is incorrect.  The amount of "work" that a guest does is
>> actually the same in both cases, since the guest OS peforms the hotswap
>> handling natively for all bus types (at least for Linux and Windows).
>> You still need to have a PV layer to interface with those objects in
>> both cases, as well, so there is no such thing as "native interface" for
>> PV.  Its only a matter of where it occurs in the stack.
>>    
> 
> I'm missing something.  Where's the pv layer for virtio-net?

covered above

> 
> Linux drivers have an abstraction layer to deal with non-pci.  But the
> Windows drivers are ordinary pci drivers with nothing that looks
> pv-ish.

They certainly do not have to be since Windows supports a similar notion
as the LDM in Linux.  In fact, we are exploiting that Windows facility
in our drivers.  It's rather unfortunate if its true that your
drivers were not designed this way, since virtio has a rather nice
stack model on Linux that could work in Windows as well.

>  You could implement virtio-net hardware if you wanted to.

Technically you could build vbus in hardware too, I suppose, since the
bridge is PCI compliant.  I would never advocate it, however, since many
of our tricks do not matter if its real hardware (e.g. they are
optimized for the costs associated with VM).

> 
>>>   non-privileged-user capable?
>>>      
>> The short answer is "not yet (I think)".  I need to write a patch to
>> properly set the mode attribute in sysfs, but I think this will be
>> trivial.
>>
>>    
> 
> (and selinux label)

If any of these things that are problems, they can simply be exposed via
the new ioctl admin interface, I suppose.

> 
>>> Ah, so you have two control planes.
>>>      
>> So what?  If anything, it goes to show how extensible the framework is
>> that a new plane could be added in 119 lines of code:
>>
>> ~/git/linux-2.6>  stg show vbus-add-admin-ioctls.patch | diffstat
>>   Makefile       |    3 -
>>   config-ioctl.c |  117
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 119 insertions(+), 1 deletion(-)
>>
>> if and when having two control planes exceeds its utility, I will submit
>> a simple patch that removes the useless one.
>>    
> 
> It always begins with a 119-line patch and then grows, that's life.
> 

I can't argue with that.

>>> kvm didn't have an existing counterpart in Linux when it was
>>> proposed/merged.
>>>      
>> And likewise, neither does vbus.
>>
>>    
> 
> For virt uses, I don't see the need.  For non-virt, I have no opinion.
> 
> 

Well, I hope to change your mind on both counts, then.

Kind Regards,
-Greg

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]

^ permalink raw reply

* Re: [PATCH] net: fix NOHZ: local_softirq_pending 08
From: Michael Buesch @ 2009-10-01 19:10 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Miller, oliver-fJ+pQTUTwRTk1uMJSBkQmQ,
	kalle.valo-X3B1VOXEql0, linville-2XuSBdqkA4R54TAoqtyWWQ,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1254422548.3959.24.camel-YfaajirXv2244ywRPIzf9A@public.gmane.org>

On Thursday 01 October 2009 20:42:28 Johannes Berg wrote:
> On Thu, 2009-10-01 at 16:04 +0200, Michael Buesch wrote:
> > On Thursday 01 October 2009 01:33:33 David Miller wrote:
> > 
> > > I'm not applying this until all of these details are sorted out 
> > 
> > John, please apply my fix to wireless-testing to get rid of the regression.
> > You can revert it later, if there's a better fix available.
> 
> I agree with davem, don't. Just fix the driver to local_bh_disable()
> around the rx function if necessary.

For the benefit of a much bigger critical section? I don't get it why this would be any better.

I _am_ going to do one thing now, however. That is ignoring any regression bugreport.
(Yes, it _is_ a regression for b43)

-- 
Greetings, Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Eric Dumazet @ 2009-10-01 19:07 UTC (permalink / raw)
  To: David S. Miller, Patrick McHardy; +Cc: Linux Netdev List

We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
is running.

# tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0

User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
one (because no estimator is active)

After this patch, tc command output is :
$ tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/gen_stats.h |   16 +++++++++++++---
 net/core/gen_estimator.c  |    9 +++++----
 net/core/gen_stats.c      |    9 ++++++---
 net/sched/act_police.c    |    2 +-
 4 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/include/linux/gen_stats.h b/include/linux/gen_stats.h
index 710e901..7678ded 100644
--- a/include/linux/gen_stats.h
+++ b/include/linux/gen_stats.h
@@ -30,17 +30,27 @@ struct gnet_stats_basic_packed
 } __attribute__ ((packed));
 
 /**
- * struct gnet_stats_rate_est - rate estimator
+ * struct gnet_stats_user_rate_est - rate estimator
  * @bps: current byte rate
  * @pps: current packet rate
  */
-struct gnet_stats_rate_est
-{
+struct gnet_stats_user_rate_est {
 	__u32	bps;
 	__u32	pps;
 };
 
 /**
+ * struct gnet_stats_rate_est - rate estimator with flags
+ * @est: current byte/packet rate
+ * @flags: set to one if estimation is valid
+ */
+struct gnet_stats_rate_est {
+	struct gnet_stats_user_rate_est	est;
+	int				flags;
+};
+#define RATE_EST_VALID 1
+
+/**
  * struct gnet_stats_queue - queuing statistics
  * @qlen: queue length
  * @backlog: backlog size of queue
diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index 493775f..5ba9d90 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -129,12 +129,13 @@ static void est_timer(unsigned long arg)
 		brate = (nbytes - e->last_bytes)<<(7 - idx);
 		e->last_bytes = nbytes;
 		e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
-		e->rate_est->bps = (e->avbps+0xF)>>5;
+		e->rate_est->est.bps = (e->avbps+0xF)>>5;
 
 		rate = (npackets - e->last_packets)<<(12 - idx);
 		e->last_packets = npackets;
 		e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
-		e->rate_est->pps = (e->avpps+0x1FF)>>10;
+		e->rate_est->est.pps = (e->avpps+0x1FF)>>10;
+		e->rate_est->flags |= RATE_EST_VALID;
 skip:
 		read_unlock(&est_lock);
 		spin_unlock(e->stats_lock);
@@ -227,9 +228,9 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 	est->stats_lock = stats_lock;
 	est->ewma_log = parm->ewma_log;
 	est->last_bytes = bstats->bytes;
-	est->avbps = rate_est->bps<<5;
+	est->avbps = rate_est->est.bps<<5;
 	est->last_packets = bstats->packets;
-	est->avpps = rate_est->pps<<10;
+	est->avpps = rate_est->est.pps<<10;
 
 	if (!elist[idx].timer.function) {
 		INIT_LIST_HEAD(&elist[idx].list);
diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index 8569310..b6f723c 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -138,13 +138,16 @@ gnet_stats_copy_basic(struct gnet_dump *d, struct gnet_stats_basic_packed *b)
 int
 gnet_stats_copy_rate_est(struct gnet_dump *d, struct gnet_stats_rate_est *r)
 {
+	if (!(r->flags & RATE_EST_VALID))
+		return 0;
+
 	if (d->compat_tc_stats) {
-		d->tc_stats.bps = r->bps;
-		d->tc_stats.pps = r->pps;
+		d->tc_stats.bps = r->est.bps;
+		d->tc_stats.pps = r->est.pps;
 	}
 
 	if (d->tail)
-		return gnet_stats_copy(d, TCA_STATS_RATE_EST, r, sizeof(*r));
+		return gnet_stats_copy(d, TCA_STATS_RATE_EST, &r->est, sizeof(r->est));
 
 	return 0;
 }
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 723964c..ba01081 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -292,7 +292,7 @@ static int tcf_act_police(struct sk_buff *skb, struct tc_action *a,
 	police->tcf_bstats.packets++;
 
 	if (police->tcfp_ewma_rate &&
-	    police->tcf_rate_est.bps >= police->tcfp_ewma_rate) {
+	    police->tcf_rate_est.est.bps >= police->tcfp_ewma_rate) {
 		police->tcf_qstats.overlimits++;
 		if (police->tcf_action == TC_ACT_SHOT)
 			police->tcf_qstats.drops++;

^ permalink raw reply related

* Re: [PATCH] make TLLAO option for NA packets configurable
From: Stephen Hemminger @ 2009-10-01 18:56 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: Cosmin Ratiu, David Miller, netdev
In-Reply-To: <200910012139.32070.opurdila@ixiacom.com>

On Thu, 1 Oct 2009 21:39:32 +0300
Octavian Purdila <opurdila@ixiacom.com> wrote:

> On Thursday 01 October 2009 21:14:50 you wrote:
> > 
> > Probably this should be a per interface property rather than per namespace.
> 
> In our case, where we have lots of interfaces active, it would be nice to have 
> the per namespace property as well.

The ipv6 control infrastructure already has that option. If you changed your
patch to use a per-interface control then there would be:

  /proc/sys/net/ipv6/conf/all/force_tllao

^ permalink raw reply

* Re: [PATCH] net: fix NOHZ: local_softirq_pending 08
From: Johannes Berg @ 2009-10-01 18:42 UTC (permalink / raw)
  To: Michael Buesch
  Cc: David Miller, oliver-fJ+pQTUTwRTk1uMJSBkQmQ,
	kalle.valo-X3B1VOXEql0, linville-2XuSBdqkA4R54TAoqtyWWQ,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <200910011604.42916.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 460 bytes --]

On Thu, 2009-10-01 at 16:04 +0200, Michael Buesch wrote:
> On Thursday 01 October 2009 01:33:33 David Miller wrote:
> 
> > I'm not applying this until all of these details are sorted out 
> 
> John, please apply my fix to wireless-testing to get rid of the regression.
> You can revert it later, if there's a better fix available.

I agree with davem, don't. Just fix the driver to local_bh_disable()
around the rx function if necessary.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH] make TLLAO option for NA packets configurable
From: Octavian Purdila @ 2009-10-01 18:39 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Cosmin Ratiu, David Miller, netdev
In-Reply-To: <20091001111450.221969cc@s6510>

On Thursday 01 October 2009 21:14:50 you wrote:
> 
> Probably this should be a per interface property rather than per namespace.

In our case, where we have lots of interfaces active, it would be nice to have 
the per namespace property as well.

But, as Cosmin suggested, perhaps it would be better to just send this options 
by default? (its a RFC SHOULD after all...)

Thanks,
tavi

^ permalink raw reply

* pull request: wireless-2.6 2009-10-01
From: John W. Linville @ 2009-10-01 18:24 UTC (permalink / raw)
  To: davem; +Cc: linux-wireless, netdev, linux-kernel

Dave,

A small collection of fixes for 2.6.32...

One is a brown paper bag fix for an uninitialized variable, another is a
USB ID.  There is a beaconing fix for the mac80211_hwsim "fake" driver,
and a bug fix for AP mode related to buffering frames for stations in
power save mode.

The b43 fix looks a bit long, but it is more-or-less the same simple
fix applied in multiple places.  It addresses a bug where "the last
bytes of data sent/received to/from PIO FIFOs on SDIO-based cards get
'swizzled' when its length is not multiple of 4 bytes."

Please let me know if there are problems!

Thanks,

John

---

Individual patches are available here:

	http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/

---

The following changes since commit eb1cf0f8f7a9e5a6d573d5bd72c015686a042db0:
  David S. Miller (1):
        Merge branch 'master' of ssh://master.kernel.org/.../linville/wireless-2.6

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Christian Lamparter (1):
      ar9170: fix bug in iq-auto calibration value calculation

Igor Perminov (1):
      mac80211: Fix [re]association power saving issue on AP side

Jouni Malinen (1):
      mac80211_hwsim: Fix initial beacon timer configuration

Michael Buesch (1):
      b43: Always use block-I/O for the PIO data registers

Michal Szalata (1):
      rt2x00: Thrustmaster FunAccess WIFI USB and rt73usb

 drivers/net/wireless/ath/ar9170/phy.c |    6 +--
 drivers/net/wireless/b43/pio.c        |   60 +++++++++++++++++++++------------
 drivers/net/wireless/mac80211_hwsim.c |    3 ++
 drivers/net/wireless/rt2x00/rt73usb.c |    1 +
 net/mac80211/tx.c                     |    5 ++-
 5 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/drivers/net/wireless/ath/ar9170/phy.c b/drivers/net/wireless/ath/ar9170/phy.c
index b3e5cf3..dbd488d 100644
--- a/drivers/net/wireless/ath/ar9170/phy.c
+++ b/drivers/net/wireless/ath/ar9170/phy.c
@@ -1141,7 +1141,8 @@ static int ar9170_set_freq_cal_data(struct ar9170 *ar,
 	u8 vpds[2][AR5416_PD_GAIN_ICEPTS];
 	u8 pwrs[2][AR5416_PD_GAIN_ICEPTS];
 	int chain, idx, i;
-	u8 f;
+	u32 phy_data = 0;
+	u8 f, tmp;
 
 	switch (channel->band) {
 	case IEEE80211_BAND_2GHZ:
@@ -1208,9 +1209,6 @@ static int ar9170_set_freq_cal_data(struct ar9170 *ar,
 		}
 
 		for (i = 0; i < 76; i++) {
-			u32 phy_data;
-			u8 tmp;
-
 			if (i < 25) {
 				tmp = ar9170_interpolate_val(i, &pwrs[0][0],
 							     &vpds[0][0]);
diff --git a/drivers/net/wireless/b43/pio.c b/drivers/net/wireless/b43/pio.c
index e96091b..9c13979 100644
--- a/drivers/net/wireless/b43/pio.c
+++ b/drivers/net/wireless/b43/pio.c
@@ -340,10 +340,15 @@ static u16 tx_write_2byte_queue(struct b43_pio_txqueue *q,
 			q->mmio_base + B43_PIO_TXDATA,
 			sizeof(u16));
 	if (data_len & 1) {
+		u8 tail[2] = { 0, };
+
 		/* Write the last byte. */
 		ctl &= ~B43_PIO_TXCTL_WRITEHI;
 		b43_piotx_write16(q, B43_PIO_TXCTL, ctl);
-		b43_piotx_write16(q, B43_PIO_TXDATA, data[data_len - 1]);
+		tail[0] = data[data_len - 1];
+		ssb_block_write(dev->dev, tail, 2,
+				q->mmio_base + B43_PIO_TXDATA,
+				sizeof(u16));
 	}
 
 	return ctl;
@@ -386,26 +391,31 @@ static u32 tx_write_4byte_queue(struct b43_pio_txqueue *q,
 			q->mmio_base + B43_PIO8_TXDATA,
 			sizeof(u32));
 	if (data_len & 3) {
-		u32 value = 0;
+		u8 tail[4] = { 0, };
 
 		/* Write the last few bytes. */
 		ctl &= ~(B43_PIO8_TXCTL_8_15 | B43_PIO8_TXCTL_16_23 |
 			 B43_PIO8_TXCTL_24_31);
-		data = &(data[data_len - 1]);
 		switch (data_len & 3) {
 		case 3:
-			ctl |= B43_PIO8_TXCTL_16_23;
-			value |= (u32)(*data) << 16;
-			data--;
+			ctl |= B43_PIO8_TXCTL_16_23 | B43_PIO8_TXCTL_8_15;
+			tail[0] = data[data_len - 3];
+			tail[1] = data[data_len - 2];
+			tail[2] = data[data_len - 1];
+			break;
 		case 2:
 			ctl |= B43_PIO8_TXCTL_8_15;
-			value |= (u32)(*data) << 8;
-			data--;
+			tail[0] = data[data_len - 2];
+			tail[1] = data[data_len - 1];
+			break;
 		case 1:
-			value |= (u32)(*data);
+			tail[0] = data[data_len - 1];
+			break;
 		}
 		b43_piotx_write32(q, B43_PIO8_TXCTL, ctl);
-		b43_piotx_write32(q, B43_PIO8_TXDATA, value);
+		ssb_block_write(dev->dev, tail, 4,
+				q->mmio_base + B43_PIO8_TXDATA,
+				sizeof(u32));
 	}
 
 	return ctl;
@@ -693,21 +703,25 @@ data_ready:
 			       q->mmio_base + B43_PIO8_RXDATA,
 			       sizeof(u32));
 		if (len & 3) {
-			u32 value;
-			char *data;
+			u8 tail[4] = { 0, };
 
 			/* Read the last few bytes. */
-			value = b43_piorx_read32(q, B43_PIO8_RXDATA);
-			data = &(skb->data[len + padding - 1]);
+			ssb_block_read(dev->dev, tail, 4,
+				       q->mmio_base + B43_PIO8_RXDATA,
+				       sizeof(u32));
 			switch (len & 3) {
 			case 3:
-				*data = (value >> 16);
-				data--;
+				skb->data[len + padding - 3] = tail[0];
+				skb->data[len + padding - 2] = tail[1];
+				skb->data[len + padding - 1] = tail[2];
+				break;
 			case 2:
-				*data = (value >> 8);
-				data--;
+				skb->data[len + padding - 2] = tail[0];
+				skb->data[len + padding - 1] = tail[1];
+				break;
 			case 1:
-				*data = value;
+				skb->data[len + padding - 1] = tail[0];
+				break;
 			}
 		}
 	} else {
@@ -715,11 +729,13 @@ data_ready:
 			       q->mmio_base + B43_PIO_RXDATA,
 			       sizeof(u16));
 		if (len & 1) {
-			u16 value;
+			u8 tail[2] = { 0, };
 
 			/* Read the last byte. */
-			value = b43_piorx_read16(q, B43_PIO_RXDATA);
-			skb->data[len + padding - 1] = value;
+			ssb_block_read(dev->dev, tail, 2,
+				       q->mmio_base + B43_PIO_RXDATA,
+				       sizeof(u16));
+			skb->data[len + padding - 1] = tail[0];
 		}
 	}
 
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index 896f532..38cfd79 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -631,6 +631,9 @@ static void mac80211_hwsim_bss_info_changed(struct ieee80211_hw *hw,
 		data->beacon_int = 1024 * info->beacon_int / 1000 * HZ / 1000;
 		if (WARN_ON(!data->beacon_int))
 			data->beacon_int = 1;
+		if (data->started)
+			mod_timer(&data->beacon_timer,
+				  jiffies + data->beacon_int);
 	}
 
 	if (changed & BSS_CHANGED_ERP_CTS_PROT) {
diff --git a/drivers/net/wireless/rt2x00/rt73usb.c b/drivers/net/wireless/rt2x00/rt73usb.c
index 1cbd9b4..b8f5ee3 100644
--- a/drivers/net/wireless/rt2x00/rt73usb.c
+++ b/drivers/net/wireless/rt2x00/rt73usb.c
@@ -2381,6 +2381,7 @@ static struct usb_device_id rt73usb_device_table[] = {
 	/* Huawei-3Com */
 	{ USB_DEVICE(0x1472, 0x0009), USB_DEVICE_DATA(&rt73usb_ops) },
 	/* Hercules */
+	{ USB_DEVICE(0x06f8, 0xe002), USB_DEVICE_DATA(&rt73usb_ops) },
 	{ USB_DEVICE(0x06f8, 0xe010), USB_DEVICE_DATA(&rt73usb_ops) },
 	{ USB_DEVICE(0x06f8, 0xe020), USB_DEVICE_DATA(&rt73usb_ops) },
 	/* Linksys */
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 5143d20..fd40282 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -367,7 +367,10 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx)
 	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
 	u32 staflags;
 
-	if (unlikely(!sta || ieee80211_is_probe_resp(hdr->frame_control)))
+	if (unlikely(!sta || ieee80211_is_probe_resp(hdr->frame_control)
+			|| ieee80211_is_auth(hdr->frame_control)
+			|| ieee80211_is_assoc_resp(hdr->frame_control)
+			|| ieee80211_is_reassoc_resp(hdr->frame_control)))
 		return TX_CONTINUE;
 
 	staflags = get_sta_flags(sta);
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply related

* Re: [PATCH] make TLLAO option for NA packets configurable
From: Stephen Hemminger @ 2009-10-01 18:14 UTC (permalink / raw)
  To: Cosmin Ratiu; +Cc: David Miller, netdev, opurdila
In-Reply-To: <200910012108.41071.cratiu@ixiacom.com>

On Thu, 1 Oct 2009 21:08:40 +0300
Cosmin Ratiu <cratiu@ixiacom.com> wrote:

> On Thursday 01 October 2009 19:43:56 David Miller wrote:
> > Using CLT_UNNUMBERED is a must these days.
> >
> > Also, please fix the prefixing of the paths in your patch.
> > See Documentation/SubmittingPatches in the kernel tree.
> 
> Here is the new variant. Please let me know what you think.
> 
> And I apologize for using [PATCH] instead of [RFC] in the subject, I don't 
> know much about netdev protocol (yet).
> 
> Cosmin.

Probably this should be a per interface property rather than per namespace.

^ permalink raw reply

* [PATCH v3] skge: use unique IRQ name
From: Stephen Hemminger @ 2009-10-01 18:13 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: David Miller, netdev
In-Reply-To: <20091001200208.2907b8ff@leela>

From: Michal Schmidt <mschmidt@redhat.com>

Most network drivers request their IRQ when the interface is activated.
skge does it in ->probe() instead, because it can work with two-port
cards where the two net_devices use the same IRQ. This works fine most
of the time, except in some situations when the interface gets renamed.
Consider this example:

1. modprobe skge
   The card is detected as eth0 and requests IRQ 17. Directory
   /proc/irq/17/eth0 is created.
2. There is an udev rule which says this interface should be called
   eth1, so udev renames eth0 -> eth1.
3. modprobe 8139too
   The Realtek card is detected as eth0. It will be using IRQ 17 too.
4. ip link set eth0 up
   Now 8139too requests IRQ 17.

The result is:
WARNING: at fs/proc/generic.c:590 proc_register ...
proc_dir_entry '17/eth0' already registered
...
And "ls /proc/irq/17" shows two subdirectories, both called eth0.

Fix it by using a unique name for skge's IRQ, based on the PCI address.
The naming from the example then looks like this:
$ grep skge /proc/interrupts
 17:        169   IO-APIC-fasteoi   skge@0000:00:0a.0, eth0

irqbalance daemon will have to be taught to recognize "skge@" as an
Ethernet interrupt. This will be a one-liner addition in classify.c. I
will send a patch to irqbalance if this change is accepted.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>

---
 Changes:
   v2 use pci: in irq name
   v3 simplify calculation of string length

--- a/drivers/net/skge.c	2009-10-01 11:09:19.057676199 -0700
+++ b/drivers/net/skge.c	2009-10-01 11:10:45.786382284 -0700
@@ -3935,11 +3935,14 @@ static int __devinit skge_probe(struct p
 #endif

 	err = -ENOMEM;
-	hw = kzalloc(sizeof(*hw), GFP_KERNEL);
+	/* space for skge@pci:0000:04:00.0 */
+	hw = kzalloc(sizeof(*hw) + strlen(DRV_NAME "@pci:" )
+		     + strlen(pci_name(pdev)) + 1, GFP_KERNEL);
 	if (!hw) {
 		dev_err(&pdev->dev, "cannot allocate hardware struct\n");
 		goto err_out_free_regions;
 	}
+	sprintf(hw->irq_name, DRV_NAME "@pci:%s", pci_name(pdev));

 	hw->pdev = pdev;
 	spin_lock_init(&hw->hw_lock);
@@ -3974,7 +3977,7 @@ static int __devinit skge_probe(struct p
 		goto err_out_free_netdev;
 	}

-	err = request_irq(pdev->irq, skge_intr, IRQF_SHARED, dev->name, hw);
+	err = request_irq(pdev->irq, skge_intr, IRQF_SHARED, hw->irq_name, hw);
 	if (err) {
 		dev_err(&pdev->dev, "%s: cannot assign irq %d\n",
 		       dev->name, pdev->irq);
--- a/drivers/net/skge.h	2009-10-01 11:09:19.067695070 -0700
+++ b/drivers/net/skge.h	2009-10-01 11:09:34.147674089 -0700
@@ -2423,6 +2423,8 @@ struct skge_hw {
 	u16		     phy_addr;
 	spinlock_t	     phy_lock;
 	struct tasklet_struct phy_task;
+
+	char		     irq_name[0]; /* skge@pci:000:04:00.0 */
 };

 enum pause_control {

^ permalink raw reply

* Re: [PATCH] make TLLAO option for NA packets configurable
From: Cosmin Ratiu @ 2009-10-01 18:08 UTC (permalink / raw)
  To: David Miller; +Cc: shemminger, netdev, opurdila
In-Reply-To: <20091001.094356.95174955.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 409 bytes --]

On Thursday 01 October 2009 19:43:56 David Miller wrote:
> Using CLT_UNNUMBERED is a must these days.
>
> Also, please fix the prefixing of the paths in your patch.
> See Documentation/SubmittingPatches in the kernel tree.

Here is the new variant. Please let me know what you think.

And I apologize for using [PATCH] instead of [RFC] in the subject, I don't 
know much about netdev protocol (yet).

Cosmin.

[-- Attachment #2: 0001-ipv6-new-sysctl-for-sending-TLLAO-with-NAs.patch --]
[-- Type: text/x-patch, Size: 2274 bytes --]

From 1911a98df800cedf4c3a63b897163e2935c5f602 Mon Sep 17 00:00:00 2001
From: Cosmin Ratiu <cratiu@ixiacom.com>
Date: Thu, 1 Oct 2009 20:27:39 +0300
Subject: [PATCH] ipv6: new sysctl for sending TLLAO with NAs

Neighbor advertisements responding to unicast Neighbor Solicitations did
not include the TLLAO option. This patch makes this configurable via
/proc/sys/net/ipv6/ndisc_force_tllao, which by default is off.

The need for this arose because certain routers expect the TLLAO in some
situations even as a response to unicast NS packets.

Moreover, RFC 2461 recommends on page 24 sending this to avoid a race.

Signed-off-by: Cosmin Ratiu <cratiu@ixiacom.com>
---
 include/net/netns/ipv6.h   |    1 +
 net/ipv6/ndisc.c           |    1 +
 net/ipv6/sysctl_net_ipv6.c |    8 ++++++++
 3 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index dfeb2d7..dd0a95b 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -16,6 +16,7 @@ struct netns_sysctl_ipv6 {
 	struct ctl_table_header *frags_hdr;
 #endif
 	int bindv6only;
+	int ndisc_force_tllao;
 	int flush_delay;
 	int ip6_rt_max_size;
 	int ip6_rt_gc_min_interval;
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index f74e4e2..f08cf65 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -598,6 +598,7 @@ static void ndisc_send_na(struct net_device *dev, struct neighbour *neigh,
 	icmp6h.icmp6_solicited = solicited;
 	icmp6h.icmp6_override = override;
 
+	inc_opt |= dev_net(dev)->ipv6.sysctl.ndisc_force_tllao;
 	__ndisc_send(dev, neigh, daddr, src_addr,
 		     &icmp6h, solicited_addr,
 		     inc_opt ? ND_OPT_TARGET_LL_ADDR : 0);
diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
index 0dc6a4e..fb423ce 100644
--- a/net/ipv6/sysctl_net_ipv6.c
+++ b/net/ipv6/sysctl_net_ipv6.c
@@ -37,6 +37,14 @@ static ctl_table ipv6_table_template[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec
 	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "ndisc_force_tllao",
+		.data		= &init_net.ipv6.sysctl.ndisc_force_tllao,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
 	{ .ctl_name = 0 }
 };
 
-- 
1.6.3.3


^ permalink raw reply related

* Re: [PATCH] sky2: irqname based on pci address
From: Michal Schmidt @ 2009-10-01 18:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20091001101146.3368b4a4@s6510>

Dne Thu, 1 Oct 2009 10:11:46 -0700 Stephen Hemminger napsal(a):
> This is based on Michal Schmidt fix for skge.
> 
> Most network drivers request their IRQ when the interface is
> activated. sky2 does it in ->probe() instead, because it can work
> with two-port cards where the two net_devices use the same IRQ. This
> works fine most of the time, except in some situations when the
> interface gets renamed. Consider this example:
> 
> 1. modprobe sky2
>    The card is detected as eth0 and requests IRQ 17. Directory
>    /proc/irq/17/eth0 is created.
> 2. There is an udev rule which says this interface should be called
>    eth1, so udev renames eth0 -> eth1.
> 3. modprobe 8139too
>    The Realtek card is detected as eth0. It will be using IRQ 17 too.
> 4. ip link set eth0 up
>    Now 8139too requests IRQ 17.
> 
> The result is:
> WARNING: at fs/proc/generic.c:590 proc_register ...
> proc_dir_entry '17/eth0' already registered
> 
> The fix is for sky2 to name the irq based on the pci device, as is
> done by some other devices DRM, infiniband, ...  ie.
> sky2@pci:0000:00:00
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Reviewed-by: Michal Schmidt <mschmidt@redhat.com>

^ permalink raw reply

* Re: [PATCH] skge: use unique IRQ name
From: Michal Schmidt @ 2009-10-01 18:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20091001095032.17271dcc@s6510>

Dne Thu, 1 Oct 2009 09:50:32 -0700 Stephen Hemminger napsal:
>  	err = -ENOMEM;
> -	hw = kzalloc(sizeof(*hw), GFP_KERNEL);
> +	/* space for skge@pci:0000:04:00.0 */
> +	irq_name_len = strlen(DRV_NAME) +
> strlen(dev_name(&pdev->dev)) + 6;

You replaced "dev_name(&pdev->dev)" with "pci_name(pdev)" below.
That's nice, so we should replace it here too for consistency.

> +	hw = kzalloc(sizeof(*hw) + irq_name_len, GFP_KERNEL);
>  	if (!hw) {
>  		dev_err(&pdev->dev, "cannot allocate hardware
> struct\n"); goto err_out_free_regions;
>  	}
> +	sprintf(hw->irq_name, DRV_NAME "@pci:%s", pci_name(pdev));

Michal

^ permalink raw reply

* Re: [PATCH 00/31] Swap over NFS -v20
From: Christoph Hellwig @ 2009-10-01 17:42 UTC (permalink / raw)
  To: Suresh Jayaraman
  Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
	Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <1254405858-15651-1-git-send-email-sjayaraman@suse.de>

On Thu, Oct 01, 2009 at 07:34:18PM +0530, Suresh Jayaraman wrote:
> Hi,
> 
> Here's the latest version of swap over NFS series since -v19 last October by
> Peter Zijlstra. Peter does not have time to pursue this further (though he has
> not lost interest) and that led me to take over this patchset and try merging
> upstream.
> 
> The patches are against the current mmotm. It does not support SLQB, yet.
> These patches can also be found online here:
> 	http://www.suse.de/~sjayaraman/patches/swap-over-nfs/

My advise again that I already gave to Peter long ago.  It's almost
impossible to get a patchset that large and touching many subsystems in.

Split it into smaller series that make sense of their own.  One of them
would be the whole VM/net work to just make swap over nbd/iscsi safe.

The other really big one is adding a proper method for safe, page-backed
kernelspace I/O on files.  That is not something like the grotty
swap-tied address_space operations in this patch, but more something in
the direction of the kernel direct I/O patches from Jenx Axboe he did
for using in the loop driver.  But even those aren't complete as they
don't touch the locking issue yet.

Especially the latter is an absolutely essential step to make any
progress here, and an excellent patch series of it's own as there are
multiple users for this, like making swap safe on btrfs files, making
the MD bitmap code actually safe or improving the loop driver.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox