Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] ks8851_ml ethernet network driver
From: Stephen Hemminger @ 2009-09-17  4:03 UTC (permalink / raw)
  To: Li, Charles; +Cc: Greg KH, netdev, David S. Miller, Choi, David, Jeff Garzik
In-Reply-To: <20090917023836.GA15260@kroah.com>

On Wed, 16 Sep 2009 19:38:36 -0700
Greg KH <greg@kroah.com> wrote:

> /**
> + * ks_irq - device interrupt handler
> + * @irq: Interrupt number passed from the IRQ hnalder.
> + * @pw: The private word passed to register_irq(), our struct ks_net.
> + *
> + * This is the handler invoked to find out what happened
> + *
> + * Read the interrupt status, work out what needs to be done and then clear
> + * any of the interrupts that are not needed.
> + */
> +
> +static irqreturn_t ks_irq(int irq, void *pw)
> +{
> +	struct ks_net *ks = pw;
> +	struct net_device *netdev = ks->netdev;
> +	u16 status;
> +
> +	/*this should be the first in IRQ handler */
> +	ks_save_cmd_reg(ks);
> +
> +	status = ks_rdreg16(ks, KS_ISR);
> +	ks_wrreg16(ks, KS_ISR, status);

if status == 0 or status == ~0 then device should not return IRQ_HANDLED.
In the former case, the IRQ is shared, in later case the device is not present
on the bus (hotplug).

^ permalink raw reply

* Re: [patch 0/7] s390: iucv / af_iucv fixes for 2.6.31+
From: David Miller @ 2009-09-17  3:58 UTC (permalink / raw)
  To: ursula.braun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens
In-Reply-To: <20090916143721.863799000@linux.vnet.ibm.com>

From: Ursula Braun <ursula.braun@de.ibm.com>
Date: Wed, 16 Sep 2009 16:37:21 +0200

> Summary:
> 
> Ursula Braun (1)
> iucv: suspend/resume error msg for left over pathes
> 
> Hendrik Brueckner (6)
> iucv: fix iucv_buffer_cpumask check when calling IUCV functions
> iucv: use correct output register in iucv_query_maxconn()
> af_iucv: fix race in __iucv_sock_wait()
> af_iucv: handle non-accepted sockets after resuming from suspend
> af_iucv: do not call iucv_sock_kill() twice
> af_iucv: fix race when queueing skbs on the backlog queue

All applied, thank you.

^ permalink raw reply

* Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Michael S. Tsirkin @ 2009-09-17  3:57 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Avi Kivity, Ira W. Snyder, netdev, virtualization, kvm,
	linux-kernel, mingo, linux-mm, akpm, hpa, Rusty Russell, s.hetze,
	alacrityvm-devel
In-Reply-To: <4AB0F1EF.5050102@gmail.com>

On Wed, Sep 16, 2009 at 10:10:55AM -0400, Gregory Haskins wrote:
> > There is no role reversal.
> 
> So if I have virtio-blk driver running on the x86 and vhost-blk device
> running on the ppc board, I can use the ppc board as a block-device.
> What if I really wanted to go the other way?

It seems ppc is the only one that can initiate DMA to an arbitrary
address, so you can't do this really, or you can by tunneling each
request back to ppc, or doing an extra data copy, but it's unlikely to
work well.

The limitation comes from hardware, not from the API we use.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [GIT PULL 0/2] Fixes for IEEE 802.15.4
From: David Miller @ 2009-09-17  3:55 UTC (permalink / raw)
  To: dbaryshkov; +Cc: linux-zigbee-devel, slapin, netdev
In-Reply-To: <1253107333-25043-1-git-send-email-dbaryshkov@gmail.com>

From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Date: Wed, 16 Sep 2009 17:22:11 +0400

> Hi, David,
> 
> Please pull both into net/master and net-next/master (as I'd like
> to submit few patches into net-next/master depending on this).
> 
> The following changes since commit 4e36a95e591e9c58dd10bb4103c00993917c27fd:
>   David Howells (1):
>         RxRPC: Use uX/sX rather than uintX_t/intX_t types
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan.git for-linus

Pulled thanks.

^ permalink raw reply

* Re: [PATCH 2/2] net: remove print_mac as it's not anymore used
From: David Miller @ 2009-09-17  3:54 UTC (permalink / raw)
  To: plagnioj; +Cc: netdev
In-Reply-To: <20090916.205159.191998627.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Wed, 16 Sep 2009 20:51:59 -0700 (PDT)

> From: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
> Date: Thu, 17 Sep 2009 02:07:39 +0200
> 
>> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
> 
> Applied.

Actually I had to revert.  It's still used by the FCOE stack in
the scsi layer.

Did you actually run grep on the entire tree to see if it's still used
anywhere or did you only check drivers/net/ and net/ or something
equally lazy?

^ permalink raw reply

* Re: [PATCH 2/2] net: remove print_mac as it's not anymore used
From: David Miller @ 2009-09-17  3:51 UTC (permalink / raw)
  To: plagnioj; +Cc: netdev
In-Reply-To: <1253146059-4169-2-git-send-email-plagnioj@jcrosoft.com>

From: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Date: Thu, 17 Sep 2009 02:07:39 +0200

> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] wl12xx: switch to %pM to print the mac address
From: David Miller @ 2009-09-17  3:51 UTC (permalink / raw)
  To: linville; +Cc: plagnioj, netdev
In-Reply-To: <20090917002852.GE14393@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Wed, 16 Sep 2009 20:28:52 -0400

> On Thu, Sep 17, 2009 at 02:07:38AM +0200, Jean-Christophe PLAGNIOL-VILLARD wrote:
>> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
 ...
> 
> ACK

Applied.

^ permalink raw reply

* Re: [PATCH] ks8851_ml ethernet network driver
From: David Miller @ 2009-09-17  3:48 UTC (permalink / raw)
  To: greg; +Cc: netdev, Charles.Li, Choi, David.Choi, jgarzik, shemminger
In-Reply-To: <20090917023836.GA15260@kroah.com>

From: Greg KH <greg@kroah.com>
Date: Wed, 16 Sep 2009 19:38:36 -0700

> From: Choi, David <David.Choi@Micrel.Com>
> 
> This is a network driver for the ks8851 16bit MLL ethernet device.
> 
> Signed-off-by: David J. Choi <david.choi@micrel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

This doesn't even build cleanly:

drivers/net/ks8851_mll.c: In function ‘ks_inblk’:
drivers/net/ks8851_mll.c:555: warning: cast from pointer to integer of different size
drivers/net/ks8851_mll.c:558: warning: passing argument 1 of ‘_readw’ makes pointer from integer without a cast
drivers/net/ks8851_mll.c: In function ‘ks_outblk’:
drivers/net/ks8851_mll.c:571: warning: cast from pointer to integer of different size
drivers/net/ks8851_mll.c:574: warning: passing argument 2 of ‘_writew’ makes pointer from integer without a cast

It also has a big "#define DEBUG" at the beginning of the driver.

And it also has stuff like:

+#define MALLOC(x)		kmalloc(x, GFP_KERNEL)

which actually decreases the readability of this driver.

Please fix this up.

^ permalink raw reply

* question about tcp_ack_update_window
From: hong liu @ 2009-09-17  3:24 UTC (permalink / raw)
  To: netdev

Hi,

In tcp_ack_update_window, we don't scale the window if it's a SYN packet.
This modification is introduced by Kevin Lahey during 2.5.75
(http://oss.sgi.com/archives/netdev/2003-10/msg01391.html).

tcp_ack_update_window is only called by tcp_ack, which is called by
tcp_rcv_synsent_state_process & tcp_rcv_established, and we change the
snd_wnd back to th->window in tcp_rcv_synsent_state_process, so why we
still need this  check in tcp_ack_update_window?

I think we may need to add tp->max_window = tp->snd_wnd in
tcp_rcv_syssent_state_process if we remove the check in tcp_ack_update_window.

The only problem I can see is: client entered into established state and sent
an ack (the last packet in 3-way handershake) to the server, but the packet
got lost / arrived slightly late. Then server resent a SYN+ACK packet,
then client can get a SYN packet and call tcp_ack in tcp_rcv_established.

Is this the only concern?

Thanks,
Hong

^ permalink raw reply

* Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Gregory Haskins @ 2009-09-17  3:11 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Michael S. Tsirkin, Ira W. Snyder, netdev, virtualization, kvm,
	linux-kernel, mingo, linux-mm, akpm, hpa, Rusty Russell, s.hetze,
	alacrityvm-devel
In-Reply-To: <4AB151D7.10402@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 12081 bytes --]

Avi Kivity wrote:
> On 09/16/2009 10:22 PM, Gregory Haskins wrote:
>> Avi Kivity wrote:
>>   
>>> On 09/16/2009 05:10 PM, Gregory Haskins wrote:
>>>     
>>>>> If kvm can do it, others can.
>>>>>
>>>>>          
>>>> The problem is that you seem to either hand-wave over details like
>>>> this,
>>>> or you give details that are pretty much exactly what vbus does
>>>> already.
>>>>    My point is that I've already sat down and thought about these
>>>> issues
>>>> and solved them in a freely available GPL'ed software package.
>>>>
>>>>        
>>> In the kernel.  IMO that's the wrong place for it.
>>>      
>> 3) "in-kernel": You can do something like virtio-net to vhost to
>> potentially meet some of the requirements, but not all.
>>
>> In order to fully meet (3), you would need to do some of that stuff you
>> mentioned in the last reply with muxing device-nr/reg-nr.  In addition,
>> we need to have a facility for mapping eventfds and establishing a
>> signaling mechanism (like PIO+qid), etc. KVM does this with
>> IRQFD/IOEVENTFD, but we dont have KVM in this case so it needs to be
>> invented.
>>    
> 
> irqfd/eventfd is the abstraction layer, it doesn't need to be reabstracted.

Not per se, but it needs to be interfaced.  How do I register that
eventfd with the fastpath in Ira's rig? How do I signal the eventfd
(x86->ppc, and ppc->x86)?

To take it to the next level, how do I organize that mechanism so that
it works for more than one IO-stream (e.g. address the various queues
within ethernet or a different device like the console)?  KVM has
IOEVENTFD and IRQFD managed with MSI and PIO.  This new rig does not
have the luxury of an established IO paradigm.

Is vbus the only way to implement a solution?  No.  But it is _a_ way,
and its one that was specifically designed to solve this very problem
(as well as others).

(As an aside, note that you generally will want an abstraction on top of
irqfd/eventfd like shm-signal or virtqueues to do shared-memory based
event mitigation, but I digress.  That is a separate topic).

> 
>> To meet performance, this stuff has to be in kernel and there has to be
>> a way to manage it.
> 
> and management belongs in userspace.

vbus does not dictate where the management must be.  Its an extensible
framework, governed by what you plug into it (ala connectors and devices).

For instance, the vbus-kvm connector in alacrityvm chooses to put DEVADD
and DEVDROP hotswap events into the interrupt stream, because they are
simple and we already needed the interrupt stream anyway for fast-path.

As another example: venet chose to put ->call(MACQUERY) "config-space"
into its call namespace because its simple, and we already need
->calls() for fastpath.  It therefore exports an attribute to sysfs that
allows the management app to set it.

I could likewise have designed the connector or device-model differently
as to keep the mac-address and hotswap-events somewhere else (QEMU/PCI
userspace) but this seems silly to me when they are so trivial, so I didn't.

> 
>> Since vbus was designed to do exactly that, this is
>> what I would advocate.  You could also reinvent these concepts and put
>> your own mux and mapping code in place, in addition to all the other
>> stuff that vbus does.  But I am not clear why anyone would want to.
>>    
> 
> Maybe they like their backward compatibility and Windows support.

This is really not relevant to this thread, since we are talking about
Ira's hardware.  But if you must bring this up, then I will reiterate
that you just design the connector to interface with QEMU+PCI and you
have that too if that was important to you.

But on that topic: Since you could consider KVM a "motherboard
manufacturer" of sorts (it just happens to be virtual hardware), I don't
know why KVM seems to consider itself the only motherboard manufacturer
in the world that has to make everything look legacy.  If a company like
ASUS wants to add some cutting edge IO controller/bus, they simply do
it.  Pretty much every product release may contain a different array of
devices, many of which are not backwards compatible with any prior
silicon.  The guy/gal installing Windows on that system may see a "?" in
device-manager until they load a driver that supports the new chip, and
subsequently it works.  It is certainly not a requirement to make said
chip somehow work with existing drivers/facilities on bare metal, per
se.  Why should virtual systems be different?

So, yeah, the current design of the vbus-kvm connector means I have to
provide a driver.  This is understood, and I have no problem with that.

The only thing that I would agree has to be backwards compatible is the
BIOS/boot function.  If you can't support running an image like the
Windows installer, you are hosed.  If you can't use your ethernet until
you get a chance to install a driver after the install completes, its
just like most other systems in existence.  IOW: It's not a big deal.

For cases where the IO system is needed as part of the boot/install, you
provide BIOS and/or an install-disk support for it.

> 
>> So no, the kernel is not the wrong place for it.  Its the _only_ place
>> for it.  Otherwise, just use (1) and be done with it.
>>
>>    
> 
> I'm talking about the config stuff, not the data path.

As stated above, where config stuff lives is a function of what you
interface to vbus.  Data-path stuff must be in the kernel for
performance reasons, and this is what I was referring to.  I think we
are generally both in agreement, here.

What I was getting at is that you can't just hand-wave the datapath
stuff.  We do fast path in KVM with IRQFD/IOEVENTFD+PIO, and we do
device discovery/addressing with PCI.  Neither of those are available
here in Ira's case yet the general concepts are needed.  Therefore, we
have to come up with something else.

> 
>>>   Further, if we adopt
>>> vbus, if drop compatibility with existing guests or have to support both
>>> vbus and virtio-pci.
>>>      
>> We already need to support both (at least to support Ira).  virtio-pci
>> doesn't work here.  Something else (vbus, or vbus-like) is needed.
>>    
> 
> virtio-ira.

Sure, virtio-ira and he is on his own to make a bus-model under that, or
virtio-vbus + vbus-ira-connector to use the vbus framework.  Either
model can work, I agree.

> 
>>>> So the question is: is your position that vbus is all wrong and you
>>>> wish
>>>> to create a new bus-like thing to solve the problem?
>>>>        
>>> I don't intend to create anything new, I am satisfied with virtio.  If
>>> it works for Ira, excellent.  If not, too bad.
>>>      
>> I think that about sums it up, then.
>>    
> 
> Yes.  I'm all for reusing virtio, but I'm not going switch to vbus or
> support both for this esoteric use case.

With all due respect, no one asked you to.  This sub-thread was
originally about using vhost in Ira's rig.  When problems surfaced in
that proposed model, I highlighted that I had already addressed that
problem in vbus, and here we are.

> 
>>>> If so, how is it
>>>> different from what Ive already done?  More importantly, what specific
>>>> objections do you have to what Ive done, as perhaps they can be fixed
>>>> instead of starting over?
>>>>
>>>>        
>>> The two biggest objections are:
>>> - the host side is in the kernel
>>>      
>> As it needs to be.
>>    
> 
> vhost-net somehow manages to work without the config stuff in the kernel.

I was referring to data-path stuff, like signal and memory
configuration/routing.

As an aside, it should be noted that vhost under KVM has
IRQFD/IOEVENTFD, PCI-emulation, QEMU, etc to complement it and fill in
some of the pieces one needs for a complete solution.  Not all
environments have all of those pieces (nor should they), and those
pieces need to come from somewhere.

It should also be noted that what remains (config/management) after the
data-path stuff is laid out is actually quite simple.  It consists of
pretty much an enumerated list of device-ids within a container,
DEVADD(id), DEVDROP(id) events, and some sysfs attributes as defined on
a per-device basis (many of which are often needed regardless of whether
the "config-space" operation is handled in-kernel or not)

Therefore, the configuration aspect of the system does not necessitate a
complicated (e.g. full PCI emulation) or external (e.g. userspace)
component per se.  The parts of vbus that could be construed as
"management" are (afaict) built using accepted/best-practices for
managing arbitrary kernel subsystems (sysfs, configfs, ioctls, etc) so
there is nothing new or reasonably controversial there.  It is for this
reason that I think the objection to "in-kernel config" is unfounded.

Disagreements on this point may be settled by the connector design,
while still utilizing vbus, and thus retaining most of the other
benefits of using the vbus framework.  The connector ultimately dictates
how and what is exposed to the "guest".

> 
>> With all due respect, based on all of your comments in aggregate I
>> really do not think you are truly grasping what I am actually building
>> here.
>>    
> 
> Thanks.
> 
> 
> 
>>>> Bingo.  So now its a question of do you want to write this layer from
>>>> scratch, or re-use my framework.
>>>>
>>>>        
>>> You will have to implement a connector or whatever for vbus as well.
>>> vbus has more layers so it's probably smaller for vbus.
>>>      
>> Bingo!
> 
> (addictive, isn't it)

Apparently.

> 
>> That is precisely the point.
>>
>> All the stuff for how to map eventfds, handle signal mitigation, demux
>> device/function pointers, isolation, etc, are built in.  All the
>> connector has to do is transport the 4-6 verbs and provide a memory
>> mapping/copy function, and the rest is reusable.  The device models
>> would then work in all environments unmodified, and likewise the
>> connectors could use all device-models unmodified.
>>    
> 
> Well, virtio has a similar abstraction on the guest side.  The host side
> abstraction is limited to signalling since all configuration is in
> userspace.  vhost-net ought to work for lguest and s390 without change.

But IIUC that is primarily because the revectoring work is already in
QEMU for virtio-u and it rides on that, right?  Not knocking that, thats
nice and a distinct advantage.  It should just be noted that its based
on sunk-cost, and not truly free.  Its just already paid for, which is
different.  It also means it only works in environments based on QEMU,
which not all are (as evident by this sub-thread).

> 
>>> It was already implemented three times for virtio, so apparently that's
>>> extensible too.
>>>      
>> And to my point, I'm trying to commoditize as much of that process as
>> possible on both the front and backends (at least for cases where
>> performance matters) so that you don't need to reinvent the wheel for
>> each one.
>>    
> 
> Since you're interested in any-to-any connectors it makes sense to you. 
> I'm only interested in kvm-host-to-kvm-guest, so reducing the already
> minor effort to implement a new virtio binding has little appeal to me.
> 

Fair enough.

>>> You mean, if the x86 board was able to access the disks and dma into the
>>> ppb boards memory?  You'd run vhost-blk on x86 and virtio-net on ppc.
>>>      
>> But as we discussed, vhost doesn't work well if you try to run it on the
>> x86 side due to its assumptions about pagable "guest" memory, right?  So
>> is that even an option?  And even still, you would still need to solve
>> the aggregation problem so that multiple devices can coexist.
>>    
> 
> I don't know.  Maybe it can be made to work and maybe it cannot.  It
> probably can with some determined hacking.
> 

I guess you can say the same for any of the solutions.

Kind Regards,
-Greg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]

^ permalink raw reply

* [PATCH] ks8851_ml ethernet network driver
From: Greg KH @ 2009-09-17  2:38 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: Li, Charles, Choi, David, Jeff Garzik, Stephen Hemminger

From: Choi, David <David.Choi@Micrel.Com>

This is a network driver for the ks8851 16bit MLL ethernet device.

Signed-off-by: David J. Choi <david.choi@micrel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


---

 drivers/net/Kconfig      |    6 
 drivers/net/Makefile     |    1 
 drivers/net/ks8851_mll.c | 1701 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1708 insertions(+)

--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1738,6 +1738,12 @@ config KS8851
        help
          SPI driver for Micrel KS8851 SPI attached network chip.
 
+config KS8851_MLL
+	tristate "Micrel KSZ8851"
+	depends on HAS_IOMEM
+	help
+	  This platform driver is for Micrel KSZ8851 MLL chip.
+
 config VIA_RHINE
 	tristate "VIA Rhine support"
 	depends on NET_PCI && PCI
--- /dev/null
+++ b/drivers/net/ks8851_mll.c
@@ -0,0 +1,1701 @@
+/**
+ * drivers/net/ks8851_mll.c
+ * Copyright (c) 2009 Micrel Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+/**
+ * Supports:
+ * KS8851 16bit MLL chip from Micrel Inc.
+ */
+
+#define DEBUG
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/ethtool.h>
+#include <linux/cache.h>
+#include <linux/crc32.h>
+#include <linux/mii.h>
+#include <linux/platform_device.h>
+#include <linux/delay.h>
+
+#define	DRV_NAME	"ks8851_mll"
+
+static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0x00, 0x10, 0xA1, 0x86, 0x95, 0x11 };
+#define MAX_RECV_FRAMES			32
+#define MAX_BUF_SIZE			2048
+#define TX_BUF_SIZE			2000
+#define RX_BUF_SIZE			2000
+
+#define KS_CCR				0x08
+#define CCR_EEPROM			(1 << 9)
+#define CCR_SPI				(1 << 8)
+#define CCR_8BIT			(1 << 7)
+#define CCR_16BIT			(1 << 6)
+#define CCR_32BIT			(1 << 5)
+#define CCR_SHARED			(1 << 4)
+#define CCR_32PIN			(1 << 0)
+
+/* MAC address registers */
+#define KS_MARL				0x10
+#define KS_MARM				0x12
+#define KS_MARH				0x14
+
+#define KS_OBCR				0x20
+#define OBCR_ODS_16MA			(1 << 6)
+
+#define KS_EEPCR			0x22
+#define EEPCR_EESA			(1 << 4)
+#define EEPCR_EESB			(1 << 3)
+#define EEPCR_EEDO			(1 << 2)
+#define EEPCR_EESCK			(1 << 1)
+#define EEPCR_EECS			(1 << 0)
+
+#define KS_MBIR				0x24
+#define MBIR_TXMBF			(1 << 12)
+#define MBIR_TXMBFA			(1 << 11)
+#define MBIR_RXMBF			(1 << 4)
+#define MBIR_RXMBFA			(1 << 3)
+
+#define KS_GRR				0x26
+#define GRR_QMU				(1 << 1)
+#define GRR_GSR				(1 << 0)
+
+#define KS_WFCR				0x2A
+#define WFCR_MPRXE			(1 << 7)
+#define WFCR_WF3E			(1 << 3)
+#define WFCR_WF2E			(1 << 2)
+#define WFCR_WF1E			(1 << 1)
+#define WFCR_WF0E			(1 << 0)
+
+#define KS_WF0CRC0			0x30
+#define KS_WF0CRC1			0x32
+#define KS_WF0BM0			0x34
+#define KS_WF0BM1			0x36
+#define KS_WF0BM2			0x38
+#define KS_WF0BM3			0x3A
+
+#define KS_WF1CRC0			0x40
+#define KS_WF1CRC1			0x42
+#define KS_WF1BM0			0x44
+#define KS_WF1BM1			0x46
+#define KS_WF1BM2			0x48
+#define KS_WF1BM3			0x4A
+
+#define KS_WF2CRC0			0x50
+#define KS_WF2CRC1			0x52
+#define KS_WF2BM0			0x54
+#define KS_WF2BM1			0x56
+#define KS_WF2BM2			0x58
+#define KS_WF2BM3			0x5A
+
+#define KS_WF3CRC0			0x60
+#define KS_WF3CRC1			0x62
+#define KS_WF3BM0			0x64
+#define KS_WF3BM1			0x66
+#define KS_WF3BM2			0x68
+#define KS_WF3BM3			0x6A
+
+#define KS_TXCR				0x70
+#define TXCR_TCGICMP			(1 << 8)
+#define TXCR_TCGUDP			(1 << 7)
+#define TXCR_TCGTCP			(1 << 6)
+#define TXCR_TCGIP			(1 << 5)
+#define TXCR_FTXQ			(1 << 4)
+#define TXCR_TXFCE			(1 << 3)
+#define TXCR_TXPE			(1 << 2)
+#define TXCR_TXCRC			(1 << 1)
+#define TXCR_TXE			(1 << 0)
+
+#define KS_TXSR				0x72
+#define TXSR_TXLC			(1 << 13)
+#define TXSR_TXMC			(1 << 12)
+#define TXSR_TXFID_MASK			(0x3f << 0)
+#define TXSR_TXFID_SHIFT		(0)
+#define TXSR_TXFID_GET(_v)		(((_v) >> 0) & 0x3f)
+
+
+#define KS_RXCR1			0x74
+#define RXCR1_FRXQ			(1 << 15)
+#define RXCR1_RXUDPFCC			(1 << 14)
+#define RXCR1_RXTCPFCC			(1 << 13)
+#define RXCR1_RXIPFCC			(1 << 12)
+#define RXCR1_RXPAFMA			(1 << 11)
+#define RXCR1_RXFCE			(1 << 10)
+#define RXCR1_RXEFE			(1 << 9)
+#define RXCR1_RXMAFMA			(1 << 8)
+#define RXCR1_RXBE			(1 << 7)
+#define RXCR1_RXME			(1 << 6)
+#define RXCR1_RXUE			(1 << 5)
+#define RXCR1_RXAE			(1 << 4)
+#define RXCR1_RXINVF			(1 << 1)
+#define RXCR1_RXE			(1 << 0)
+#define RXCR1_FILTER_MASK    		(RXCR1_RXINVF | RXCR1_RXAE | \
+					 RXCR1_RXMAFMA | RXCR1_RXPAFMA)
+
+#define KS_RXCR2			0x76
+#define RXCR2_SRDBL_MASK		(0x7 << 5)
+#define RXCR2_SRDBL_SHIFT		(5)
+#define RXCR2_SRDBL_4B			(0x0 << 5)
+#define RXCR2_SRDBL_8B			(0x1 << 5)
+#define RXCR2_SRDBL_16B			(0x2 << 5)
+#define RXCR2_SRDBL_32B			(0x3 << 5)
+/* #define RXCR2_SRDBL_FRAME		(0x4 << 5) */
+#define RXCR2_IUFFP			(1 << 4)
+#define RXCR2_RXIUFCEZ			(1 << 3)
+#define RXCR2_UDPLFE			(1 << 2)
+#define RXCR2_RXICMPFCC			(1 << 1)
+#define RXCR2_RXSAF			(1 << 0)
+
+#define KS_TXMIR			0x78
+
+#define KS_RXFHSR			0x7C
+#define RXFSHR_RXFV			(1 << 15)
+#define RXFSHR_RXICMPFCS		(1 << 13)
+#define RXFSHR_RXIPFCS			(1 << 12)
+#define RXFSHR_RXTCPFCS			(1 << 11)
+#define RXFSHR_RXUDPFCS			(1 << 10)
+#define RXFSHR_RXBF			(1 << 7)
+#define RXFSHR_RXMF			(1 << 6)
+#define RXFSHR_RXUF			(1 << 5)
+#define RXFSHR_RXMR			(1 << 4)
+#define RXFSHR_RXFT			(1 << 3)
+#define RXFSHR_RXFTL			(1 << 2)
+#define RXFSHR_RXRF			(1 << 1)
+#define RXFSHR_RXCE			(1 << 0)
+#define	RXFSHR_ERR			(RXFSHR_RXCE | RXFSHR_RXRF |\
+					RXFSHR_RXFTL | RXFSHR_RXMR |\
+					RXFSHR_RXICMPFCS | RXFSHR_RXIPFCS |\
+					RXFSHR_RXTCPFCS)
+#define KS_RXFHBCR			0x7E
+#define RXFHBCR_CNT_MASK		0x0FFF
+
+#define KS_TXQCR			0x80
+#define TXQCR_AETFE			(1 << 2)
+#define TXQCR_TXQMAM			(1 << 1)
+#define TXQCR_METFE			(1 << 0)
+
+#define KS_RXQCR			0x82
+#define RXQCR_RXDTTS			(1 << 12)
+#define RXQCR_RXDBCTS			(1 << 11)
+#define RXQCR_RXFCTS			(1 << 10)
+#define RXQCR_RXIPHTOE			(1 << 9)
+#define RXQCR_RXDTTE			(1 << 7)
+#define RXQCR_RXDBCTE			(1 << 6)
+#define RXQCR_RXFCTE			(1 << 5)
+#define RXQCR_ADRFE			(1 << 4)
+#define RXQCR_SDA			(1 << 3)
+#define RXQCR_RRXEF			(1 << 0)
+#define RXQCR_CMD_CNTL                	(RXQCR_RXFCTE|RXQCR_ADRFE)
+
+#define KS_TXFDPR			0x84
+#define TXFDPR_TXFPAI			(1 << 14)
+#define TXFDPR_TXFP_MASK		(0x7ff << 0)
+#define TXFDPR_TXFP_SHIFT		(0)
+
+#define KS_RXFDPR			0x86
+#define RXFDPR_RXFPAI			(1 << 14)
+
+#define KS_RXDTTR			0x8C
+#define KS_RXDBCTR			0x8E
+
+#define KS_IER				0x90
+#define KS_ISR				0x92
+#define IRQ_LCI				(1 << 15)
+#define IRQ_TXI				(1 << 14)
+#define IRQ_RXI				(1 << 13)
+#define IRQ_RXOI			(1 << 11)
+#define IRQ_TXPSI			(1 << 9)
+#define IRQ_RXPSI			(1 << 8)
+#define IRQ_TXSAI			(1 << 6)
+#define IRQ_RXWFDI			(1 << 5)
+#define IRQ_RXMPDI			(1 << 4)
+#define IRQ_LDI				(1 << 3)
+#define IRQ_EDI				(1 << 2)
+#define IRQ_SPIBEI			(1 << 1)
+#define IRQ_DEDI			(1 << 0)
+
+#define KS_RXFCTR			0x9C
+#define RXFCTR_THRESHOLD_MASK     	0x00FF
+
+#define KS_RXFC				0x9D
+#define RXFCTR_RXFC_MASK		(0xff << 8)
+#define RXFCTR_RXFC_SHIFT		(8)
+#define RXFCTR_RXFC_GET(_v)		(((_v) >> 8) & 0xff)
+#define RXFCTR_RXFCT_MASK		(0xff << 0)
+#define RXFCTR_RXFCT_SHIFT		(0)
+
+#define KS_TXNTFSR			0x9E
+
+#define KS_MAHTR0			0xA0
+#define KS_MAHTR1			0xA2
+#define KS_MAHTR2			0xA4
+#define KS_MAHTR3			0xA6
+
+#define KS_FCLWR			0xB0
+#define KS_FCHWR			0xB2
+#define KS_FCOWR			0xB4
+
+#define KS_CIDER			0xC0
+#define CIDER_ID			0x8870
+#define CIDER_REV_MASK			(0x7 << 1)
+#define CIDER_REV_SHIFT			(1)
+#define CIDER_REV_GET(_v)		(((_v) >> 1) & 0x7)
+
+#define KS_CGCR				0xC6
+#define KS_IACR				0xC8
+#define IACR_RDEN			(1 << 12)
+#define IACR_TSEL_MASK			(0x3 << 10)
+#define IACR_TSEL_SHIFT			(10)
+#define IACR_TSEL_MIB			(0x3 << 10)
+#define IACR_ADDR_MASK			(0x1f << 0)
+#define IACR_ADDR_SHIFT			(0)
+
+#define KS_IADLR			0xD0
+#define KS_IAHDR			0xD2
+
+#define KS_PMECR			0xD4
+#define PMECR_PME_DELAY			(1 << 14)
+#define PMECR_PME_POL			(1 << 12)
+#define PMECR_WOL_WAKEUP		(1 << 11)
+#define PMECR_WOL_MAGICPKT		(1 << 10)
+#define PMECR_WOL_LINKUP		(1 << 9)
+#define PMECR_WOL_ENERGY		(1 << 8)
+#define PMECR_AUTO_WAKE_EN		(1 << 7)
+#define PMECR_WAKEUP_NORMAL		(1 << 6)
+#define PMECR_WKEVT_MASK		(0xf << 2)
+#define PMECR_WKEVT_SHIFT		(2)
+#define PMECR_WKEVT_GET(_v)		(((_v) >> 2) & 0xf)
+#define PMECR_WKEVT_ENERGY		(0x1 << 2)
+#define PMECR_WKEVT_LINK		(0x2 << 2)
+#define PMECR_WKEVT_MAGICPKT		(0x4 << 2)
+#define PMECR_WKEVT_FRAME		(0x8 << 2)
+#define PMECR_PM_MASK			(0x3 << 0)
+#define PMECR_PM_SHIFT			(0)
+#define PMECR_PM_NORMAL			(0x0 << 0)
+#define PMECR_PM_ENERGY			(0x1 << 0)
+#define PMECR_PM_SOFTDOWN		(0x2 << 0)
+#define PMECR_PM_POWERSAVE		(0x3 << 0)
+
+/* Standard MII PHY data */
+#define KS_P1MBCR			0xE4
+#define P1MBCR_FORCE_FDX		(1 << 8)
+
+#define KS_P1MBSR			0xE6
+#define P1MBSR_AN_COMPLETE		(1 << 5)
+#define P1MBSR_AN_CAPABLE		(1 << 3)
+#define P1MBSR_LINK_UP			(1 << 2)
+
+#define KS_PHY1ILR			0xE8
+#define KS_PHY1IHR			0xEA
+#define KS_P1ANAR			0xEC
+#define KS_P1ANLPR			0xEE
+
+#define KS_P1SCLMD			0xF4
+#define P1SCLMD_LEDOFF			(1 << 15)
+#define P1SCLMD_TXIDS			(1 << 14)
+#define P1SCLMD_RESTARTAN		(1 << 13)
+#define P1SCLMD_DISAUTOMDIX		(1 << 10)
+#define P1SCLMD_FORCEMDIX		(1 << 9)
+#define P1SCLMD_AUTONEGEN		(1 << 7)
+#define P1SCLMD_FORCE100		(1 << 6)
+#define P1SCLMD_FORCEFDX		(1 << 5)
+#define P1SCLMD_ADV_FLOW		(1 << 4)
+#define P1SCLMD_ADV_100BT_FDX		(1 << 3)
+#define P1SCLMD_ADV_100BT_HDX		(1 << 2)
+#define P1SCLMD_ADV_10BT_FDX		(1 << 1)
+#define P1SCLMD_ADV_10BT_HDX		(1 << 0)
+
+#define KS_P1CR				0xF6
+#define P1CR_HP_MDIX			(1 << 15)
+#define P1CR_REV_POL			(1 << 13)
+#define P1CR_OP_100M			(1 << 10)
+#define P1CR_OP_FDX			(1 << 9)
+#define P1CR_OP_MDI			(1 << 7)
+#define P1CR_AN_DONE			(1 << 6)
+#define P1CR_LINK_GOOD			(1 << 5)
+#define P1CR_PNTR_FLOW			(1 << 4)
+#define P1CR_PNTR_100BT_FDX		(1 << 3)
+#define P1CR_PNTR_100BT_HDX		(1 << 2)
+#define P1CR_PNTR_10BT_FDX		(1 << 1)
+#define P1CR_PNTR_10BT_HDX		(1 << 0)
+
+/* TX Frame control */
+
+#define TXFR_TXIC			(1 << 15)
+#define TXFR_TXFID_MASK			(0x3f << 0)
+#define TXFR_TXFID_SHIFT		(0)
+
+#define KS_P1SR				0xF8
+#define P1SR_HP_MDIX			(1 << 15)
+#define P1SR_REV_POL			(1 << 13)
+#define P1SR_OP_100M			(1 << 10)
+#define P1SR_OP_FDX			(1 << 9)
+#define P1SR_OP_MDI			(1 << 7)
+#define P1SR_AN_DONE			(1 << 6)
+#define P1SR_LINK_GOOD			(1 << 5)
+#define P1SR_PNTR_FLOW			(1 << 4)
+#define P1SR_PNTR_100BT_FDX		(1 << 3)
+#define P1SR_PNTR_100BT_HDX		(1 << 2)
+#define P1SR_PNTR_10BT_FDX		(1 << 1)
+#define P1SR_PNTR_10BT_HDX		(1 << 0)
+
+#define	ENUM_BUS_NONE			0
+#define	ENUM_BUS_8BIT			1
+#define	ENUM_BUS_16BIT			2
+#define	ENUM_BUS_32BIT			3
+
+#define MAX_MCAST_LST			32
+#define HW_MCAST_SIZE			8
+#define MAC_ADDR_LEN			6
+
+/**
+ * union ks_tx_hdr - tx header data
+ * @txb: The header as bytes
+ * @txw: The header as 16bit, little-endian words
+ *
+ * A dual representation of the tx header data to allow
+ * access to individual bytes, and to allow 16bit accesses
+ * with 16bit alignment.
+ */
+union ks_tx_hdr {
+	u8      txb[4];
+	__le16  txw[2];
+};
+
+/**
+ * struct ks_net - KS8851 driver private data
+ * @net_device 	: The network device we're bound to
+ * @hw_addr	: start address of data register.
+ * @hw_addr_cmd	: start address of command register.
+ * @txh    	: temporaly buffer to save status/length.
+ * @lock	: Lock to ensure that the device is not accessed when busy.
+ * @pdev	: Pointer to platform device.
+ * @mii		: The MII state information for the mii calls.
+ * @frame_head_info   	: frame header information for multi-pkt rx.
+ * @statelock	: Lock on this structure for tx list.
+ * @msg_enable	: The message flags controlling driver output (see ethtool).
+ * @frame_cnt  	: number of frames received.
+ * @bus_width  	: i/o bus width.
+ * @irq    	: irq number assigned to this device.
+ * @rc_rxqcr	: Cached copy of KS_RXQCR.
+ * @rc_txcr	: Cached copy of KS_TXCR.
+ * @rc_ier	: Cached copy of KS_IER.
+ * @sharedbus  	: Multipex(addr and data bus) mode indicator.
+ * @cmd_reg_cache	: command register cached.
+ * @cmd_reg_cache_int	: command register cached. Used in the irq handler.
+ * @promiscuous	: promiscuous mode indicator.
+ * @all_mcast  	: mutlicast indicator.
+ * @mcast_lst_size   	: size of multicast list.
+ * @mcast_lst    	: multicast list.
+ * @mcast_bits    	: multicast enabed.
+ * @mac_addr   		: MAC address assigned to this device.
+ * @fid    		: frame id.
+ * @extra_byte    	: number of extra byte prepended rx pkt.
+ * @enabled    		: indicator this device works.
+ *
+ * The @lock ensures that the chip is protected when certain operations are
+ * in progress. When the read or write packet transfer is in progress, most
+ * of the chip registers are not accessible until the transfer is finished and
+ * the DMA has been de-asserted.
+ *
+ * The @statelock is used to protect information in the structure which may
+ * need to be accessed via several sources, such as the network driver layer
+ * or one of the work queues.
+ *
+ */
+#define MALLOC(x)		kmalloc(x, GFP_KERNEL)
+
+/* Receive multiplex framer header info */
+struct type_frame_head {
+	u16	sts;         /* Frame status */
+	u16	len;         /* Byte count */
+};
+
+struct ks_net {
+	struct net_device	*netdev;
+	void __iomem    	*hw_addr;
+	void __iomem    	*hw_addr_cmd;
+	union ks_tx_hdr		txh ____cacheline_aligned;
+	struct mutex      	lock; /* spinlock to be interrupt safe */
+	struct platform_device *pdev;
+	struct mii_if_info	mii;
+	struct type_frame_head	*frame_head_info;
+	spinlock_t		statelock;
+	u32			msg_enable;
+	u32			frame_cnt;
+	int			bus_width;
+	int             	irq;
+
+	u16			rc_rxqcr;
+	u16			rc_txcr;
+	u16			rc_ier;
+	u16			sharedbus;
+	u16			cmd_reg_cache;
+	u16			cmd_reg_cache_int;
+	u16			promiscuous;
+	u16			all_mcast;
+	u16			mcast_lst_size;
+	u8			mcast_lst[MAX_MCAST_LST][MAC_ADDR_LEN];
+	u8			mcast_bits[HW_MCAST_SIZE];
+	u8			mac_addr[6];
+	u8                      fid;
+	u8			extra_byte;
+	u8			enabled;
+};
+
+static int msg_enable;
+
+#define ks_info(_ks, _msg...) dev_info(&(_ks)->pdev->dev, _msg)
+#define ks_warn(_ks, _msg...) dev_warn(&(_ks)->pdev->dev, _msg)
+#define ks_dbg(_ks, _msg...) dev_dbg(&(_ks)->pdev->dev, _msg)
+#define ks_err(_ks, _msg...) dev_err(&(_ks)->pdev->dev, _msg)
+
+#define BE3             0x8000      /* Byte Enable 3 */
+#define BE2             0x4000      /* Byte Enable 2 */
+#define BE1             0x2000      /* Byte Enable 1 */
+#define BE0             0x1000      /* Byte Enable 0 */
+
+/**
+ * register read/write calls.
+ *
+ * All these calls issue transactions to access the chip's registers. They
+ * all require that the necessary lock is held to prevent accesses when the
+ * chip is busy transfering packet data (RX/TX FIFO accesses).
+ */
+
+/**
+ * ks_rdreg8 - read 8 bit register from device
+ * @ks	  : The chip information
+ * @offset: The register address
+ *
+ * Read a 8bit register from the chip, returning the result
+ */
+static u8 ks_rdreg8(struct ks_net *ks, int offset)
+{
+	u16 data;
+	u8 shift_bit = offset & 0x03;
+	u8 shift_data = (offset & 1) << 3;
+	ks->cmd_reg_cache = (u16) offset | (u16)(BE0 << shift_bit);
+	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
+	data  = ioread16(ks->hw_addr);
+	return (u8)(data >> shift_data);
+}
+
+/**
+ * ks_rdreg16 - read 16 bit register from device
+ * @ks	  : The chip information
+ * @offset: The register address
+ *
+ * Read a 16bit register from the chip, returning the result
+ */
+
+static u16 ks_rdreg16(struct ks_net *ks, int offset)
+{
+	ks->cmd_reg_cache = (u16)offset | ((BE1 | BE0) << (offset & 0x02));
+	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
+	return ioread16(ks->hw_addr);
+}
+
+/**
+ * ks_wrreg8 - write 8bit register value to chip
+ * @ks: The chip information
+ * @offset: The register address
+ * @value: The value to write
+ *
+ */
+static void ks_wrreg8(struct ks_net *ks, int offset, u8 value)
+{
+	u8  shift_bit = (offset & 0x03);
+	u16 value_write = (u16)(value << ((offset & 1) << 3));
+	ks->cmd_reg_cache = (u16)offset | (BE0 << shift_bit);
+	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
+	iowrite16(value_write, ks->hw_addr);
+}
+
+/**
+ * ks_wrreg16 - write 16bit register value to chip
+ * @ks: The chip information
+ * @offset: The register address
+ * @value: The value to write
+ *
+ */
+
+static void ks_wrreg16(struct ks_net *ks, int offset, u16 value)
+{
+	ks->cmd_reg_cache = (u16)offset | ((BE1 | BE0) << (offset & 0x02));
+	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
+	iowrite16(value, ks->hw_addr);
+}
+
+/**
+ * ks_inblk - read a block of data from QMU. This is called after sudo DMA mode enabled.
+ * @ks: The chip state
+ * @wptr: buffer address to save data
+ * @len: length in byte to read
+ *
+ */
+static inline void ks_inblk(struct ks_net *ks, u16 *wptr, u32 len)
+{
+	u32 data_port = (u32)ks->hw_addr;
+	len >>= 1;
+	do {
+		*wptr++ = (u16)ioread16(data_port);
+	} while (--len);
+}
+
+/**
+ * ks_outblk - write data to QMU. This is called after sudo DMA mode enabled.
+ * @ks: The chip information
+ * @wptr: buffer address
+ * @len: length in byte to write
+ *
+ */
+static inline void ks_outblk(struct ks_net *ks, u16 *wptr, u32 len)
+{
+	u32 data_port = (u32)ks->hw_addr;
+	len >>= 1;
+	do {
+		iowrite16(*wptr++, data_port);
+	} while (--len);
+}
+
+/**
+ * ks_tx_fifo_space - return the available hardware buffer size.
+ * @ks: The chip information
+ *
+ */
+static inline u16 ks_tx_fifo_space(struct ks_net *ks)
+{
+	return ks_rdreg16(ks, KS_TXMIR) & 0x1fff;
+}
+
+/**
+ * ks_save_cmd_reg - save the command register from the cache.
+ * @ks: The chip information
+ *
+ */
+static inline void ks_save_cmd_reg(struct ks_net *ks)
+{
+	/*ks8851 MLL has a bug to read back the command register.
+	* So rely on software to save the content of command register.
+	*/
+	ks->cmd_reg_cache_int = ks->cmd_reg_cache;
+}
+
+/**
+ * ks_restore_cmd_reg - restore the command register from the cache and
+ * 	write to hardware register.
+ * @ks: The chip information
+ *
+ */
+static inline void ks_restore_cmd_reg(struct ks_net *ks)
+{
+	ks->cmd_reg_cache = ks->cmd_reg_cache_int;
+	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
+}
+
+/**
+ * ks_set_powermode - set power mode of the device
+ * @ks: The chip information
+ * @pwrmode: The power mode value to write to KS_PMECR.
+ *
+ * Change the power mode of the chip.
+ */
+static void ks_set_powermode(struct ks_net *ks, unsigned pwrmode)
+{
+	unsigned pmecr;
+
+	if (netif_msg_hw(ks))
+		ks_dbg(ks, "setting power mode %d\n", pwrmode);
+
+	ks_rdreg16(ks, KS_GRR);
+	pmecr = ks_rdreg16(ks, KS_PMECR);
+	pmecr &= ~PMECR_PM_MASK;
+	pmecr |= pwrmode;
+
+	ks_wrreg16(ks, KS_PMECR, pmecr);
+}
+
+/**
+ * ks_read_config - read chip configuration of bus width.
+ * @ks: The chip information
+ *
+ */
+static void ks_read_config(struct ks_net *ks)
+{
+	u16 reg_data = 0;
+
+	/* Regardless of bus width, 8 bit read should always work.*/
+	reg_data = ks_rdreg8(ks, KS_CCR) & 0x00FF;
+	reg_data |= ks_rdreg8(ks, KS_CCR+1) << 8;
+
+	/* addr/data bus are multiplexed */
+	ks->sharedbus = (reg_data & CCR_SHARED) == CCR_SHARED;
+
+	/* There are garbage data when reading data from QMU,
+	depending on bus-width.
+	*/
+
+	if (reg_data & CCR_8BIT) {
+		ks->bus_width = ENUM_BUS_8BIT;
+		ks->extra_byte = 1;
+	} else if (reg_data & CCR_16BIT) {
+		ks->bus_width = ENUM_BUS_16BIT;
+		ks->extra_byte = 2;
+	} else {
+		ks->bus_width = ENUM_BUS_32BIT;
+		ks->extra_byte = 4;
+	}
+}
+
+/**
+ * ks_soft_reset - issue one of the soft reset to the device
+ * @ks: The device state.
+ * @op: The bit(s) to set in the GRR
+ *
+ * Issue the relevant soft-reset command to the device's GRR register
+ * specified by @op.
+ *
+ * Note, the delays are in there as a caution to ensure that the reset
+ * has time to take effect and then complete. Since the datasheet does
+ * not currently specify the exact sequence, we have chosen something
+ * that seems to work with our device.
+ */
+static void ks_soft_reset(struct ks_net *ks, unsigned op)
+{
+	/* Disable interrupt first */
+	ks_wrreg16(ks, KS_IER, 0x0000);
+	ks_wrreg16(ks, KS_GRR, op);
+	mdelay(10);	/* wait a short time to effect reset */
+	ks_wrreg16(ks, KS_GRR, 0);
+	mdelay(1);	/* wait for condition to clear */
+}
+
+
+/**
+ * ks_read_qmu - read 1 pkt data from the QMU.
+ * @ks: The chip information
+ * @buf: buffer address to save 1 pkt
+ * @len: Pkt length
+ * Here is the sequence to read 1 pkt:
+ *	1. set sudo DMA mode
+ *	2. read prepend data
+ *	3. read pkt data
+ *	4. reset sudo DMA Mode
+ */
+static inline void ks_read_qmu(struct ks_net *ks, u16 *buf, u32 len)
+{
+	u32 r =  ks->extra_byte & 0x1 ;
+	u32 w = ks->extra_byte - r;
+
+	/* 1. set sudo DMA mode */
+	ks_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI);
+	ks_wrreg8(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_SDA) & 0xff);
+
+	/* 2. read prepend data */
+	/**
+	 * read 4 + extra bytes and discard them.
+	 * extra bytes for dummy, 2 for status, 2 for len
+	 */
+
+	/* use likely(r) for 8 bit access for performance */
+	if (unlikely(r))
+		ioread8(ks->hw_addr);
+	ks_inblk(ks, buf, w + 2 + 2);
+
+	/* 3. read pkt data */
+	ks_inblk(ks, buf, ALIGN(len, 4));
+
+	/* 4. reset sudo DMA Mode */
+	ks_wrreg8(ks, KS_RXQCR, ks->rc_rxqcr);
+}
+
+/**
+ * ks_rcv - read multiple pkts data from the QMU.
+ * @ks: The chip information
+ * @netdev: The network device being opened.
+ *
+ * Read all of header information before reading pkt content.
+ * It is not allowed only port of pkts in QMU after issuing
+ * interrupt ack.
+ */
+static void ks_rcv(struct ks_net *ks, struct net_device *netdev)
+{
+	u32	i;
+	struct type_frame_head *frame_hdr = ks->frame_head_info;
+	struct sk_buff *skb;
+
+	ks->frame_cnt = ks_rdreg16(ks, KS_RXFCTR) >> 8;
+
+	/* read all header information */
+	for (i = 0; i < ks->frame_cnt; i++) {
+		/* Checking Received packet status */
+		frame_hdr->sts = ks_rdreg16(ks, KS_RXFHSR);
+		/* Get packet len from hardware */
+		frame_hdr->len = ks_rdreg16(ks, KS_RXFHBCR);
+		frame_hdr++;
+	}
+
+	frame_hdr = ks->frame_head_info;
+	while (ks->frame_cnt--) {
+		skb = dev_alloc_skb(frame_hdr->len + 16);
+		if (likely(skb && (frame_hdr->sts & RXFSHR_RXFV) &&
+			(frame_hdr->len < RX_BUF_SIZE) && frame_hdr->len)) {
+			skb_reserve(skb, 2);
+			/* read data block including CRC 4 bytes */
+			ks_read_qmu(ks, (u16 *)skb->data, frame_hdr->len + 4);
+			skb_put(skb, frame_hdr->len);
+			skb->dev = netdev;
+			skb->protocol = eth_type_trans(skb, netdev);
+			netif_rx(skb);
+		} else {
+			printk(KERN_ERR "%s: err:skb alloc\n", __func__);
+			ks_wrreg16(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_RRXEF));
+			if (skb)
+				dev_kfree_skb_irq(skb);
+		}
+		frame_hdr++;
+	}
+}
+
+/**
+ * ks_update_link_status - link status update.
+ * @netdev: The network device being opened.
+ * @ks: The chip information
+ *
+ */
+
+static void ks_update_link_status(struct net_device *netdev, struct ks_net *ks)
+{
+	/* check the status of the link */
+	u32 link_up_status;
+	if (ks_rdreg16(ks, KS_P1SR) & P1SR_LINK_GOOD) {
+		netif_carrier_on(netdev);
+		link_up_status = true;
+	} else {
+		netif_carrier_off(netdev);
+		link_up_status = false;
+	}
+	if (netif_msg_link(ks))
+		ks_dbg(ks, "%s: %s\n",
+			__func__, link_up_status ? "UP" : "DOWN");
+}
+
+/**
+ * ks_irq - device interrupt handler
+ * @irq: Interrupt number passed from the IRQ hnalder.
+ * @pw: The private word passed to register_irq(), our struct ks_net.
+ *
+ * This is the handler invoked to find out what happened
+ *
+ * Read the interrupt status, work out what needs to be done and then clear
+ * any of the interrupts that are not needed.
+ */
+
+static irqreturn_t ks_irq(int irq, void *pw)
+{
+	struct ks_net *ks = pw;
+	struct net_device *netdev = ks->netdev;
+	u16 status;
+
+	/*this should be the first in IRQ handler */
+	ks_save_cmd_reg(ks);
+
+	status = ks_rdreg16(ks, KS_ISR);
+	ks_wrreg16(ks, KS_ISR, status);
+
+	if (likely(status & IRQ_RXI))
+		ks_rcv(ks, netdev);
+
+	if (unlikely(status & IRQ_LCI))
+		ks_update_link_status(netdev, ks);
+
+	if (unlikely(status & IRQ_TXI))
+		netif_wake_queue(netdev);
+
+	if (unlikely(status & IRQ_LDI)) {
+
+		u16 pmecr = ks_rdreg16(ks, KS_PMECR);
+		pmecr &= ~PMECR_WKEVT_MASK;
+		ks_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
+	}
+
+	/* this should be the last in IRQ handler*/
+	ks_restore_cmd_reg(ks);
+	return IRQ_HANDLED;
+}
+
+
+/**
+ * ks_net_open - open network device
+ * @netdev: The network device being opened.
+ *
+ * Called when the network device is marked active, such as a user executing
+ * 'ifconfig up' on the device.
+ */
+static int ks_net_open(struct net_device *netdev)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	int err;
+
+#define	KS_INT_FLAGS	(IRQF_DISABLED|IRQF_TRIGGER_LOW)
+	/* lock the card, even if we may not actually do anything
+	 * else at the moment.
+	 */
+	mutex_lock(&ks->lock);
+
+	if (netif_msg_ifup(ks))
+		ks_dbg(ks, "%s - entry\n", __func__);
+
+	/* reset the HW */
+	err = request_irq(ks->irq, ks_irq, KS_INT_FLAGS, DRV_NAME, ks);
+
+	if (err) {
+		printk(KERN_ERR "Failed to request IRQ: %d: %d\n",
+			ks->irq, err);
+		return err;
+	}
+
+	if (netif_msg_ifup(ks))
+		ks_dbg(ks, "network device %s up\n", netdev->name);
+
+	mutex_unlock(&ks->lock);
+
+	return 0;
+}
+
+/**
+ * ks_net_stop - close network device
+ * @netdev: The device being closed.
+ *
+ * Called to close down a network device which has been active. Cancell any
+ * work, shutdown the RX and TX process and then place the chip into a low
+ * power state whilst it is not being used.
+ */
+static int ks_net_stop(struct net_device *netdev)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+
+	if (netif_msg_ifdown(ks))
+		ks_info(ks, "%s: shutting down\n", netdev->name);
+
+	netif_stop_queue(netdev);
+
+	kfree(ks->frame_head_info);
+
+	mutex_lock(&ks->lock);
+
+	/* turn off the IRQs and ack any outstanding */
+	ks_wrreg16(ks, KS_IER, 0x0000);
+	ks_wrreg16(ks, KS_ISR, 0xffff);
+
+	/* shutdown RX process */
+	ks_wrreg16(ks, KS_RXCR1, 0x0000);
+
+	/* shutdown TX process */
+	ks_wrreg16(ks, KS_TXCR, 0x0000);
+
+	/* set powermode to soft power down to save power */
+	ks_set_powermode(ks, PMECR_PM_SOFTDOWN);
+	free_irq(ks->irq, netdev);
+	mutex_unlock(&ks->lock);
+	return 0;
+}
+
+
+/**
+ * ks_write_qmu - write 1 pkt data to the QMU.
+ * @ks: The chip information
+ * @pdata: buffer address to save 1 pkt
+ * @len: Pkt length in byte
+ * Here is the sequence to write 1 pkt:
+ *	1. set sudo DMA mode
+ *	2. write status/length
+ *	3. write pkt data
+ *	4. reset sudo DMA Mode
+ *	5. reset sudo DMA mode
+ *	6. Wait until pkt is out
+ */
+static void ks_write_qmu(struct ks_net *ks, u8 *pdata, u16 len)
+{
+	unsigned fid = ks->fid;
+
+	fid = ks->fid;
+	ks->fid = (ks->fid + 1) & TXFR_TXFID_MASK;
+
+	/* reduce the tx interrupt occurrances. */
+	if (!fid)
+		fid |= TXFR_TXIC;       /* irq on completion */
+
+	/* start header at txb[0] to align txw entries */
+	ks->txh.txw[0] = cpu_to_le16(fid);
+	ks->txh.txw[1] = cpu_to_le16(len);
+
+	/* 1. set sudo-DMA mode */
+	ks_wrreg8(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_SDA) & 0xff);
+	/* 2. write status/lenth info */
+	ks_outblk(ks, ks->txh.txw, 4);
+	/* 3. write pkt data */
+	ks_outblk(ks, (u16 *)pdata, ALIGN(len, 4));
+	/* 4. reset sudo-DMA mode */
+	ks_wrreg8(ks, KS_RXQCR, ks->rc_rxqcr);
+	/* 5. Enqueue Tx(move the pkt from TX buffer into TXQ) */
+	ks_wrreg16(ks, KS_TXQCR, TXQCR_METFE);
+	/* 6. wait until TXQCR_METFE is auto-cleared */
+	while (ks_rdreg16(ks, KS_TXQCR) & TXQCR_METFE)
+		;
+}
+
+static void ks_disable_int(struct ks_net *ks)
+{
+	ks_wrreg16(ks, KS_IER, 0x0000);
+}  /* ks_disable_int */
+
+static void ks_enable_int(struct ks_net *ks)
+{
+	ks_wrreg16(ks, KS_IER, ks->rc_ier);
+}  /* ks_enable_int */
+
+/**
+ * ks_start_xmit - transmit packet
+ * @skb		: The buffer to transmit
+ * @netdev	: The device used to transmit the packet.
+ *
+ * Called by the network layer to transmit the @skb.
+ * spin_lock_irqsave is required because tx and rx should be mutual exclusive.
+ * So while tx is in-progress, prevent IRQ interrupt from happenning.
+ */
+static int ks_start_xmit(struct sk_buff *skb, struct net_device *netdev)
+{
+	int retv = NETDEV_TX_OK;
+	struct ks_net *ks = netdev_priv(netdev);
+
+	disable_irq(netdev->irq);
+	ks_disable_int(ks);
+	spin_lock(&ks->statelock);
+
+	/* Extra space are required:
+	*  4 byte for alignment, 4 for status/length, 4 for CRC
+	*/
+
+	if (likely(ks_tx_fifo_space(ks) >= skb->len + 12)) {
+		ks_write_qmu(ks, skb->data, skb->len);
+		dev_kfree_skb(skb);
+	} else
+		retv = NETDEV_TX_BUSY;
+	spin_unlock(&ks->statelock);
+	ks_enable_int(ks);
+	enable_irq(netdev->irq);
+	return retv;
+}
+
+/**
+ * ks_start_rx - ready to serve pkts
+ * @ks		: The chip information
+ *
+ */
+static void ks_start_rx(struct ks_net *ks)
+{
+	u16 cntl;
+
+	/* Enables QMU Receive (RXCR1). */
+	cntl = ks_rdreg16(ks, KS_RXCR1);
+	cntl |= RXCR1_RXE ;
+	ks_wrreg16(ks, KS_RXCR1, cntl);
+}  /* ks_start_rx */
+
+/**
+ * ks_stop_rx - stop to serve pkts
+ * @ks		: The chip information
+ *
+ */
+static void ks_stop_rx(struct ks_net *ks)
+{
+	u16 cntl;
+
+	/* Disables QMU Receive (RXCR1). */
+	cntl = ks_rdreg16(ks, KS_RXCR1);
+	cntl &= ~RXCR1_RXE ;
+	ks_wrreg16(ks, KS_RXCR1, cntl);
+
+}  /* ks_stop_rx */
+
+static unsigned long const ethernet_polynomial = 0x04c11db7U;
+
+static unsigned long ether_gen_crc(int length, u8 *data)
+{
+	long crc = -1;
+	while (--length >= 0) {
+		u8 current_octet = *data++;
+		int bit;
+
+		for (bit = 0; bit < 8; bit++, current_octet >>= 1) {
+			crc = (crc << 1) ^
+				((crc < 0) ^ (current_octet & 1) ?
+			ethernet_polynomial : 0);
+		}
+	}
+	return (unsigned long)crc;
+}  /* ether_gen_crc */
+
+/**
+* ks_set_grpaddr - set multicast information
+* @ks : The chip information
+*/
+
+static void ks_set_grpaddr(struct ks_net *ks)
+{
+	u8	i;
+	u32	index, position, value;
+
+	memset(ks->mcast_bits, 0, sizeof(u8) * HW_MCAST_SIZE);
+
+	for (i = 0; i < ks->mcast_lst_size; i++) {
+		position = (ether_gen_crc(6, ks->mcast_lst[i]) >> 26) & 0x3f;
+		index = position >> 3;
+		value = 1 << (position & 7);
+		ks->mcast_bits[index] |= (u8)value;
+	}
+
+	for (i  = 0; i < HW_MCAST_SIZE; i++) {
+		if (i & 1) {
+			ks_wrreg16(ks, (u16)((KS_MAHTR0 + i) & ~1),
+				(ks->mcast_bits[i] << 8) |
+				ks->mcast_bits[i - 1]);
+		}
+	}
+}  /* ks_set_grpaddr */
+
+/*
+* ks_clear_mcast - clear multicast information
+*
+* @ks : The chip information
+* This routine removes all mcast addresses set in the hardware.
+*/
+
+static void ks_clear_mcast(struct ks_net *ks)
+{
+	u16	i, mcast_size;
+	for (i = 0; i < HW_MCAST_SIZE; i++)
+		ks->mcast_bits[i] = 0;
+
+	mcast_size = HW_MCAST_SIZE >> 2;
+	for (i = 0; i < mcast_size; i++)
+		ks_wrreg16(ks, KS_MAHTR0 + (2*i), 0);
+}
+
+static void ks_set_promis(struct ks_net *ks, u16 promiscuous_mode)
+{
+	u16		cntl;
+	ks->promiscuous = promiscuous_mode;
+	ks_stop_rx(ks);  /* Stop receiving for reconfiguration */
+	cntl = ks_rdreg16(ks, KS_RXCR1);
+
+	cntl &= ~RXCR1_FILTER_MASK;
+	if (promiscuous_mode)
+		/* Enable Promiscuous mode */
+		cntl |= RXCR1_RXAE | RXCR1_RXINVF;
+	else
+		/* Disable Promiscuous mode (default normal mode) */
+		cntl |= RXCR1_RXPAFMA;
+
+	ks_wrreg16(ks, KS_RXCR1, cntl);
+
+	if (ks->enabled)
+		ks_start_rx(ks);
+
+}  /* ks_set_promis */
+
+static void ks_set_mcast(struct ks_net *ks, u16 mcast)
+{
+	u16	cntl;
+
+	ks->all_mcast = mcast;
+	ks_stop_rx(ks);  /* Stop receiving for reconfiguration */
+	cntl = ks_rdreg16(ks, KS_RXCR1);
+	cntl &= ~RXCR1_FILTER_MASK;
+	if (mcast)
+		/* Enable "Perfect with Multicast address passed mode" */
+		cntl |= (RXCR1_RXAE | RXCR1_RXMAFMA | RXCR1_RXPAFMA);
+	else
+		/**
+		 * Disable "Perfect with Multicast address passed
+		 * mode" (normal mode).
+		 */
+		cntl |= RXCR1_RXPAFMA;
+
+	ks_wrreg16(ks, KS_RXCR1, cntl);
+
+	if (ks->enabled)
+		ks_start_rx(ks);
+}  /* ks_set_mcast */
+
+static void ks_set_rx_mode(struct net_device *netdev)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	struct dev_mc_list *ptr;
+
+	/* Turn on/off promiscuous mode. */
+	if ((netdev->flags & IFF_PROMISC) == IFF_PROMISC)
+		ks_set_promis(ks,
+			(u16)((netdev->flags & IFF_PROMISC) == IFF_PROMISC));
+	/* Turn on/off all mcast mode. */
+	else if ((netdev->flags & IFF_ALLMULTI) == IFF_ALLMULTI)
+		ks_set_mcast(ks,
+			(u16)((netdev->flags & IFF_ALLMULTI) == IFF_ALLMULTI));
+	else
+		ks_set_promis(ks, false);
+
+	if ((netdev->flags & IFF_MULTICAST) && netdev->mc_count) {
+		if (netdev->mc_count <= MAX_MCAST_LST) {
+			int i = 0;
+			for (ptr = netdev->mc_list; ptr; ptr = ptr->next) {
+				if (!(*ptr->dmi_addr & 1))
+					continue;
+				if (i >= MAX_MCAST_LST)
+					break;
+				memcpy(ks->mcast_lst[i++], ptr->dmi_addr,
+				MAC_ADDR_LEN);
+			}
+			ks->mcast_lst_size = (u8)i;
+			ks_set_grpaddr(ks);
+		} else {
+			/**
+			 * List too big to support so
+			 * turn on all mcast mode.
+			 */
+			ks->mcast_lst_size = MAX_MCAST_LST;
+			ks_set_mcast(ks, true);
+		}
+	} else {
+		ks->mcast_lst_size = 0;
+		ks_clear_mcast(ks);
+	}
+} /* ks_set_rx_mode */
+
+static void ks_set_mac(struct ks_net *ks, u8 *data)
+{
+	u16 *pw = (u16 *)data;
+	u16 w, u;
+
+	ks_stop_rx(ks);  /* Stop receiving for reconfiguration */
+
+	u = *pw++;
+	w = ((u & 0xFF) << 8) | ((u >> 8) & 0xFF);
+	ks_wrreg16(ks, KS_MARH, w);
+
+	u = *pw++;
+	w = ((u & 0xFF) << 8) | ((u >> 8) & 0xFF);
+	ks_wrreg16(ks, KS_MARM, w);
+
+	u = *pw;
+	w = ((u & 0xFF) << 8) | ((u >> 8) & 0xFF);
+	ks_wrreg16(ks, KS_MARL, w);
+
+	memcpy(ks->mac_addr, data, 6);
+
+	if (ks->enabled)
+		ks_start_rx(ks);
+}
+
+static int ks_set_mac_address(struct net_device *netdev, void *paddr)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	struct sockaddr *addr = paddr;
+	u8 *da;
+
+	memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len);
+
+	da = (u8 *)netdev->dev_addr;
+
+	ks_set_mac(ks, da);
+	return 0;
+}
+
+static int ks_net_ioctl(struct net_device *netdev, struct ifreq *req, int cmd)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+
+	if (!netif_running(netdev))
+		return -EINVAL;
+
+	return generic_mii_ioctl(&ks->mii, if_mii(req), cmd, NULL);
+}
+
+static const struct net_device_ops ks_netdev_ops = {
+	.ndo_open		= ks_net_open,
+	.ndo_stop		= ks_net_stop,
+	.ndo_do_ioctl		= ks_net_ioctl,
+	.ndo_start_xmit		= ks_start_xmit,
+	.ndo_set_mac_address	= ks_set_mac_address,
+	.ndo_set_rx_mode	= ks_set_rx_mode,
+	.ndo_change_mtu		= eth_change_mtu,
+	.ndo_validate_addr	= eth_validate_addr,
+};
+
+/* ethtool support */
+
+static void ks_get_drvinfo(struct net_device *netdev,
+			       struct ethtool_drvinfo *di)
+{
+	strlcpy(di->driver, DRV_NAME, sizeof(di->driver));
+	strlcpy(di->version, "1.00", sizeof(di->version));
+	strlcpy(di->bus_info, dev_name(netdev->dev.parent),
+		sizeof(di->bus_info));
+}
+
+static u32 ks_get_msglevel(struct net_device *netdev)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	return ks->msg_enable;
+}
+
+static void ks_set_msglevel(struct net_device *netdev, u32 to)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	ks->msg_enable = to;
+}
+
+static int ks_get_settings(struct net_device *netdev, struct ethtool_cmd *cmd)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	return mii_ethtool_gset(&ks->mii, cmd);
+}
+
+static int ks_set_settings(struct net_device *netdev, struct ethtool_cmd *cmd)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	return mii_ethtool_sset(&ks->mii, cmd);
+}
+
+static u32 ks_get_link(struct net_device *netdev)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	return mii_link_ok(&ks->mii);
+}
+
+static int ks_nway_reset(struct net_device *netdev)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	return mii_nway_restart(&ks->mii);
+}
+
+static const struct ethtool_ops ks_ethtool_ops = {
+	.get_drvinfo	= ks_get_drvinfo,
+	.get_msglevel	= ks_get_msglevel,
+	.set_msglevel	= ks_set_msglevel,
+	.get_settings	= ks_get_settings,
+	.set_settings	= ks_set_settings,
+	.get_link	= ks_get_link,
+	.nway_reset	= ks_nway_reset,
+};
+
+/* MII interface controls */
+
+/**
+ * ks_phy_reg - convert MII register into a KS8851 register
+ * @reg: MII register number.
+ *
+ * Return the KS8851 register number for the corresponding MII PHY register
+ * if possible. Return zero if the MII register has no direct mapping to the
+ * KS8851 register set.
+ */
+static int ks_phy_reg(int reg)
+{
+	switch (reg) {
+	case MII_BMCR:
+		return KS_P1MBCR;
+	case MII_BMSR:
+		return KS_P1MBSR;
+	case MII_PHYSID1:
+		return KS_PHY1ILR;
+	case MII_PHYSID2:
+		return KS_PHY1IHR;
+	case MII_ADVERTISE:
+		return KS_P1ANAR;
+	case MII_LPA:
+		return KS_P1ANLPR;
+	}
+
+	return 0x0;
+}
+
+/**
+ * ks_phy_read - MII interface PHY register read.
+ * @netdev: The network device the PHY is on.
+ * @phy_addr: Address of PHY (ignored as we only have one)
+ * @reg: The register to read.
+ *
+ * This call reads data from the PHY register specified in @reg. Since the
+ * device does not support all the MII registers, the non-existant values
+ * are always returned as zero.
+ *
+ * We return zero for unsupported registers as the MII code does not check
+ * the value returned for any error status, and simply returns it to the
+ * caller. The mii-tool that the driver was tested with takes any -ve error
+ * as real PHY capabilities, thus displaying incorrect data to the user.
+ */
+static int ks_phy_read(struct net_device *netdev, int phy_addr, int reg)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	int ksreg;
+	int result;
+
+	ksreg = ks_phy_reg(reg);
+	if (!ksreg)
+		return 0x0;	/* no error return allowed, so use zero */
+
+	mutex_lock(&ks->lock);
+	result = ks_rdreg16(ks, ksreg);
+	mutex_unlock(&ks->lock);
+
+	return result;
+}
+
+static void ks_phy_write(struct net_device *netdev,
+			     int phy, int reg, int value)
+{
+	struct ks_net *ks = netdev_priv(netdev);
+	int ksreg;
+
+	ksreg = ks_phy_reg(reg);
+	if (ksreg) {
+		mutex_lock(&ks->lock);
+		ks_wrreg16(ks, ksreg, value);
+		mutex_unlock(&ks->lock);
+	}
+}
+
+/**
+ * ks_read_selftest - read the selftest memory info.
+ * @ks: The device state
+ *
+ * Read and check the TX/RX memory selftest information.
+ */
+static int ks_read_selftest(struct ks_net *ks)
+{
+	unsigned both_done = MBIR_TXMBF | MBIR_RXMBF;
+	int ret = 0;
+	unsigned rd;
+
+	rd = ks_rdreg16(ks, KS_MBIR);
+
+	if ((rd & both_done) != both_done) {
+		ks_warn(ks, "Memory selftest not finished\n");
+		return 0;
+	}
+
+	if (rd & MBIR_TXMBFA) {
+		ks_err(ks, "TX memory selftest fails\n");
+		ret |= 1;
+	}
+
+	if (rd & MBIR_RXMBFA) {
+		ks_err(ks, "RX memory selftest fails\n");
+		ret |= 2;
+	}
+
+	ks_info(ks, "the selftest passes\n");
+	return ret;
+}
+
+static void ks_disable(struct ks_net *ks)
+{
+	u16	w;
+
+	w = ks_rdreg16(ks, KS_TXCR);
+
+	/* Disables QMU Transmit (TXCR). */
+	w  &= ~TXCR_TXE;
+	ks_wrreg16(ks, KS_TXCR, w);
+
+	/* Disables QMU Receive (RXCR1). */
+	w = ks_rdreg16(ks, KS_RXCR1);
+	w &= ~RXCR1_RXE ;
+	ks_wrreg16(ks, KS_RXCR1, w);
+
+	ks->enabled = false;
+
+}  /* ks_disable */
+
+static void ks_setup(struct ks_net *ks)
+{
+	u16	w;
+
+	/**
+	 * Configure QMU Transmit
+	 */
+
+	/* Setup Transmit Frame Data Pointer Auto-Increment (TXFDPR) */
+	ks_wrreg16(ks, KS_TXFDPR, TXFDPR_TXFPAI);
+
+	/* Setup Receive Frame Data Pointer Auto-Increment */
+	ks_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI);
+
+	/* Setup Receive Frame Threshold - 1 frame (RXFCTFC) */
+	ks_wrreg16(ks, KS_RXFCTR, 1 & RXFCTR_THRESHOLD_MASK);
+
+	/* Setup RxQ Command Control (RXQCR) */
+	ks->rc_rxqcr = RXQCR_CMD_CNTL;
+	ks_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr);
+
+	/**
+	 * set the force mode to half duplex, default is full duplex
+	 *  because if the auto-negotiation fails, most switch uses
+	 *  half-duplex.
+	 */
+
+	w = ks_rdreg16(ks, KS_P1MBCR);
+	w &= ~P1MBCR_FORCE_FDX;
+	ks_wrreg16(ks, KS_P1MBCR, w);
+
+	w = TXCR_TXFCE | TXCR_TXPE | TXCR_TXCRC | TXCR_TCGIP;
+	ks_wrreg16(ks, KS_TXCR, w);
+
+	w = RXCR1_RXFCE | RXCR1_RXBE | RXCR1_RXUE;
+
+	if (ks->promiscuous)         /* bPromiscuous */
+		w |= (RXCR1_RXAE | RXCR1_RXINVF);
+	else if (ks->all_mcast) /* Multicast address passed mode */
+		w |= (RXCR1_RXAE | RXCR1_RXMAFMA | RXCR1_RXPAFMA);
+	else                                   /* Normal mode */
+		w |= RXCR1_RXPAFMA;
+
+	ks_wrreg16(ks, KS_RXCR1, w);
+}  /*ks_setup */
+
+
+static void ks_setup_int(struct ks_net *ks)
+{
+	ks->rc_ier = 0x00;
+	/* Clear the interrupts status of the hardware. */
+	ks_wrreg16(ks, KS_ISR, 0xffff);
+
+	/* Enables the interrupts of the hardware. */
+	ks->rc_ier = (IRQ_LCI | IRQ_TXI | IRQ_RXI);
+}  /* ks_setup_int */
+
+void ks_enable(struct ks_net *ks)
+{
+	u16 w;
+
+	w = ks_rdreg16(ks, KS_TXCR);
+	/* Enables QMU Transmit (TXCR). */
+	ks_wrreg16(ks, KS_TXCR, w | TXCR_TXE);
+
+	/*
+	 * RX Frame Count Threshold Enable and Auto-Dequeue RXQ Frame
+	 * Enable
+	 */
+
+	w = ks_rdreg16(ks, KS_RXQCR);
+	ks_wrreg16(ks, KS_RXQCR, w | RXQCR_RXFCTE);
+
+	/* Enables QMU Receive (RXCR1). */
+	w = ks_rdreg16(ks, KS_RXCR1);
+	ks_wrreg16(ks, KS_RXCR1, w | RXCR1_RXE);
+	ks->enabled = true;
+}  /* ks_enable */
+
+static int ks_hw_init(struct ks_net *ks)
+{
+	ks->promiscuous = 0;
+	ks->all_mcast = 0;
+	ks->mcast_lst_size = 0;
+
+	ks->frame_head_info = (struct type_frame_head *) \
+		MALLOC(sizeof(struct type_frame_head) * MAX_RECV_FRAMES);
+	if (!ks->frame_head_info) {
+		printk(KERN_ERR "Error: Fail to allocate frame memory\n");
+		return false;
+	}
+
+	ks_set_mac(ks, KS_DEFAULT_MAC_ADDRESS);
+	return true;
+}
+
+
+static int __devinit ks8851_probe(struct platform_device *pdev)
+{
+	int err = -ENOMEM;
+	struct resource *io_d, *io_c;
+	struct net_device *netdev;
+	struct ks_net *ks;
+	u16 id, data;
+
+	io_d = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	io_c = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+
+	if (!request_mem_region(io_d->start, resource_size(io_d), DRV_NAME))
+		goto err_mem_region;
+
+	if (!request_mem_region(io_c->start, resource_size(io_c), DRV_NAME))
+		goto err_mem_region1;
+
+	netdev = alloc_etherdev(sizeof(struct ks_net));
+	if (!netdev)
+		goto err_alloc_etherdev;
+
+	SET_NETDEV_DEV(netdev, &pdev->dev);
+
+	ks = netdev_priv(netdev);
+	ks->netdev = netdev;
+	ks->hw_addr = ioremap(io_d->start, resource_size(io_d));
+
+	if (!ks->hw_addr)
+		goto err_ioremap;
+
+	ks->hw_addr_cmd = ioremap(io_c->start, resource_size(io_c));
+	if (!ks->hw_addr_cmd)
+		goto err_ioremap1;
+
+	ks->irq = platform_get_irq(pdev, 0);
+
+	if (ks->irq < 0) {
+		err = ks->irq;
+		goto err_get_irq;
+	}
+
+	ks->pdev = pdev;
+
+	mutex_init(&ks->lock);
+	spin_lock_init(&ks->statelock);
+
+	netdev->netdev_ops = &ks_netdev_ops;
+	netdev->ethtool_ops = &ks_ethtool_ops;
+
+	/* setup mii state */
+	ks->mii.dev             = netdev;
+	ks->mii.phy_id          = 1,
+	ks->mii.phy_id_mask     = 1;
+	ks->mii.reg_num_mask    = 0xf;
+	ks->mii.mdio_read       = ks_phy_read;
+	ks->mii.mdio_write      = ks_phy_write;
+
+	ks_info(ks, "message enable is %d\n", msg_enable);
+	/* set the default message enable */
+	ks->msg_enable = netif_msg_init(msg_enable, (NETIF_MSG_DRV |
+						     NETIF_MSG_PROBE |
+						     NETIF_MSG_LINK));
+	ks_read_config(ks);
+
+	/* simple check for a valid chip being connected to the bus */
+	if ((ks_rdreg16(ks, KS_CIDER) & ~CIDER_REV_MASK) != CIDER_ID) {
+		ks_err(ks, "failed to read device ID\n");
+		err = -ENODEV;
+		goto err_register;
+	}
+
+	if (ks_read_selftest(ks)) {
+		ks_err(ks, "failed to read device ID\n");
+		err = -ENODEV;
+		goto err_register;
+	}
+
+	err = register_netdev(netdev);
+	if (err)
+		goto err_register;
+
+	platform_set_drvdata(pdev, netdev);
+
+	ks_soft_reset(ks, GRR_GSR);
+	ks_hw_init(ks);
+	ks_disable(ks);
+	ks_setup(ks);
+	ks_setup_int(ks);
+	ks_enable_int(ks);
+	ks_enable(ks);
+	memcpy(netdev->dev_addr, ks->mac_addr, 6);
+
+	data = ks_rdreg16(ks, KS_OBCR);
+	ks_wrreg16(ks, KS_OBCR, data | OBCR_ODS_16MA);
+
+	/**
+	 * If you want to use the default MAC addr,
+	 * comment out the 2 functions below.
+	 */
+
+	random_ether_addr(netdev->dev_addr);
+	ks_set_mac(ks, netdev->dev_addr);
+
+	id = ks_rdreg16(ks, KS_CIDER);
+
+	printk(KERN_INFO DRV_NAME
+		" Found chip, family: 0x%x, id: 0x%x, rev: 0x%x\n",
+		(id >> 8) & 0xff, (id >> 4) & 0xf, (id >> 1) & 0x7);
+	return 0;
+
+err_register:
+err_get_irq:
+	iounmap(ks->hw_addr_cmd);
+err_ioremap1:
+	iounmap(ks->hw_addr);
+err_ioremap:
+	free_netdev(netdev);
+err_alloc_etherdev:
+	release_mem_region(io_c->start, resource_size(io_c));
+err_mem_region1:
+	release_mem_region(io_d->start, resource_size(io_d));
+err_mem_region:
+	return err;
+}
+
+static int __devexit ks8851_remove(struct platform_device *pdev)
+{
+	struct net_device *netdev = platform_get_drvdata(pdev);
+	struct ks_net *ks = netdev_priv(netdev);
+	struct resource *iomem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+
+	unregister_netdev(netdev);
+	iounmap(ks->hw_addr);
+	free_netdev(netdev);
+	release_mem_region(iomem->start, resource_size(iomem));
+	platform_set_drvdata(pdev, NULL);
+	return 0;
+
+}
+
+static struct platform_driver ks8851_platform_driver = {
+	.driver = {
+		.name = DRV_NAME,
+		.owner = THIS_MODULE,
+	},
+	.probe = ks8851_probe,
+	.remove = __devexit_p(ks8851_remove),
+};
+
+static int __init ks8851_init(void)
+{
+	return platform_driver_register(&ks8851_platform_driver);
+}
+
+static void __exit ks8851_exit(void)
+{
+	platform_driver_unregister(&ks8851_platform_driver);
+}
+
+module_init(ks8851_init);
+module_exit(ks8851_exit);
+
+MODULE_DESCRIPTION("KS8851 MLL Network driver");
+MODULE_AUTHOR("David Choi <david.choi@micrel.com>");
+MODULE_LICENSE("GPL");
+module_param_named(message, msg_enable, int, 0);
+MODULE_PARM_DESC(message, "Message verbosity level (0=none, 31=all)");
+
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -89,6 +89,7 @@ obj-$(CONFIG_SKY2) += sky2.o
 obj-$(CONFIG_SKFP) += skfp/
 obj-$(CONFIG_KS8842)	+= ks8842.o
 obj-$(CONFIG_KS8851)	+= ks8851.o
+obj-$(CONFIG_KS8851_MLL) += ks8851_mll.o
 obj-$(CONFIG_VIA_RHINE) += via-rhine.o
 obj-$(CONFIG_VIA_VELOCITY) += via-velocity.o
 obj-$(CONFIG_ADAPTEC_STARFIRE) += starfire.o

^ permalink raw reply

* [PATCH] d44: the poll handler b44_poll must not enable IRQ unconditionally
From: DDD @ 2009-09-17  2:10 UTC (permalink / raw)
  To: davem, mpm, romieu; +Cc: netdev

net/core/netpoll.c::netpoll_send_skb() calls the poll handler when
it is available. As netconsole can be used from almost any context,
IRQ must not be enabled blindly in the NAPI handler of the driver
which supports netpoll.

Call trace:
netpoll_send_skb()
{
local_irq_save(flags)
  -> netpoll_poll()
    -> poll_napi()
      -> poll_one_napi()
        -> napi->poll()
            -> b44_poll()
local_irq_restore(flags)
}

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
---
 drivers/net/b44.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index 0189dcd..e046943 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -847,23 +847,22 @@ static int b44_poll(struct napi_struct *napi, int budget)
 {
 	struct b44 *bp = container_of(napi, struct b44, napi);
 	int work_done;
+	unsigned long flags;
 
-	spin_lock_irq(&bp->lock);
+	spin_lock_irqsave(&bp->lock, flags);
 
 	if (bp->istat & (ISTAT_TX | ISTAT_TO)) {
 		/* spin_lock(&bp->tx_lock); */
 		b44_tx(bp);
 		/* spin_unlock(&bp->tx_lock); */
 	}
-	spin_unlock_irq(&bp->lock);
+	spin_unlock_irqrestore(&bp->lock, flags);
 
 	work_done = 0;
 	if (bp->istat & ISTAT_RX)
 		work_done += b44_rx(bp, budget);
 
 	if (bp->istat & ISTAT_ERRORS) {
-		unsigned long flags;
-
 		spin_lock_irqsave(&bp->lock, flags);
 		b44_halt(bp);
 		b44_init_rings(bp);
-- 
1.6.0.4



^ permalink raw reply related

* Re: [PATCH 1/2] wl12xx: switch to %pM to print the mac address
From: Ben Hutchings @ 2009-09-17  1:22 UTC (permalink / raw)
  To: Jean-Christophe PLAGNIOL-VILLARD; +Cc: netdev
In-Reply-To: <1253146059-4169-1-git-send-email-plagnioj@jcrosoft.com>

On Thu, 2009-09-17 at 02:07 +0200, Jean-Christophe PLAGNIOL-VILLARD
wrote:
> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
> ---
>  drivers/net/wireless/wl12xx/wl1271_main.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/wl12xx/wl1271_main.c b/drivers/net/wireless/wl12xx/wl1271_main.c
> index d9169b4..f6f8895 100644
> --- a/drivers/net/wireless/wl12xx/wl1271_main.c
> +++ b/drivers/net/wireless/wl12xx/wl1271_main.c
> @@ -644,11 +644,10 @@ static int wl1271_op_config_interface(struct ieee80211_hw *hw,
>  {
>  	struct wl1271 *wl = hw->priv;
>  	struct sk_buff *beacon;
> -	DECLARE_MAC_BUF(mac);
>  	int ret;
>  
>  	wl1271_debug(DEBUG_MAC80211, "mac80211 config_interface bssid %s",
> -		     print_mac(mac, conf->bssid));
> +		     printf("%pM", conf->bssid);
>  	wl1271_dump_ascii(DEBUG_MAC80211, "ssid: ", conf->ssid,
>  			  conf->ssid_len);
>  

That isn't even syntactically valid, let alone correct.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Effective enhancement products that to all kinds of men.
From: Mag Harvey @ 2009-09-17  2:25 UTC (permalink / raw)
  To: netdev

Some extra seconds will help! http://adapd.jitniejuxo.com/


^ permalink raw reply

* Re: igb bandwidth allocation configuration
From: Simon Horman @ 2009-09-17  1:09 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Or Gerlitz, e1000-devel@lists.sourceforge.net, Kirsher, Jeffrey T,
	Alexander Duyck, netdev@vger.kernel.org
In-Reply-To: <20090915222926.GA24467@verge.net.au>

On Wed, Sep 16, 2009 at 08:29:26AM +1000, Simon Horman wrote:
> On Tue, Sep 15, 2009 at 11:01:52AM -0700, Alexander Duyck wrote:
> > Or Gerlitz wrote:
> > >If the rate limiter is exposed as a feature of the VF, it doesn't
> > >matter who really enforces it, the "VF portion" of the HW or the
> > >PF itself. I agree that if you have to program the PF for the rate
> > >of a specific VF, then its more complex. Basically, I would expect
> > >that a VF can be configured with <mac, vlad-id, priority, rate>
> > >such that it can be done where the VF NIC is spawned, host kernel
> > >or guest kernel.
> > 
> > Adding the rate limiter as a feature of the VF doesn't make much
> > sense since the VF could be direct assigned to another OS for all we
> > know so we won't have control over it from there.
> > 
> > The interface for all of this would make sense as part of a virtual
> > ethernet switch control which is the way I am currently leaning on
> > all this.  As such it is probably another thing we can bring up at
> > the BOF session at the Linux Plumbers Conference.
> 
> Unfortunately I won't be able to make it to the BOF or Plumbers.
> I look forward to hearing what is discussed.

Is there any chance of being able to participate in this remotely,
even just a listen-only feed of some sort would be quite valuable to me
and possibly others.

------------------------------------------------------------------------------
Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
http://p.sf.net/sfu/devconf

^ permalink raw reply

* Re: [PATCH 1/2] wl12xx: switch to %pM to print the mac address
From: John W. Linville @ 2009-09-17  0:28 UTC (permalink / raw)
  To: Jean-Christophe PLAGNIOL-VILLARD; +Cc: netdev
In-Reply-To: <1253146059-4169-1-git-send-email-plagnioj@jcrosoft.com>

On Thu, Sep 17, 2009 at 02:07:38AM +0200, Jean-Christophe PLAGNIOL-VILLARD wrote:
> Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
> ---
>  drivers/net/wireless/wl12xx/wl1271_main.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/wl12xx/wl1271_main.c b/drivers/net/wireless/wl12xx/wl1271_main.c
> index d9169b4..f6f8895 100644
> --- a/drivers/net/wireless/wl12xx/wl1271_main.c
> +++ b/drivers/net/wireless/wl12xx/wl1271_main.c
> @@ -644,11 +644,10 @@ static int wl1271_op_config_interface(struct ieee80211_hw *hw,
>  {
>  	struct wl1271 *wl = hw->priv;
>  	struct sk_buff *beacon;
> -	DECLARE_MAC_BUF(mac);
>  	int ret;
>  
>  	wl1271_debug(DEBUG_MAC80211, "mac80211 config_interface bssid %s",
> -		     print_mac(mac, conf->bssid));
> +		     printf("%pM", conf->bssid);
>  	wl1271_dump_ascii(DEBUG_MAC80211, "ssid: ", conf->ssid,
>  			  conf->ssid_len);
>  

ACK

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: Ignore route option with ROUTER_PREF_INVALID
From: David Miller @ 2009-09-17  0:12 UTC (permalink / raw)
  To: me; +Cc: netdev
In-Reply-To: <1252599911.5980.16.camel@fnki-nb00130>

From: Jens Rosenboom <me@jayr.de>
Date: Thu, 10 Sep 2009 18:25:11 +0200

> RFC4191 says that "If the Reserved (10) value is received, the Route
> Information Option MUST be ignored.", so this patch makes us conform
> to the RFC. This is different to the usage of the Default Router
> Preference, where an invalid value must indeed be treated as
> PREF_MEDIUM.
> 
> Signed-off-by: Jens Rosenboom <me@jayr.de>

Applied, thanks Jens.

^ permalink raw reply

* [PATCH 1/2] wl12xx: switch to %pM to print the mac address
From: Jean-Christophe PLAGNIOL-VILLARD @ 2009-09-17  0:07 UTC (permalink / raw)
  To: netdev; +Cc: Jean-Christophe PLAGNIOL-VILLARD

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
---
 drivers/net/wireless/wl12xx/wl1271_main.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/wl12xx/wl1271_main.c b/drivers/net/wireless/wl12xx/wl1271_main.c
index d9169b4..f6f8895 100644
--- a/drivers/net/wireless/wl12xx/wl1271_main.c
+++ b/drivers/net/wireless/wl12xx/wl1271_main.c
@@ -644,11 +644,10 @@ static int wl1271_op_config_interface(struct ieee80211_hw *hw,
 {
 	struct wl1271 *wl = hw->priv;
 	struct sk_buff *beacon;
-	DECLARE_MAC_BUF(mac);
 	int ret;
 
 	wl1271_debug(DEBUG_MAC80211, "mac80211 config_interface bssid %s",
-		     print_mac(mac, conf->bssid));
+		     printf("%pM", conf->bssid);
 	wl1271_dump_ascii(DEBUG_MAC80211, "ssid: ", conf->ssid,
 			  conf->ssid_len);
 
-- 
1.6.4


^ permalink raw reply related

* [PATCH 2/2] net: remove print_mac as it's not anymore used
From: Jean-Christophe PLAGNIOL-VILLARD @ 2009-09-17  0:07 UTC (permalink / raw)
  To: netdev; +Cc: Jean-Christophe PLAGNIOL-VILLARD
In-Reply-To: <1253146059-4169-1-git-send-email-plagnioj@jcrosoft.com>

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
---
 include/linux/if_ether.h |    8 --------
 net/ethernet/eth.c       |    7 -------
 2 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/include/linux/if_ether.h b/include/linux/if_ether.h
index 580b600..b1a19a7 100644
--- a/include/linux/if_ether.h
+++ b/include/linux/if_ether.h
@@ -136,14 +136,6 @@ extern struct ctl_table ether_table[];
 
 extern ssize_t sysfs_format_mac(char *buf, const unsigned char *addr, int len);
 
-/*
- *	Display a 6 byte device address (MAC) in a readable format.
- */
-extern char *print_mac(char *buf, const unsigned char *addr) __deprecated;
-#define MAC_FMT "%02x:%02x:%02x:%02x:%02x:%02x"
-#define MAC_BUF_SIZE	18
-#define DECLARE_MAC_BUF(var) char var[MAC_BUF_SIZE]
-
 #endif
 
 #endif	/* _LINUX_IF_ETHER_H */
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index 5a883af..dd3db88 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -393,10 +393,3 @@ ssize_t sysfs_format_mac(char *buf, const unsigned char *addr, int len)
 	return ((ssize_t) l);
 }
 EXPORT_SYMBOL(sysfs_format_mac);
-
-char *print_mac(char *buf, const unsigned char *addr)
-{
-	_format_mac_addr(buf, MAC_BUF_SIZE, addr, ETH_ALEN);
-	return buf;
-}
-EXPORT_SYMBOL(print_mac);
-- 
1.6.4


^ permalink raw reply related

* Re: [PATCH net-next-2.6] bonding: make ab_arp select active slaves as other modes
From: David Miller @ 2009-09-17  0:05 UTC (permalink / raw)
  To: fubar; +Cc: jpirko, netdev, bonding-devel, nicolas.2p.debian
In-Reply-To: <30814.1253145765@death.nxdomain.ibm.com>

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Wed, 16 Sep 2009 17:02:45 -0700

> Jiri Pirko <jpirko@redhat.com> wrote:
> 
>>When I was implementing primary_passive option (formely named primary_lazy) I've
>>run into troubles with ab_arp. This is the only mode which is not using
>>bond_select_active_slave() function to select active slave and instead it
>>selects it itself. This seems to be not the right behaviour and it would be
>>better to do it in bond_select_active_slave() for all cases. This patch makes
>>this happen. Please review.
>>
>>Signed-off-by: Jiri Pirko <jpirko@redhat.com>
> 
> 	I tried to break this, and couldn't.  Tested with regular
> ethernet interfaces, as well as VLANs, and it does the right thing.
> 
> 	-J
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: pull request: wireless-next-2.6 2009-09-16
From: David Miller @ 2009-09-17  0:03 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20090916204150.GG10634@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Wed, 16 Sep 2009 16:41:51 -0400

> Dave,
> 
> Here is a batch of fixes for 2.6.32...nothing too controversial
> AFAICT...
> 
> Please let me know if there are problems!

Pulled, thanks John.

^ permalink raw reply

* Re: [PATCH net-next-2.6] bonding: make ab_arp select active slaves as other modes
From: Jay Vosburgh @ 2009-09-17  0:02 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, bonding-devel, nicolas.2p.debian
In-Reply-To: <20090831210937.GA3152@psychotron.redhat.com>

Jiri Pirko <jpirko@redhat.com> wrote:

>When I was implementing primary_passive option (formely named primary_lazy) I've
>run into troubles with ab_arp. This is the only mode which is not using
>bond_select_active_slave() function to select active slave and instead it
>selects it itself. This seems to be not the right behaviour and it would be
>better to do it in bond_select_active_slave() for all cases. This patch makes
>this happen. Please review.
>
>Signed-off-by: Jiri Pirko <jpirko@redhat.com>

	I tried to break this, and couldn't.  Tested with regular
ethernet interfaces, as well as VLANs, and it does the right thing.

	-J

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>


>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 7c0e0bd..6ebd88d 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1093,15 +1093,8 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
> 			return NULL; /* still no slave, return NULL */
> 	}
>
>-	/*
>-	 * first try the primary link; if arping, a link must tx/rx
>-	 * traffic before it can be considered the curr_active_slave.
>-	 * also, we would skip slaves between the curr_active_slave
>-	 * and primary_slave that may be up and able to arp
>-	 */
> 	if ((bond->primary_slave) &&
>-	    (!bond->params.arp_interval) &&
>-	    (IS_UP(bond->primary_slave->dev))) {
>+	    bond->primary_slave->link == BOND_LINK_UP) {
> 		new_active = bond->primary_slave;
> 	}
>
>@@ -1109,15 +1102,14 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
> 	old_active = new_active;
>
> 	bond_for_each_slave_from(bond, new_active, i, old_active) {
>-		if (IS_UP(new_active->dev)) {
>-			if (new_active->link == BOND_LINK_UP) {
>-				return new_active;
>-			} else if (new_active->link == BOND_LINK_BACK) {
>-				/* link up, but waiting for stabilization */
>-				if (new_active->delay < mintime) {
>-					mintime = new_active->delay;
>-					bestslave = new_active;
>-				}
>+		if (new_active->link == BOND_LINK_UP) {
>+			return new_active;
>+		} else if (new_active->link == BOND_LINK_BACK &&
>+			   IS_UP(new_active->dev)) {
>+			/* link up, but waiting for stabilization */
>+			if (new_active->delay < mintime) {
>+				mintime = new_active->delay;
>+				bestslave = new_active;
> 			}
> 		}
> 	}
>@@ -2929,18 +2921,6 @@ static int bond_ab_arp_inspect(struct bonding *bond, int delta_in_ticks)
> 		}
> 	}
>
>-	read_lock(&bond->curr_slave_lock);
>-
>-	/*
>-	 * Trigger a commit if the primary option setting has changed.
>-	 */
>-	if (bond->primary_slave &&
>-	    (bond->primary_slave != bond->curr_active_slave) &&
>-	    (bond->primary_slave->link == BOND_LINK_UP))
>-		commit++;
>-
>-	read_unlock(&bond->curr_slave_lock);
>-
> 	return commit;
> }
>
>@@ -2961,90 +2941,58 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks)
> 			continue;
>
> 		case BOND_LINK_UP:
>-			write_lock_bh(&bond->curr_slave_lock);
>-
>-			if (!bond->curr_active_slave &&
>-			    time_before_eq(jiffies, dev_trans_start(slave->dev) +
>-					   delta_in_ticks)) {
>+			if ((!bond->curr_active_slave &&
>+			     time_before_eq(jiffies,
>+					    dev_trans_start(slave->dev) +
>+					    delta_in_ticks)) ||
>+			    bond->curr_active_slave != slave) {
> 				slave->link = BOND_LINK_UP;
>-				bond_change_active_slave(bond, slave);
> 				bond->current_arp_slave = NULL;
>
> 				pr_info(DRV_NAME
>-				       ": %s: %s is up and now the "
>-				       "active interface\n",
>-				       bond->dev->name, slave->dev->name);
>-
>-			} else if (bond->curr_active_slave != slave) {
>-				/* this slave has just come up but we
>-				 * already have a current slave; this can
>-				 * also happen if bond_enslave adds a new
>-				 * slave that is up while we are searching
>-				 * for a new slave
>-				 */
>-				slave->link = BOND_LINK_UP;
>-				bond_set_slave_inactive_flags(slave);
>-				bond->current_arp_slave = NULL;
>+					": %s: link status definitely "
>+					"up for interface %s.\n",
>+					bond->dev->name, slave->dev->name);
>
>-				pr_info(DRV_NAME
>-				       ": %s: backup interface %s is now up\n",
>-				       bond->dev->name, slave->dev->name);
>-			}
>+				if (!bond->curr_active_slave ||
>+				    (slave == bond->primary_slave))
>+					goto do_failover;
>
>-			write_unlock_bh(&bond->curr_slave_lock);
>+			}
>
>-			break;
>+			continue;
>
> 		case BOND_LINK_DOWN:
> 			if (slave->link_failure_count < UINT_MAX)
> 				slave->link_failure_count++;
>
> 			slave->link = BOND_LINK_DOWN;
>+			bond_set_slave_inactive_flags(slave);
>
>-			if (slave == bond->curr_active_slave) {
>-				pr_info(DRV_NAME
>-				       ": %s: link status down for active "
>-				       "interface %s, disabling it\n",
>-				       bond->dev->name, slave->dev->name);
>-
>-				bond_set_slave_inactive_flags(slave);
>-
>-				write_lock_bh(&bond->curr_slave_lock);
>-
>-				bond_select_active_slave(bond);
>-				if (bond->curr_active_slave)
>-					bond->curr_active_slave->jiffies =
>-						jiffies;
>-
>-				write_unlock_bh(&bond->curr_slave_lock);
>+			pr_info(DRV_NAME
>+				": %s: link status definitely down for "
>+				"interface %s, disabling it\n",
>+				bond->dev->name, slave->dev->name);
>
>+			if (slave == bond->curr_active_slave) {
> 				bond->current_arp_slave = NULL;
>-
>-			} else if (slave->state == BOND_STATE_BACKUP) {
>-				pr_info(DRV_NAME
>-				       ": %s: backup interface %s is now down\n",
>-				       bond->dev->name, slave->dev->name);
>-
>-				bond_set_slave_inactive_flags(slave);
>+				goto do_failover;
> 			}
>-			break;
>+
>+			continue;
>
> 		default:
> 			pr_err(DRV_NAME
> 			       ": %s: impossible: new_link %d on slave %s\n",
> 			       bond->dev->name, slave->new_link,
> 			       slave->dev->name);
>+			continue;
> 		}
>-	}
>
>-	/*
>-	 * No race with changes to primary via sysfs, as we hold rtnl.
>-	 */
>-	if (bond->primary_slave &&
>-	    (bond->primary_slave != bond->curr_active_slave) &&
>-	    (bond->primary_slave->link == BOND_LINK_UP)) {
>+do_failover:
>+		ASSERT_RTNL();
> 		write_lock_bh(&bond->curr_slave_lock);
>-		bond_change_active_slave(bond, bond->primary_slave);
>+		bond_select_active_slave(bond);
> 		write_unlock_bh(&bond->curr_slave_lock);
> 	}
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 2/4] bonding: make sure tx and rx hash tables stay in sync when using alb mode
From: Andy Gospodarek @ 2009-09-16 23:44 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Andy Gospodarek, netdev, bonding-devel
In-Reply-To: <27763.1253144169@death.nxdomain.ibm.com>

On Wed, Sep 16, 2009 at 04:36:09PM -0700, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> 
> >
> >Subject: [PATCH] bonding: make sure tx and rx hash tables stay in sync when using alb mode
> 
> 	When testing this, I'm getting a lockdep warning.  It appears to
> be unhappy that tlb_choose_channel acquires the tx / rx hash table locks
> in the order tx then rx, but rlb_choose_channel -> alb_get_best_slave
> acquires the locks in the other order.  I applied all four patches, but
> it looks like the change that trips lockdep is in this patch (#2).

Interesting.  I specifically enabled lockdep (or so I thought) because I
wanted to be sure by more than my inspection that there were no deadlock
possibilities.

> 
> 	I haven't gotten an actual deadlock from this, although it seems
> plausible if there are two cpus in bond_alb_xmit at the same time, and
> one of them is sending an ARP.
> 
> 	One fairly straightforward fix would be to combine the rx and tx
> hash table locks into a single lock.  I suspect that wouldn't have any
> real performance penalty, since the rx hash table lock is generally not
> acquired very often (unlike the tx lock, which is taken for every packet
> that goes out).
> 

That will probably work.  I'll take a look at this right away and see
how feasible that is.

> 	Also, FYI, two of the four patches had trailing whitespace.  I
> believe it was #2 and #4.

Grrr, I can't believe I did that. :-/

> 
> 	Thoughts?
> 
> 	Here's the lockdep warning:
> 
> Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)
> bonding: In ALB mode you might experience client disconnections upon reconnection of a link if the bonding module updelay parameter (0 msec) is incompatible with the forwarding delay time of the switch
> bonding: MII link monitoring set to 10 ms
> ADDRCONF(NETDEV_UP): bond0: link is not ready
> tg3 0000:01:07.1: PME# disabled
> bonding: bond0: enslaving eth0 as an active interface with a down link.
> tg3 0000:01:07.0: PME# enabled
> tg3 0000:01:07.0: PME# disabled
> bonding: bond0: enslaving eth1 as an active interface with a down link.
> tg3: eth0: Link is up at 1000 Mbps, full duplex.
> tg3: eth0: Flow control is off for TX and off for RX.
> bonding: bond0: link status definitely up for interface eth0.
> bonding: bond0: making interface eth0 the new active one.
> bonding: bond0: first active interface up!
> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
> tg3: eth1: Link is up at 1000 Mbps, full duplex.
> tg3: eth1: Flow control is off for TX and off for RX.
> bonding: bond0: link status definitely up for interface eth1.
> bonding: bond0: enslaving eth2 as an active interface with a down link.
> e1000: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> bonding: bond0: link status definitely up for interface eth2.
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31-locking #10
> -------------------------------------------------------
> swapper/0 is trying to acquire lock:
>  (&(bond_info->tx_hashtbl_lock)){+.-...}, at: [<e1863ec1>] alb_get_best_slave+0x1b/0x58 [bonding]
> 
> but task is already holding lock:
>  (&(bond_info->rx_hashtbl_lock)){+.-...}, at: [<e1864fe5>] bond_alb_xmit+0x1b7/0x60b [bonding]
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&(bond_info->rx_hashtbl_lock)){+.-...}:
>        [<c014fcbb>] __lock_acquire+0x109f/0x13a0
>        [<c0150064>] lock_acquire+0xa8/0xbf
>        [<c0343678>] _spin_lock_bh+0x2a/0x39
>        [<e186532f>] bond_alb_xmit+0x501/0x60b [bonding]
>        [<e1861a6b>] bond_start_xmit+0x2e9/0x32e [bonding]
>        [<c02dc369>] dev_hard_start_xmit+0x281/0x314
>        [<c02dc81a>] dev_queue_xmit+0x338/0x41b
>        [<c02e1ddc>] neigh_resolve_output+0x260/0x28d
>        [<e170dece>] ip6_output2+0x2dc/0x32a [ipv6]
>        [<e170edaf>] ip6_output+0xe93/0xea0 [ipv6]
>        [<e171be2e>] ndisc_send_skb+0x19d/0x320 [ipv6]
>        [<e171bfeb>] __ndisc_send+0x3a/0x45 [ipv6]
>        [<e171d3df>] ndisc_send_rs+0x34/0x3c [ipv6]
>        [<e171127c>] addrconf_dad_completed+0x5e/0x99 [ipv6]
>        [<e17120df>] addrconf_dad_timer+0x5d/0xe1 [ipv6]
>        [<c0136692>] run_timer_softirq+0x1a0/0x219
>        [<c0132cd9>] __do_softirq+0xd6/0x1a3
>        [<c0132dd1>] do_softirq+0x2b/0x43
>        [<c0132f4e>] irq_exit+0x38/0x74
>        [<c0113669>] smp_apic_timer_interrupt+0x6e/0x7c
>        [<c01032fb>] apic_timer_interrupt+0x2f/0x34
>        [<c0101b43>] cpu_idle+0x49/0x76
>        [<c03335f3>] rest_init+0x67/0x69
>        [<c04937cd>] start_kernel+0x2c1/0x2c6
>        [<c049306a>] __init_begin+0x6a/0x6f
> 
> -> #0 (&(bond_info->tx_hashtbl_lock)){+.-...}:
>        [<c014fa46>] __lock_acquire+0xe2a/0x13a0
>        [<c0150064>] lock_acquire+0xa8/0xbf
>        [<c0343678>] _spin_lock_bh+0x2a/0x39
>        [<e1863ec1>] alb_get_best_slave+0x1b/0x58 [bonding]
>        [<e1865080>] bond_alb_xmit+0x252/0x60b [bonding]
>        [<e1861a6b>] bond_start_xmit+0x2e9/0x32e [bonding]
>        [<c02dc369>] dev_hard_start_xmit+0x281/0x314
>        [<c02dc81a>] dev_queue_xmit+0x338/0x41b
>        [<c03157a8>] arp_send+0x32/0x37
>        [<c0316033>] arp_solicit+0x18a/0x1a1
>        [<c02e3ce3>] neigh_timer_handler+0x1c9/0x20a
>        [<c0136692>] run_timer_softirq+0x1a0/0x219
>        [<c0132cd9>] __do_softirq+0xd6/0x1a3
>        [<c0132dd1>] do_softirq+0x2b/0x43
>        [<c0132f4e>] irq_exit+0x38/0x74
>        [<c0113669>] smp_apic_timer_interrupt+0x6e/0x7c
>        [<c01032fb>] apic_timer_interrupt+0x2f/0x34
>        [<c0101b43>] cpu_idle+0x49/0x76
>        [<c033e289>] start_secondary+0x1ab/0x1b2
> 
> other info that might help us debug this:
> 
> 5 locks held by swapper/0:
>  #0:  (&n->timer){+.-...}, at: [<c0136638>] run_timer_softirq+0x146/0x219
>  #1:  (rcu_read_lock){.+.+..}, at: [<c02dc696>] dev_queue_xmit+0x1b4/0x41b
>  #2:  (&bond->lock){++.?..}, at: [<e1864e6b>] bond_alb_xmit+0x3d/0x60b [bonding]
>  #3:  (&bond->curr_slave_lock){++.?..}, at: [<e1864e7e>] bond_alb_xmit+0x50/0x60b [bonding]
>  #4:  (&(bond_info->rx_hashtbl_lock)){+.-...}, at: [<e1864fe5>] bond_alb_xmit+0x1b7/0x60b [bonding]
> 
> stack backtrace:
> Pid: 0, comm: swapper Not tainted 2.6.31-locking #10
> Call Trace:
>  [<c03408d6>] ? printk+0xf/0x11
>  [<c014e7d5>] print_circular_bug+0x90/0x9c
>  [<c014fa46>] __lock_acquire+0xe2a/0x13a0
>  [<e1864fe5>] ? bond_alb_xmit+0x1b7/0x60b [bonding]
>  [<c0150064>] lock_acquire+0xa8/0xbf
>  [<e1863ec1>] ? alb_get_best_slave+0x1b/0x58 [bonding]
>  [<c0343678>] _spin_lock_bh+0x2a/0x39
>  [<e1863ec1>] ? alb_get_best_slave+0x1b/0x58 [bonding]
>  [<e1863ec1>] alb_get_best_slave+0x1b/0x58 [bonding]
>  [<e1865080>] bond_alb_xmit+0x252/0x60b [bonding]
>  [<c014dff7>] ? mark_held_locks+0x43/0x5b
>  [<c014e211>] ? trace_hardirqs_on_caller+0xe6/0x120
>  [<e1861a6b>] bond_start_xmit+0x2e9/0x32e [bonding]
>  [<c02dc369>] dev_hard_start_xmit+0x281/0x314
>  [<c02dc81a>] dev_queue_xmit+0x338/0x41b
>  [<c031579c>] ? arp_send+0x26/0x37
>  [<c03157a8>] arp_send+0x32/0x37
>  [<c0316033>] arp_solicit+0x18a/0x1a1
>  [<c02e3ce3>] neigh_timer_handler+0x1c9/0x20a
>  [<c0136692>] run_timer_softirq+0x1a0/0x219
>  [<c0136638>] ? run_timer_softirq+0x146/0x219
>  [<c02e3b1a>] ? neigh_timer_handler+0x0/0x20a
>  [<c0132cd9>] __do_softirq+0xd6/0x1a3
>  [<c0132dd1>] do_softirq+0x2b/0x43
>  [<c0132f4e>] irq_exit+0x38/0x74
>  [<c0113669>] smp_apic_timer_interrupt+0x6e/0x7c
>  [<c01032fb>] apic_timer_interrupt+0x2f/0x34
>  [<c0101b3d>] ? cpu_idle+0x43/0x76
>  [<c014007b>] ? sys_timer_create+0x205/0x304
>  [<c0108125>] ? default_idle+0x8a/0xef
>  [<c014b25c>] ? tick_nohz_restart_sched_tick+0x12f/0x138
>  [<c0101b43>] cpu_idle+0x49/0x76
>  [<c033e289>] start_secondary+0x1ab/0x1b2
> 
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 2/4] bonding: make sure tx and rx hash tables stay in sync when using alb mode
From: Jay Vosburgh @ 2009-09-16 23:36 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev, bonding-devel
In-Reply-To: <20090911211112.GR8515@gospo.rdu.redhat.com>

Andy Gospodarek <andy@greyhouse.net> wrote:

>
>Subject: [PATCH] bonding: make sure tx and rx hash tables stay in sync when using alb mode

	When testing this, I'm getting a lockdep warning.  It appears to
be unhappy that tlb_choose_channel acquires the tx / rx hash table locks
in the order tx then rx, but rlb_choose_channel -> alb_get_best_slave
acquires the locks in the other order.  I applied all four patches, but
it looks like the change that trips lockdep is in this patch (#2).

	I haven't gotten an actual deadlock from this, although it seems
plausible if there are two cpus in bond_alb_xmit at the same time, and
one of them is sending an ARP.

	One fairly straightforward fix would be to combine the rx and tx
hash table locks into a single lock.  I suspect that wouldn't have any
real performance penalty, since the rx hash table lock is generally not
acquired very often (unlike the tx lock, which is taken for every packet
that goes out).

	Also, FYI, two of the four patches had trailing whitespace.  I
believe it was #2 and #4.

	Thoughts?

	Here's the lockdep warning:

Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)
bonding: In ALB mode you might experience client disconnections upon reconnection of a link if the bonding module updelay parameter (0 msec) is incompatible with the forwarding delay time of the switch
bonding: MII link monitoring set to 10 ms
ADDRCONF(NETDEV_UP): bond0: link is not ready
tg3 0000:01:07.1: PME# disabled
bonding: bond0: enslaving eth0 as an active interface with a down link.
tg3 0000:01:07.0: PME# enabled
tg3 0000:01:07.0: PME# disabled
bonding: bond0: enslaving eth1 as an active interface with a down link.
tg3: eth0: Link is up at 1000 Mbps, full duplex.
tg3: eth0: Flow control is off for TX and off for RX.
bonding: bond0: link status definitely up for interface eth0.
bonding: bond0: making interface eth0 the new active one.
bonding: bond0: first active interface up!
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is off for TX and off for RX.
bonding: bond0: link status definitely up for interface eth1.
bonding: bond0: enslaving eth2 as an active interface with a down link.
e1000: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
bonding: bond0: link status definitely up for interface eth2.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-locking #10
-------------------------------------------------------
swapper/0 is trying to acquire lock:
 (&(bond_info->tx_hashtbl_lock)){+.-...}, at: [<e1863ec1>] alb_get_best_slave+0x1b/0x58 [bonding]

but task is already holding lock:
 (&(bond_info->rx_hashtbl_lock)){+.-...}, at: [<e1864fe5>] bond_alb_xmit+0x1b7/0x60b [bonding]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&(bond_info->rx_hashtbl_lock)){+.-...}:
       [<c014fcbb>] __lock_acquire+0x109f/0x13a0
       [<c0150064>] lock_acquire+0xa8/0xbf
       [<c0343678>] _spin_lock_bh+0x2a/0x39
       [<e186532f>] bond_alb_xmit+0x501/0x60b [bonding]
       [<e1861a6b>] bond_start_xmit+0x2e9/0x32e [bonding]
       [<c02dc369>] dev_hard_start_xmit+0x281/0x314
       [<c02dc81a>] dev_queue_xmit+0x338/0x41b
       [<c02e1ddc>] neigh_resolve_output+0x260/0x28d
       [<e170dece>] ip6_output2+0x2dc/0x32a [ipv6]
       [<e170edaf>] ip6_output+0xe93/0xea0 [ipv6]
       [<e171be2e>] ndisc_send_skb+0x19d/0x320 [ipv6]
       [<e171bfeb>] __ndisc_send+0x3a/0x45 [ipv6]
       [<e171d3df>] ndisc_send_rs+0x34/0x3c [ipv6]
       [<e171127c>] addrconf_dad_completed+0x5e/0x99 [ipv6]
       [<e17120df>] addrconf_dad_timer+0x5d/0xe1 [ipv6]
       [<c0136692>] run_timer_softirq+0x1a0/0x219
       [<c0132cd9>] __do_softirq+0xd6/0x1a3
       [<c0132dd1>] do_softirq+0x2b/0x43
       [<c0132f4e>] irq_exit+0x38/0x74
       [<c0113669>] smp_apic_timer_interrupt+0x6e/0x7c
       [<c01032fb>] apic_timer_interrupt+0x2f/0x34
       [<c0101b43>] cpu_idle+0x49/0x76
       [<c03335f3>] rest_init+0x67/0x69
       [<c04937cd>] start_kernel+0x2c1/0x2c6
       [<c049306a>] __init_begin+0x6a/0x6f

-> #0 (&(bond_info->tx_hashtbl_lock)){+.-...}:
       [<c014fa46>] __lock_acquire+0xe2a/0x13a0
       [<c0150064>] lock_acquire+0xa8/0xbf
       [<c0343678>] _spin_lock_bh+0x2a/0x39
       [<e1863ec1>] alb_get_best_slave+0x1b/0x58 [bonding]
       [<e1865080>] bond_alb_xmit+0x252/0x60b [bonding]
       [<e1861a6b>] bond_start_xmit+0x2e9/0x32e [bonding]
       [<c02dc369>] dev_hard_start_xmit+0x281/0x314
       [<c02dc81a>] dev_queue_xmit+0x338/0x41b
       [<c03157a8>] arp_send+0x32/0x37
       [<c0316033>] arp_solicit+0x18a/0x1a1
       [<c02e3ce3>] neigh_timer_handler+0x1c9/0x20a
       [<c0136692>] run_timer_softirq+0x1a0/0x219
       [<c0132cd9>] __do_softirq+0xd6/0x1a3
       [<c0132dd1>] do_softirq+0x2b/0x43
       [<c0132f4e>] irq_exit+0x38/0x74
       [<c0113669>] smp_apic_timer_interrupt+0x6e/0x7c
       [<c01032fb>] apic_timer_interrupt+0x2f/0x34
       [<c0101b43>] cpu_idle+0x49/0x76
       [<c033e289>] start_secondary+0x1ab/0x1b2

other info that might help us debug this:

5 locks held by swapper/0:
 #0:  (&n->timer){+.-...}, at: [<c0136638>] run_timer_softirq+0x146/0x219
 #1:  (rcu_read_lock){.+.+..}, at: [<c02dc696>] dev_queue_xmit+0x1b4/0x41b
 #2:  (&bond->lock){++.?..}, at: [<e1864e6b>] bond_alb_xmit+0x3d/0x60b [bonding]
 #3:  (&bond->curr_slave_lock){++.?..}, at: [<e1864e7e>] bond_alb_xmit+0x50/0x60b [bonding]
 #4:  (&(bond_info->rx_hashtbl_lock)){+.-...}, at: [<e1864fe5>] bond_alb_xmit+0x1b7/0x60b [bonding]

stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.31-locking #10
Call Trace:
 [<c03408d6>] ? printk+0xf/0x11
 [<c014e7d5>] print_circular_bug+0x90/0x9c
 [<c014fa46>] __lock_acquire+0xe2a/0x13a0
 [<e1864fe5>] ? bond_alb_xmit+0x1b7/0x60b [bonding]
 [<c0150064>] lock_acquire+0xa8/0xbf
 [<e1863ec1>] ? alb_get_best_slave+0x1b/0x58 [bonding]
 [<c0343678>] _spin_lock_bh+0x2a/0x39
 [<e1863ec1>] ? alb_get_best_slave+0x1b/0x58 [bonding]
 [<e1863ec1>] alb_get_best_slave+0x1b/0x58 [bonding]
 [<e1865080>] bond_alb_xmit+0x252/0x60b [bonding]
 [<c014dff7>] ? mark_held_locks+0x43/0x5b
 [<c014e211>] ? trace_hardirqs_on_caller+0xe6/0x120
 [<e1861a6b>] bond_start_xmit+0x2e9/0x32e [bonding]
 [<c02dc369>] dev_hard_start_xmit+0x281/0x314
 [<c02dc81a>] dev_queue_xmit+0x338/0x41b
 [<c031579c>] ? arp_send+0x26/0x37
 [<c03157a8>] arp_send+0x32/0x37
 [<c0316033>] arp_solicit+0x18a/0x1a1
 [<c02e3ce3>] neigh_timer_handler+0x1c9/0x20a
 [<c0136692>] run_timer_softirq+0x1a0/0x219
 [<c0136638>] ? run_timer_softirq+0x146/0x219
 [<c02e3b1a>] ? neigh_timer_handler+0x0/0x20a
 [<c0132cd9>] __do_softirq+0xd6/0x1a3
 [<c0132dd1>] do_softirq+0x2b/0x43
 [<c0132f4e>] irq_exit+0x38/0x74
 [<c0113669>] smp_apic_timer_interrupt+0x6e/0x7c
 [<c01032fb>] apic_timer_interrupt+0x2f/0x34
 [<c0101b3d>] ? cpu_idle+0x43/0x76
 [<c014007b>] ? sys_timer_create+0x205/0x304
 [<c0108125>] ? default_idle+0x8a/0xef
 [<c014b25c>] ? tick_nohz_restart_sched_tick+0x12f/0x138
 [<c0101b43>] cpu_idle+0x49/0x76
 [<c033e289>] start_secondary+0x1ab/0x1b2


	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [E1000-devel] [BUG 2.6.30+] e100 sometimes causes oops during resume
From: Rafael J. Wysocki @ 2009-09-16 23:11 UTC (permalink / raw)
  To: Graham, David
  Cc: Karol Lewandowski, linux-kernel@vger.kernel.org,
	e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org
In-Reply-To: <13830B75AD5A2F42848F92269B11996F5BF592C3@orsmsx509.amr.corp.intel.com>

On Wednesday 16 September 2009, Graham, David wrote:
> A v2.6.30..v2.6.31 diff shows that this is probably exposed by Rafael Wysocki's
> commit 6905b1f1, which now allows systems with e100 to sleep. If I understand
> correctly, it looks like these systems simply couldn't sleep before. Is that right Rafael?

The systems where e100 is not power manageable by any means couldn't suspend
before that commit.  For the other systems, where e100 is power manageable
either with ACPI or natively, the commit doesn't change anything. 

> I don't think its likely that the commit is a direct cause of the problem, but that the
> suspend/resume cycle now allows us to see another issue. Maybe e100 is
> leaking memory on suspend/resume cycles, or something else is leaking memory,
> or memory is becoming fragmented and the e100 driver is improperly
> requesting and being failed on an 'atomic' memory allocation  from a heavily
> fragmented memory map. Or something else.

I have a couple of test systems with e100 that don't have any resume problems,
FWIW.

Thanks,
Rafael

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox