Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 3/7] flow: allocate hash table for online cpus only
From: Rusty Russell @ 2010-03-31 12:32 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teras, netdev
In-Reply-To: <20100330121255.GC5731@gondor.apana.org.au>

On Tue, 30 Mar 2010 10:42:55 pm Herbert Xu wrote:
> On Mon, Mar 29, 2010 at 05:12:40PM +0300, Timo Teras wrote:
> > Instead of unconditionally allocating hash table for all possible
> > cpu's, allocate it only for online cpu's and release related
> > memory if cpu goes down.
> > 
> > Signed-off-by: Timo Teras <timo.teras@iki.fi>
> 
> Hmm that's where we started but then Rusty changed it back in 2004:
> 
> commit 0a32dc4d8e83c48f7535d66731eb35d1916b39a8
> Author: Rusty Russell <rusty@rustcorp.com.au>
> Date:   Wed Jan 21 18:14:37 2004 -0800
> 
>     [NET]: Simplify net/flow.c per-cpu handling.
> 
>     The cpu handling in net/core/flow.c is complex: it tries to allocate
>     flow cache as each CPU comes up.  It might as well allocate them for
>     each possible CPU at boot.
> 
> So I'd like to hear his opinion on changing it back again.

It was pretty unique at the time, it no longer is, so the arguments are less
compelling IMHO.

However, we can now use a dynamic percpu variable and get it as a real
per-cpu thing (which currently means it *will* be for every available cpu,
not just online ones).  Haven't thought about it, but that change might be
worth considering instead?

Thanks!
Rusty.

^ permalink raw reply

* [PATCH] can: Add driver for SJA1000 based PCI CAN interface cards by esd
From: Matthias Fuchs @ 2010-03-31 12:28 UTC (permalink / raw)
  To: netdev; +Cc: Socketcan-core

This patch adds support for SJA1000 based PCI CAN interface cards
from electronic system design gmbh.

These boards are supported:

        CAN-PCI/200 (PCI)
        CAN-PCI/266 (PCI)
        CAN-PMC266 (PMC module)
        CAN-PCIe/2000 (PCI Express)
        CAN-CPCI/200 (Compact PCI, 3U)
        CAN-PCI104 (PCI104)

This driver is part of the SocketCAN SVN repository since
April 2009.

Signed-off-by: Matthias Fuchs <matthias.fuchs@esd.eu>
---
 drivers/net/can/sja1000/Kconfig   |    9 +
 drivers/net/can/sja1000/Makefile  |    1 +
 drivers/net/can/sja1000/esd_pci.c |  315 +++++++++++++++++++++++++++++++++++++
 3 files changed, 325 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/can/sja1000/esd_pci.c

diff --git a/drivers/net/can/sja1000/Kconfig b/drivers/net/can/sja1000/Kconfig
index 9e277d6..c01eb4e 100644
--- a/drivers/net/can/sja1000/Kconfig
+++ b/drivers/net/can/sja1000/Kconfig
@@ -37,6 +37,15 @@ config CAN_EMS_PCI
 	  CPC-PCIe and CPC-104P cards from EMS Dr. Thomas Wuensche
 	  (http://www.ems-wuensche.de).
 
+config CAN_ESD_PCI
+	tristate "ESD PCI Cards"
+	depends on PCI
+	---help---
+	  This driver supports the esd PCI CAN cards CAN-PCI/200,
+	  CAN-PCI/266, CAN-PMC/266 (PMC), CAN-CPCI/200 (CompactPCI),
+	  CAN-PCIe2000 (PCI Express) and CAN-PCI104/200 (PCI104)
+	  from the esd electronic system design gmbh (http://www.esd.eu).
+
 config CAN_KVASER_PCI
 	tristate "Kvaser PCIcanx and Kvaser PCIcan PCI Cards"
 	depends on PCI
diff --git a/drivers/net/can/sja1000/Makefile b/drivers/net/can/sja1000/Makefile
index ce92455..a8dcbe0 100644
--- a/drivers/net/can/sja1000/Makefile
+++ b/drivers/net/can/sja1000/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_CAN_SJA1000_ISA) += sja1000_isa.o
 obj-$(CONFIG_CAN_SJA1000_PLATFORM) += sja1000_platform.o
 obj-$(CONFIG_CAN_SJA1000_OF_PLATFORM) += sja1000_of_platform.o
 obj-$(CONFIG_CAN_EMS_PCI) += ems_pci.o
+obj-$(CONFIG_CAN_ESD_PCI) += esd_pci.o
 obj-$(CONFIG_CAN_KVASER_PCI) += kvaser_pci.o
 obj-$(CONFIG_CAN_PLX_PCI) += plx_pci.o
 
diff --git a/drivers/net/can/sja1000/esd_pci.c b/drivers/net/can/sja1000/esd_pci.c
new file mode 100644
index 0000000..98389f2
--- /dev/null
+++ b/drivers/net/can/sja1000/esd_pci.c
@@ -0,0 +1,315 @@
+/*
+ * Copyright (C) 2007 Wolfgang Grandegger <wg@grandegger.com>
+ * Copyright (C) 2008 Sascha Hauer <s.hauer@pengutronix.de>, Pengutronix
+ * Copyright (C) 2009 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the version 2 of the GNU General Public License
+ * as published by the Free Software Foundation
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/netdevice.h>
+#include <linux/delay.h>
+#include <linux/pci.h>
+#include <linux/can.h>
+#include <linux/can/dev.h>
+#include <linux/io.h>
+
+#include "sja1000.h"
+
+#define DRV_NAME  "esd_pci"
+
+MODULE_AUTHOR("Matthias Fuchs <matthias.fuchs@esd.eu");
+MODULE_DESCRIPTION("Socket-CAN driver for esd PCI/PMC/CPCI/PCIe/PCI104 " \
+		   "CAN cards");
+MODULE_SUPPORTED_DEVICE("esd CAN-PCI/200, CAN-PCI/266, CAN-PMC266, " \
+			"CAN-PCIe/2000, CAN-CPCI/200, CAN-PCI104");
+MODULE_LICENSE("GPL v2");
+
+/* Maximum number of interfaces supported on one card. */
+#define ESD_PCI_MAX_CAN 2
+
+struct esd_pci {
+	struct pci_dev *pci_dev;
+	struct net_device *dev[ESD_PCI_MAX_CAN];
+	void __iomem *conf_addr;
+	void __iomem *base_addr;
+};
+
+#define ESD_PCI_CAN_CLOCK	(16000000 / 2)
+
+#define ESD_PCI_OCR		(OCR_TX0_PUSHPULL | OCR_TX1_PUSHPULL)
+#define ESD_PCI_CDR		0
+
+#define CHANNEL_OFFSET		0x100
+
+#define INTCSR_OFFSET		0x4c /* Offset in PLX9050 conf registers */
+#define INTCSR_LINTI1		(1 << 0)
+#define INTCSR_PCI		(1 << 6)
+
+#define INTCSR9056_OFFSET	0x68 /* Offset in PLX9056 conf registers */
+#define INTCSR9056_LINTI	(1 << 11)
+#define INTCSR9056_PCI		(1 << 8)
+
+/* PCI subsystem IDs of esd's SJA1000 based CAN cards */
+
+/* CAN-PCI/200: PCI, 33MHz only, bridge: PLX9050 */
+#define ESD_PCI_SUB_SYS_ID_PCI200	0x0004
+
+/* CAN-PCI/266: PCI, 33/66MHz, bridge: PLX9056 */
+#define ESD_PCI_SUB_SYS_ID_PCI266	0x0009
+
+/* CAN-PMC/266: PMC module, 33/66MHz, bridge: PLX9056 */
+#define ESD_PCI_SUB_SYS_ID_PMC266	0x000e
+
+/* CAN-CPCI/200: Compact PCI, 33MHz only, bridge: PLX9030 */
+#define ESD_PCI_SUB_SYS_ID_CPCI200	0x010b
+
+/* CAN-PCIE/2000: PCI Express 1x, bridge: PEX8311 = PEX8111 + PLX9056 */
+#define ESD_PCI_SUB_SYS_ID_PCIE2000	0x0200
+
+/* CAN-PCI/104: PCI104 module, 33MHz only, bridge: PLX9030 */
+#define ESD_PCI_SUB_SYS_ID_PCI104200	0x0501
+
+static DEFINE_PCI_DEVICE_TABLE(esd_pci_tbl) = {
+	{PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9050,
+	 PCI_VENDOR_ID_ESDGMBH, ESD_PCI_SUB_SYS_ID_PCI200},
+	{PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9056,
+	 PCI_VENDOR_ID_ESDGMBH, ESD_PCI_SUB_SYS_ID_PCI266},
+	{PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9056,
+	 PCI_VENDOR_ID_ESDGMBH, ESD_PCI_SUB_SYS_ID_PMC266},
+	{PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9030,
+	 PCI_VENDOR_ID_ESDGMBH, ESD_PCI_SUB_SYS_ID_CPCI200},
+	{PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9056,
+	 PCI_VENDOR_ID_ESDGMBH, ESD_PCI_SUB_SYS_ID_PCIE2000},
+	{PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9030,
+	 PCI_VENDOR_ID_ESDGMBH, ESD_PCI_SUB_SYS_ID_PCI104200},
+	{0,}
+};
+
+#define ESD_PCI_BASE_SIZE  0x200
+
+MODULE_DEVICE_TABLE(pci, esd_pci_tbl);
+
+static u8 esd_pci_read_reg(const struct sja1000_priv *priv, int port)
+{
+	return readb(priv->reg_base + port);
+}
+
+static void esd_pci_write_reg(const struct sja1000_priv *priv, int port, u8 val)
+{
+	writeb(val, priv->reg_base + port);
+}
+
+static void esd_pci_del_chan(struct pci_dev *pdev, struct net_device *ndev)
+{
+	dev_info(&pdev->dev, "Removing device %s\n", ndev->name);
+
+	unregister_sja1000dev(ndev);
+
+	free_sja1000dev(ndev);
+}
+
+static struct net_device * __devinit esd_pci_add_chan(struct pci_dev *pdev,
+						      void __iomem *base_addr)
+{
+	struct net_device *ndev;
+	struct sja1000_priv *priv;
+	int err;
+
+	ndev = alloc_sja1000dev(0);
+	if (ndev == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	priv = netdev_priv(ndev);
+
+	priv->reg_base = base_addr;
+
+	priv->read_reg = esd_pci_read_reg;
+	priv->write_reg = esd_pci_write_reg;
+
+	priv->can.clock.freq = ESD_PCI_CAN_CLOCK;
+
+	priv->ocr = ESD_PCI_OCR;
+	priv->cdr = ESD_PCI_CDR;
+
+	/* Set and enable PCI interrupts */
+	priv->irq_flags = IRQF_SHARED;
+	ndev->irq = pdev->irq;
+
+	dev_dbg(&pdev->dev, "reg_base=0x%p irq=%d\n",
+			priv->reg_base, ndev->irq);
+
+	SET_NETDEV_DEV(ndev, &pdev->dev);
+
+	err = register_sja1000dev(ndev);
+	if (err) {
+		dev_err(&pdev->dev, "Failed to register (err=%d)\n", err);
+		goto failure;
+	}
+
+	return ndev;
+
+failure:
+	free_sja1000dev(ndev);
+	return ERR_PTR(err);
+}
+
+static int __devinit esd_pci_init_one(struct pci_dev *pdev,
+				       const struct pci_device_id *ent)
+{
+	struct esd_pci *board;
+	int err;
+	void __iomem *base_addr;
+	void __iomem *conf_addr;
+
+	dev_info(&pdev->dev,
+		 "Initializing device %04x:%04x %04x:%04x\n",
+		 pdev->vendor, pdev->device,
+		 pdev->subsystem_vendor, pdev->subsystem_device);
+
+	board = kzalloc(sizeof(*board), GFP_KERNEL);
+	if (!board)
+		return -ENOMEM;
+
+	err = pci_enable_device(pdev);
+	if (err)
+		goto failure;
+
+	err = pci_request_regions(pdev, DRV_NAME);
+	if (err)
+		goto failure;
+
+	conf_addr = pci_iomap(pdev, 0, ESD_PCI_BASE_SIZE);
+	if (conf_addr == NULL) {
+		err = -ENODEV;
+		goto failure_release_pci;
+	}
+
+	board->conf_addr = conf_addr;
+
+	base_addr = pci_iomap(pdev, 2, ESD_PCI_BASE_SIZE);
+	if (base_addr == NULL) {
+		err = -ENODEV;
+		goto failure_iounmap_conf;
+	}
+
+	board->base_addr = base_addr;
+
+	board->dev[0] = esd_pci_add_chan(pdev, base_addr);
+	if (IS_ERR(board->dev[0]))
+		goto failure_iounmap_base;
+
+	/* Check if second channel is available */
+	writeb(MOD_RM, base_addr + CHANNEL_OFFSET + REG_MOD);
+	writeb(CDR_CBP, base_addr + CHANNEL_OFFSET + REG_CDR);
+	writeb(MOD_RM, base_addr + CHANNEL_OFFSET + REG_MOD);
+	if (readb(base_addr + CHANNEL_OFFSET + REG_MOD) == 0x21) {
+		writeb(MOD_SM | MOD_AFM | MOD_STM | MOD_LOM | MOD_RM,
+		       base_addr + CHANNEL_OFFSET + REG_MOD);
+		if (readb(base_addr + CHANNEL_OFFSET + REG_MOD) == 0x3f) {
+			writeb(MOD_RM, base_addr + CHANNEL_OFFSET + REG_MOD);
+			board->dev[1] =
+				esd_pci_add_chan(pdev,
+						 base_addr + CHANNEL_OFFSET);
+			if (IS_ERR(board->dev[1]))
+				goto failure_unreg_dev0;
+		} else
+			writeb(MOD_RM, base_addr + CHANNEL_OFFSET + REG_MOD);
+	} else
+		writeb(MOD_RM, base_addr + CHANNEL_OFFSET + REG_MOD);
+
+	if ((pdev->device == PCI_DEVICE_ID_PLX_9050) ||
+	    (pdev->device == PCI_DEVICE_ID_PLX_9030)) {
+		/* Enable interrupts in PLX9050 */
+		writel(INTCSR_LINTI1 | INTCSR_PCI,
+		       board->conf_addr + INTCSR_OFFSET);
+	} else {
+		/* Enable interrupts in PLX9056*/
+		writel(INTCSR9056_LINTI | INTCSR9056_PCI,
+		       board->conf_addr + INTCSR9056_OFFSET);
+	}
+
+	pci_set_drvdata(pdev, board);
+
+	return 0;
+
+failure_unreg_dev0:
+	esd_pci_del_chan(pdev, board->dev[0]);
+
+failure_iounmap_base:
+	pci_iounmap(pdev, board->base_addr);
+
+failure_iounmap_conf:
+	pci_iounmap(pdev, board->conf_addr);
+
+failure_release_pci:
+	pci_release_regions(pdev);
+
+failure:
+	kfree(board);
+
+	return err;
+}
+
+static void esd_pci_remove_one(struct pci_dev *pdev)
+{
+	struct esd_pci *board = pci_get_drvdata(pdev);
+	int i;
+
+	if ((pdev->device == PCI_DEVICE_ID_PLX_9050) ||
+	    (pdev->device == PCI_DEVICE_ID_PLX_9030)) {
+		/* Disable interrupts in PLX9050*/
+		writel(0, board->conf_addr + INTCSR_OFFSET);
+	} else {
+		/* Disable interrupts in PLX9056*/
+		writel(0, board->conf_addr + INTCSR9056_OFFSET);
+	}
+
+	for (i = 0; i < ESD_PCI_MAX_CAN; i++) {
+		if (!board->dev[i])
+			break;
+		esd_pci_del_chan(pdev, board->dev[i]);
+	}
+
+	pci_iounmap(pdev, board->base_addr);
+	pci_iounmap(pdev, board->conf_addr);
+
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+
+	kfree(board);
+}
+
+static struct pci_driver esd_pci_driver = {
+	.name = DRV_NAME,
+	.id_table = esd_pci_tbl,
+	.probe = esd_pci_init_one,
+	.remove = esd_pci_remove_one,
+};
+
+static int __init esd_pci_init(void)
+{
+	return pci_register_driver(&esd_pci_driver);
+}
+
+static void __exit esd_pci_exit(void)
+{
+	pci_unregister_driver(&esd_pci_driver);
+}
+
+module_init(esd_pci_init);
+module_exit(esd_pci_exit);
-- 
1.6.1


^ permalink raw reply related

* Re: [r8169] WARNING: at net/sched/sch_generic.c
From: Eric Dumazet @ 2010-03-31 12:29 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Neil Horman, netdev, Francois Romieu, David S. Miller,
	linux-kernel
In-Reply-To: <20100331121426.GB3294@swordfish.minsk.epam.com>

Le mercredi 31 mars 2010 à 15:14 +0300, Sergey Senozhatsky a écrit :


> PKT_SIZE="pkt_size 2048"
> 

If you use 1024 bytes pktgen messages, do you still have the problem ?



^ permalink raw reply

* Re: [PATCH] netdev/fec.c: add phylib supporting to enable carrier detection (v2)
From: Sascha Hauer @ 2010-03-31 12:26 UTC (permalink / raw)
  To: Bryan Wu
  Cc: gerg, amit.kucheria, w.sang, kernel-team, netdev,
	linux-arm-kernel, linux-kernel
In-Reply-To: <1270037444-8470-1-git-send-email-bryan.wu@canonical.com>

On Wed, Mar 31, 2010 at 08:10:44PM +0800, Bryan Wu wrote:
> BugLink: http://bugs.launchpad.net/bugs/457878
> 
> v2:
>  - remove duplicated phy_speed caculation
>  - fix the phy_speed caculation according to the DataSheet
> 
> v1:
>  - removed old MII phy control code
>  - add phylib supporting
>  - add ethtool interface to make user space NetworkManager works
> 
> Tested on Freescale i.MX51 Babbage board.
> 
> This patch is based on a patch from Frederic Rodo <fred.rodo@gmail.com>
> 
> Cc: Frederic Rodo <fred.rodo@gmail.com>
> Signed-off-by: Bryan Wu <bryan.wu@canonical.com>

Acked-by: Sascha Hauer <s.hauer@pengutronix.de>

Sascha

> ---
>  drivers/net/Kconfig |    1 +
>  drivers/net/fec.c   | 1128 ++++++++++++---------------------------------------
>  2 files changed, 252 insertions(+), 877 deletions(-)
> 
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 0ba5b8e..41f6a70 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -1916,6 +1916,7 @@ config FEC
>  	bool "FEC ethernet controller (of ColdFire and some i.MX CPUs)"
>  	depends on M523x || M527x || M5272 || M528x || M520x || M532x || \
>  		MACH_MX27 || ARCH_MX35 || ARCH_MX25 || ARCH_MX5
> +	select PHYLIB
>  	help
>  	  Say Y here if you want to use the built-in 10/100 Fast ethernet
>  	  controller on some Motorola ColdFire and Freescale i.MX processors.
> diff --git a/drivers/net/fec.c b/drivers/net/fec.c
> index 9f98c1c..848eb19 100644
> --- a/drivers/net/fec.c
> +++ b/drivers/net/fec.c
> @@ -40,6 +40,7 @@
>  #include <linux/irq.h>
>  #include <linux/clk.h>
>  #include <linux/platform_device.h>
> +#include <linux/phy.h>
>  
>  #include <asm/cacheflush.h>
>  
> @@ -61,7 +62,6 @@
>   * Define the fixed address of the FEC hardware.
>   */
>  #if defined(CONFIG_M5272)
> -#define HAVE_mii_link_interrupt
>  
>  static unsigned char	fec_mac_default[] = {
>  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> @@ -86,23 +86,6 @@ static unsigned char	fec_mac_default[] = {
>  #endif
>  #endif /* CONFIG_M5272 */
>  
> -/* Forward declarations of some structures to support different PHYs */
> -
> -typedef struct {
> -	uint mii_data;
> -	void (*funct)(uint mii_reg, struct net_device *dev);
> -} phy_cmd_t;
> -
> -typedef struct {
> -	uint id;
> -	char *name;
> -
> -	const phy_cmd_t *config;
> -	const phy_cmd_t *startup;
> -	const phy_cmd_t *ack_int;
> -	const phy_cmd_t *shutdown;
> -} phy_info_t;
> -
>  /* The number of Tx and Rx buffers.  These are allocated from the page
>   * pool.  The code may assume these are power of two, so it it best
>   * to keep them that size.
> @@ -189,29 +172,21 @@ struct fec_enet_private {
>  	uint	tx_full;
>  	/* hold while accessing the HW like ringbuffer for tx/rx but not MAC */
>  	spinlock_t hw_lock;
> -	/* hold while accessing the mii_list_t() elements */
> -	spinlock_t mii_lock;
> -
> -	uint	phy_id;
> -	uint	phy_id_done;
> -	uint	phy_status;
> -	uint	phy_speed;
> -	phy_info_t const	*phy;
> -	struct work_struct phy_task;
>  
> -	uint	sequence_done;
> -	uint	mii_phy_task_queued;
> +	struct  platform_device *pdev;
>  
> -	uint	phy_addr;
> +	int	opened;
>  
> +	/* Phylib and MDIO interface */
> +	struct  mii_bus *mii_bus;
> +	struct  phy_device *phy_dev;
> +	int     mii_timeout;
> +	uint    phy_speed;
>  	int	index;
> -	int	opened;
>  	int	link;
> -	int	old_link;
>  	int	full_duplex;
>  };
>  
> -static void fec_enet_mii(struct net_device *dev);
>  static irqreturn_t fec_enet_interrupt(int irq, void * dev_id);
>  static void fec_enet_tx(struct net_device *dev);
>  static void fec_enet_rx(struct net_device *dev);
> @@ -219,67 +194,20 @@ static int fec_enet_close(struct net_device *dev);
>  static void fec_restart(struct net_device *dev, int duplex);
>  static void fec_stop(struct net_device *dev);
>  
> +/* FEC MII MMFR bits definition */
> +#define FEC_MMFR_ST		(1 << 30)
> +#define FEC_MMFR_OP_READ	(2 << 28)
> +#define FEC_MMFR_OP_WRITE	(1 << 28)
> +#define FEC_MMFR_PA(v)		((v & 0x1f) << 23)
> +#define FEC_MMFR_RA(v)		((v & 0x1f) << 18)
> +#define FEC_MMFR_TA		(2 << 16)
> +#define FEC_MMFR_DATA(v)	(v & 0xffff)
>  
> -/* MII processing.  We keep this as simple as possible.  Requests are
> - * placed on the list (if there is room).  When the request is finished
> - * by the MII, an optional function may be called.
> - */
> -typedef struct mii_list {
> -	uint	mii_regval;
> -	void	(*mii_func)(uint val, struct net_device *dev);
> -	struct	mii_list *mii_next;
> -} mii_list_t;
> -
> -#define		NMII	20
> -static mii_list_t	mii_cmds[NMII];
> -static mii_list_t	*mii_free;
> -static mii_list_t	*mii_head;
> -static mii_list_t	*mii_tail;
> -
> -static int	mii_queue(struct net_device *dev, int request,
> -				void (*func)(uint, struct net_device *));
> -
> -/* Make MII read/write commands for the FEC */
> -#define mk_mii_read(REG)	(0x60020000 | ((REG & 0x1f) << 18))
> -#define mk_mii_write(REG, VAL)	(0x50020000 | ((REG & 0x1f) << 18) | \
> -						(VAL & 0xffff))
> -#define mk_mii_end	0
> +#define FEC_MII_TIMEOUT		10000
>  
>  /* Transmitter timeout */
>  #define TX_TIMEOUT (2 * HZ)
>  
> -/* Register definitions for the PHY */
> -
> -#define MII_REG_CR          0  /* Control Register                         */
> -#define MII_REG_SR          1  /* Status Register                          */
> -#define MII_REG_PHYIR1      2  /* PHY Identification Register 1            */
> -#define MII_REG_PHYIR2      3  /* PHY Identification Register 2            */
> -#define MII_REG_ANAR        4  /* A-N Advertisement Register               */
> -#define MII_REG_ANLPAR      5  /* A-N Link Partner Ability Register        */
> -#define MII_REG_ANER        6  /* A-N Expansion Register                   */
> -#define MII_REG_ANNPTR      7  /* A-N Next Page Transmit Register          */
> -#define MII_REG_ANLPRNPR    8  /* A-N Link Partner Received Next Page Reg. */
> -
> -/* values for phy_status */
> -
> -#define PHY_CONF_ANE	0x0001  /* 1 auto-negotiation enabled */
> -#define PHY_CONF_LOOP	0x0002  /* 1 loopback mode enabled */
> -#define PHY_CONF_SPMASK	0x00f0  /* mask for speed */
> -#define PHY_CONF_10HDX	0x0010  /* 10 Mbit half duplex supported */
> -#define PHY_CONF_10FDX	0x0020  /* 10 Mbit full duplex supported */
> -#define PHY_CONF_100HDX	0x0040  /* 100 Mbit half duplex supported */
> -#define PHY_CONF_100FDX	0x0080  /* 100 Mbit full duplex supported */
> -
> -#define PHY_STAT_LINK	0x0100  /* 1 up - 0 down */
> -#define PHY_STAT_FAULT	0x0200  /* 1 remote fault */
> -#define PHY_STAT_ANC	0x0400  /* 1 auto-negotiation complete	*/
> -#define PHY_STAT_SPMASK	0xf000  /* mask for speed */
> -#define PHY_STAT_10HDX	0x1000  /* 10 Mbit half duplex selected	*/
> -#define PHY_STAT_10FDX	0x2000  /* 10 Mbit full duplex selected	*/
> -#define PHY_STAT_100HDX	0x4000  /* 100 Mbit half duplex selected */
> -#define PHY_STAT_100FDX	0x8000  /* 100 Mbit full duplex selected */
> -
> -
>  static int
>  fec_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
> @@ -406,12 +334,6 @@ fec_enet_interrupt(int irq, void * dev_id)
>  			ret = IRQ_HANDLED;
>  			fec_enet_tx(dev);
>  		}
> -
> -		if (int_events & FEC_ENET_MII) {
> -			ret = IRQ_HANDLED;
> -			fec_enet_mii(dev);
> -		}
> -
>  	} while (int_events);
>  
>  	return ret;
> @@ -607,827 +529,311 @@ rx_processing_done:
>  	spin_unlock(&fep->hw_lock);
>  }
>  
> -/* called from interrupt context */
> -static void
> -fec_enet_mii(struct net_device *dev)
> -{
> -	struct	fec_enet_private *fep;
> -	mii_list_t	*mip;
> -
> -	fep = netdev_priv(dev);
> -	spin_lock(&fep->mii_lock);
> -
> -	if ((mip = mii_head) == NULL) {
> -		printk("MII and no head!\n");
> -		goto unlock;
> -	}
> -
> -	if (mip->mii_func != NULL)
> -		(*(mip->mii_func))(readl(fep->hwp + FEC_MII_DATA), dev);
> -
> -	mii_head = mip->mii_next;
> -	mip->mii_next = mii_free;
> -	mii_free = mip;
> -
> -	if ((mip = mii_head) != NULL)
> -		writel(mip->mii_regval, fep->hwp + FEC_MII_DATA);
> -
> -unlock:
> -	spin_unlock(&fep->mii_lock);
> -}
> -
> -static int
> -mii_queue_unlocked(struct net_device *dev, int regval,
> -		void (*func)(uint, struct net_device *))
> +/* ------------------------------------------------------------------------- */
> +#ifdef CONFIG_M5272
> +static void __inline__ fec_get_mac(struct net_device *dev)
>  {
> -	struct fec_enet_private *fep;
> -	mii_list_t	*mip;
> -	int		retval;
> -
> -	/* Add PHY address to register command */
> -	fep = netdev_priv(dev);
> +	struct fec_enet_private *fep = netdev_priv(dev);
> +	unsigned char *iap, tmpaddr[ETH_ALEN];
>  
> -	regval |= fep->phy_addr << 23;
> -	retval = 0;
> -
> -	if ((mip = mii_free) != NULL) {
> -		mii_free = mip->mii_next;
> -		mip->mii_regval = regval;
> -		mip->mii_func = func;
> -		mip->mii_next = NULL;
> -		if (mii_head) {
> -			mii_tail->mii_next = mip;
> -			mii_tail = mip;
> -		} else {
> -			mii_head = mii_tail = mip;
> -			writel(regval, fep->hwp + FEC_MII_DATA);
> -		}
> +	if (FEC_FLASHMAC) {
> +		/*
> +		 * Get MAC address from FLASH.
> +		 * If it is all 1's or 0's, use the default.
> +		 */
> +		iap = (unsigned char *)FEC_FLASHMAC;
> +		if ((iap[0] == 0) && (iap[1] == 0) && (iap[2] == 0) &&
> +		    (iap[3] == 0) && (iap[4] == 0) && (iap[5] == 0))
> +			iap = fec_mac_default;
> +		if ((iap[0] == 0xff) && (iap[1] == 0xff) && (iap[2] == 0xff) &&
> +		    (iap[3] == 0xff) && (iap[4] == 0xff) && (iap[5] == 0xff))
> +			iap = fec_mac_default;
>  	} else {
> -		retval = 1;
> +		*((unsigned long *) &tmpaddr[0]) = readl(fep->hwp + FEC_ADDR_LOW);
> +		*((unsigned short *) &tmpaddr[4]) = (readl(fep->hwp + FEC_ADDR_HIGH) >> 16);
> +		iap = &tmpaddr[0];
>  	}
>  
> -	return retval;
> -}
> -
> -static int
> -mii_queue(struct net_device *dev, int regval,
> -		void (*func)(uint, struct net_device *))
> -{
> -	struct fec_enet_private *fep;
> -	unsigned long   flags;
> -	int             retval;
> -	fep = netdev_priv(dev);
> -	spin_lock_irqsave(&fep->mii_lock, flags);
> -	retval = mii_queue_unlocked(dev, regval, func);
> -	spin_unlock_irqrestore(&fep->mii_lock, flags);
> -	return retval;
> -}
> -
> -static void mii_do_cmd(struct net_device *dev, const phy_cmd_t *c)
> -{
> -	if(!c)
> -		return;
> +	memcpy(dev->dev_addr, iap, ETH_ALEN);
>  
> -	for (; c->mii_data != mk_mii_end; c++)
> -		mii_queue(dev, c->mii_data, c->funct);
> +	/* Adjust MAC if using default MAC address */
> +	if (iap == fec_mac_default)
> +		 dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;
>  }
> +#endif
>  
> -static void mii_parse_sr(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_STAT_LINK | PHY_STAT_FAULT | PHY_STAT_ANC);
> -
> -	if (mii_reg & 0x0004)
> -		status |= PHY_STAT_LINK;
> -	if (mii_reg & 0x0010)
> -		status |= PHY_STAT_FAULT;
> -	if (mii_reg & 0x0020)
> -		status |= PHY_STAT_ANC;
> -	*s = status;
> -}
> +/* ------------------------------------------------------------------------- */
>  
> -static void mii_parse_cr(uint mii_reg, struct net_device *dev)
> +/*
> + * Phy section
> + */
> +static void fec_enet_adjust_link(struct net_device *dev)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_CONF_ANE | PHY_CONF_LOOP);
> -
> -	if (mii_reg & 0x1000)
> -		status |= PHY_CONF_ANE;
> -	if (mii_reg & 0x4000)
> -		status |= PHY_CONF_LOOP;
> -	*s = status;
> -}
> +	struct phy_device *phy_dev = fep->phy_dev;
> +	unsigned long flags;
>  
> -static void mii_parse_anar(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_CONF_SPMASK);
> -
> -	if (mii_reg & 0x0020)
> -		status |= PHY_CONF_10HDX;
> -	if (mii_reg & 0x0040)
> -		status |= PHY_CONF_10FDX;
> -	if (mii_reg & 0x0080)
> -		status |= PHY_CONF_100HDX;
> -	if (mii_reg & 0x00100)
> -		status |= PHY_CONF_100FDX;
> -	*s = status;
> -}
> +	int status_change = 0;
>  
> -/* ------------------------------------------------------------------------- */
> -/* The Level one LXT970 is used by many boards				     */
> +	spin_lock_irqsave(&fep->hw_lock, flags);
>  
> -#define MII_LXT970_MIRROR    16  /* Mirror register           */
> -#define MII_LXT970_IER       17  /* Interrupt Enable Register */
> -#define MII_LXT970_ISR       18  /* Interrupt Status Register */
> -#define MII_LXT970_CONFIG    19  /* Configuration Register    */
> -#define MII_LXT970_CSR       20  /* Chip Status Register      */
> +	/* Prevent a state halted on mii error */
> +	if (fep->mii_timeout && phy_dev->state == PHY_HALTED) {
> +		phy_dev->state = PHY_RESUMING;
> +		goto spin_unlock;
> +	}
>  
> -static void mii_parse_lxt970_csr(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> +	/* Duplex link change */
> +	if (phy_dev->link) {
> +		if (fep->full_duplex != phy_dev->duplex) {
> +			fec_restart(dev, phy_dev->duplex);
> +			status_change = 1;
> +		}
> +	}
>  
> -	status = *s & ~(PHY_STAT_SPMASK);
> -	if (mii_reg & 0x0800) {
> -		if (mii_reg & 0x1000)
> -			status |= PHY_STAT_100FDX;
> +	/* Link on or off change */
> +	if (phy_dev->link != fep->link) {
> +		fep->link = phy_dev->link;
> +		if (phy_dev->link)
> +			fec_restart(dev, phy_dev->duplex);
>  		else
> -			status |= PHY_STAT_100HDX;
> -	} else {
> -		if (mii_reg & 0x1000)
> -			status |= PHY_STAT_10FDX;
> -		else
> -			status |= PHY_STAT_10HDX;
> +			fec_stop(dev);
> +		status_change = 1;
>  	}
> -	*s = status;
> -}
> -
> -static phy_cmd_t const phy_cmd_lxt970_config[] = {
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt970_startup[] = { /* enable interrupts */
> -		{ mk_mii_write(MII_LXT970_IER, 0x0002), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt970_ack_int[] = {
> -		/* read SR and ISR to acknowledge */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_LXT970_ISR), NULL },
> -
> -		/* find out the current status */
> -		{ mk_mii_read(MII_LXT970_CSR), mii_parse_lxt970_csr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt970_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_LXT970_IER, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_lxt970 = {
> -	.id = 0x07810000,
> -	.name = "LXT970",
> -	.config = phy_cmd_lxt970_config,
> -	.startup = phy_cmd_lxt970_startup,
> -	.ack_int = phy_cmd_lxt970_ack_int,
> -	.shutdown = phy_cmd_lxt970_shutdown
> -};
>  
> -/* ------------------------------------------------------------------------- */
> -/* The Level one LXT971 is used on some of my custom boards                  */
> -
> -/* register definitions for the 971 */
> +spin_unlock:
> +	spin_unlock_irqrestore(&fep->hw_lock, flags);
>  
> -#define MII_LXT971_PCR       16  /* Port Control Register     */
> -#define MII_LXT971_SR2       17  /* Status Register 2         */
> -#define MII_LXT971_IER       18  /* Interrupt Enable Register */
> -#define MII_LXT971_ISR       19  /* Interrupt Status Register */
> -#define MII_LXT971_LCR       20  /* LED Control Register      */
> -#define MII_LXT971_TCR       30  /* Transmit Control Register */
> +	if (status_change)
> +		phy_print_status(phy_dev);
> +}
>  
>  /*
> - * I had some nice ideas of running the MDIO faster...
> - * The 971 should support 8MHz and I tried it, but things acted really
> - * weird, so 2.5 MHz ought to be enough for anyone...
> + * NOTE: a MII transaction is during around 25 us, so polling it...
>   */
> -
> -static void mii_parse_lxt971_sr2(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
>  {
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> +	struct fec_enet_private *fep = bus->priv;
> +	int timeout = FEC_MII_TIMEOUT;
>  
> -	status = *s & ~(PHY_STAT_SPMASK | PHY_STAT_LINK | PHY_STAT_ANC);
> +	fep->mii_timeout = 0;
>  
> -	if (mii_reg & 0x0400) {
> -		fep->link = 1;
> -		status |= PHY_STAT_LINK;
> -	} else {
> -		fep->link = 0;
> -	}
> -	if (mii_reg & 0x0080)
> -		status |= PHY_STAT_ANC;
> -	if (mii_reg & 0x4000) {
> -		if (mii_reg & 0x0200)
> -			status |= PHY_STAT_100FDX;
> -		else
> -			status |= PHY_STAT_100HDX;
> -	} else {
> -		if (mii_reg & 0x0200)
> -			status |= PHY_STAT_10FDX;
> -		else
> -			status |= PHY_STAT_10HDX;
> +	/* clear MII end of transfer bit*/
> +	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
> +
> +	/* start a read op */
> +	writel(FEC_MMFR_ST | FEC_MMFR_OP_READ |
> +		FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
> +		FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
> +
> +	/* wait for end of transfer */
> +	while (!(readl(fep->hwp + FEC_IEVENT) & FEC_ENET_MII)) {
> +		cpu_relax();
> +		if (timeout-- < 0) {
> +			fep->mii_timeout = 1;
> +			printk(KERN_ERR "FEC: MDIO read timeout\n");
> +			return -ETIMEDOUT;
> +		}
>  	}
> -	if (mii_reg & 0x0008)
> -		status |= PHY_STAT_FAULT;
>  
> -	*s = status;
> +	/* return value */
> +	return FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
>  }
>  
> -static phy_cmd_t const phy_cmd_lxt971_config[] = {
> -		/* limit to 10MBit because my prototype board
> -		 * doesn't work with 100. */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_read(MII_LXT971_SR2), mii_parse_lxt971_sr2 },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt971_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_LXT971_IER, 0x00f2), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_write(MII_LXT971_LCR, 0xd422), NULL }, /* LED config */
> -		/* Somehow does the 971 tell me that the link is down
> -		 * the first read after power-up.
> -		 * read here to get a valid value in ack_int */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt971_ack_int[] = {
> -		/* acknowledge the int before reading status ! */
> -		{ mk_mii_read(MII_LXT971_ISR), NULL },
> -		/* find out the current status */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_LXT971_SR2), mii_parse_lxt971_sr2 },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt971_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_LXT971_IER, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_lxt971 = {
> -	.id = 0x0001378e,
> -	.name = "LXT971",
> -	.config = phy_cmd_lxt971_config,
> -	.startup = phy_cmd_lxt971_startup,
> -	.ack_int = phy_cmd_lxt971_ack_int,
> -	.shutdown = phy_cmd_lxt971_shutdown
> -};
> -
> -/* ------------------------------------------------------------------------- */
> -/* The Quality Semiconductor QS6612 is used on the RPX CLLF                  */
> -
> -/* register definitions */
> -
> -#define MII_QS6612_MCR       17  /* Mode Control Register      */
> -#define MII_QS6612_FTR       27  /* Factory Test Register      */
> -#define MII_QS6612_MCO       28  /* Misc. Control Register     */
> -#define MII_QS6612_ISR       29  /* Interrupt Source Register  */
> -#define MII_QS6612_IMR       30  /* Interrupt Mask Register    */
> -#define MII_QS6612_PCR       31  /* 100BaseTx PHY Control Reg. */
> -
> -static void mii_parse_qs6612_pcr(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
> +			   u16 value)
>  {
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> +	struct fec_enet_private *fep = bus->priv;
> +	int timeout = FEC_MII_TIMEOUT;
>  
> -	status = *s & ~(PHY_STAT_SPMASK);
> +	fep->mii_timeout = 0;
>  
> -	switch((mii_reg >> 2) & 7) {
> -	case 1: status |= PHY_STAT_10HDX; break;
> -	case 2: status |= PHY_STAT_100HDX; break;
> -	case 5: status |= PHY_STAT_10FDX; break;
> -	case 6: status |= PHY_STAT_100FDX; break;
> -}
> -
> -	*s = status;
> -}
> -
> -static phy_cmd_t const phy_cmd_qs6612_config[] = {
> -		/* The PHY powers up isolated on the RPX,
> -		 * so send a command to allow operation.
> -		 */
> -		{ mk_mii_write(MII_QS6612_PCR, 0x0dc0), NULL },
> -
> -		/* parse cr and anar to get some info */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_qs6612_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_QS6612_IMR, 0x003a), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_qs6612_ack_int[] = {
> -		/* we need to read ISR, SR and ANER to acknowledge */
> -		{ mk_mii_read(MII_QS6612_ISR), NULL },
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_REG_ANER), NULL },
> -
> -		/* read pcr to get info */
> -		{ mk_mii_read(MII_QS6612_PCR), mii_parse_qs6612_pcr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_qs6612_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_QS6612_IMR, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_qs6612 = {
> -	.id = 0x00181440,
> -	.name = "QS6612",
> -	.config = phy_cmd_qs6612_config,
> -	.startup = phy_cmd_qs6612_startup,
> -	.ack_int = phy_cmd_qs6612_ack_int,
> -	.shutdown = phy_cmd_qs6612_shutdown
> -};
> -
> -/* ------------------------------------------------------------------------- */
> -/* AMD AM79C874 phy                                                          */
> +	/* clear MII end of transfer bit*/
> +	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
>  
> -/* register definitions for the 874 */
> +	/* start a read op */
> +	writel(FEC_MMFR_ST | FEC_MMFR_OP_READ |
> +		FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
> +		FEC_MMFR_TA | FEC_MMFR_DATA(value),
> +		fep->hwp + FEC_MII_DATA);
> +
> +	/* wait for end of transfer */
> +	while (!(readl(fep->hwp + FEC_IEVENT) & FEC_ENET_MII)) {
> +		cpu_relax();
> +		if (timeout-- < 0) {
> +			fep->mii_timeout = 1;
> +			printk(KERN_ERR "FEC: MDIO write timeout\n");
> +			return -ETIMEDOUT;
> +		}
> +	}
>  
> -#define MII_AM79C874_MFR       16  /* Miscellaneous Feature Register */
> -#define MII_AM79C874_ICSR      17  /* Interrupt/Status Register      */
> -#define MII_AM79C874_DR        18  /* Diagnostic Register            */
> -#define MII_AM79C874_PMLR      19  /* Power and Loopback Register    */
> -#define MII_AM79C874_MCR       21  /* ModeControl Register           */
> -#define MII_AM79C874_DC        23  /* Disconnect Counter             */
> -#define MII_AM79C874_REC       24  /* Recieve Error Counter          */
> +	return 0;
> +}
>  
> -static void mii_parse_am79c874_dr(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mdio_reset(struct mii_bus *bus)
>  {
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_STAT_SPMASK | PHY_STAT_ANC);
> -
> -	if (mii_reg & 0x0080)
> -		status |= PHY_STAT_ANC;
> -	if (mii_reg & 0x0400)
> -		status |= ((mii_reg & 0x0800) ? PHY_STAT_100FDX : PHY_STAT_100HDX);
> -	else
> -		status |= ((mii_reg & 0x0800) ? PHY_STAT_10FDX : PHY_STAT_10HDX);
> -
> -	*s = status;
> +	return 0;
>  }
>  
> -static phy_cmd_t const phy_cmd_am79c874_config[] = {
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_read(MII_AM79C874_DR), mii_parse_am79c874_dr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_am79c874_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_AM79C874_ICSR, 0xff00), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_am79c874_ack_int[] = {
> -		/* find out the current status */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_AM79C874_DR), mii_parse_am79c874_dr },
> -		/* we only need to read ISR to acknowledge */
> -		{ mk_mii_read(MII_AM79C874_ICSR), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_am79c874_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_AM79C874_ICSR, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_am79c874 = {
> -	.id = 0x00022561,
> -	.name = "AM79C874",
> -	.config = phy_cmd_am79c874_config,
> -	.startup = phy_cmd_am79c874_startup,
> -	.ack_int = phy_cmd_am79c874_ack_int,
> -	.shutdown = phy_cmd_am79c874_shutdown
> -};
> -
> -
> -/* ------------------------------------------------------------------------- */
> -/* Kendin KS8721BL phy                                                       */
> -
> -/* register definitions for the 8721 */
> -
> -#define MII_KS8721BL_RXERCR	21
> -#define MII_KS8721BL_ICSR	27
> -#define	MII_KS8721BL_PHYCR	31
> -
> -static phy_cmd_t const phy_cmd_ks8721bl_config[] = {
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_ks8721bl_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_KS8721BL_ICSR, 0xff00), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_ks8721bl_ack_int[] = {
> -		/* find out the current status */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		/* we only need to read ISR to acknowledge */
> -		{ mk_mii_read(MII_KS8721BL_ICSR), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_ks8721bl_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_KS8721BL_ICSR, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_ks8721bl = {
> -	.id = 0x00022161,
> -	.name = "KS8721BL",
> -	.config = phy_cmd_ks8721bl_config,
> -	.startup = phy_cmd_ks8721bl_startup,
> -	.ack_int = phy_cmd_ks8721bl_ack_int,
> -	.shutdown = phy_cmd_ks8721bl_shutdown
> -};
> -
> -/* ------------------------------------------------------------------------- */
> -/* register definitions for the DP83848 */
> -
> -#define MII_DP8384X_PHYSTST    16  /* PHY Status Register */
> -
> -static void mii_parse_dp8384x_sr2(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mii_probe(struct net_device *dev)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -
> -	*s &= ~(PHY_STAT_SPMASK | PHY_STAT_LINK | PHY_STAT_ANC);
> -
> -	/* Link up */
> -	if (mii_reg & 0x0001) {
> -		fep->link = 1;
> -		*s |= PHY_STAT_LINK;
> -	} else
> -		fep->link = 0;
> -	/* Status of link */
> -	if (mii_reg & 0x0010)   /* Autonegotioation complete */
> -		*s |= PHY_STAT_ANC;
> -	if (mii_reg & 0x0002) {   /* 10MBps? */
> -		if (mii_reg & 0x0004)   /* Full Duplex? */
> -			*s |= PHY_STAT_10FDX;
> -		else
> -			*s |= PHY_STAT_10HDX;
> -	} else {                  /* 100 Mbps? */
> -		if (mii_reg & 0x0004)   /* Full Duplex? */
> -			*s |= PHY_STAT_100FDX;
> -		else
> -			*s |= PHY_STAT_100HDX;
> -	}
> -	if (mii_reg & 0x0008)
> -		*s |= PHY_STAT_FAULT;
> -}
> -
> -static phy_info_t phy_info_dp83848= {
> -	0x020005c9,
> -	"DP83848",
> +	struct phy_device *phy_dev = NULL;
> +	int phy_addr;
>  
> -	(const phy_cmd_t []) {  /* config */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_read(MII_DP8384X_PHYSTST), mii_parse_dp8384x_sr2 },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) {  /* startup - enable interrupts */
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* ack_int - never happens, no interrupt */
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) {  /* shutdown */
> -		{ mk_mii_end, }
> -	},
> -};
> +	/* find the first phy */
> +	for (phy_addr = 0; phy_addr < PHY_MAX_ADDR; phy_addr++) {
> +		if (fep->mii_bus->phy_map[phy_addr]) {
> +			phy_dev = fep->mii_bus->phy_map[phy_addr];
> +			break;
> +		}
> +	}
>  
> -static phy_info_t phy_info_lan8700 = {
> -	0x0007C0C,
> -	"LAN8700",
> -	(const phy_cmd_t []) { /* config */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* startup */
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* act_int */
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* shutdown */
> -		{ mk_mii_end, }
> -	},
> -};
> -/* ------------------------------------------------------------------------- */
> +	if (!phy_dev) {
> +		printk(KERN_ERR "%s: no PHY found\n", dev->name);
> +		return -ENODEV;
> +	}
>  
> -static phy_info_t const * const phy_info[] = {
> -	&phy_info_lxt970,
> -	&phy_info_lxt971,
> -	&phy_info_qs6612,
> -	&phy_info_am79c874,
> -	&phy_info_ks8721bl,
> -	&phy_info_dp83848,
> -	&phy_info_lan8700,
> -	NULL
> -};
> +	/* attach the mac to the phy */
> +	phy_dev = phy_connect(dev, dev_name(&phy_dev->dev),
> +			     &fec_enet_adjust_link, 0,
> +			     PHY_INTERFACE_MODE_MII);
> +	if (IS_ERR(phy_dev)) {
> +		printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name);
> +		return PTR_ERR(phy_dev);
> +	}
>  
> -/* ------------------------------------------------------------------------- */
> -#ifdef HAVE_mii_link_interrupt
> -static irqreturn_t
> -mii_link_interrupt(int irq, void * dev_id);
> +	/* mask with MAC supported features */
> +	phy_dev->supported &= PHY_BASIC_FEATURES;
> +	phy_dev->advertising = phy_dev->supported;
>  
> -/*
> - *	This is specific to the MII interrupt setup of the M5272EVB.
> - */
> -static void __inline__ fec_request_mii_intr(struct net_device *dev)
> -{
> -	if (request_irq(66, mii_link_interrupt, IRQF_DISABLED, "fec(MII)", dev) != 0)
> -		printk("FEC: Could not allocate fec(MII) IRQ(66)!\n");
> -}
> +	fep->phy_dev = phy_dev;
> +	fep->link = 0;
> +	fep->full_duplex = 0;
>  
> -static void __inline__ fec_disable_phy_intr(struct net_device *dev)
> -{
> -	free_irq(66, dev);
> +	return 0;
>  }
> -#endif
>  
> -#ifdef CONFIG_M5272
> -static void __inline__ fec_get_mac(struct net_device *dev)
> +static int fec_enet_mii_init(struct platform_device *pdev)
>  {
> +	struct net_device *dev = platform_get_drvdata(pdev);
>  	struct fec_enet_private *fep = netdev_priv(dev);
> -	unsigned char *iap, tmpaddr[ETH_ALEN];
> +	int err = -ENXIO, i;
>  
> -	if (FEC_FLASHMAC) {
> -		/*
> -		 * Get MAC address from FLASH.
> -		 * If it is all 1's or 0's, use the default.
> -		 */
> -		iap = (unsigned char *)FEC_FLASHMAC;
> -		if ((iap[0] == 0) && (iap[1] == 0) && (iap[2] == 0) &&
> -		    (iap[3] == 0) && (iap[4] == 0) && (iap[5] == 0))
> -			iap = fec_mac_default;
> -		if ((iap[0] == 0xff) && (iap[1] == 0xff) && (iap[2] == 0xff) &&
> -		    (iap[3] == 0xff) && (iap[4] == 0xff) && (iap[5] == 0xff))
> -			iap = fec_mac_default;
> -	} else {
> -		*((unsigned long *) &tmpaddr[0]) = readl(fep->hwp + FEC_ADDR_LOW);
> -		*((unsigned short *) &tmpaddr[4]) = (readl(fep->hwp + FEC_ADDR_HIGH) >> 16);
> -		iap = &tmpaddr[0];
> -	}
> -
> -	memcpy(dev->dev_addr, iap, ETH_ALEN);
> -
> -	/* Adjust MAC if using default MAC address */
> -	if (iap == fec_mac_default)
> -		 dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;
> -}
> -#endif
> +	fep->mii_timeout = 0;
>  
> -/* ------------------------------------------------------------------------- */
> -
> -static void mii_display_status(struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> +	/*
> +	 * Set MII speed to 2.5 MHz (= clk_get_rate() / 2 * phy_speed)
> +	 */
> +	fep->phy_speed = DIV_ROUND_UP(clk_get_rate(fep->clk), 5000000) << 1;
> +	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
>  
> -	if (!fep->link && !fep->old_link) {
> -		/* Link is still down - don't print anything */
> -		return;
> +	fep->mii_bus = mdiobus_alloc();
> +	if (fep->mii_bus == NULL) {
> +		err = -ENOMEM;
> +		goto err_out;
>  	}
>  
> -	printk("%s: status: ", dev->name);
> -
> -	if (!fep->link) {
> -		printk("link down");
> -	} else {
> -		printk("link up");
> -
> -		switch(*s & PHY_STAT_SPMASK) {
> -		case PHY_STAT_100FDX: printk(", 100MBit Full Duplex"); break;
> -		case PHY_STAT_100HDX: printk(", 100MBit Half Duplex"); break;
> -		case PHY_STAT_10FDX: printk(", 10MBit Full Duplex"); break;
> -		case PHY_STAT_10HDX: printk(", 10MBit Half Duplex"); break;
> -		default:
> -			printk(", Unknown speed/duplex");
> -		}
> -
> -		if (*s & PHY_STAT_ANC)
> -			printk(", auto-negotiation complete");
> +	fep->mii_bus->name = "fec_enet_mii_bus";
> +	fep->mii_bus->read = fec_enet_mdio_read;
> +	fep->mii_bus->write = fec_enet_mdio_write;
> +	fep->mii_bus->reset = fec_enet_mdio_reset;
> +	snprintf(fep->mii_bus->id, MII_BUS_ID_SIZE, "%x", pdev->id);
> +	fep->mii_bus->priv = fep;
> +	fep->mii_bus->parent = &pdev->dev;
> +
> +	fep->mii_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
> +	if (!fep->mii_bus->irq) {
> +		err = -ENOMEM;
> +		goto err_out_free_mdiobus;
>  	}
>  
> -	if (*s & PHY_STAT_FAULT)
> -		printk(", remote fault");
> -
> -	printk(".\n");
> -}
> -
> -static void mii_display_config(struct work_struct *work)
> -{
> -	struct fec_enet_private *fep = container_of(work, struct fec_enet_private, phy_task);
> -	struct net_device *dev = fep->netdev;
> -	uint status = fep->phy_status;
> +	for (i = 0; i < PHY_MAX_ADDR; i++)
> +		fep->mii_bus->irq[i] = PHY_POLL;
>  
> -	/*
> -	** When we get here, phy_task is already removed from
> -	** the workqueue.  It is thus safe to allow to reuse it.
> -	*/
> -	fep->mii_phy_task_queued = 0;
> -	printk("%s: config: auto-negotiation ", dev->name);
> -
> -	if (status & PHY_CONF_ANE)
> -		printk("on");
> -	else
> -		printk("off");
> +	platform_set_drvdata(dev, fep->mii_bus);
>  
> -	if (status & PHY_CONF_100FDX)
> -		printk(", 100FDX");
> -	if (status & PHY_CONF_100HDX)
> -		printk(", 100HDX");
> -	if (status & PHY_CONF_10FDX)
> -		printk(", 10FDX");
> -	if (status & PHY_CONF_10HDX)
> -		printk(", 10HDX");
> -	if (!(status & PHY_CONF_SPMASK))
> -		printk(", No speed/duplex selected?");
> +	if (mdiobus_register(fep->mii_bus))
> +		goto err_out_free_mdio_irq;
>  
> -	if (status & PHY_CONF_LOOP)
> -		printk(", loopback enabled");
> +	if (fec_enet_mii_probe(dev) != 0)
> +		goto err_out_unregister_bus;
>  
> -	printk(".\n");
> +	return 0;
>  
> -	fep->sequence_done = 1;
> +err_out_unregister_bus:
> +	mdiobus_unregister(fep->mii_bus);
> +err_out_free_mdio_irq:
> +	kfree(fep->mii_bus->irq);
> +err_out_free_mdiobus:
> +	mdiobus_free(fep->mii_bus);
> +err_out:
> +	return err;
>  }
>  
> -static void mii_relink(struct work_struct *work)
> +static void fec_enet_mii_remove(struct fec_enet_private *fep)
>  {
> -	struct fec_enet_private *fep = container_of(work, struct fec_enet_private, phy_task);
> -	struct net_device *dev = fep->netdev;
> -	int duplex;
> -
> -	/*
> -	** When we get here, phy_task is already removed from
> -	** the workqueue.  It is thus safe to allow to reuse it.
> -	*/
> -	fep->mii_phy_task_queued = 0;
> -	fep->link = (fep->phy_status & PHY_STAT_LINK) ? 1 : 0;
> -	mii_display_status(dev);
> -	fep->old_link = fep->link;
> -
> -	if (fep->link) {
> -		duplex = 0;
> -		if (fep->phy_status
> -		    & (PHY_STAT_100FDX | PHY_STAT_10FDX))
> -			duplex = 1;
> -		fec_restart(dev, duplex);
> -	} else
> -		fec_stop(dev);
> +	if (fep->phy_dev)
> +		phy_disconnect(fep->phy_dev);
> +	mdiobus_unregister(fep->mii_bus);
> +	kfree(fep->mii_bus->irq);
> +	mdiobus_free(fep->mii_bus);
>  }
>  
> -/* mii_queue_relink is called in interrupt context from mii_link_interrupt */
> -static void mii_queue_relink(uint mii_reg, struct net_device *dev)
> +static int fec_enet_get_settings(struct net_device *dev,
> +				  struct ethtool_cmd *cmd)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> +	struct phy_device *phydev = fep->phy_dev;
>  
> -	/*
> -	 * We cannot queue phy_task twice in the workqueue.  It
> -	 * would cause an endless loop in the workqueue.
> -	 * Fortunately, if the last mii_relink entry has not yet been
> -	 * executed now, it will do the job for the current interrupt,
> -	 * which is just what we want.
> -	 */
> -	if (fep->mii_phy_task_queued)
> -		return;
> +	if (!phydev)
> +		return -ENODEV;
>  
> -	fep->mii_phy_task_queued = 1;
> -	INIT_WORK(&fep->phy_task, mii_relink);
> -	schedule_work(&fep->phy_task);
> +	return phy_ethtool_gset(phydev, cmd);
>  }
>  
> -/* mii_queue_config is called in interrupt context from fec_enet_mii */
> -static void mii_queue_config(uint mii_reg, struct net_device *dev)
> +static int fec_enet_set_settings(struct net_device *dev,
> +				 struct ethtool_cmd *cmd)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> +	struct phy_device *phydev = fep->phy_dev;
>  
> -	if (fep->mii_phy_task_queued)
> -		return;
> +	if (!phydev)
> +		return -ENODEV;
>  
> -	fep->mii_phy_task_queued = 1;
> -	INIT_WORK(&fep->phy_task, mii_display_config);
> -	schedule_work(&fep->phy_task);
> +	return phy_ethtool_sset(phydev, cmd);
>  }
>  
> -phy_cmd_t const phy_cmd_relink[] = {
> -	{ mk_mii_read(MII_REG_CR), mii_queue_relink },
> -	{ mk_mii_end, }
> -	};
> -phy_cmd_t const phy_cmd_config[] = {
> -	{ mk_mii_read(MII_REG_CR), mii_queue_config },
> -	{ mk_mii_end, }
> -	};
> -
> -/* Read remainder of PHY ID. */
> -static void
> -mii_discover_phy3(uint mii_reg, struct net_device *dev)
> +static void fec_enet_get_drvinfo(struct net_device *dev,
> +				 struct ethtool_drvinfo *info)
>  {
> -	struct fec_enet_private *fep;
> -	int i;
> -
> -	fep = netdev_priv(dev);
> -	fep->phy_id |= (mii_reg & 0xffff);
> -	printk("fec: PHY @ 0x%x, ID 0x%08x", fep->phy_addr, fep->phy_id);
> -
> -	for(i = 0; phy_info[i]; i++) {
> -		if(phy_info[i]->id == (fep->phy_id >> 4))
> -			break;
> -	}
> -
> -	if (phy_info[i])
> -		printk(" -- %s\n", phy_info[i]->name);
> -	else
> -		printk(" -- unknown PHY!\n");
> +	struct fec_enet_private *fep = netdev_priv(dev);
>  
> -	fep->phy = phy_info[i];
> -	fep->phy_id_done = 1;
> +	strcpy(info->driver, fep->pdev->dev.driver->name);
> +	strcpy(info->version, "Revision: 1.0");
> +	strcpy(info->bus_info, dev_name(&dev->dev));
>  }
>  
> -/* Scan all of the MII PHY addresses looking for someone to respond
> - * with a valid ID.  This usually happens quickly.
> - */
> -static void
> -mii_discover_phy(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep;
> -	uint phytype;
> -
> -	fep = netdev_priv(dev);
> -
> -	if (fep->phy_addr < 32) {
> -		if ((phytype = (mii_reg & 0xffff)) != 0xffff && phytype != 0) {
> -
> -			/* Got first part of ID, now get remainder */
> -			fep->phy_id = phytype << 16;
> -			mii_queue_unlocked(dev, mk_mii_read(MII_REG_PHYIR2),
> -							mii_discover_phy3);
> -		} else {
> -			fep->phy_addr++;
> -			mii_queue_unlocked(dev, mk_mii_read(MII_REG_PHYIR1),
> -							mii_discover_phy);
> -		}
> -	} else {
> -		printk("FEC: No PHY device found.\n");
> -		/* Disable external MII interface */
> -		writel(0, fep->hwp + FEC_MII_SPEED);
> -		fep->phy_speed = 0;
> -#ifdef HAVE_mii_link_interrupt
> -		fec_disable_phy_intr(dev);
> -#endif
> -	}
> -}
> +static struct ethtool_ops fec_enet_ethtool_ops = {
> +	.get_settings		= fec_enet_get_settings,
> +	.set_settings		= fec_enet_set_settings,
> +	.get_drvinfo		= fec_enet_get_drvinfo,
> +	.get_link		= ethtool_op_get_link,
> +};
>  
> -/* This interrupt occurs when the PHY detects a link change */
> -#ifdef HAVE_mii_link_interrupt
> -static irqreturn_t
> -mii_link_interrupt(int irq, void * dev_id)
> +static int fec_enet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
>  {
> -	struct	net_device *dev = dev_id;
>  	struct fec_enet_private *fep = netdev_priv(dev);
> +	struct phy_device *phydev = fep->phy_dev;
> +
> +	if (!netif_running(dev))
> +		return -EINVAL;
>  
> -	mii_do_cmd(dev, fep->phy->ack_int);
> -	mii_do_cmd(dev, phy_cmd_relink);  /* restart and display status */
> +	if (!phydev)
> +		return -ENODEV;
>  
> -	return IRQ_HANDLED;
> +	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
>  }
> -#endif
>  
>  static void fec_enet_free_buffers(struct net_device *dev)
>  {
> @@ -1509,35 +915,8 @@ fec_enet_open(struct net_device *dev)
>  	if (ret)
>  		return ret;
>  
> -	fep->sequence_done = 0;
> -	fep->link = 0;
> -
> -	fec_restart(dev, 1);
> -
> -	if (fep->phy) {
> -		mii_do_cmd(dev, fep->phy->ack_int);
> -		mii_do_cmd(dev, fep->phy->config);
> -		mii_do_cmd(dev, phy_cmd_config);  /* display configuration */
> -
> -		/* Poll until the PHY tells us its configuration
> -		 * (not link state).
> -		 * Request is initiated by mii_do_cmd above, but answer
> -		 * comes by interrupt.
> -		 * This should take about 25 usec per register at 2.5 MHz,
> -		 * and we read approximately 5 registers.
> -		 */
> -		while(!fep->sequence_done)
> -			schedule();
> -
> -		mii_do_cmd(dev, fep->phy->startup);
> -	}
> -
> -	/* Set the initial link state to true. A lot of hardware
> -	 * based on this device does not implement a PHY interrupt,
> -	 * so we are never notified of link change.
> -	 */
> -	fep->link = 1;
> -
> +	/* schedule a link state check */
> +	phy_start(fep->phy_dev);
>  	netif_start_queue(dev);
>  	fep->opened = 1;
>  	return 0;
> @@ -1550,6 +929,7 @@ fec_enet_close(struct net_device *dev)
>  
>  	/* Don't know what to do yet. */
>  	fep->opened = 0;
> +	phy_stop(fep->phy_dev);
>  	netif_stop_queue(dev);
>  	fec_stop(dev);
>  
> @@ -1666,6 +1046,7 @@ static const struct net_device_ops fec_netdev_ops = {
>  	.ndo_validate_addr	= eth_validate_addr,
>  	.ndo_tx_timeout		= fec_timeout,
>  	.ndo_set_mac_address	= fec_set_mac_address,
> +	.ndo_do_ioctl           = fec_enet_ioctl,
>  };
>  
>   /*
> @@ -1689,7 +1070,6 @@ static int fec_enet_init(struct net_device *dev, int index)
>  	}
>  
>  	spin_lock_init(&fep->hw_lock);
> -	spin_lock_init(&fep->mii_lock);
>  
>  	fep->index = index;
>  	fep->hwp = (void __iomem *)dev->base_addr;
> @@ -1716,20 +1096,10 @@ static int fec_enet_init(struct net_device *dev, int index)
>  	fep->rx_bd_base = cbd_base;
>  	fep->tx_bd_base = cbd_base + RX_RING_SIZE;
>  
> -#ifdef HAVE_mii_link_interrupt
> -	fec_request_mii_intr(dev);
> -#endif
>  	/* The FEC Ethernet specific entries in the device structure */
>  	dev->watchdog_timeo = TX_TIMEOUT;
>  	dev->netdev_ops = &fec_netdev_ops;
> -
> -	for (i=0; i<NMII-1; i++)
> -		mii_cmds[i].mii_next = &mii_cmds[i+1];
> -	mii_free = mii_cmds;
> -
> -	/* Set MII speed to 2.5 MHz */
> -	fep->phy_speed = ((((clk_get_rate(fep->clk) / 2 + 4999999)
> -					/ 2500000) / 2) & 0x3F) << 1;
> +	dev->ethtool_ops = &fec_enet_ethtool_ops;
>  
>  	/* Initialize the receive buffer descriptors. */
>  	bdp = fep->rx_bd_base;
> @@ -1760,13 +1130,6 @@ static int fec_enet_init(struct net_device *dev, int index)
>  
>  	fec_restart(dev, 0);
>  
> -	/* Queue up command to detect the PHY and initialize the
> -	 * remainder of the interface.
> -	 */
> -	fep->phy_id_done = 0;
> -	fep->phy_addr = 0;
> -	mii_queue(dev, mk_mii_read(MII_REG_PHYIR1), mii_discover_phy);
> -
>  	return 0;
>  }
>  
> @@ -1835,8 +1198,7 @@ fec_restart(struct net_device *dev, int duplex)
>  	writel(0, fep->hwp + FEC_R_DES_ACTIVE);
>  
>  	/* Enable interrupts we wish to service */
> -	writel(FEC_ENET_TXF | FEC_ENET_RXF | FEC_ENET_MII,
> -			fep->hwp + FEC_IMASK);
> +	writel(FEC_ENET_TXF | FEC_ENET_RXF, fep->hwp + FEC_IMASK);
>  }
>  
>  static void
> @@ -1859,7 +1221,6 @@ fec_stop(struct net_device *dev)
>  	/* Clear outstanding MII command interrupts. */
>  	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
>  
> -	writel(FEC_ENET_MII, fep->hwp + FEC_IMASK);
>  	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
>  }
>  
> @@ -1891,6 +1252,7 @@ fec_probe(struct platform_device *pdev)
>  	memset(fep, 0, sizeof(*fep));
>  
>  	ndev->base_addr = (unsigned long)ioremap(r->start, resource_size(r));
> +	fep->pdev = pdev;
>  
>  	if (!ndev->base_addr) {
>  		ret = -ENOMEM;
> @@ -1926,13 +1288,24 @@ fec_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto failed_init;
>  
> +	ret = fec_enet_mii_init(pdev);
> +	if (ret)
> +		goto failed_mii_init;
> +
>  	ret = register_netdev(ndev);
>  	if (ret)
>  		goto failed_register;
>  
> +	printk(KERN_INFO "%s: Freescale FEC PHY driver [%s] "
> +		"(mii_bus:phy_addr=%s, irq=%d)\n", ndev->name,
> +		fep->phy_dev->drv->name, dev_name(&fep->phy_dev->dev),
> +		fep->phy_dev->irq);
> +
>  	return 0;
>  
>  failed_register:
> +	fec_enet_mii_remove(fep);
> +failed_mii_init:
>  failed_init:
>  	clk_disable(fep->clk);
>  	clk_put(fep->clk);
> @@ -1959,6 +1332,7 @@ fec_drv_remove(struct platform_device *pdev)
>  	platform_set_drvdata(pdev, NULL);
>  
>  	fec_stop(ndev);
> +	fec_enet_mii_remove(fep);
>  	clk_disable(fep->clk);
>  	clk_put(fep->clk);
>  	iounmap((void __iomem *)ndev->base_addr);
> -- 
> 1.7.0.1
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply

* [PATCH nf-next-2.6] xt_hashlimit: RCU conversion
From: Eric Dumazet @ 2010-03-31 12:23 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Jorrit Kronjee, netfilter-devel, netdev
In-Reply-To: <1269514928.3626.13.camel@edumazet-laptop>

xt_hashlimit uses a central lock per hash table and suffers from
contention on some workloads. (Multiqueue NIC or if RPS is enabled)

After RCU conversion, central lock is only used when a writer wants to
add or delete an entry.

For 'readers', updating an existing entry, they use an individual lock
per entry.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/netfilter/xt_hashlimit.c |  115 ++++++++++++++++++---------------
 1 file changed, 66 insertions(+), 49 deletions(-)
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 5470bb0..f245de6 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -81,12 +81,14 @@ struct dsthash_ent {
 	struct dsthash_dst dst;
 
 	/* modified structure members in the end */
+	spinlock_t lock;
 	unsigned long expires;		/* precalculated expiry time */
 	struct {
 		unsigned long prev;	/* last modification */
 		u_int32_t credit;
 		u_int32_t credit_cap, cost;
 	} rateinfo;
+	struct rcu_head rcu;
 };
 
 struct xt_hashlimit_htable {
@@ -143,54 +145,30 @@ dsthash_find(const struct xt_hashlimit_htable *ht,
 	u_int32_t hash = hash_dst(ht, dst);
 
 	if (!hlist_empty(&ht->hash[hash])) {
-		hlist_for_each_entry(ent, pos, &ht->hash[hash], node)
-			if (dst_cmp(ent, dst))
+		hlist_for_each_entry_rcu(ent, pos, &ht->hash[hash], node)
+			if (dst_cmp(ent, dst)) {
+				spin_lock(&ent->lock);
 				return ent;
+			}
 	}
 	return NULL;
 }
 
-/* allocate dsthash_ent, initialize dst, put in htable and lock it */
-static struct dsthash_ent *
-dsthash_alloc_init(struct xt_hashlimit_htable *ht,
-		   const struct dsthash_dst *dst)
+static void dsthash_free_rcu(struct rcu_head *head)
 {
-	struct dsthash_ent *ent;
+	struct dsthash_ent *ent = container_of(head, struct dsthash_ent, rcu);
 
-	/* initialize hash with random val at the time we allocate
-	 * the first hashtable entry */
-	if (!ht->rnd_initialized) {
-		get_random_bytes(&ht->rnd, sizeof(ht->rnd));
-		ht->rnd_initialized = true;
-	}
-
-	if (ht->cfg.max && ht->count >= ht->cfg.max) {
-		/* FIXME: do something. question is what.. */
-		if (net_ratelimit())
-			pr_err("max count of %u reached\n", ht->cfg.max);
-		return NULL;
-	}
-
-	ent = kmem_cache_alloc(hashlimit_cachep, GFP_ATOMIC);
-	if (!ent) {
-		if (net_ratelimit())
-			pr_err("cannot allocate dsthash_ent\n");
-		return NULL;
-	}
-	memcpy(&ent->dst, dst, sizeof(ent->dst));
-
-	hlist_add_head(&ent->node, &ht->hash[hash_dst(ht, dst)]);
-	ht->count++;
-	return ent;
+	kmem_cache_free(hashlimit_cachep, ent);
 }
 
-static inline void
-dsthash_free(struct xt_hashlimit_htable *ht, struct dsthash_ent *ent)
+static void dsthash_free(struct xt_hashlimit_htable *ht, struct dsthash_ent *ent)
 {
-	hlist_del(&ent->node);
-	kmem_cache_free(hashlimit_cachep, ent);
+	hlist_del_rcu(&ent->node);
+	call_rcu_bh(&ent->rcu, dsthash_free_rcu);
 	ht->count--;
 }
+
+
 static void htable_gc(unsigned long htlong);
 
 static int htable_create(struct net *net, struct xt_hashlimit_mtinfo1 *minfo,
@@ -500,6 +478,49 @@ hashlimit_init_dst(const struct xt_hashlimit_htable *hinfo,
 	return 0;
 }
 
+/* allocate dsthash_ent, initialize dst, put in htable and lock it */
+static struct dsthash_ent *
+dsthash_alloc_init(struct xt_hashlimit_htable *ht,
+		   const struct dsthash_dst *dst)
+{
+	struct dsthash_ent *ent;
+
+	spin_lock(&ht->lock);
+	/* initialize hash with random val at the time we allocate
+	 * the first hashtable entry */
+	if (unlikely(!ht->rnd_initialized)) {
+		get_random_bytes(&ht->rnd, sizeof(ht->rnd));
+		ht->rnd_initialized = true;
+	}
+
+	if (ht->cfg.max && ht->count >= ht->cfg.max) {
+		/* FIXME: do something. question is what.. */
+		if (net_ratelimit())
+			pr_err("max count of %u reached\n", ht->cfg.max);
+		ent = NULL;
+	} else
+		ent = kmem_cache_alloc(hashlimit_cachep, GFP_ATOMIC);
+	if (!ent) {
+		if (net_ratelimit())
+			pr_err("cannot allocate dsthash_ent\n");
+	} else {
+		memcpy(&ent->dst, dst, sizeof(ent->dst));
+		spin_lock_init(&ent->lock);
+
+		ent->expires = jiffies + msecs_to_jiffies(ht->cfg.expire);
+		ent->rateinfo.prev = jiffies;
+		ent->rateinfo.credit = user2credits(ht->cfg.avg *
+		                      ht->cfg.burst);
+		ent->rateinfo.credit_cap = user2credits(ht->cfg.avg *
+		                          ht->cfg.burst);
+		ent->rateinfo.cost = user2credits(ht->cfg.avg);
+		spin_lock(&ent->lock);
+		hlist_add_head_rcu(&ent->node, &ht->hash[hash_dst(ht, dst)]);
+		ht->count++;
+	}
+	spin_unlock(&ht->lock);
+	return ent;
+}
 static bool
 hashlimit_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 {
@@ -512,22 +533,14 @@ hashlimit_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 	if (hashlimit_init_dst(hinfo, &dst, skb, par->thoff) < 0)
 		goto hotdrop;
 
-	spin_lock_bh(&hinfo->lock);
+	rcu_read_lock_bh();
 	dh = dsthash_find(hinfo, &dst);
 	if (dh == NULL) {
 		dh = dsthash_alloc_init(hinfo, &dst);
 		if (dh == NULL) {
-			spin_unlock_bh(&hinfo->lock);
+			rcu_read_unlock_bh();
 			goto hotdrop;
 		}
-
-		dh->expires = jiffies + msecs_to_jiffies(hinfo->cfg.expire);
-		dh->rateinfo.prev = jiffies;
-		dh->rateinfo.credit = user2credits(hinfo->cfg.avg *
-		                      hinfo->cfg.burst);
-		dh->rateinfo.credit_cap = user2credits(hinfo->cfg.avg *
-		                          hinfo->cfg.burst);
-		dh->rateinfo.cost = user2credits(hinfo->cfg.avg);
 	} else {
 		/* update expiration timeout */
 		dh->expires = now + msecs_to_jiffies(hinfo->cfg.expire);
@@ -537,11 +550,13 @@ hashlimit_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 	if (dh->rateinfo.credit >= dh->rateinfo.cost) {
 		/* below the limit */
 		dh->rateinfo.credit -= dh->rateinfo.cost;
-		spin_unlock_bh(&hinfo->lock);
+		spin_unlock(&dh->lock);
+		rcu_read_unlock_bh();
 		return !(info->cfg.mode & XT_HASHLIMIT_INVERT);
 	}
 
-	spin_unlock_bh(&hinfo->lock);
+	spin_unlock(&dh->lock);
+	rcu_read_unlock_bh();
 	/* default match is underlimit - so over the limit, we need to invert */
 	return info->cfg.mode & XT_HASHLIMIT_INVERT;
 
@@ -817,9 +832,11 @@ err1:
 
 static void __exit hashlimit_mt_exit(void)
 {
-	kmem_cache_destroy(hashlimit_cachep);
 	xt_unregister_matches(hashlimit_mt_reg, ARRAY_SIZE(hashlimit_mt_reg));
 	unregister_pernet_subsys(&hashlimit_net_ops);
+
+	rcu_barrier_bh();
+	kmem_cache_destroy(hashlimit_cachep);
 }
 
 module_init(hashlimit_mt_init);



^ permalink raw reply related

* Re: [PATCH] netdev/fec.c: add phylib supporting to enable carrier detection (v2)
From: Amit Kucheria @ 2010-03-31 12:21 UTC (permalink / raw)
  To: Bryan Wu
  Cc: netdev, s.hauer, w.sang, linux-kernel, kernel-team, gerg,
	linux-arm-kernel
In-Reply-To: <1270037444-8470-1-git-send-email-bryan.wu@canonical.com>

On 10 Mar 31, Bryan Wu wrote:
> BugLink: http://bugs.launchpad.net/bugs/457878
> 
> v2:
>  - remove duplicated phy_speed caculation
>  - fix the phy_speed caculation according to the DataSheet
> 
> v1:
>  - removed old MII phy control code
>  - add phylib supporting
>  - add ethtool interface to make user space NetworkManager works
> 
> Tested on Freescale i.MX51 Babbage board.
> 
> This patch is based on a patch from Frederic Rodo <fred.rodo@gmail.com>
> 
> Cc: Frederic Rodo <fred.rodo@gmail.com>
> Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
Acked-by: Amit Kucheria <amit.kucheria@canonical.com>

> ---
>  drivers/net/Kconfig |    1 +
>  drivers/net/fec.c   | 1128 ++++++++++++---------------------------------------
>  2 files changed, 252 insertions(+), 877 deletions(-)
> 
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 0ba5b8e..41f6a70 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -1916,6 +1916,7 @@ config FEC
>  	bool "FEC ethernet controller (of ColdFire and some i.MX CPUs)"
>  	depends on M523x || M527x || M5272 || M528x || M520x || M532x || \
>  		MACH_MX27 || ARCH_MX35 || ARCH_MX25 || ARCH_MX5
> +	select PHYLIB
>  	help
>  	  Say Y here if you want to use the built-in 10/100 Fast ethernet
>  	  controller on some Motorola ColdFire and Freescale i.MX processors.
> diff --git a/drivers/net/fec.c b/drivers/net/fec.c
> index 9f98c1c..848eb19 100644
> --- a/drivers/net/fec.c
> +++ b/drivers/net/fec.c
> @@ -40,6 +40,7 @@
>  #include <linux/irq.h>
>  #include <linux/clk.h>
>  #include <linux/platform_device.h>
> +#include <linux/phy.h>
>  
>  #include <asm/cacheflush.h>
>  
> @@ -61,7 +62,6 @@
>   * Define the fixed address of the FEC hardware.
>   */
>  #if defined(CONFIG_M5272)
> -#define HAVE_mii_link_interrupt
>  
>  static unsigned char	fec_mac_default[] = {
>  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> @@ -86,23 +86,6 @@ static unsigned char	fec_mac_default[] = {
>  #endif
>  #endif /* CONFIG_M5272 */
>  
> -/* Forward declarations of some structures to support different PHYs */
> -
> -typedef struct {
> -	uint mii_data;
> -	void (*funct)(uint mii_reg, struct net_device *dev);
> -} phy_cmd_t;
> -
> -typedef struct {
> -	uint id;
> -	char *name;
> -
> -	const phy_cmd_t *config;
> -	const phy_cmd_t *startup;
> -	const phy_cmd_t *ack_int;
> -	const phy_cmd_t *shutdown;
> -} phy_info_t;
> -
>  /* The number of Tx and Rx buffers.  These are allocated from the page
>   * pool.  The code may assume these are power of two, so it it best
>   * to keep them that size.
> @@ -189,29 +172,21 @@ struct fec_enet_private {
>  	uint	tx_full;
>  	/* hold while accessing the HW like ringbuffer for tx/rx but not MAC */
>  	spinlock_t hw_lock;
> -	/* hold while accessing the mii_list_t() elements */
> -	spinlock_t mii_lock;
> -
> -	uint	phy_id;
> -	uint	phy_id_done;
> -	uint	phy_status;
> -	uint	phy_speed;
> -	phy_info_t const	*phy;
> -	struct work_struct phy_task;
>  
> -	uint	sequence_done;
> -	uint	mii_phy_task_queued;
> +	struct  platform_device *pdev;
>  
> -	uint	phy_addr;
> +	int	opened;
>  
> +	/* Phylib and MDIO interface */
> +	struct  mii_bus *mii_bus;
> +	struct  phy_device *phy_dev;
> +	int     mii_timeout;
> +	uint    phy_speed;
>  	int	index;
> -	int	opened;
>  	int	link;
> -	int	old_link;
>  	int	full_duplex;
>  };
>  
> -static void fec_enet_mii(struct net_device *dev);
>  static irqreturn_t fec_enet_interrupt(int irq, void * dev_id);
>  static void fec_enet_tx(struct net_device *dev);
>  static void fec_enet_rx(struct net_device *dev);
> @@ -219,67 +194,20 @@ static int fec_enet_close(struct net_device *dev);
>  static void fec_restart(struct net_device *dev, int duplex);
>  static void fec_stop(struct net_device *dev);
>  
> +/* FEC MII MMFR bits definition */
> +#define FEC_MMFR_ST		(1 << 30)
> +#define FEC_MMFR_OP_READ	(2 << 28)
> +#define FEC_MMFR_OP_WRITE	(1 << 28)
> +#define FEC_MMFR_PA(v)		((v & 0x1f) << 23)
> +#define FEC_MMFR_RA(v)		((v & 0x1f) << 18)
> +#define FEC_MMFR_TA		(2 << 16)
> +#define FEC_MMFR_DATA(v)	(v & 0xffff)
>  
> -/* MII processing.  We keep this as simple as possible.  Requests are
> - * placed on the list (if there is room).  When the request is finished
> - * by the MII, an optional function may be called.
> - */
> -typedef struct mii_list {
> -	uint	mii_regval;
> -	void	(*mii_func)(uint val, struct net_device *dev);
> -	struct	mii_list *mii_next;
> -} mii_list_t;
> -
> -#define		NMII	20
> -static mii_list_t	mii_cmds[NMII];
> -static mii_list_t	*mii_free;
> -static mii_list_t	*mii_head;
> -static mii_list_t	*mii_tail;
> -
> -static int	mii_queue(struct net_device *dev, int request,
> -				void (*func)(uint, struct net_device *));
> -
> -/* Make MII read/write commands for the FEC */
> -#define mk_mii_read(REG)	(0x60020000 | ((REG & 0x1f) << 18))
> -#define mk_mii_write(REG, VAL)	(0x50020000 | ((REG & 0x1f) << 18) | \
> -						(VAL & 0xffff))
> -#define mk_mii_end	0
> +#define FEC_MII_TIMEOUT		10000
>  
>  /* Transmitter timeout */
>  #define TX_TIMEOUT (2 * HZ)
>  
> -/* Register definitions for the PHY */
> -
> -#define MII_REG_CR          0  /* Control Register                         */
> -#define MII_REG_SR          1  /* Status Register                          */
> -#define MII_REG_PHYIR1      2  /* PHY Identification Register 1            */
> -#define MII_REG_PHYIR2      3  /* PHY Identification Register 2            */
> -#define MII_REG_ANAR        4  /* A-N Advertisement Register               */
> -#define MII_REG_ANLPAR      5  /* A-N Link Partner Ability Register        */
> -#define MII_REG_ANER        6  /* A-N Expansion Register                   */
> -#define MII_REG_ANNPTR      7  /* A-N Next Page Transmit Register          */
> -#define MII_REG_ANLPRNPR    8  /* A-N Link Partner Received Next Page Reg. */
> -
> -/* values for phy_status */
> -
> -#define PHY_CONF_ANE	0x0001  /* 1 auto-negotiation enabled */
> -#define PHY_CONF_LOOP	0x0002  /* 1 loopback mode enabled */
> -#define PHY_CONF_SPMASK	0x00f0  /* mask for speed */
> -#define PHY_CONF_10HDX	0x0010  /* 10 Mbit half duplex supported */
> -#define PHY_CONF_10FDX	0x0020  /* 10 Mbit full duplex supported */
> -#define PHY_CONF_100HDX	0x0040  /* 100 Mbit half duplex supported */
> -#define PHY_CONF_100FDX	0x0080  /* 100 Mbit full duplex supported */
> -
> -#define PHY_STAT_LINK	0x0100  /* 1 up - 0 down */
> -#define PHY_STAT_FAULT	0x0200  /* 1 remote fault */
> -#define PHY_STAT_ANC	0x0400  /* 1 auto-negotiation complete	*/
> -#define PHY_STAT_SPMASK	0xf000  /* mask for speed */
> -#define PHY_STAT_10HDX	0x1000  /* 10 Mbit half duplex selected	*/
> -#define PHY_STAT_10FDX	0x2000  /* 10 Mbit full duplex selected	*/
> -#define PHY_STAT_100HDX	0x4000  /* 100 Mbit half duplex selected */
> -#define PHY_STAT_100FDX	0x8000  /* 100 Mbit full duplex selected */
> -
> -
>  static int
>  fec_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
> @@ -406,12 +334,6 @@ fec_enet_interrupt(int irq, void * dev_id)
>  			ret = IRQ_HANDLED;
>  			fec_enet_tx(dev);
>  		}
> -
> -		if (int_events & FEC_ENET_MII) {
> -			ret = IRQ_HANDLED;
> -			fec_enet_mii(dev);
> -		}
> -
>  	} while (int_events);
>  
>  	return ret;
> @@ -607,827 +529,311 @@ rx_processing_done:
>  	spin_unlock(&fep->hw_lock);
>  }
>  
> -/* called from interrupt context */
> -static void
> -fec_enet_mii(struct net_device *dev)
> -{
> -	struct	fec_enet_private *fep;
> -	mii_list_t	*mip;
> -
> -	fep = netdev_priv(dev);
> -	spin_lock(&fep->mii_lock);
> -
> -	if ((mip = mii_head) == NULL) {
> -		printk("MII and no head!\n");
> -		goto unlock;
> -	}
> -
> -	if (mip->mii_func != NULL)
> -		(*(mip->mii_func))(readl(fep->hwp + FEC_MII_DATA), dev);
> -
> -	mii_head = mip->mii_next;
> -	mip->mii_next = mii_free;
> -	mii_free = mip;
> -
> -	if ((mip = mii_head) != NULL)
> -		writel(mip->mii_regval, fep->hwp + FEC_MII_DATA);
> -
> -unlock:
> -	spin_unlock(&fep->mii_lock);
> -}
> -
> -static int
> -mii_queue_unlocked(struct net_device *dev, int regval,
> -		void (*func)(uint, struct net_device *))
> +/* ------------------------------------------------------------------------- */
> +#ifdef CONFIG_M5272
> +static void __inline__ fec_get_mac(struct net_device *dev)
>  {
> -	struct fec_enet_private *fep;
> -	mii_list_t	*mip;
> -	int		retval;
> -
> -	/* Add PHY address to register command */
> -	fep = netdev_priv(dev);
> +	struct fec_enet_private *fep = netdev_priv(dev);
> +	unsigned char *iap, tmpaddr[ETH_ALEN];
>  
> -	regval |= fep->phy_addr << 23;
> -	retval = 0;
> -
> -	if ((mip = mii_free) != NULL) {
> -		mii_free = mip->mii_next;
> -		mip->mii_regval = regval;
> -		mip->mii_func = func;
> -		mip->mii_next = NULL;
> -		if (mii_head) {
> -			mii_tail->mii_next = mip;
> -			mii_tail = mip;
> -		} else {
> -			mii_head = mii_tail = mip;
> -			writel(regval, fep->hwp + FEC_MII_DATA);
> -		}
> +	if (FEC_FLASHMAC) {
> +		/*
> +		 * Get MAC address from FLASH.
> +		 * If it is all 1's or 0's, use the default.
> +		 */
> +		iap = (unsigned char *)FEC_FLASHMAC;
> +		if ((iap[0] == 0) && (iap[1] == 0) && (iap[2] == 0) &&
> +		    (iap[3] == 0) && (iap[4] == 0) && (iap[5] == 0))
> +			iap = fec_mac_default;
> +		if ((iap[0] == 0xff) && (iap[1] == 0xff) && (iap[2] == 0xff) &&
> +		    (iap[3] == 0xff) && (iap[4] == 0xff) && (iap[5] == 0xff))
> +			iap = fec_mac_default;
>  	} else {
> -		retval = 1;
> +		*((unsigned long *) &tmpaddr[0]) = readl(fep->hwp + FEC_ADDR_LOW);
> +		*((unsigned short *) &tmpaddr[4]) = (readl(fep->hwp + FEC_ADDR_HIGH) >> 16);
> +		iap = &tmpaddr[0];
>  	}
>  
> -	return retval;
> -}
> -
> -static int
> -mii_queue(struct net_device *dev, int regval,
> -		void (*func)(uint, struct net_device *))
> -{
> -	struct fec_enet_private *fep;
> -	unsigned long   flags;
> -	int             retval;
> -	fep = netdev_priv(dev);
> -	spin_lock_irqsave(&fep->mii_lock, flags);
> -	retval = mii_queue_unlocked(dev, regval, func);
> -	spin_unlock_irqrestore(&fep->mii_lock, flags);
> -	return retval;
> -}
> -
> -static void mii_do_cmd(struct net_device *dev, const phy_cmd_t *c)
> -{
> -	if(!c)
> -		return;
> +	memcpy(dev->dev_addr, iap, ETH_ALEN);
>  
> -	for (; c->mii_data != mk_mii_end; c++)
> -		mii_queue(dev, c->mii_data, c->funct);
> +	/* Adjust MAC if using default MAC address */
> +	if (iap == fec_mac_default)
> +		 dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;
>  }
> +#endif
>  
> -static void mii_parse_sr(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_STAT_LINK | PHY_STAT_FAULT | PHY_STAT_ANC);
> -
> -	if (mii_reg & 0x0004)
> -		status |= PHY_STAT_LINK;
> -	if (mii_reg & 0x0010)
> -		status |= PHY_STAT_FAULT;
> -	if (mii_reg & 0x0020)
> -		status |= PHY_STAT_ANC;
> -	*s = status;
> -}
> +/* ------------------------------------------------------------------------- */
>  
> -static void mii_parse_cr(uint mii_reg, struct net_device *dev)
> +/*
> + * Phy section
> + */
> +static void fec_enet_adjust_link(struct net_device *dev)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_CONF_ANE | PHY_CONF_LOOP);
> -
> -	if (mii_reg & 0x1000)
> -		status |= PHY_CONF_ANE;
> -	if (mii_reg & 0x4000)
> -		status |= PHY_CONF_LOOP;
> -	*s = status;
> -}
> +	struct phy_device *phy_dev = fep->phy_dev;
> +	unsigned long flags;
>  
> -static void mii_parse_anar(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_CONF_SPMASK);
> -
> -	if (mii_reg & 0x0020)
> -		status |= PHY_CONF_10HDX;
> -	if (mii_reg & 0x0040)
> -		status |= PHY_CONF_10FDX;
> -	if (mii_reg & 0x0080)
> -		status |= PHY_CONF_100HDX;
> -	if (mii_reg & 0x00100)
> -		status |= PHY_CONF_100FDX;
> -	*s = status;
> -}
> +	int status_change = 0;
>  
> -/* ------------------------------------------------------------------------- */
> -/* The Level one LXT970 is used by many boards				     */
> +	spin_lock_irqsave(&fep->hw_lock, flags);
>  
> -#define MII_LXT970_MIRROR    16  /* Mirror register           */
> -#define MII_LXT970_IER       17  /* Interrupt Enable Register */
> -#define MII_LXT970_ISR       18  /* Interrupt Status Register */
> -#define MII_LXT970_CONFIG    19  /* Configuration Register    */
> -#define MII_LXT970_CSR       20  /* Chip Status Register      */
> +	/* Prevent a state halted on mii error */
> +	if (fep->mii_timeout && phy_dev->state == PHY_HALTED) {
> +		phy_dev->state = PHY_RESUMING;
> +		goto spin_unlock;
> +	}
>  
> -static void mii_parse_lxt970_csr(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> +	/* Duplex link change */
> +	if (phy_dev->link) {
> +		if (fep->full_duplex != phy_dev->duplex) {
> +			fec_restart(dev, phy_dev->duplex);
> +			status_change = 1;
> +		}
> +	}
>  
> -	status = *s & ~(PHY_STAT_SPMASK);
> -	if (mii_reg & 0x0800) {
> -		if (mii_reg & 0x1000)
> -			status |= PHY_STAT_100FDX;
> +	/* Link on or off change */
> +	if (phy_dev->link != fep->link) {
> +		fep->link = phy_dev->link;
> +		if (phy_dev->link)
> +			fec_restart(dev, phy_dev->duplex);
>  		else
> -			status |= PHY_STAT_100HDX;
> -	} else {
> -		if (mii_reg & 0x1000)
> -			status |= PHY_STAT_10FDX;
> -		else
> -			status |= PHY_STAT_10HDX;
> +			fec_stop(dev);
> +		status_change = 1;
>  	}
> -	*s = status;
> -}
> -
> -static phy_cmd_t const phy_cmd_lxt970_config[] = {
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt970_startup[] = { /* enable interrupts */
> -		{ mk_mii_write(MII_LXT970_IER, 0x0002), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt970_ack_int[] = {
> -		/* read SR and ISR to acknowledge */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_LXT970_ISR), NULL },
> -
> -		/* find out the current status */
> -		{ mk_mii_read(MII_LXT970_CSR), mii_parse_lxt970_csr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt970_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_LXT970_IER, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_lxt970 = {
> -	.id = 0x07810000,
> -	.name = "LXT970",
> -	.config = phy_cmd_lxt970_config,
> -	.startup = phy_cmd_lxt970_startup,
> -	.ack_int = phy_cmd_lxt970_ack_int,
> -	.shutdown = phy_cmd_lxt970_shutdown
> -};
>  
> -/* ------------------------------------------------------------------------- */
> -/* The Level one LXT971 is used on some of my custom boards                  */
> -
> -/* register definitions for the 971 */
> +spin_unlock:
> +	spin_unlock_irqrestore(&fep->hw_lock, flags);
>  
> -#define MII_LXT971_PCR       16  /* Port Control Register     */
> -#define MII_LXT971_SR2       17  /* Status Register 2         */
> -#define MII_LXT971_IER       18  /* Interrupt Enable Register */
> -#define MII_LXT971_ISR       19  /* Interrupt Status Register */
> -#define MII_LXT971_LCR       20  /* LED Control Register      */
> -#define MII_LXT971_TCR       30  /* Transmit Control Register */
> +	if (status_change)
> +		phy_print_status(phy_dev);
> +}
>  
>  /*
> - * I had some nice ideas of running the MDIO faster...
> - * The 971 should support 8MHz and I tried it, but things acted really
> - * weird, so 2.5 MHz ought to be enough for anyone...
> + * NOTE: a MII transaction is during around 25 us, so polling it...
>   */
> -
> -static void mii_parse_lxt971_sr2(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
>  {
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> +	struct fec_enet_private *fep = bus->priv;
> +	int timeout = FEC_MII_TIMEOUT;
>  
> -	status = *s & ~(PHY_STAT_SPMASK | PHY_STAT_LINK | PHY_STAT_ANC);
> +	fep->mii_timeout = 0;
>  
> -	if (mii_reg & 0x0400) {
> -		fep->link = 1;
> -		status |= PHY_STAT_LINK;
> -	} else {
> -		fep->link = 0;
> -	}
> -	if (mii_reg & 0x0080)
> -		status |= PHY_STAT_ANC;
> -	if (mii_reg & 0x4000) {
> -		if (mii_reg & 0x0200)
> -			status |= PHY_STAT_100FDX;
> -		else
> -			status |= PHY_STAT_100HDX;
> -	} else {
> -		if (mii_reg & 0x0200)
> -			status |= PHY_STAT_10FDX;
> -		else
> -			status |= PHY_STAT_10HDX;
> +	/* clear MII end of transfer bit*/
> +	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
> +
> +	/* start a read op */
> +	writel(FEC_MMFR_ST | FEC_MMFR_OP_READ |
> +		FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
> +		FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
> +
> +	/* wait for end of transfer */
> +	while (!(readl(fep->hwp + FEC_IEVENT) & FEC_ENET_MII)) {
> +		cpu_relax();
> +		if (timeout-- < 0) {
> +			fep->mii_timeout = 1;
> +			printk(KERN_ERR "FEC: MDIO read timeout\n");
> +			return -ETIMEDOUT;
> +		}
>  	}
> -	if (mii_reg & 0x0008)
> -		status |= PHY_STAT_FAULT;
>  
> -	*s = status;
> +	/* return value */
> +	return FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
>  }
>  
> -static phy_cmd_t const phy_cmd_lxt971_config[] = {
> -		/* limit to 10MBit because my prototype board
> -		 * doesn't work with 100. */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_read(MII_LXT971_SR2), mii_parse_lxt971_sr2 },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt971_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_LXT971_IER, 0x00f2), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_write(MII_LXT971_LCR, 0xd422), NULL }, /* LED config */
> -		/* Somehow does the 971 tell me that the link is down
> -		 * the first read after power-up.
> -		 * read here to get a valid value in ack_int */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt971_ack_int[] = {
> -		/* acknowledge the int before reading status ! */
> -		{ mk_mii_read(MII_LXT971_ISR), NULL },
> -		/* find out the current status */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_LXT971_SR2), mii_parse_lxt971_sr2 },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_lxt971_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_LXT971_IER, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_lxt971 = {
> -	.id = 0x0001378e,
> -	.name = "LXT971",
> -	.config = phy_cmd_lxt971_config,
> -	.startup = phy_cmd_lxt971_startup,
> -	.ack_int = phy_cmd_lxt971_ack_int,
> -	.shutdown = phy_cmd_lxt971_shutdown
> -};
> -
> -/* ------------------------------------------------------------------------- */
> -/* The Quality Semiconductor QS6612 is used on the RPX CLLF                  */
> -
> -/* register definitions */
> -
> -#define MII_QS6612_MCR       17  /* Mode Control Register      */
> -#define MII_QS6612_FTR       27  /* Factory Test Register      */
> -#define MII_QS6612_MCO       28  /* Misc. Control Register     */
> -#define MII_QS6612_ISR       29  /* Interrupt Source Register  */
> -#define MII_QS6612_IMR       30  /* Interrupt Mask Register    */
> -#define MII_QS6612_PCR       31  /* 100BaseTx PHY Control Reg. */
> -
> -static void mii_parse_qs6612_pcr(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
> +			   u16 value)
>  {
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> +	struct fec_enet_private *fep = bus->priv;
> +	int timeout = FEC_MII_TIMEOUT;
>  
> -	status = *s & ~(PHY_STAT_SPMASK);
> +	fep->mii_timeout = 0;
>  
> -	switch((mii_reg >> 2) & 7) {
> -	case 1: status |= PHY_STAT_10HDX; break;
> -	case 2: status |= PHY_STAT_100HDX; break;
> -	case 5: status |= PHY_STAT_10FDX; break;
> -	case 6: status |= PHY_STAT_100FDX; break;
> -}
> -
> -	*s = status;
> -}
> -
> -static phy_cmd_t const phy_cmd_qs6612_config[] = {
> -		/* The PHY powers up isolated on the RPX,
> -		 * so send a command to allow operation.
> -		 */
> -		{ mk_mii_write(MII_QS6612_PCR, 0x0dc0), NULL },
> -
> -		/* parse cr and anar to get some info */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_qs6612_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_QS6612_IMR, 0x003a), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_qs6612_ack_int[] = {
> -		/* we need to read ISR, SR and ANER to acknowledge */
> -		{ mk_mii_read(MII_QS6612_ISR), NULL },
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_REG_ANER), NULL },
> -
> -		/* read pcr to get info */
> -		{ mk_mii_read(MII_QS6612_PCR), mii_parse_qs6612_pcr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_qs6612_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_QS6612_IMR, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_qs6612 = {
> -	.id = 0x00181440,
> -	.name = "QS6612",
> -	.config = phy_cmd_qs6612_config,
> -	.startup = phy_cmd_qs6612_startup,
> -	.ack_int = phy_cmd_qs6612_ack_int,
> -	.shutdown = phy_cmd_qs6612_shutdown
> -};
> -
> -/* ------------------------------------------------------------------------- */
> -/* AMD AM79C874 phy                                                          */
> +	/* clear MII end of transfer bit*/
> +	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
>  
> -/* register definitions for the 874 */
> +	/* start a read op */
> +	writel(FEC_MMFR_ST | FEC_MMFR_OP_READ |
> +		FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
> +		FEC_MMFR_TA | FEC_MMFR_DATA(value),
> +		fep->hwp + FEC_MII_DATA);
> +
> +	/* wait for end of transfer */
> +	while (!(readl(fep->hwp + FEC_IEVENT) & FEC_ENET_MII)) {
> +		cpu_relax();
> +		if (timeout-- < 0) {
> +			fep->mii_timeout = 1;
> +			printk(KERN_ERR "FEC: MDIO write timeout\n");
> +			return -ETIMEDOUT;
> +		}
> +	}
>  
> -#define MII_AM79C874_MFR       16  /* Miscellaneous Feature Register */
> -#define MII_AM79C874_ICSR      17  /* Interrupt/Status Register      */
> -#define MII_AM79C874_DR        18  /* Diagnostic Register            */
> -#define MII_AM79C874_PMLR      19  /* Power and Loopback Register    */
> -#define MII_AM79C874_MCR       21  /* ModeControl Register           */
> -#define MII_AM79C874_DC        23  /* Disconnect Counter             */
> -#define MII_AM79C874_REC       24  /* Recieve Error Counter          */
> +	return 0;
> +}
>  
> -static void mii_parse_am79c874_dr(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mdio_reset(struct mii_bus *bus)
>  {
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -	uint status;
> -
> -	status = *s & ~(PHY_STAT_SPMASK | PHY_STAT_ANC);
> -
> -	if (mii_reg & 0x0080)
> -		status |= PHY_STAT_ANC;
> -	if (mii_reg & 0x0400)
> -		status |= ((mii_reg & 0x0800) ? PHY_STAT_100FDX : PHY_STAT_100HDX);
> -	else
> -		status |= ((mii_reg & 0x0800) ? PHY_STAT_10FDX : PHY_STAT_10HDX);
> -
> -	*s = status;
> +	return 0;
>  }
>  
> -static phy_cmd_t const phy_cmd_am79c874_config[] = {
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_read(MII_AM79C874_DR), mii_parse_am79c874_dr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_am79c874_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_AM79C874_ICSR, 0xff00), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_am79c874_ack_int[] = {
> -		/* find out the current status */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_read(MII_AM79C874_DR), mii_parse_am79c874_dr },
> -		/* we only need to read ISR to acknowledge */
> -		{ mk_mii_read(MII_AM79C874_ICSR), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_am79c874_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_AM79C874_ICSR, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_am79c874 = {
> -	.id = 0x00022561,
> -	.name = "AM79C874",
> -	.config = phy_cmd_am79c874_config,
> -	.startup = phy_cmd_am79c874_startup,
> -	.ack_int = phy_cmd_am79c874_ack_int,
> -	.shutdown = phy_cmd_am79c874_shutdown
> -};
> -
> -
> -/* ------------------------------------------------------------------------- */
> -/* Kendin KS8721BL phy                                                       */
> -
> -/* register definitions for the 8721 */
> -
> -#define MII_KS8721BL_RXERCR	21
> -#define MII_KS8721BL_ICSR	27
> -#define	MII_KS8721BL_PHYCR	31
> -
> -static phy_cmd_t const phy_cmd_ks8721bl_config[] = {
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_ks8721bl_startup[] = {  /* enable interrupts */
> -		{ mk_mii_write(MII_KS8721BL_ICSR, 0xff00), NULL },
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_ks8721bl_ack_int[] = {
> -		/* find out the current status */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		/* we only need to read ISR to acknowledge */
> -		{ mk_mii_read(MII_KS8721BL_ICSR), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_cmd_t const phy_cmd_ks8721bl_shutdown[] = { /* disable interrupts */
> -		{ mk_mii_write(MII_KS8721BL_ICSR, 0x0000), NULL },
> -		{ mk_mii_end, }
> -	};
> -static phy_info_t const phy_info_ks8721bl = {
> -	.id = 0x00022161,
> -	.name = "KS8721BL",
> -	.config = phy_cmd_ks8721bl_config,
> -	.startup = phy_cmd_ks8721bl_startup,
> -	.ack_int = phy_cmd_ks8721bl_ack_int,
> -	.shutdown = phy_cmd_ks8721bl_shutdown
> -};
> -
> -/* ------------------------------------------------------------------------- */
> -/* register definitions for the DP83848 */
> -
> -#define MII_DP8384X_PHYSTST    16  /* PHY Status Register */
> -
> -static void mii_parse_dp8384x_sr2(uint mii_reg, struct net_device *dev)
> +static int fec_enet_mii_probe(struct net_device *dev)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> -
> -	*s &= ~(PHY_STAT_SPMASK | PHY_STAT_LINK | PHY_STAT_ANC);
> -
> -	/* Link up */
> -	if (mii_reg & 0x0001) {
> -		fep->link = 1;
> -		*s |= PHY_STAT_LINK;
> -	} else
> -		fep->link = 0;
> -	/* Status of link */
> -	if (mii_reg & 0x0010)   /* Autonegotioation complete */
> -		*s |= PHY_STAT_ANC;
> -	if (mii_reg & 0x0002) {   /* 10MBps? */
> -		if (mii_reg & 0x0004)   /* Full Duplex? */
> -			*s |= PHY_STAT_10FDX;
> -		else
> -			*s |= PHY_STAT_10HDX;
> -	} else {                  /* 100 Mbps? */
> -		if (mii_reg & 0x0004)   /* Full Duplex? */
> -			*s |= PHY_STAT_100FDX;
> -		else
> -			*s |= PHY_STAT_100HDX;
> -	}
> -	if (mii_reg & 0x0008)
> -		*s |= PHY_STAT_FAULT;
> -}
> -
> -static phy_info_t phy_info_dp83848= {
> -	0x020005c9,
> -	"DP83848",
> +	struct phy_device *phy_dev = NULL;
> +	int phy_addr;
>  
> -	(const phy_cmd_t []) {  /* config */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_read(MII_DP8384X_PHYSTST), mii_parse_dp8384x_sr2 },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) {  /* startup - enable interrupts */
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* ack_int - never happens, no interrupt */
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) {  /* shutdown */
> -		{ mk_mii_end, }
> -	},
> -};
> +	/* find the first phy */
> +	for (phy_addr = 0; phy_addr < PHY_MAX_ADDR; phy_addr++) {
> +		if (fep->mii_bus->phy_map[phy_addr]) {
> +			phy_dev = fep->mii_bus->phy_map[phy_addr];
> +			break;
> +		}
> +	}
>  
> -static phy_info_t phy_info_lan8700 = {
> -	0x0007C0C,
> -	"LAN8700",
> -	(const phy_cmd_t []) { /* config */
> -		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
> -		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* startup */
> -		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
> -		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* act_int */
> -		{ mk_mii_end, }
> -	},
> -	(const phy_cmd_t []) { /* shutdown */
> -		{ mk_mii_end, }
> -	},
> -};
> -/* ------------------------------------------------------------------------- */
> +	if (!phy_dev) {
> +		printk(KERN_ERR "%s: no PHY found\n", dev->name);
> +		return -ENODEV;
> +	}
>  
> -static phy_info_t const * const phy_info[] = {
> -	&phy_info_lxt970,
> -	&phy_info_lxt971,
> -	&phy_info_qs6612,
> -	&phy_info_am79c874,
> -	&phy_info_ks8721bl,
> -	&phy_info_dp83848,
> -	&phy_info_lan8700,
> -	NULL
> -};
> +	/* attach the mac to the phy */
> +	phy_dev = phy_connect(dev, dev_name(&phy_dev->dev),
> +			     &fec_enet_adjust_link, 0,
> +			     PHY_INTERFACE_MODE_MII);
> +	if (IS_ERR(phy_dev)) {
> +		printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name);
> +		return PTR_ERR(phy_dev);
> +	}
>  
> -/* ------------------------------------------------------------------------- */
> -#ifdef HAVE_mii_link_interrupt
> -static irqreturn_t
> -mii_link_interrupt(int irq, void * dev_id);
> +	/* mask with MAC supported features */
> +	phy_dev->supported &= PHY_BASIC_FEATURES;
> +	phy_dev->advertising = phy_dev->supported;
>  
> -/*
> - *	This is specific to the MII interrupt setup of the M5272EVB.
> - */
> -static void __inline__ fec_request_mii_intr(struct net_device *dev)
> -{
> -	if (request_irq(66, mii_link_interrupt, IRQF_DISABLED, "fec(MII)", dev) != 0)
> -		printk("FEC: Could not allocate fec(MII) IRQ(66)!\n");
> -}
> +	fep->phy_dev = phy_dev;
> +	fep->link = 0;
> +	fep->full_duplex = 0;
>  
> -static void __inline__ fec_disable_phy_intr(struct net_device *dev)
> -{
> -	free_irq(66, dev);
> +	return 0;
>  }
> -#endif
>  
> -#ifdef CONFIG_M5272
> -static void __inline__ fec_get_mac(struct net_device *dev)
> +static int fec_enet_mii_init(struct platform_device *pdev)
>  {
> +	struct net_device *dev = platform_get_drvdata(pdev);
>  	struct fec_enet_private *fep = netdev_priv(dev);
> -	unsigned char *iap, tmpaddr[ETH_ALEN];
> +	int err = -ENXIO, i;
>  
> -	if (FEC_FLASHMAC) {
> -		/*
> -		 * Get MAC address from FLASH.
> -		 * If it is all 1's or 0's, use the default.
> -		 */
> -		iap = (unsigned char *)FEC_FLASHMAC;
> -		if ((iap[0] == 0) && (iap[1] == 0) && (iap[2] == 0) &&
> -		    (iap[3] == 0) && (iap[4] == 0) && (iap[5] == 0))
> -			iap = fec_mac_default;
> -		if ((iap[0] == 0xff) && (iap[1] == 0xff) && (iap[2] == 0xff) &&
> -		    (iap[3] == 0xff) && (iap[4] == 0xff) && (iap[5] == 0xff))
> -			iap = fec_mac_default;
> -	} else {
> -		*((unsigned long *) &tmpaddr[0]) = readl(fep->hwp + FEC_ADDR_LOW);
> -		*((unsigned short *) &tmpaddr[4]) = (readl(fep->hwp + FEC_ADDR_HIGH) >> 16);
> -		iap = &tmpaddr[0];
> -	}
> -
> -	memcpy(dev->dev_addr, iap, ETH_ALEN);
> -
> -	/* Adjust MAC if using default MAC address */
> -	if (iap == fec_mac_default)
> -		 dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;
> -}
> -#endif
> +	fep->mii_timeout = 0;
>  
> -/* ------------------------------------------------------------------------- */
> -
> -static void mii_display_status(struct net_device *dev)
> -{
> -	struct fec_enet_private *fep = netdev_priv(dev);
> -	volatile uint *s = &(fep->phy_status);
> +	/*
> +	 * Set MII speed to 2.5 MHz (= clk_get_rate() / 2 * phy_speed)
> +	 */
> +	fep->phy_speed = DIV_ROUND_UP(clk_get_rate(fep->clk), 5000000) << 1;
> +	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
>  
> -	if (!fep->link && !fep->old_link) {
> -		/* Link is still down - don't print anything */
> -		return;
> +	fep->mii_bus = mdiobus_alloc();
> +	if (fep->mii_bus == NULL) {
> +		err = -ENOMEM;
> +		goto err_out;
>  	}
>  
> -	printk("%s: status: ", dev->name);
> -
> -	if (!fep->link) {
> -		printk("link down");
> -	} else {
> -		printk("link up");
> -
> -		switch(*s & PHY_STAT_SPMASK) {
> -		case PHY_STAT_100FDX: printk(", 100MBit Full Duplex"); break;
> -		case PHY_STAT_100HDX: printk(", 100MBit Half Duplex"); break;
> -		case PHY_STAT_10FDX: printk(", 10MBit Full Duplex"); break;
> -		case PHY_STAT_10HDX: printk(", 10MBit Half Duplex"); break;
> -		default:
> -			printk(", Unknown speed/duplex");
> -		}
> -
> -		if (*s & PHY_STAT_ANC)
> -			printk(", auto-negotiation complete");
> +	fep->mii_bus->name = "fec_enet_mii_bus";
> +	fep->mii_bus->read = fec_enet_mdio_read;
> +	fep->mii_bus->write = fec_enet_mdio_write;
> +	fep->mii_bus->reset = fec_enet_mdio_reset;
> +	snprintf(fep->mii_bus->id, MII_BUS_ID_SIZE, "%x", pdev->id);
> +	fep->mii_bus->priv = fep;
> +	fep->mii_bus->parent = &pdev->dev;
> +
> +	fep->mii_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
> +	if (!fep->mii_bus->irq) {
> +		err = -ENOMEM;
> +		goto err_out_free_mdiobus;
>  	}
>  
> -	if (*s & PHY_STAT_FAULT)
> -		printk(", remote fault");
> -
> -	printk(".\n");
> -}
> -
> -static void mii_display_config(struct work_struct *work)
> -{
> -	struct fec_enet_private *fep = container_of(work, struct fec_enet_private, phy_task);
> -	struct net_device *dev = fep->netdev;
> -	uint status = fep->phy_status;
> +	for (i = 0; i < PHY_MAX_ADDR; i++)
> +		fep->mii_bus->irq[i] = PHY_POLL;
>  
> -	/*
> -	** When we get here, phy_task is already removed from
> -	** the workqueue.  It is thus safe to allow to reuse it.
> -	*/
> -	fep->mii_phy_task_queued = 0;
> -	printk("%s: config: auto-negotiation ", dev->name);
> -
> -	if (status & PHY_CONF_ANE)
> -		printk("on");
> -	else
> -		printk("off");
> +	platform_set_drvdata(dev, fep->mii_bus);
>  
> -	if (status & PHY_CONF_100FDX)
> -		printk(", 100FDX");
> -	if (status & PHY_CONF_100HDX)
> -		printk(", 100HDX");
> -	if (status & PHY_CONF_10FDX)
> -		printk(", 10FDX");
> -	if (status & PHY_CONF_10HDX)
> -		printk(", 10HDX");
> -	if (!(status & PHY_CONF_SPMASK))
> -		printk(", No speed/duplex selected?");
> +	if (mdiobus_register(fep->mii_bus))
> +		goto err_out_free_mdio_irq;
>  
> -	if (status & PHY_CONF_LOOP)
> -		printk(", loopback enabled");
> +	if (fec_enet_mii_probe(dev) != 0)
> +		goto err_out_unregister_bus;
>  
> -	printk(".\n");
> +	return 0;
>  
> -	fep->sequence_done = 1;
> +err_out_unregister_bus:
> +	mdiobus_unregister(fep->mii_bus);
> +err_out_free_mdio_irq:
> +	kfree(fep->mii_bus->irq);
> +err_out_free_mdiobus:
> +	mdiobus_free(fep->mii_bus);
> +err_out:
> +	return err;
>  }
>  
> -static void mii_relink(struct work_struct *work)
> +static void fec_enet_mii_remove(struct fec_enet_private *fep)
>  {
> -	struct fec_enet_private *fep = container_of(work, struct fec_enet_private, phy_task);
> -	struct net_device *dev = fep->netdev;
> -	int duplex;
> -
> -	/*
> -	** When we get here, phy_task is already removed from
> -	** the workqueue.  It is thus safe to allow to reuse it.
> -	*/
> -	fep->mii_phy_task_queued = 0;
> -	fep->link = (fep->phy_status & PHY_STAT_LINK) ? 1 : 0;
> -	mii_display_status(dev);
> -	fep->old_link = fep->link;
> -
> -	if (fep->link) {
> -		duplex = 0;
> -		if (fep->phy_status
> -		    & (PHY_STAT_100FDX | PHY_STAT_10FDX))
> -			duplex = 1;
> -		fec_restart(dev, duplex);
> -	} else
> -		fec_stop(dev);
> +	if (fep->phy_dev)
> +		phy_disconnect(fep->phy_dev);
> +	mdiobus_unregister(fep->mii_bus);
> +	kfree(fep->mii_bus->irq);
> +	mdiobus_free(fep->mii_bus);
>  }
>  
> -/* mii_queue_relink is called in interrupt context from mii_link_interrupt */
> -static void mii_queue_relink(uint mii_reg, struct net_device *dev)
> +static int fec_enet_get_settings(struct net_device *dev,
> +				  struct ethtool_cmd *cmd)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> +	struct phy_device *phydev = fep->phy_dev;
>  
> -	/*
> -	 * We cannot queue phy_task twice in the workqueue.  It
> -	 * would cause an endless loop in the workqueue.
> -	 * Fortunately, if the last mii_relink entry has not yet been
> -	 * executed now, it will do the job for the current interrupt,
> -	 * which is just what we want.
> -	 */
> -	if (fep->mii_phy_task_queued)
> -		return;
> +	if (!phydev)
> +		return -ENODEV;
>  
> -	fep->mii_phy_task_queued = 1;
> -	INIT_WORK(&fep->phy_task, mii_relink);
> -	schedule_work(&fep->phy_task);
> +	return phy_ethtool_gset(phydev, cmd);
>  }
>  
> -/* mii_queue_config is called in interrupt context from fec_enet_mii */
> -static void mii_queue_config(uint mii_reg, struct net_device *dev)
> +static int fec_enet_set_settings(struct net_device *dev,
> +				 struct ethtool_cmd *cmd)
>  {
>  	struct fec_enet_private *fep = netdev_priv(dev);
> +	struct phy_device *phydev = fep->phy_dev;
>  
> -	if (fep->mii_phy_task_queued)
> -		return;
> +	if (!phydev)
> +		return -ENODEV;
>  
> -	fep->mii_phy_task_queued = 1;
> -	INIT_WORK(&fep->phy_task, mii_display_config);
> -	schedule_work(&fep->phy_task);
> +	return phy_ethtool_sset(phydev, cmd);
>  }
>  
> -phy_cmd_t const phy_cmd_relink[] = {
> -	{ mk_mii_read(MII_REG_CR), mii_queue_relink },
> -	{ mk_mii_end, }
> -	};
> -phy_cmd_t const phy_cmd_config[] = {
> -	{ mk_mii_read(MII_REG_CR), mii_queue_config },
> -	{ mk_mii_end, }
> -	};
> -
> -/* Read remainder of PHY ID. */
> -static void
> -mii_discover_phy3(uint mii_reg, struct net_device *dev)
> +static void fec_enet_get_drvinfo(struct net_device *dev,
> +				 struct ethtool_drvinfo *info)
>  {
> -	struct fec_enet_private *fep;
> -	int i;
> -
> -	fep = netdev_priv(dev);
> -	fep->phy_id |= (mii_reg & 0xffff);
> -	printk("fec: PHY @ 0x%x, ID 0x%08x", fep->phy_addr, fep->phy_id);
> -
> -	for(i = 0; phy_info[i]; i++) {
> -		if(phy_info[i]->id == (fep->phy_id >> 4))
> -			break;
> -	}
> -
> -	if (phy_info[i])
> -		printk(" -- %s\n", phy_info[i]->name);
> -	else
> -		printk(" -- unknown PHY!\n");
> +	struct fec_enet_private *fep = netdev_priv(dev);
>  
> -	fep->phy = phy_info[i];
> -	fep->phy_id_done = 1;
> +	strcpy(info->driver, fep->pdev->dev.driver->name);
> +	strcpy(info->version, "Revision: 1.0");
> +	strcpy(info->bus_info, dev_name(&dev->dev));
>  }
>  
> -/* Scan all of the MII PHY addresses looking for someone to respond
> - * with a valid ID.  This usually happens quickly.
> - */
> -static void
> -mii_discover_phy(uint mii_reg, struct net_device *dev)
> -{
> -	struct fec_enet_private *fep;
> -	uint phytype;
> -
> -	fep = netdev_priv(dev);
> -
> -	if (fep->phy_addr < 32) {
> -		if ((phytype = (mii_reg & 0xffff)) != 0xffff && phytype != 0) {
> -
> -			/* Got first part of ID, now get remainder */
> -			fep->phy_id = phytype << 16;
> -			mii_queue_unlocked(dev, mk_mii_read(MII_REG_PHYIR2),
> -							mii_discover_phy3);
> -		} else {
> -			fep->phy_addr++;
> -			mii_queue_unlocked(dev, mk_mii_read(MII_REG_PHYIR1),
> -							mii_discover_phy);
> -		}
> -	} else {
> -		printk("FEC: No PHY device found.\n");
> -		/* Disable external MII interface */
> -		writel(0, fep->hwp + FEC_MII_SPEED);
> -		fep->phy_speed = 0;
> -#ifdef HAVE_mii_link_interrupt
> -		fec_disable_phy_intr(dev);
> -#endif
> -	}
> -}
> +static struct ethtool_ops fec_enet_ethtool_ops = {
> +	.get_settings		= fec_enet_get_settings,
> +	.set_settings		= fec_enet_set_settings,
> +	.get_drvinfo		= fec_enet_get_drvinfo,
> +	.get_link		= ethtool_op_get_link,
> +};
>  
> -/* This interrupt occurs when the PHY detects a link change */
> -#ifdef HAVE_mii_link_interrupt
> -static irqreturn_t
> -mii_link_interrupt(int irq, void * dev_id)
> +static int fec_enet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
>  {
> -	struct	net_device *dev = dev_id;
>  	struct fec_enet_private *fep = netdev_priv(dev);
> +	struct phy_device *phydev = fep->phy_dev;
> +
> +	if (!netif_running(dev))
> +		return -EINVAL;
>  
> -	mii_do_cmd(dev, fep->phy->ack_int);
> -	mii_do_cmd(dev, phy_cmd_relink);  /* restart and display status */
> +	if (!phydev)
> +		return -ENODEV;
>  
> -	return IRQ_HANDLED;
> +	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
>  }
> -#endif
>  
>  static void fec_enet_free_buffers(struct net_device *dev)
>  {
> @@ -1509,35 +915,8 @@ fec_enet_open(struct net_device *dev)
>  	if (ret)
>  		return ret;
>  
> -	fep->sequence_done = 0;
> -	fep->link = 0;
> -
> -	fec_restart(dev, 1);
> -
> -	if (fep->phy) {
> -		mii_do_cmd(dev, fep->phy->ack_int);
> -		mii_do_cmd(dev, fep->phy->config);
> -		mii_do_cmd(dev, phy_cmd_config);  /* display configuration */
> -
> -		/* Poll until the PHY tells us its configuration
> -		 * (not link state).
> -		 * Request is initiated by mii_do_cmd above, but answer
> -		 * comes by interrupt.
> -		 * This should take about 25 usec per register at 2.5 MHz,
> -		 * and we read approximately 5 registers.
> -		 */
> -		while(!fep->sequence_done)
> -			schedule();
> -
> -		mii_do_cmd(dev, fep->phy->startup);
> -	}
> -
> -	/* Set the initial link state to true. A lot of hardware
> -	 * based on this device does not implement a PHY interrupt,
> -	 * so we are never notified of link change.
> -	 */
> -	fep->link = 1;
> -
> +	/* schedule a link state check */
> +	phy_start(fep->phy_dev);
>  	netif_start_queue(dev);
>  	fep->opened = 1;
>  	return 0;
> @@ -1550,6 +929,7 @@ fec_enet_close(struct net_device *dev)
>  
>  	/* Don't know what to do yet. */
>  	fep->opened = 0;
> +	phy_stop(fep->phy_dev);
>  	netif_stop_queue(dev);
>  	fec_stop(dev);
>  
> @@ -1666,6 +1046,7 @@ static const struct net_device_ops fec_netdev_ops = {
>  	.ndo_validate_addr	= eth_validate_addr,
>  	.ndo_tx_timeout		= fec_timeout,
>  	.ndo_set_mac_address	= fec_set_mac_address,
> +	.ndo_do_ioctl           = fec_enet_ioctl,
>  };
>  
>   /*
> @@ -1689,7 +1070,6 @@ static int fec_enet_init(struct net_device *dev, int index)
>  	}
>  
>  	spin_lock_init(&fep->hw_lock);
> -	spin_lock_init(&fep->mii_lock);
>  
>  	fep->index = index;
>  	fep->hwp = (void __iomem *)dev->base_addr;
> @@ -1716,20 +1096,10 @@ static int fec_enet_init(struct net_device *dev, int index)
>  	fep->rx_bd_base = cbd_base;
>  	fep->tx_bd_base = cbd_base + RX_RING_SIZE;
>  
> -#ifdef HAVE_mii_link_interrupt
> -	fec_request_mii_intr(dev);
> -#endif
>  	/* The FEC Ethernet specific entries in the device structure */
>  	dev->watchdog_timeo = TX_TIMEOUT;
>  	dev->netdev_ops = &fec_netdev_ops;
> -
> -	for (i=0; i<NMII-1; i++)
> -		mii_cmds[i].mii_next = &mii_cmds[i+1];
> -	mii_free = mii_cmds;
> -
> -	/* Set MII speed to 2.5 MHz */
> -	fep->phy_speed = ((((clk_get_rate(fep->clk) / 2 + 4999999)
> -					/ 2500000) / 2) & 0x3F) << 1;
> +	dev->ethtool_ops = &fec_enet_ethtool_ops;
>  
>  	/* Initialize the receive buffer descriptors. */
>  	bdp = fep->rx_bd_base;
> @@ -1760,13 +1130,6 @@ static int fec_enet_init(struct net_device *dev, int index)
>  
>  	fec_restart(dev, 0);
>  
> -	/* Queue up command to detect the PHY and initialize the
> -	 * remainder of the interface.
> -	 */
> -	fep->phy_id_done = 0;
> -	fep->phy_addr = 0;
> -	mii_queue(dev, mk_mii_read(MII_REG_PHYIR1), mii_discover_phy);
> -
>  	return 0;
>  }
>  
> @@ -1835,8 +1198,7 @@ fec_restart(struct net_device *dev, int duplex)
>  	writel(0, fep->hwp + FEC_R_DES_ACTIVE);
>  
>  	/* Enable interrupts we wish to service */
> -	writel(FEC_ENET_TXF | FEC_ENET_RXF | FEC_ENET_MII,
> -			fep->hwp + FEC_IMASK);
> +	writel(FEC_ENET_TXF | FEC_ENET_RXF, fep->hwp + FEC_IMASK);
>  }
>  
>  static void
> @@ -1859,7 +1221,6 @@ fec_stop(struct net_device *dev)
>  	/* Clear outstanding MII command interrupts. */
>  	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
>  
> -	writel(FEC_ENET_MII, fep->hwp + FEC_IMASK);
>  	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
>  }
>  
> @@ -1891,6 +1252,7 @@ fec_probe(struct platform_device *pdev)
>  	memset(fep, 0, sizeof(*fep));
>  
>  	ndev->base_addr = (unsigned long)ioremap(r->start, resource_size(r));
> +	fep->pdev = pdev;
>  
>  	if (!ndev->base_addr) {
>  		ret = -ENOMEM;
> @@ -1926,13 +1288,24 @@ fec_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto failed_init;
>  
> +	ret = fec_enet_mii_init(pdev);
> +	if (ret)
> +		goto failed_mii_init;
> +
>  	ret = register_netdev(ndev);
>  	if (ret)
>  		goto failed_register;
>  
> +	printk(KERN_INFO "%s: Freescale FEC PHY driver [%s] "
> +		"(mii_bus:phy_addr=%s, irq=%d)\n", ndev->name,
> +		fep->phy_dev->drv->name, dev_name(&fep->phy_dev->dev),
> +		fep->phy_dev->irq);
> +
>  	return 0;
>  
>  failed_register:
> +	fec_enet_mii_remove(fep);
> +failed_mii_init:
>  failed_init:
>  	clk_disable(fep->clk);
>  	clk_put(fep->clk);
> @@ -1959,6 +1332,7 @@ fec_drv_remove(struct platform_device *pdev)
>  	platform_set_drvdata(pdev, NULL);
>  
>  	fec_stop(ndev);
> +	fec_enet_mii_remove(fep);
>  	clk_disable(fep->clk);
>  	clk_put(fep->clk);
>  	iounmap((void __iomem *)ndev->base_addr);
> -- 
> 1.7.0.1
> 

-- 
----------------------------------------------------------------------
Amit Kucheria, Kernel Engineer || amit.kucheria@canonical.com
----------------------------------------------------------------------

^ permalink raw reply

* Re: [r8169] WARNING: at net/sched/sch_generic.c
From: Sergey Senozhatsky @ 2010-03-31 12:14 UTC (permalink / raw)
  To: Neil Horman
  Cc: netdev, Francois Romieu, Eric Dumazet, David S. Miller,
	linux-kernel
In-Reply-To: <20100331111942.GB13963@hmsreliant.think-freely.org>

[-- Attachment #1: Type: text/plain, Size: 3536 bytes --]

Hello,

On (03/31/10 07:19), Neil Horman wrote:
> On Wed, Mar 31, 2010 at 01:21:42PM +0300, Sergey Senozhatsky wrote:
> > Hello,
> > I have the following problem:
> > 
> > [  296.337510] ------------[ cut here ]------------
> > [  296.337523] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> > [  296.337527] Hardware name: F3JC                
> > [  296.337530] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > [  296.337533] Modules linked in: pktgen ipv6 snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci asus_laptop sparse_keymap mmc_core led_class snd_hda_intel
> > snd_hda_codec psmouse snd_pcm snd_timer snd soundcore snd_page_alloc serio_raw i2c_i801 rng_core evdev sg r8169 mii usbhid hid uhci_hcd ehci_hcd sr_mod cdrom sd_mod usbcore
> > ata_piix
> > [  296.337586] Pid: 0, comm: swapper Not tainted 2.6.34-rc3-dbg #74
> > [  296.337589] Call Trace:
> > [  296.337597]  [<c102e71f>] warn_slowpath_common+0x65/0x7c
> > [  296.337603]  [<c126e30c>] ? dev_watchdog+0xc1/0x125
> > [  296.337608]  [<c102e76a>] warn_slowpath_fmt+0x24/0x27
> > [  296.337613]  [<c126e30c>] dev_watchdog+0xc1/0x125
> > [  296.337620]  [<c1040039>] ? prepare_to_wait_exclusive+0x52/0x5b
> > [  296.337627]  [<c1037053>] ? run_timer_softirq+0x120/0x1eb
> > [  296.337632]  [<c10370a9>] run_timer_softirq+0x176/0x1eb
> > [  296.337637]  [<c1037053>] ? run_timer_softirq+0x120/0x1eb
> > [  296.337643]  [<c126e24b>] ? dev_watchdog+0x0/0x125
> > [  296.337650]  [<c10331c9>] __do_softirq+0x8d/0x117
> > [  296.337655]  [<c103327e>] do_softirq+0x2b/0x43
> > [  296.337660]  [<c10333a3>] irq_exit+0x38/0x75
> > [  296.337667]  [<c1015138>] smp_apic_timer_interrupt+0x6d/0x7b
> > [  296.337673]  [<c12cbada>] apic_timer_interrupt+0x36/0x3c
> > [  296.337679]  [<c104007b>] ? prepare_to_wait+0x39/0x57
> > [  296.337685]  [<c11dd835>] ? acpi_idle_enter_simple+0x119/0x144
> > [  296.337692]  [<c124d358>] cpuidle_idle_call+0x6d/0xa5
> > [  296.337697]  [<c1001b51>] cpu_idle+0x92/0xc1
> > [  296.337704]  [<c12c63d0>] start_secondary+0x1f3/0x1fa
> > [  296.337708] ---[ end trace cd4a1b50139837df ]---
> > 
> > 
> > Reproducing 100% with pktgen tests.
> > 
> What kind of packets are you sending with pktgen?


Here is the sh file to run pktgen.

=====
#!/bin/sh

function pgset() {
    local result

    echo $1 > $PGDEV

    result=`cat $PGDEV | fgrep "Result: OK:"`
    if [ "$result" = "" ]; then
         cat $PGDEV | fgrep Result:
    fi
}

function pg() {
    echo inject > $PGDEV
    cat $PGDEV
}

# Config Start Here -----------------------------------------------------------


# thread config
# Each CPU has own thread. Two CPU exammple. We add eth1, eth2 respectivly.

PGDEV=/proc/net/pktgen/kpktgend_0
  echo "Removing all devices"
 pgset "rem_device_all" 
  echo "Adding eth0"
 pgset "add_device eth0" 
  echo "Setting max_before_softirq 10000"
 pgset "max_before_softirq 10000"


# device config
# delay 0 means maximum speed.

CLONE_SKB="clone_skb 500"
# NIC adds 4 bytes CRC
PKT_SIZE="pkt_size 2048"

# COUNT 0 means forever
#COUNT="count 0"
COUNT="count 110110"
DELAY="delay 3"

PGDEV=/proc/net/pktgen/eth0
  echo "Configuring $PGDEV"
 pgset "$COUNT"
 pgset "$CLONE_SKB"
 pgset "$PKT_SIZE"
 pgset "$DELAY"
 pgset "dst ???????"
 pgset "dst_mac ????????"


# Time to run
PGDEV=/proc/net/pktgen/pgctrl

 echo "Running... ctrl^C to stop"
 pgset "start" 
 echo "Done"

====


	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply

* Re: [PATCH] netdev/fec.c: add phylib supporting to enable carrier detection
From: Bryan Wu @ 2010-03-31 12:11 UTC (permalink / raw)
  To: Wolfram Sang, Sascha Hauer
  Cc: netdev, linux-kernel, kernel-team, gerg, linux-arm-kernel
In-Reply-To: <20100331081004.GB23391@pengutronix.de>

On 03/31/2010 04:10 PM, Wolfram Sang wrote:
>> With rounding this would be:
>>
>> (clk_get_rate() + 4999999) / 5000000
>
> Better use DIV_ROUND_UP(clk_get_rate(), 5000000)?
>
> Regards,
>
>     Wolfram
>

Sascha and Wolfram,

Thanks a lot. I updated the patch to v2 and sent it already.

-- 
Bryan Wu <bryan.wu@canonical.com>
Kernel Developer    +86.138-1617-6545 Mobile
Ubuntu Kernel Team | Hardware Enablement Team
Canonical Ltd.      www.canonical.com
Ubuntu - Linux for human beings | www.ubuntu.com

^ permalink raw reply

* [PATCH] netdev/fec.c: add phylib supporting to enable carrier detection (v2)
From: Bryan Wu @ 2010-03-31 12:10 UTC (permalink / raw)
  To: s.hauer, gerg, amit.kucheria, w.sang
  Cc: netdev, kernel-team, linux-kernel, linux-arm-kernel

BugLink: http://bugs.launchpad.net/bugs/457878

v2:
 - remove duplicated phy_speed caculation
 - fix the phy_speed caculation according to the DataSheet

v1:
 - removed old MII phy control code
 - add phylib supporting
 - add ethtool interface to make user space NetworkManager works

Tested on Freescale i.MX51 Babbage board.

This patch is based on a patch from Frederic Rodo <fred.rodo@gmail.com>

Cc: Frederic Rodo <fred.rodo@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
---
 drivers/net/Kconfig |    1 +
 drivers/net/fec.c   | 1128 ++++++++++++---------------------------------------
 2 files changed, 252 insertions(+), 877 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0ba5b8e..41f6a70 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1916,6 +1916,7 @@ config FEC
 	bool "FEC ethernet controller (of ColdFire and some i.MX CPUs)"
 	depends on M523x || M527x || M5272 || M528x || M520x || M532x || \
 		MACH_MX27 || ARCH_MX35 || ARCH_MX25 || ARCH_MX5
+	select PHYLIB
 	help
 	  Say Y here if you want to use the built-in 10/100 Fast ethernet
 	  controller on some Motorola ColdFire and Freescale i.MX processors.
diff --git a/drivers/net/fec.c b/drivers/net/fec.c
index 9f98c1c..848eb19 100644
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -40,6 +40,7 @@
 #include <linux/irq.h>
 #include <linux/clk.h>
 #include <linux/platform_device.h>
+#include <linux/phy.h>
 
 #include <asm/cacheflush.h>
 
@@ -61,7 +62,6 @@
  * Define the fixed address of the FEC hardware.
  */
 #if defined(CONFIG_M5272)
-#define HAVE_mii_link_interrupt
 
 static unsigned char	fec_mac_default[] = {
 	0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -86,23 +86,6 @@ static unsigned char	fec_mac_default[] = {
 #endif
 #endif /* CONFIG_M5272 */
 
-/* Forward declarations of some structures to support different PHYs */
-
-typedef struct {
-	uint mii_data;
-	void (*funct)(uint mii_reg, struct net_device *dev);
-} phy_cmd_t;
-
-typedef struct {
-	uint id;
-	char *name;
-
-	const phy_cmd_t *config;
-	const phy_cmd_t *startup;
-	const phy_cmd_t *ack_int;
-	const phy_cmd_t *shutdown;
-} phy_info_t;
-
 /* The number of Tx and Rx buffers.  These are allocated from the page
  * pool.  The code may assume these are power of two, so it it best
  * to keep them that size.
@@ -189,29 +172,21 @@ struct fec_enet_private {
 	uint	tx_full;
 	/* hold while accessing the HW like ringbuffer for tx/rx but not MAC */
 	spinlock_t hw_lock;
-	/* hold while accessing the mii_list_t() elements */
-	spinlock_t mii_lock;
-
-	uint	phy_id;
-	uint	phy_id_done;
-	uint	phy_status;
-	uint	phy_speed;
-	phy_info_t const	*phy;
-	struct work_struct phy_task;
 
-	uint	sequence_done;
-	uint	mii_phy_task_queued;
+	struct  platform_device *pdev;
 
-	uint	phy_addr;
+	int	opened;
 
+	/* Phylib and MDIO interface */
+	struct  mii_bus *mii_bus;
+	struct  phy_device *phy_dev;
+	int     mii_timeout;
+	uint    phy_speed;
 	int	index;
-	int	opened;
 	int	link;
-	int	old_link;
 	int	full_duplex;
 };
 
-static void fec_enet_mii(struct net_device *dev);
 static irqreturn_t fec_enet_interrupt(int irq, void * dev_id);
 static void fec_enet_tx(struct net_device *dev);
 static void fec_enet_rx(struct net_device *dev);
@@ -219,67 +194,20 @@ static int fec_enet_close(struct net_device *dev);
 static void fec_restart(struct net_device *dev, int duplex);
 static void fec_stop(struct net_device *dev);
 
+/* FEC MII MMFR bits definition */
+#define FEC_MMFR_ST		(1 << 30)
+#define FEC_MMFR_OP_READ	(2 << 28)
+#define FEC_MMFR_OP_WRITE	(1 << 28)
+#define FEC_MMFR_PA(v)		((v & 0x1f) << 23)
+#define FEC_MMFR_RA(v)		((v & 0x1f) << 18)
+#define FEC_MMFR_TA		(2 << 16)
+#define FEC_MMFR_DATA(v)	(v & 0xffff)
 
-/* MII processing.  We keep this as simple as possible.  Requests are
- * placed on the list (if there is room).  When the request is finished
- * by the MII, an optional function may be called.
- */
-typedef struct mii_list {
-	uint	mii_regval;
-	void	(*mii_func)(uint val, struct net_device *dev);
-	struct	mii_list *mii_next;
-} mii_list_t;
-
-#define		NMII	20
-static mii_list_t	mii_cmds[NMII];
-static mii_list_t	*mii_free;
-static mii_list_t	*mii_head;
-static mii_list_t	*mii_tail;
-
-static int	mii_queue(struct net_device *dev, int request,
-				void (*func)(uint, struct net_device *));
-
-/* Make MII read/write commands for the FEC */
-#define mk_mii_read(REG)	(0x60020000 | ((REG & 0x1f) << 18))
-#define mk_mii_write(REG, VAL)	(0x50020000 | ((REG & 0x1f) << 18) | \
-						(VAL & 0xffff))
-#define mk_mii_end	0
+#define FEC_MII_TIMEOUT		10000
 
 /* Transmitter timeout */
 #define TX_TIMEOUT (2 * HZ)
 
-/* Register definitions for the PHY */
-
-#define MII_REG_CR          0  /* Control Register                         */
-#define MII_REG_SR          1  /* Status Register                          */
-#define MII_REG_PHYIR1      2  /* PHY Identification Register 1            */
-#define MII_REG_PHYIR2      3  /* PHY Identification Register 2            */
-#define MII_REG_ANAR        4  /* A-N Advertisement Register               */
-#define MII_REG_ANLPAR      5  /* A-N Link Partner Ability Register        */
-#define MII_REG_ANER        6  /* A-N Expansion Register                   */
-#define MII_REG_ANNPTR      7  /* A-N Next Page Transmit Register          */
-#define MII_REG_ANLPRNPR    8  /* A-N Link Partner Received Next Page Reg. */
-
-/* values for phy_status */
-
-#define PHY_CONF_ANE	0x0001  /* 1 auto-negotiation enabled */
-#define PHY_CONF_LOOP	0x0002  /* 1 loopback mode enabled */
-#define PHY_CONF_SPMASK	0x00f0  /* mask for speed */
-#define PHY_CONF_10HDX	0x0010  /* 10 Mbit half duplex supported */
-#define PHY_CONF_10FDX	0x0020  /* 10 Mbit full duplex supported */
-#define PHY_CONF_100HDX	0x0040  /* 100 Mbit half duplex supported */
-#define PHY_CONF_100FDX	0x0080  /* 100 Mbit full duplex supported */
-
-#define PHY_STAT_LINK	0x0100  /* 1 up - 0 down */
-#define PHY_STAT_FAULT	0x0200  /* 1 remote fault */
-#define PHY_STAT_ANC	0x0400  /* 1 auto-negotiation complete	*/
-#define PHY_STAT_SPMASK	0xf000  /* mask for speed */
-#define PHY_STAT_10HDX	0x1000  /* 10 Mbit half duplex selected	*/
-#define PHY_STAT_10FDX	0x2000  /* 10 Mbit full duplex selected	*/
-#define PHY_STAT_100HDX	0x4000  /* 100 Mbit half duplex selected */
-#define PHY_STAT_100FDX	0x8000  /* 100 Mbit full duplex selected */
-
-
 static int
 fec_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
@@ -406,12 +334,6 @@ fec_enet_interrupt(int irq, void * dev_id)
 			ret = IRQ_HANDLED;
 			fec_enet_tx(dev);
 		}
-
-		if (int_events & FEC_ENET_MII) {
-			ret = IRQ_HANDLED;
-			fec_enet_mii(dev);
-		}
-
 	} while (int_events);
 
 	return ret;
@@ -607,827 +529,311 @@ rx_processing_done:
 	spin_unlock(&fep->hw_lock);
 }
 
-/* called from interrupt context */
-static void
-fec_enet_mii(struct net_device *dev)
-{
-	struct	fec_enet_private *fep;
-	mii_list_t	*mip;
-
-	fep = netdev_priv(dev);
-	spin_lock(&fep->mii_lock);
-
-	if ((mip = mii_head) == NULL) {
-		printk("MII and no head!\n");
-		goto unlock;
-	}
-
-	if (mip->mii_func != NULL)
-		(*(mip->mii_func))(readl(fep->hwp + FEC_MII_DATA), dev);
-
-	mii_head = mip->mii_next;
-	mip->mii_next = mii_free;
-	mii_free = mip;
-
-	if ((mip = mii_head) != NULL)
-		writel(mip->mii_regval, fep->hwp + FEC_MII_DATA);
-
-unlock:
-	spin_unlock(&fep->mii_lock);
-}
-
-static int
-mii_queue_unlocked(struct net_device *dev, int regval,
-		void (*func)(uint, struct net_device *))
+/* ------------------------------------------------------------------------- */
+#ifdef CONFIG_M5272
+static void __inline__ fec_get_mac(struct net_device *dev)
 {
-	struct fec_enet_private *fep;
-	mii_list_t	*mip;
-	int		retval;
-
-	/* Add PHY address to register command */
-	fep = netdev_priv(dev);
+	struct fec_enet_private *fep = netdev_priv(dev);
+	unsigned char *iap, tmpaddr[ETH_ALEN];
 
-	regval |= fep->phy_addr << 23;
-	retval = 0;
-
-	if ((mip = mii_free) != NULL) {
-		mii_free = mip->mii_next;
-		mip->mii_regval = regval;
-		mip->mii_func = func;
-		mip->mii_next = NULL;
-		if (mii_head) {
-			mii_tail->mii_next = mip;
-			mii_tail = mip;
-		} else {
-			mii_head = mii_tail = mip;
-			writel(regval, fep->hwp + FEC_MII_DATA);
-		}
+	if (FEC_FLASHMAC) {
+		/*
+		 * Get MAC address from FLASH.
+		 * If it is all 1's or 0's, use the default.
+		 */
+		iap = (unsigned char *)FEC_FLASHMAC;
+		if ((iap[0] == 0) && (iap[1] == 0) && (iap[2] == 0) &&
+		    (iap[3] == 0) && (iap[4] == 0) && (iap[5] == 0))
+			iap = fec_mac_default;
+		if ((iap[0] == 0xff) && (iap[1] == 0xff) && (iap[2] == 0xff) &&
+		    (iap[3] == 0xff) && (iap[4] == 0xff) && (iap[5] == 0xff))
+			iap = fec_mac_default;
 	} else {
-		retval = 1;
+		*((unsigned long *) &tmpaddr[0]) = readl(fep->hwp + FEC_ADDR_LOW);
+		*((unsigned short *) &tmpaddr[4]) = (readl(fep->hwp + FEC_ADDR_HIGH) >> 16);
+		iap = &tmpaddr[0];
 	}
 
-	return retval;
-}
-
-static int
-mii_queue(struct net_device *dev, int regval,
-		void (*func)(uint, struct net_device *))
-{
-	struct fec_enet_private *fep;
-	unsigned long   flags;
-	int             retval;
-	fep = netdev_priv(dev);
-	spin_lock_irqsave(&fep->mii_lock, flags);
-	retval = mii_queue_unlocked(dev, regval, func);
-	spin_unlock_irqrestore(&fep->mii_lock, flags);
-	return retval;
-}
-
-static void mii_do_cmd(struct net_device *dev, const phy_cmd_t *c)
-{
-	if(!c)
-		return;
+	memcpy(dev->dev_addr, iap, ETH_ALEN);
 
-	for (; c->mii_data != mk_mii_end; c++)
-		mii_queue(dev, c->mii_data, c->funct);
+	/* Adjust MAC if using default MAC address */
+	if (iap == fec_mac_default)
+		 dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;
 }
+#endif
 
-static void mii_parse_sr(uint mii_reg, struct net_device *dev)
-{
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
-
-	status = *s & ~(PHY_STAT_LINK | PHY_STAT_FAULT | PHY_STAT_ANC);
-
-	if (mii_reg & 0x0004)
-		status |= PHY_STAT_LINK;
-	if (mii_reg & 0x0010)
-		status |= PHY_STAT_FAULT;
-	if (mii_reg & 0x0020)
-		status |= PHY_STAT_ANC;
-	*s = status;
-}
+/* ------------------------------------------------------------------------- */
 
-static void mii_parse_cr(uint mii_reg, struct net_device *dev)
+/*
+ * Phy section
+ */
+static void fec_enet_adjust_link(struct net_device *dev)
 {
 	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
-
-	status = *s & ~(PHY_CONF_ANE | PHY_CONF_LOOP);
-
-	if (mii_reg & 0x1000)
-		status |= PHY_CONF_ANE;
-	if (mii_reg & 0x4000)
-		status |= PHY_CONF_LOOP;
-	*s = status;
-}
+	struct phy_device *phy_dev = fep->phy_dev;
+	unsigned long flags;
 
-static void mii_parse_anar(uint mii_reg, struct net_device *dev)
-{
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
-
-	status = *s & ~(PHY_CONF_SPMASK);
-
-	if (mii_reg & 0x0020)
-		status |= PHY_CONF_10HDX;
-	if (mii_reg & 0x0040)
-		status |= PHY_CONF_10FDX;
-	if (mii_reg & 0x0080)
-		status |= PHY_CONF_100HDX;
-	if (mii_reg & 0x00100)
-		status |= PHY_CONF_100FDX;
-	*s = status;
-}
+	int status_change = 0;
 
-/* ------------------------------------------------------------------------- */
-/* The Level one LXT970 is used by many boards				     */
+	spin_lock_irqsave(&fep->hw_lock, flags);
 
-#define MII_LXT970_MIRROR    16  /* Mirror register           */
-#define MII_LXT970_IER       17  /* Interrupt Enable Register */
-#define MII_LXT970_ISR       18  /* Interrupt Status Register */
-#define MII_LXT970_CONFIG    19  /* Configuration Register    */
-#define MII_LXT970_CSR       20  /* Chip Status Register      */
+	/* Prevent a state halted on mii error */
+	if (fep->mii_timeout && phy_dev->state == PHY_HALTED) {
+		phy_dev->state = PHY_RESUMING;
+		goto spin_unlock;
+	}
 
-static void mii_parse_lxt970_csr(uint mii_reg, struct net_device *dev)
-{
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
+	/* Duplex link change */
+	if (phy_dev->link) {
+		if (fep->full_duplex != phy_dev->duplex) {
+			fec_restart(dev, phy_dev->duplex);
+			status_change = 1;
+		}
+	}
 
-	status = *s & ~(PHY_STAT_SPMASK);
-	if (mii_reg & 0x0800) {
-		if (mii_reg & 0x1000)
-			status |= PHY_STAT_100FDX;
+	/* Link on or off change */
+	if (phy_dev->link != fep->link) {
+		fep->link = phy_dev->link;
+		if (phy_dev->link)
+			fec_restart(dev, phy_dev->duplex);
 		else
-			status |= PHY_STAT_100HDX;
-	} else {
-		if (mii_reg & 0x1000)
-			status |= PHY_STAT_10FDX;
-		else
-			status |= PHY_STAT_10HDX;
+			fec_stop(dev);
+		status_change = 1;
 	}
-	*s = status;
-}
-
-static phy_cmd_t const phy_cmd_lxt970_config[] = {
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_lxt970_startup[] = { /* enable interrupts */
-		{ mk_mii_write(MII_LXT970_IER, 0x0002), NULL },
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_lxt970_ack_int[] = {
-		/* read SR and ISR to acknowledge */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_read(MII_LXT970_ISR), NULL },
-
-		/* find out the current status */
-		{ mk_mii_read(MII_LXT970_CSR), mii_parse_lxt970_csr },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_lxt970_shutdown[] = { /* disable interrupts */
-		{ mk_mii_write(MII_LXT970_IER, 0x0000), NULL },
-		{ mk_mii_end, }
-	};
-static phy_info_t const phy_info_lxt970 = {
-	.id = 0x07810000,
-	.name = "LXT970",
-	.config = phy_cmd_lxt970_config,
-	.startup = phy_cmd_lxt970_startup,
-	.ack_int = phy_cmd_lxt970_ack_int,
-	.shutdown = phy_cmd_lxt970_shutdown
-};
 
-/* ------------------------------------------------------------------------- */
-/* The Level one LXT971 is used on some of my custom boards                  */
-
-/* register definitions for the 971 */
+spin_unlock:
+	spin_unlock_irqrestore(&fep->hw_lock, flags);
 
-#define MII_LXT971_PCR       16  /* Port Control Register     */
-#define MII_LXT971_SR2       17  /* Status Register 2         */
-#define MII_LXT971_IER       18  /* Interrupt Enable Register */
-#define MII_LXT971_ISR       19  /* Interrupt Status Register */
-#define MII_LXT971_LCR       20  /* LED Control Register      */
-#define MII_LXT971_TCR       30  /* Transmit Control Register */
+	if (status_change)
+		phy_print_status(phy_dev);
+}
 
 /*
- * I had some nice ideas of running the MDIO faster...
- * The 971 should support 8MHz and I tried it, but things acted really
- * weird, so 2.5 MHz ought to be enough for anyone...
+ * NOTE: a MII transaction is during around 25 us, so polling it...
  */
-
-static void mii_parse_lxt971_sr2(uint mii_reg, struct net_device *dev)
+static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
 {
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
+	struct fec_enet_private *fep = bus->priv;
+	int timeout = FEC_MII_TIMEOUT;
 
-	status = *s & ~(PHY_STAT_SPMASK | PHY_STAT_LINK | PHY_STAT_ANC);
+	fep->mii_timeout = 0;
 
-	if (mii_reg & 0x0400) {
-		fep->link = 1;
-		status |= PHY_STAT_LINK;
-	} else {
-		fep->link = 0;
-	}
-	if (mii_reg & 0x0080)
-		status |= PHY_STAT_ANC;
-	if (mii_reg & 0x4000) {
-		if (mii_reg & 0x0200)
-			status |= PHY_STAT_100FDX;
-		else
-			status |= PHY_STAT_100HDX;
-	} else {
-		if (mii_reg & 0x0200)
-			status |= PHY_STAT_10FDX;
-		else
-			status |= PHY_STAT_10HDX;
+	/* clear MII end of transfer bit*/
+	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
+
+	/* start a read op */
+	writel(FEC_MMFR_ST | FEC_MMFR_OP_READ |
+		FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
+		FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
+
+	/* wait for end of transfer */
+	while (!(readl(fep->hwp + FEC_IEVENT) & FEC_ENET_MII)) {
+		cpu_relax();
+		if (timeout-- < 0) {
+			fep->mii_timeout = 1;
+			printk(KERN_ERR "FEC: MDIO read timeout\n");
+			return -ETIMEDOUT;
+		}
 	}
-	if (mii_reg & 0x0008)
-		status |= PHY_STAT_FAULT;
 
-	*s = status;
+	/* return value */
+	return FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
 }
 
-static phy_cmd_t const phy_cmd_lxt971_config[] = {
-		/* limit to 10MBit because my prototype board
-		 * doesn't work with 100. */
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_read(MII_LXT971_SR2), mii_parse_lxt971_sr2 },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_lxt971_startup[] = {  /* enable interrupts */
-		{ mk_mii_write(MII_LXT971_IER, 0x00f2), NULL },
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_write(MII_LXT971_LCR, 0xd422), NULL }, /* LED config */
-		/* Somehow does the 971 tell me that the link is down
-		 * the first read after power-up.
-		 * read here to get a valid value in ack_int */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_lxt971_ack_int[] = {
-		/* acknowledge the int before reading status ! */
-		{ mk_mii_read(MII_LXT971_ISR), NULL },
-		/* find out the current status */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_read(MII_LXT971_SR2), mii_parse_lxt971_sr2 },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_lxt971_shutdown[] = { /* disable interrupts */
-		{ mk_mii_write(MII_LXT971_IER, 0x0000), NULL },
-		{ mk_mii_end, }
-	};
-static phy_info_t const phy_info_lxt971 = {
-	.id = 0x0001378e,
-	.name = "LXT971",
-	.config = phy_cmd_lxt971_config,
-	.startup = phy_cmd_lxt971_startup,
-	.ack_int = phy_cmd_lxt971_ack_int,
-	.shutdown = phy_cmd_lxt971_shutdown
-};
-
-/* ------------------------------------------------------------------------- */
-/* The Quality Semiconductor QS6612 is used on the RPX CLLF                  */
-
-/* register definitions */
-
-#define MII_QS6612_MCR       17  /* Mode Control Register      */
-#define MII_QS6612_FTR       27  /* Factory Test Register      */
-#define MII_QS6612_MCO       28  /* Misc. Control Register     */
-#define MII_QS6612_ISR       29  /* Interrupt Source Register  */
-#define MII_QS6612_IMR       30  /* Interrupt Mask Register    */
-#define MII_QS6612_PCR       31  /* 100BaseTx PHY Control Reg. */
-
-static void mii_parse_qs6612_pcr(uint mii_reg, struct net_device *dev)
+static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
+			   u16 value)
 {
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
+	struct fec_enet_private *fep = bus->priv;
+	int timeout = FEC_MII_TIMEOUT;
 
-	status = *s & ~(PHY_STAT_SPMASK);
+	fep->mii_timeout = 0;
 
-	switch((mii_reg >> 2) & 7) {
-	case 1: status |= PHY_STAT_10HDX; break;
-	case 2: status |= PHY_STAT_100HDX; break;
-	case 5: status |= PHY_STAT_10FDX; break;
-	case 6: status |= PHY_STAT_100FDX; break;
-}
-
-	*s = status;
-}
-
-static phy_cmd_t const phy_cmd_qs6612_config[] = {
-		/* The PHY powers up isolated on the RPX,
-		 * so send a command to allow operation.
-		 */
-		{ mk_mii_write(MII_QS6612_PCR, 0x0dc0), NULL },
-
-		/* parse cr and anar to get some info */
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_qs6612_startup[] = {  /* enable interrupts */
-		{ mk_mii_write(MII_QS6612_IMR, 0x003a), NULL },
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_qs6612_ack_int[] = {
-		/* we need to read ISR, SR and ANER to acknowledge */
-		{ mk_mii_read(MII_QS6612_ISR), NULL },
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_read(MII_REG_ANER), NULL },
-
-		/* read pcr to get info */
-		{ mk_mii_read(MII_QS6612_PCR), mii_parse_qs6612_pcr },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_qs6612_shutdown[] = { /* disable interrupts */
-		{ mk_mii_write(MII_QS6612_IMR, 0x0000), NULL },
-		{ mk_mii_end, }
-	};
-static phy_info_t const phy_info_qs6612 = {
-	.id = 0x00181440,
-	.name = "QS6612",
-	.config = phy_cmd_qs6612_config,
-	.startup = phy_cmd_qs6612_startup,
-	.ack_int = phy_cmd_qs6612_ack_int,
-	.shutdown = phy_cmd_qs6612_shutdown
-};
-
-/* ------------------------------------------------------------------------- */
-/* AMD AM79C874 phy                                                          */
+	/* clear MII end of transfer bit*/
+	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
 
-/* register definitions for the 874 */
+	/* start a read op */
+	writel(FEC_MMFR_ST | FEC_MMFR_OP_READ |
+		FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
+		FEC_MMFR_TA | FEC_MMFR_DATA(value),
+		fep->hwp + FEC_MII_DATA);
+
+	/* wait for end of transfer */
+	while (!(readl(fep->hwp + FEC_IEVENT) & FEC_ENET_MII)) {
+		cpu_relax();
+		if (timeout-- < 0) {
+			fep->mii_timeout = 1;
+			printk(KERN_ERR "FEC: MDIO write timeout\n");
+			return -ETIMEDOUT;
+		}
+	}
 
-#define MII_AM79C874_MFR       16  /* Miscellaneous Feature Register */
-#define MII_AM79C874_ICSR      17  /* Interrupt/Status Register      */
-#define MII_AM79C874_DR        18  /* Diagnostic Register            */
-#define MII_AM79C874_PMLR      19  /* Power and Loopback Register    */
-#define MII_AM79C874_MCR       21  /* ModeControl Register           */
-#define MII_AM79C874_DC        23  /* Disconnect Counter             */
-#define MII_AM79C874_REC       24  /* Recieve Error Counter          */
+	return 0;
+}
 
-static void mii_parse_am79c874_dr(uint mii_reg, struct net_device *dev)
+static int fec_enet_mdio_reset(struct mii_bus *bus)
 {
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-	uint status;
-
-	status = *s & ~(PHY_STAT_SPMASK | PHY_STAT_ANC);
-
-	if (mii_reg & 0x0080)
-		status |= PHY_STAT_ANC;
-	if (mii_reg & 0x0400)
-		status |= ((mii_reg & 0x0800) ? PHY_STAT_100FDX : PHY_STAT_100HDX);
-	else
-		status |= ((mii_reg & 0x0800) ? PHY_STAT_10FDX : PHY_STAT_10HDX);
-
-	*s = status;
+	return 0;
 }
 
-static phy_cmd_t const phy_cmd_am79c874_config[] = {
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_read(MII_AM79C874_DR), mii_parse_am79c874_dr },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_am79c874_startup[] = {  /* enable interrupts */
-		{ mk_mii_write(MII_AM79C874_ICSR, 0xff00), NULL },
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_am79c874_ack_int[] = {
-		/* find out the current status */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_read(MII_AM79C874_DR), mii_parse_am79c874_dr },
-		/* we only need to read ISR to acknowledge */
-		{ mk_mii_read(MII_AM79C874_ICSR), NULL },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_am79c874_shutdown[] = { /* disable interrupts */
-		{ mk_mii_write(MII_AM79C874_ICSR, 0x0000), NULL },
-		{ mk_mii_end, }
-	};
-static phy_info_t const phy_info_am79c874 = {
-	.id = 0x00022561,
-	.name = "AM79C874",
-	.config = phy_cmd_am79c874_config,
-	.startup = phy_cmd_am79c874_startup,
-	.ack_int = phy_cmd_am79c874_ack_int,
-	.shutdown = phy_cmd_am79c874_shutdown
-};
-
-
-/* ------------------------------------------------------------------------- */
-/* Kendin KS8721BL phy                                                       */
-
-/* register definitions for the 8721 */
-
-#define MII_KS8721BL_RXERCR	21
-#define MII_KS8721BL_ICSR	27
-#define	MII_KS8721BL_PHYCR	31
-
-static phy_cmd_t const phy_cmd_ks8721bl_config[] = {
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_ks8721bl_startup[] = {  /* enable interrupts */
-		{ mk_mii_write(MII_KS8721BL_ICSR, 0xff00), NULL },
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_ks8721bl_ack_int[] = {
-		/* find out the current status */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		/* we only need to read ISR to acknowledge */
-		{ mk_mii_read(MII_KS8721BL_ICSR), NULL },
-		{ mk_mii_end, }
-	};
-static phy_cmd_t const phy_cmd_ks8721bl_shutdown[] = { /* disable interrupts */
-		{ mk_mii_write(MII_KS8721BL_ICSR, 0x0000), NULL },
-		{ mk_mii_end, }
-	};
-static phy_info_t const phy_info_ks8721bl = {
-	.id = 0x00022161,
-	.name = "KS8721BL",
-	.config = phy_cmd_ks8721bl_config,
-	.startup = phy_cmd_ks8721bl_startup,
-	.ack_int = phy_cmd_ks8721bl_ack_int,
-	.shutdown = phy_cmd_ks8721bl_shutdown
-};
-
-/* ------------------------------------------------------------------------- */
-/* register definitions for the DP83848 */
-
-#define MII_DP8384X_PHYSTST    16  /* PHY Status Register */
-
-static void mii_parse_dp8384x_sr2(uint mii_reg, struct net_device *dev)
+static int fec_enet_mii_probe(struct net_device *dev)
 {
 	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
-
-	*s &= ~(PHY_STAT_SPMASK | PHY_STAT_LINK | PHY_STAT_ANC);
-
-	/* Link up */
-	if (mii_reg & 0x0001) {
-		fep->link = 1;
-		*s |= PHY_STAT_LINK;
-	} else
-		fep->link = 0;
-	/* Status of link */
-	if (mii_reg & 0x0010)   /* Autonegotioation complete */
-		*s |= PHY_STAT_ANC;
-	if (mii_reg & 0x0002) {   /* 10MBps? */
-		if (mii_reg & 0x0004)   /* Full Duplex? */
-			*s |= PHY_STAT_10FDX;
-		else
-			*s |= PHY_STAT_10HDX;
-	} else {                  /* 100 Mbps? */
-		if (mii_reg & 0x0004)   /* Full Duplex? */
-			*s |= PHY_STAT_100FDX;
-		else
-			*s |= PHY_STAT_100HDX;
-	}
-	if (mii_reg & 0x0008)
-		*s |= PHY_STAT_FAULT;
-}
-
-static phy_info_t phy_info_dp83848= {
-	0x020005c9,
-	"DP83848",
+	struct phy_device *phy_dev = NULL;
+	int phy_addr;
 
-	(const phy_cmd_t []) {  /* config */
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_read(MII_DP8384X_PHYSTST), mii_parse_dp8384x_sr2 },
-		{ mk_mii_end, }
-	},
-	(const phy_cmd_t []) {  /* startup - enable interrupts */
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_end, }
-	},
-	(const phy_cmd_t []) { /* ack_int - never happens, no interrupt */
-		{ mk_mii_end, }
-	},
-	(const phy_cmd_t []) {  /* shutdown */
-		{ mk_mii_end, }
-	},
-};
+	/* find the first phy */
+	for (phy_addr = 0; phy_addr < PHY_MAX_ADDR; phy_addr++) {
+		if (fep->mii_bus->phy_map[phy_addr]) {
+			phy_dev = fep->mii_bus->phy_map[phy_addr];
+			break;
+		}
+	}
 
-static phy_info_t phy_info_lan8700 = {
-	0x0007C0C,
-	"LAN8700",
-	(const phy_cmd_t []) { /* config */
-		{ mk_mii_read(MII_REG_CR), mii_parse_cr },
-		{ mk_mii_read(MII_REG_ANAR), mii_parse_anar },
-		{ mk_mii_end, }
-	},
-	(const phy_cmd_t []) { /* startup */
-		{ mk_mii_write(MII_REG_CR, 0x1200), NULL }, /* autonegotiate */
-		{ mk_mii_read(MII_REG_SR), mii_parse_sr },
-		{ mk_mii_end, }
-	},
-	(const phy_cmd_t []) { /* act_int */
-		{ mk_mii_end, }
-	},
-	(const phy_cmd_t []) { /* shutdown */
-		{ mk_mii_end, }
-	},
-};
-/* ------------------------------------------------------------------------- */
+	if (!phy_dev) {
+		printk(KERN_ERR "%s: no PHY found\n", dev->name);
+		return -ENODEV;
+	}
 
-static phy_info_t const * const phy_info[] = {
-	&phy_info_lxt970,
-	&phy_info_lxt971,
-	&phy_info_qs6612,
-	&phy_info_am79c874,
-	&phy_info_ks8721bl,
-	&phy_info_dp83848,
-	&phy_info_lan8700,
-	NULL
-};
+	/* attach the mac to the phy */
+	phy_dev = phy_connect(dev, dev_name(&phy_dev->dev),
+			     &fec_enet_adjust_link, 0,
+			     PHY_INTERFACE_MODE_MII);
+	if (IS_ERR(phy_dev)) {
+		printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name);
+		return PTR_ERR(phy_dev);
+	}
 
-/* ------------------------------------------------------------------------- */
-#ifdef HAVE_mii_link_interrupt
-static irqreturn_t
-mii_link_interrupt(int irq, void * dev_id);
+	/* mask with MAC supported features */
+	phy_dev->supported &= PHY_BASIC_FEATURES;
+	phy_dev->advertising = phy_dev->supported;
 
-/*
- *	This is specific to the MII interrupt setup of the M5272EVB.
- */
-static void __inline__ fec_request_mii_intr(struct net_device *dev)
-{
-	if (request_irq(66, mii_link_interrupt, IRQF_DISABLED, "fec(MII)", dev) != 0)
-		printk("FEC: Could not allocate fec(MII) IRQ(66)!\n");
-}
+	fep->phy_dev = phy_dev;
+	fep->link = 0;
+	fep->full_duplex = 0;
 
-static void __inline__ fec_disable_phy_intr(struct net_device *dev)
-{
-	free_irq(66, dev);
+	return 0;
 }
-#endif
 
-#ifdef CONFIG_M5272
-static void __inline__ fec_get_mac(struct net_device *dev)
+static int fec_enet_mii_init(struct platform_device *pdev)
 {
+	struct net_device *dev = platform_get_drvdata(pdev);
 	struct fec_enet_private *fep = netdev_priv(dev);
-	unsigned char *iap, tmpaddr[ETH_ALEN];
+	int err = -ENXIO, i;
 
-	if (FEC_FLASHMAC) {
-		/*
-		 * Get MAC address from FLASH.
-		 * If it is all 1's or 0's, use the default.
-		 */
-		iap = (unsigned char *)FEC_FLASHMAC;
-		if ((iap[0] == 0) && (iap[1] == 0) && (iap[2] == 0) &&
-		    (iap[3] == 0) && (iap[4] == 0) && (iap[5] == 0))
-			iap = fec_mac_default;
-		if ((iap[0] == 0xff) && (iap[1] == 0xff) && (iap[2] == 0xff) &&
-		    (iap[3] == 0xff) && (iap[4] == 0xff) && (iap[5] == 0xff))
-			iap = fec_mac_default;
-	} else {
-		*((unsigned long *) &tmpaddr[0]) = readl(fep->hwp + FEC_ADDR_LOW);
-		*((unsigned short *) &tmpaddr[4]) = (readl(fep->hwp + FEC_ADDR_HIGH) >> 16);
-		iap = &tmpaddr[0];
-	}
-
-	memcpy(dev->dev_addr, iap, ETH_ALEN);
-
-	/* Adjust MAC if using default MAC address */
-	if (iap == fec_mac_default)
-		 dev->dev_addr[ETH_ALEN-1] = fec_mac_default[ETH_ALEN-1] + fep->index;
-}
-#endif
+	fep->mii_timeout = 0;
 
-/* ------------------------------------------------------------------------- */
-
-static void mii_display_status(struct net_device *dev)
-{
-	struct fec_enet_private *fep = netdev_priv(dev);
-	volatile uint *s = &(fep->phy_status);
+	/*
+	 * Set MII speed to 2.5 MHz (= clk_get_rate() / 2 * phy_speed)
+	 */
+	fep->phy_speed = DIV_ROUND_UP(clk_get_rate(fep->clk), 5000000) << 1;
+	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
 
-	if (!fep->link && !fep->old_link) {
-		/* Link is still down - don't print anything */
-		return;
+	fep->mii_bus = mdiobus_alloc();
+	if (fep->mii_bus == NULL) {
+		err = -ENOMEM;
+		goto err_out;
 	}
 
-	printk("%s: status: ", dev->name);
-
-	if (!fep->link) {
-		printk("link down");
-	} else {
-		printk("link up");
-
-		switch(*s & PHY_STAT_SPMASK) {
-		case PHY_STAT_100FDX: printk(", 100MBit Full Duplex"); break;
-		case PHY_STAT_100HDX: printk(", 100MBit Half Duplex"); break;
-		case PHY_STAT_10FDX: printk(", 10MBit Full Duplex"); break;
-		case PHY_STAT_10HDX: printk(", 10MBit Half Duplex"); break;
-		default:
-			printk(", Unknown speed/duplex");
-		}
-
-		if (*s & PHY_STAT_ANC)
-			printk(", auto-negotiation complete");
+	fep->mii_bus->name = "fec_enet_mii_bus";
+	fep->mii_bus->read = fec_enet_mdio_read;
+	fep->mii_bus->write = fec_enet_mdio_write;
+	fep->mii_bus->reset = fec_enet_mdio_reset;
+	snprintf(fep->mii_bus->id, MII_BUS_ID_SIZE, "%x", pdev->id);
+	fep->mii_bus->priv = fep;
+	fep->mii_bus->parent = &pdev->dev;
+
+	fep->mii_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
+	if (!fep->mii_bus->irq) {
+		err = -ENOMEM;
+		goto err_out_free_mdiobus;
 	}
 
-	if (*s & PHY_STAT_FAULT)
-		printk(", remote fault");
-
-	printk(".\n");
-}
-
-static void mii_display_config(struct work_struct *work)
-{
-	struct fec_enet_private *fep = container_of(work, struct fec_enet_private, phy_task);
-	struct net_device *dev = fep->netdev;
-	uint status = fep->phy_status;
+	for (i = 0; i < PHY_MAX_ADDR; i++)
+		fep->mii_bus->irq[i] = PHY_POLL;
 
-	/*
-	** When we get here, phy_task is already removed from
-	** the workqueue.  It is thus safe to allow to reuse it.
-	*/
-	fep->mii_phy_task_queued = 0;
-	printk("%s: config: auto-negotiation ", dev->name);
-
-	if (status & PHY_CONF_ANE)
-		printk("on");
-	else
-		printk("off");
+	platform_set_drvdata(dev, fep->mii_bus);
 
-	if (status & PHY_CONF_100FDX)
-		printk(", 100FDX");
-	if (status & PHY_CONF_100HDX)
-		printk(", 100HDX");
-	if (status & PHY_CONF_10FDX)
-		printk(", 10FDX");
-	if (status & PHY_CONF_10HDX)
-		printk(", 10HDX");
-	if (!(status & PHY_CONF_SPMASK))
-		printk(", No speed/duplex selected?");
+	if (mdiobus_register(fep->mii_bus))
+		goto err_out_free_mdio_irq;
 
-	if (status & PHY_CONF_LOOP)
-		printk(", loopback enabled");
+	if (fec_enet_mii_probe(dev) != 0)
+		goto err_out_unregister_bus;
 
-	printk(".\n");
+	return 0;
 
-	fep->sequence_done = 1;
+err_out_unregister_bus:
+	mdiobus_unregister(fep->mii_bus);
+err_out_free_mdio_irq:
+	kfree(fep->mii_bus->irq);
+err_out_free_mdiobus:
+	mdiobus_free(fep->mii_bus);
+err_out:
+	return err;
 }
 
-static void mii_relink(struct work_struct *work)
+static void fec_enet_mii_remove(struct fec_enet_private *fep)
 {
-	struct fec_enet_private *fep = container_of(work, struct fec_enet_private, phy_task);
-	struct net_device *dev = fep->netdev;
-	int duplex;
-
-	/*
-	** When we get here, phy_task is already removed from
-	** the workqueue.  It is thus safe to allow to reuse it.
-	*/
-	fep->mii_phy_task_queued = 0;
-	fep->link = (fep->phy_status & PHY_STAT_LINK) ? 1 : 0;
-	mii_display_status(dev);
-	fep->old_link = fep->link;
-
-	if (fep->link) {
-		duplex = 0;
-		if (fep->phy_status
-		    & (PHY_STAT_100FDX | PHY_STAT_10FDX))
-			duplex = 1;
-		fec_restart(dev, duplex);
-	} else
-		fec_stop(dev);
+	if (fep->phy_dev)
+		phy_disconnect(fep->phy_dev);
+	mdiobus_unregister(fep->mii_bus);
+	kfree(fep->mii_bus->irq);
+	mdiobus_free(fep->mii_bus);
 }
 
-/* mii_queue_relink is called in interrupt context from mii_link_interrupt */
-static void mii_queue_relink(uint mii_reg, struct net_device *dev)
+static int fec_enet_get_settings(struct net_device *dev,
+				  struct ethtool_cmd *cmd)
 {
 	struct fec_enet_private *fep = netdev_priv(dev);
+	struct phy_device *phydev = fep->phy_dev;
 
-	/*
-	 * We cannot queue phy_task twice in the workqueue.  It
-	 * would cause an endless loop in the workqueue.
-	 * Fortunately, if the last mii_relink entry has not yet been
-	 * executed now, it will do the job for the current interrupt,
-	 * which is just what we want.
-	 */
-	if (fep->mii_phy_task_queued)
-		return;
+	if (!phydev)
+		return -ENODEV;
 
-	fep->mii_phy_task_queued = 1;
-	INIT_WORK(&fep->phy_task, mii_relink);
-	schedule_work(&fep->phy_task);
+	return phy_ethtool_gset(phydev, cmd);
 }
 
-/* mii_queue_config is called in interrupt context from fec_enet_mii */
-static void mii_queue_config(uint mii_reg, struct net_device *dev)
+static int fec_enet_set_settings(struct net_device *dev,
+				 struct ethtool_cmd *cmd)
 {
 	struct fec_enet_private *fep = netdev_priv(dev);
+	struct phy_device *phydev = fep->phy_dev;
 
-	if (fep->mii_phy_task_queued)
-		return;
+	if (!phydev)
+		return -ENODEV;
 
-	fep->mii_phy_task_queued = 1;
-	INIT_WORK(&fep->phy_task, mii_display_config);
-	schedule_work(&fep->phy_task);
+	return phy_ethtool_sset(phydev, cmd);
 }
 
-phy_cmd_t const phy_cmd_relink[] = {
-	{ mk_mii_read(MII_REG_CR), mii_queue_relink },
-	{ mk_mii_end, }
-	};
-phy_cmd_t const phy_cmd_config[] = {
-	{ mk_mii_read(MII_REG_CR), mii_queue_config },
-	{ mk_mii_end, }
-	};
-
-/* Read remainder of PHY ID. */
-static void
-mii_discover_phy3(uint mii_reg, struct net_device *dev)
+static void fec_enet_get_drvinfo(struct net_device *dev,
+				 struct ethtool_drvinfo *info)
 {
-	struct fec_enet_private *fep;
-	int i;
-
-	fep = netdev_priv(dev);
-	fep->phy_id |= (mii_reg & 0xffff);
-	printk("fec: PHY @ 0x%x, ID 0x%08x", fep->phy_addr, fep->phy_id);
-
-	for(i = 0; phy_info[i]; i++) {
-		if(phy_info[i]->id == (fep->phy_id >> 4))
-			break;
-	}
-
-	if (phy_info[i])
-		printk(" -- %s\n", phy_info[i]->name);
-	else
-		printk(" -- unknown PHY!\n");
+	struct fec_enet_private *fep = netdev_priv(dev);
 
-	fep->phy = phy_info[i];
-	fep->phy_id_done = 1;
+	strcpy(info->driver, fep->pdev->dev.driver->name);
+	strcpy(info->version, "Revision: 1.0");
+	strcpy(info->bus_info, dev_name(&dev->dev));
 }
 
-/* Scan all of the MII PHY addresses looking for someone to respond
- * with a valid ID.  This usually happens quickly.
- */
-static void
-mii_discover_phy(uint mii_reg, struct net_device *dev)
-{
-	struct fec_enet_private *fep;
-	uint phytype;
-
-	fep = netdev_priv(dev);
-
-	if (fep->phy_addr < 32) {
-		if ((phytype = (mii_reg & 0xffff)) != 0xffff && phytype != 0) {
-
-			/* Got first part of ID, now get remainder */
-			fep->phy_id = phytype << 16;
-			mii_queue_unlocked(dev, mk_mii_read(MII_REG_PHYIR2),
-							mii_discover_phy3);
-		} else {
-			fep->phy_addr++;
-			mii_queue_unlocked(dev, mk_mii_read(MII_REG_PHYIR1),
-							mii_discover_phy);
-		}
-	} else {
-		printk("FEC: No PHY device found.\n");
-		/* Disable external MII interface */
-		writel(0, fep->hwp + FEC_MII_SPEED);
-		fep->phy_speed = 0;
-#ifdef HAVE_mii_link_interrupt
-		fec_disable_phy_intr(dev);
-#endif
-	}
-}
+static struct ethtool_ops fec_enet_ethtool_ops = {
+	.get_settings		= fec_enet_get_settings,
+	.set_settings		= fec_enet_set_settings,
+	.get_drvinfo		= fec_enet_get_drvinfo,
+	.get_link		= ethtool_op_get_link,
+};
 
-/* This interrupt occurs when the PHY detects a link change */
-#ifdef HAVE_mii_link_interrupt
-static irqreturn_t
-mii_link_interrupt(int irq, void * dev_id)
+static int fec_enet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
-	struct	net_device *dev = dev_id;
 	struct fec_enet_private *fep = netdev_priv(dev);
+	struct phy_device *phydev = fep->phy_dev;
+
+	if (!netif_running(dev))
+		return -EINVAL;
 
-	mii_do_cmd(dev, fep->phy->ack_int);
-	mii_do_cmd(dev, phy_cmd_relink);  /* restart and display status */
+	if (!phydev)
+		return -ENODEV;
 
-	return IRQ_HANDLED;
+	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
 }
-#endif
 
 static void fec_enet_free_buffers(struct net_device *dev)
 {
@@ -1509,35 +915,8 @@ fec_enet_open(struct net_device *dev)
 	if (ret)
 		return ret;
 
-	fep->sequence_done = 0;
-	fep->link = 0;
-
-	fec_restart(dev, 1);
-
-	if (fep->phy) {
-		mii_do_cmd(dev, fep->phy->ack_int);
-		mii_do_cmd(dev, fep->phy->config);
-		mii_do_cmd(dev, phy_cmd_config);  /* display configuration */
-
-		/* Poll until the PHY tells us its configuration
-		 * (not link state).
-		 * Request is initiated by mii_do_cmd above, but answer
-		 * comes by interrupt.
-		 * This should take about 25 usec per register at 2.5 MHz,
-		 * and we read approximately 5 registers.
-		 */
-		while(!fep->sequence_done)
-			schedule();
-
-		mii_do_cmd(dev, fep->phy->startup);
-	}
-
-	/* Set the initial link state to true. A lot of hardware
-	 * based on this device does not implement a PHY interrupt,
-	 * so we are never notified of link change.
-	 */
-	fep->link = 1;
-
+	/* schedule a link state check */
+	phy_start(fep->phy_dev);
 	netif_start_queue(dev);
 	fep->opened = 1;
 	return 0;
@@ -1550,6 +929,7 @@ fec_enet_close(struct net_device *dev)
 
 	/* Don't know what to do yet. */
 	fep->opened = 0;
+	phy_stop(fep->phy_dev);
 	netif_stop_queue(dev);
 	fec_stop(dev);
 
@@ -1666,6 +1046,7 @@ static const struct net_device_ops fec_netdev_ops = {
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_tx_timeout		= fec_timeout,
 	.ndo_set_mac_address	= fec_set_mac_address,
+	.ndo_do_ioctl           = fec_enet_ioctl,
 };
 
  /*
@@ -1689,7 +1070,6 @@ static int fec_enet_init(struct net_device *dev, int index)
 	}
 
 	spin_lock_init(&fep->hw_lock);
-	spin_lock_init(&fep->mii_lock);
 
 	fep->index = index;
 	fep->hwp = (void __iomem *)dev->base_addr;
@@ -1716,20 +1096,10 @@ static int fec_enet_init(struct net_device *dev, int index)
 	fep->rx_bd_base = cbd_base;
 	fep->tx_bd_base = cbd_base + RX_RING_SIZE;
 
-#ifdef HAVE_mii_link_interrupt
-	fec_request_mii_intr(dev);
-#endif
 	/* The FEC Ethernet specific entries in the device structure */
 	dev->watchdog_timeo = TX_TIMEOUT;
 	dev->netdev_ops = &fec_netdev_ops;
-
-	for (i=0; i<NMII-1; i++)
-		mii_cmds[i].mii_next = &mii_cmds[i+1];
-	mii_free = mii_cmds;
-
-	/* Set MII speed to 2.5 MHz */
-	fep->phy_speed = ((((clk_get_rate(fep->clk) / 2 + 4999999)
-					/ 2500000) / 2) & 0x3F) << 1;
+	dev->ethtool_ops = &fec_enet_ethtool_ops;
 
 	/* Initialize the receive buffer descriptors. */
 	bdp = fep->rx_bd_base;
@@ -1760,13 +1130,6 @@ static int fec_enet_init(struct net_device *dev, int index)
 
 	fec_restart(dev, 0);
 
-	/* Queue up command to detect the PHY and initialize the
-	 * remainder of the interface.
-	 */
-	fep->phy_id_done = 0;
-	fep->phy_addr = 0;
-	mii_queue(dev, mk_mii_read(MII_REG_PHYIR1), mii_discover_phy);
-
 	return 0;
 }
 
@@ -1835,8 +1198,7 @@ fec_restart(struct net_device *dev, int duplex)
 	writel(0, fep->hwp + FEC_R_DES_ACTIVE);
 
 	/* Enable interrupts we wish to service */
-	writel(FEC_ENET_TXF | FEC_ENET_RXF | FEC_ENET_MII,
-			fep->hwp + FEC_IMASK);
+	writel(FEC_ENET_TXF | FEC_ENET_RXF, fep->hwp + FEC_IMASK);
 }
 
 static void
@@ -1859,7 +1221,6 @@ fec_stop(struct net_device *dev)
 	/* Clear outstanding MII command interrupts. */
 	writel(FEC_ENET_MII, fep->hwp + FEC_IEVENT);
 
-	writel(FEC_ENET_MII, fep->hwp + FEC_IMASK);
 	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
 }
 
@@ -1891,6 +1252,7 @@ fec_probe(struct platform_device *pdev)
 	memset(fep, 0, sizeof(*fep));
 
 	ndev->base_addr = (unsigned long)ioremap(r->start, resource_size(r));
+	fep->pdev = pdev;
 
 	if (!ndev->base_addr) {
 		ret = -ENOMEM;
@@ -1926,13 +1288,24 @@ fec_probe(struct platform_device *pdev)
 	if (ret)
 		goto failed_init;
 
+	ret = fec_enet_mii_init(pdev);
+	if (ret)
+		goto failed_mii_init;
+
 	ret = register_netdev(ndev);
 	if (ret)
 		goto failed_register;
 
+	printk(KERN_INFO "%s: Freescale FEC PHY driver [%s] "
+		"(mii_bus:phy_addr=%s, irq=%d)\n", ndev->name,
+		fep->phy_dev->drv->name, dev_name(&fep->phy_dev->dev),
+		fep->phy_dev->irq);
+
 	return 0;
 
 failed_register:
+	fec_enet_mii_remove(fep);
+failed_mii_init:
 failed_init:
 	clk_disable(fep->clk);
 	clk_put(fep->clk);
@@ -1959,6 +1332,7 @@ fec_drv_remove(struct platform_device *pdev)
 	platform_set_drvdata(pdev, NULL);
 
 	fec_stop(ndev);
+	fec_enet_mii_remove(fep);
 	clk_disable(fep->clk);
 	clk_put(fep->clk);
 	iounmap((void __iomem *)ndev->base_addr);
-- 
1.7.0.1

^ permalink raw reply related

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
From: Eric Dumazet @ 2010-03-31 12:08 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev
In-Reply-To: <20100325113038.GA3471@swordfish.minsk.epam.com>

Le jeudi 25 mars 2010 à 13:30 +0200, Sergey Senozhatsky a écrit :
> Hello,
> 
> On (03/16/10 19:48), Eric Dumazet wrote:
> [..]
> > OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
> > or multiple invocation...
> > 
> > Did you tried removing the rtl8169_schedule_work() call from
> > rtl8169_rx_interrupt() ?
> > 
> > Maybe the reset is not necessary at all in case of fifo overflow..
> > 
> > Cumulative patch :
> > 
> [..]
> 
> Eric, I think you can put Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> on the cumulative patch as I don't see any errors with pktgen over LAN testing.
> 

OK thanks, here is the patch then, but I prefer let the
rtl8169_schedule_work() from rtl8169_rx_interrupt() for now, its removal
might be the subject of another patch.

Lets be very conservative for now.

[PATCH net-next-2.6] r8169: Fix rtl8169_rx_interrupt()

In case a reset is performed, rtl8169_rx_interrupt() is called from
process context instead of softirq context. Special care must be taken
to call appropriate network core services (netif_rx() instead of
netif_receive_skb()). VLAN handling also corrected.

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/r8169.c |   22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 964305c..1dffe29 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1054,14 +1054,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	u32 opts2 = le32_to_cpu(desc->opts2);
 	struct vlan_group *vlgrp = tp->vlgrp;
 	int ret;
 
 	if (vlgrp && (opts2 & RxVlanTag)) {
-		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
+		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
 		ret = 0;
 	} else
 		ret = -1;
@@ -1078,7 +1078,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	return -1;
 }
@@ -4467,12 +4467,20 @@ out:
 	return done;
 }
 
+/*
+ * Warning : rtl8169_rx_interrupt() might be called :
+ * 1) from NAPI (softirq) context
+ *	(polling = 1 : we should call netif_receive_skb())
+ * 2) from process context (rtl8169_reset_task())
+ *	(polling = 0 : we must call netif_rx() instead)
+ */		
 static int rtl8169_rx_interrupt(struct net_device *dev,
 				struct rtl8169_private *tp,
 				void __iomem *ioaddr, u32 budget)
 {
 	unsigned int cur_rx, rx_left;
 	unsigned int delta, count;
+	int polling = (budget != ~(u32)0) ? 1 : 0;
 
 	cur_rx = tp->cur_rx;
 	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
@@ -4534,8 +4542,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 			skb_put(skb, pkt_size);
 			skb->protocol = eth_type_trans(skb, dev);
 
-			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
-				netif_receive_skb(skb);
+			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
+				if (likely(polling))
+					netif_receive_skb(skb);
+				else
+					netif_rx(skb);
+			}
 
 			dev->stats.rx_bytes += pkt_size;
 			dev->stats.rx_packets++;



^ permalink raw reply related

* Re: [PATCH v2] Add Mergeable RX buffer feature to vhost_net
From: Michael S. Tsirkin @ 2010-03-31 12:02 UTC (permalink / raw)
  To: David Stevens; +Cc: kvm, kvm-owner, netdev, rusty, virtualization
In-Reply-To: <OF35D5EAC9.45A7DC5F-ON882576F7.0006F45E-882576F7.0007AB62@us.ibm.com>

On Tue, Mar 30, 2010 at 07:23:48PM -0600, David Stevens wrote:
> This patch adds support for the Mergeable Receive Buffers feature to
> vhost_net.
> 
> Changes:
>         1) generalize descriptor allocation functions to allow multiple
>                 descriptors per packet
>         2) add socket "peek" to know datalen at buffer allocation time
>         3) change notification to enable a multi-buffer max packet, rather
>                 than the single-buffer run until completely empty
> 
> Changes from previous revision:
> 1) incorporate review comments from Michael Tsirkin
> 2) assume use of TUNSETVNETHDRSZ ioctl by qemu, which simplifies vnet 
> header
>         processing
> 3) fixed notification code to only affect the receive side
> 
> Signed-Off-By: David L Stevens <dlstevens@us.ibm.com>
> 
> [in-line for review, attached for applying w/o whitespace mangling]

attached patch seems to be whiespace damaged as well.
Does the origin pass checkpatch.pl for you?

> diff -ruNp net-next-p0/drivers/vhost/net.c net-next-p3/drivers/vhost/net.c
> --- net-next-p0/drivers/vhost/net.c     2010-03-22 12:04:38.000000000 
> -0700
> +++ net-next-p3/drivers/vhost/net.c     2010-03-30 12:50:57.000000000 
> -0700
> @@ -54,26 +54,6 @@ struct vhost_net {
>         enum vhost_net_poll_state tx_poll_state;
>  };
>  
> -/* Pop first len bytes from iovec. Return number of segments used. */
> -static int move_iovec_hdr(struct iovec *from, struct iovec *to,
> -                         size_t len, int iov_count)
> -{
> -       int seg = 0;
> -       size_t size;
> -       while (len && seg < iov_count) {
> -               size = min(from->iov_len, len);
> -               to->iov_base = from->iov_base;
> -               to->iov_len = size;
> -               from->iov_len -= size;
> -               from->iov_base += size;
> -               len -= size;
> -               ++from;
> -               ++to;
> -               ++seg;
> -       }
> -       return seg;
> -}
> -
>  /* Caller must have TX VQ lock */
>  static void tx_poll_stop(struct vhost_net *net)
>  {
> @@ -97,7 +77,8 @@ static void tx_poll_start(struct vhost_n
>  static void handle_tx(struct vhost_net *net)
>  {
>         struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_TX];
> -       unsigned head, out, in, s;
> +       unsigned out, in;
> +       struct iovec head;
>         struct msghdr msg = {
>                 .msg_name = NULL,
>                 .msg_namelen = 0,
> @@ -108,8 +89,8 @@ static void handle_tx(struct vhost_net *
>         };
>         size_t len, total_len = 0;
>         int err, wmem;
> -       size_t hdr_size;
>         struct socket *sock = rcu_dereference(vq->private_data);
> +
>         if (!sock)
>                 return;
>  
> @@ -127,22 +108,19 @@ static void handle_tx(struct vhost_net *
>  
>         if (wmem < sock->sk->sk_sndbuf / 2)
>                 tx_poll_stop(net);
> -       hdr_size = vq->hdr_size;
>  
>         for (;;) {
> -               head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
> -                                        ARRAY_SIZE(vq->iov),
> -                                        &out, &in,
> -                                        NULL, NULL);
> +               head.iov_base = (void *)vhost_get_vq_desc(&net->dev, vq,
> +                       vq->iov, ARRAY_SIZE(vq->iov), &out, &in, NULL, 
> NULL);

I this casting confusing.
Is it really expensive to add an array of heads so that
we do not need to cast?


>                 /* Nothing new?  Wait for eventfd to tell us they 
> refilled. */
> -               if (head == vq->num) {
> +               if (head.iov_base == (void *)vq->num) {
>                         wmem = atomic_read(&sock->sk->sk_wmem_alloc);
>                         if (wmem >= sock->sk->sk_sndbuf * 3 / 4) {
>                                 tx_poll_start(net, sock);
>                                 set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
>                                 break;
>                         }
> -                       if (unlikely(vhost_enable_notify(vq))) {
> +                       if (unlikely(vhost_enable_notify(vq, 0))) {
>                                 vhost_disable_notify(vq);
>                                 continue;
>                         }
> @@ -154,27 +132,30 @@ static void handle_tx(struct vhost_net *
>                         break;
>                 }
>                 /* Skip header. TODO: support TSO. */
> -               s = move_iovec_hdr(vq->iov, vq->hdr, hdr_size, out);
>                 msg.msg_iovlen = out;
> -               len = iov_length(vq->iov, out);
> +               head.iov_len = len = iov_length(vq->iov, out);
> +
>                 /* Sanity check */
>                 if (!len) {
> -                       vq_err(vq, "Unexpected header len for TX: "
> -                              "%zd expected %zd\n",
> -                              iov_length(vq->hdr, s), hdr_size);
> +                       vq_err(vq, "Unexpected buffer len for TX: %zd ", 
> len);
>                         break;
>                 }
> -               /* TODO: Check specific error and bomb out unless ENOBUFS? 
> */
>                 err = sock->ops->sendmsg(NULL, sock, &msg, len);
>                 if (unlikely(err < 0)) {
> -                       vhost_discard_vq_desc(vq);
> -                       tx_poll_start(net, sock);
> +                       if (err == -EAGAIN) {
> +                               vhost_discard_vq_desc(vq, 1);
> +                               tx_poll_start(net, sock);
> +                       } else {
> +                               vq_err(vq, "sendmsg: errno %d\n", -err);
> +                               /* drop packet; do not discard/resend */
> +                               vhost_add_used_and_signal(&net->dev, vq, 
> &head,
> +                                                         1, 0);
> +                       }
>                         break;
>                 }
>                 if (err != len)
> -                       pr_err("Truncated TX packet: "
> -                              " len %d != %zd\n", err, len);
> -               vhost_add_used_and_signal(&net->dev, vq, head, 0);
> +                       pr_err("Truncated TX packet: len %d != %zd\n", 
> err, len);
> +               vhost_add_used_and_signal(&net->dev, vq, &head, 1, 0);
>                 total_len += len;
>                 if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
>                         vhost_poll_queue(&vq->poll);
> @@ -186,12 +167,25 @@ static void handle_tx(struct vhost_net *
>         unuse_mm(net->dev.mm);
>  }
>  
> +static int vhost_head_len(struct sock *sk)
> +{
> +       struct sk_buff *head;
> +       int len = 0;
> +
> +       lock_sock(sk);
> +       head = skb_peek(&sk->sk_receive_queue);
> +       if (head)
> +               len = head->len;
> +       release_sock(sk);
> +       return len;
> +}
> +
>  /* Expects to be always run from workqueue - which acts as
>   * read-size critical section for our kind of RCU. */
>  static void handle_rx(struct vhost_net *net)
>  {
>         struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_RX];
> -       unsigned head, out, in, log, s;
> +       unsigned in, log;
>         struct vhost_log *vq_log;
>         struct msghdr msg = {
>                 .msg_name = NULL,
> @@ -202,34 +196,25 @@ static void handle_rx(struct vhost_net *
>                 .msg_flags = MSG_DONTWAIT,
>         };
>  
> -       struct virtio_net_hdr hdr = {
> -               .flags = 0,
> -               .gso_type = VIRTIO_NET_HDR_GSO_NONE
> -       };
> -
>         size_t len, total_len = 0;
> -       int err;
> -       size_t hdr_size;
> +       int err, headcount, datalen;
>         struct socket *sock = rcu_dereference(vq->private_data);
> +
>         if (!sock || skb_queue_empty(&sock->sk->sk_receive_queue))
>                 return;
>  
>         use_mm(net->dev.mm);
>         mutex_lock(&vq->mutex);
>         vhost_disable_notify(vq);
> -       hdr_size = vq->hdr_size;
>  
>         vq_log = unlikely(vhost_has_feature(&net->dev, VHOST_F_LOG_ALL)) ?
>                 vq->log : NULL;
>  
> -       for (;;) {
> -               head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
> -                                        ARRAY_SIZE(vq->iov),
> -                                        &out, &in,
> -                                        vq_log, &log);
> +       while ((datalen = vhost_head_len(sock->sk))) {
> +               headcount = vhost_get_heads(vq, datalen, &in, vq_log, 
> &log);
>                 /* OK, now we need to know about added descriptors. */
> -               if (head == vq->num) {
> -                       if (unlikely(vhost_enable_notify(vq))) {
> +               if (!headcount) {
> +                       if (unlikely(vhost_enable_notify(vq, 1))) {
>                                 /* They have slipped one in as we were
>                                  * doing that: check again. */
>                                 vhost_disable_notify(vq);
> @@ -240,46 +225,42 @@ static void handle_rx(struct vhost_net *
>                         break;
>                 }
>                 /* We don't need to be notified again. */
> -               if (out) {
> -                       vq_err(vq, "Unexpected descriptor format for RX: "
> -                              "out %d, int %d\n",
> -                              out, in);
> -                       break;
> -               }
> -               /* Skip header. TODO: support TSO/mergeable rx buffers. */
> -               s = move_iovec_hdr(vq->iov, vq->hdr, hdr_size, in);
> +               if (vq->rxmaxheadcount < headcount)
> +                       vq->rxmaxheadcount = headcount;

This seems the only place where we set the rxmaxheadcount
value. Maybe it can be moved out of vhost.c to net.c?
If vhost library needs this it can get it as function
parameter.

> +               /* Skip header. TODO: support TSO. */

You seem to have removed the code that skips the header.
Won't this break non-GSO backends such as raw?

>                 msg.msg_iovlen = in;
>                 len = iov_length(vq->iov, in);
>                 /* Sanity check */
>                 if (!len) {
> -                       vq_err(vq, "Unexpected header len for RX: "
> -                              "%zd expected %zd\n",
> -                              iov_length(vq->hdr, s), hdr_size);
> +                       vq_err(vq, "Unexpected buffer len for RX: %zd\n", 
> len);
>                         break;
>                 }
>                 err = sock->ops->recvmsg(NULL, sock, &msg,
>                                          len, MSG_DONTWAIT | MSG_TRUNC);
> -               /* TODO: Check specific error and bomb out unless EAGAIN? 
> */

Do you think it's not a good idea?

>                 if (err < 0) {
> -                       vhost_discard_vq_desc(vq);
> +                       vhost_discard_vq_desc(vq, headcount);
>                         break;
>                 }

I think we should detect and discard truncated messages,
since len might not be reliable if userspace pulls
a packet from under us. Also, if new packet is
shorter than the old one, there's no truncation but
headcount is wrong.

So simplest fix IMO would be to compare err with expected len.
If there's a difference, we hit the race and so
we would discard the packet.


>                 /* TODO: Should check and handle checksum. */
> +               if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF)) 
> {
> +                       struct virtio_net_hdr_mrg_rxbuf *vhdr =
> +                               (struct virtio_net_hdr_mrg_rxbuf *)
> +                               vq->iov[0].iov_base;
> +                       /* add num_bufs */
> +                       if (put_user(headcount, &vhdr->num_buffers)) {
> +                               vq_err(vq, "Failed to write num_buffers");
> +                               vhost_discard_vq_desc(vq, headcount);

Let's do memcpy_to_iovecend etc so that we do not assume layout.
This is also why we need move_iovec: sendmsg might modify the iovec.
It would also be nice not to corrupt memory and
get a reasonable error if buffer size
that we get is smaller than expected header size.

> +                               break;
> +                       }
> +               }
>                 if (err > len) {
>                         pr_err("Discarded truncated rx packet: "
>                                " len %d > %zd\n", err, len);
> -                       vhost_discard_vq_desc(vq);
> +                       vhost_discard_vq_desc(vq, headcount);
>                         continue;
>                 }
>                 len = err;
> -               err = memcpy_toiovec(vq->hdr, (unsigned char *)&hdr, 
> hdr_size);
> -               if (err) {
> -                       vq_err(vq, "Unable to write vnet_hdr at addr %p: 
> %d\n",
> -                              vq->iov->iov_base, err);
> -                       break;
> -               }
> -               len += hdr_size;

This seems to break non-GSO backends as well.

> -               vhost_add_used_and_signal(&net->dev, vq, head, len);
> + vhost_add_used_and_signal(&net->dev,vq,vq->heads,headcount,1);
>                 if (unlikely(vq_log))
>                         vhost_log_write(vq, vq_log, log, len);
>                 total_len += len;
> @@ -560,9 +541,6 @@ done:
>  
>  static int vhost_net_set_features(struct vhost_net *n, u64 features)
>  {
> -       size_t hdr_size = features & (1 << VHOST_NET_F_VIRTIO_NET_HDR) ?
> -               sizeof(struct virtio_net_hdr) : 0;
> -       int i;
>         mutex_lock(&n->dev.mutex);
>         if ((features & (1 << VHOST_F_LOG_ALL)) &&
>             !vhost_log_access_ok(&n->dev)) {
> @@ -571,11 +549,6 @@ static int vhost_net_set_features(struct
>         }
>         n->dev.acked_features = features;
>         smp_wmb();
> -       for (i = 0; i < VHOST_NET_VQ_MAX; ++i) {
> -               mutex_lock(&n->vqs[i].mutex);
> -               n->vqs[i].hdr_size = hdr_size;
> -               mutex_unlock(&n->vqs[i].mutex);

I expect the above is a leftover from the previous version
which calculated header size in kernel?

> -       }
>         vhost_net_flush(n);
>         mutex_unlock(&n->dev.mutex);
>         return 0;
> diff -ruNp net-next-p0/drivers/vhost/vhost.c 
> net-next-p3/drivers/vhost/vhost.c
> --- net-next-p0/drivers/vhost/vhost.c   2010-03-22 12:04:38.000000000 
> -0700
> +++ net-next-p3/drivers/vhost/vhost.c   2010-03-29 20:12:42.000000000 
> -0700
> @@ -113,7 +113,7 @@ static void vhost_vq_reset(struct vhost_
>         vq->used_flags = 0;
>         vq->log_used = false;
>         vq->log_addr = -1ull;
> -       vq->hdr_size = 0;
> +       vq->rxmaxheadcount = 0;
>         vq->private_data = NULL;
>         vq->log_base = NULL;
>         vq->error_ctx = NULL;
> @@ -410,6 +410,7 @@ static long vhost_set_vring(struct vhost
>                 vq->last_avail_idx = s.num;
>                 /* Forget the cached index value. */
>                 vq->avail_idx = vq->last_avail_idx;
> +               vq->rxmaxheadcount = 0;
>                 break;
>         case VHOST_GET_VRING_BASE:
>                 s.index = idx;
> @@ -856,6 +857,48 @@ static unsigned get_indirect(struct vhos
>         return 0;
>  }
>  
> +/* This is a multi-head version of vhost_get_vq_desc

How about:
version of vhost_get_vq_desc that returns multiple
descriptors

> + * @vq         - the relevant virtqueue
> + * datalen     - data length we'll be reading
> + * @iovcount   - returned count of io vectors we fill
> + * @log                - vhost log
> + * @log_num    - log offset

return value?
Also - why unsigned?

> + */
> +unsigned vhost_get_heads(struct vhost_virtqueue *vq, int datalen, int 

Would vhost_get_vq_desc_multiple be a better name?

> *iovcount,
> +       struct vhost_log *log, unsigned int *log_num)
> +{
> +       struct iovec *heads = vq->heads;

I think it's better to pass in heads array than take it from vq->heads.

> +       int out, in = 0;

Why is in initialized here?

> +       int seg = 0;            /* iov index */
> +       int hc = 0;             /* head count */
> +
> +       while (datalen > 0) {

Can't this simply call vhost_get_vq_desc in a loop somehow,
or use a common function that both this and vhost_get_vq_desc call?

> +               if (hc >= VHOST_NET_MAX_SG) {
> +                       vhost_discard_vq_desc(vq, hc);
> +                       return 0;
> +               }
> +               heads[hc].iov_base = (void *)vhost_get_vq_desc(vq->dev, 
> vq,
> +                       vq->iov+seg, ARRAY_SIZE(vq->iov)-seg, &out, &in, 
> log,
> +                       log_num);
> +               if (heads[hc].iov_base == (void *)vq->num) {
> +                       vhost_discard_vq_desc(vq, hc);
> +                       return 0;
> +               }
> +               if (out || in <= 0) {
> +                       vq_err(vq, "unexpected descriptor format for RX: "
> +                               "out %d, in %d\n", out, in);
> +                       vhost_discard_vq_desc(vq, hc);

goto err above might help simplify cleanup.

> +                       return 0;
> +               }
> +               heads[hc].iov_len = iov_length(vq->iov+seg, in);
> +               datalen -= heads[hc].iov_len;
> +               hc++;
> +               seg += in;
> +       }
> +       *iovcount = seg;
> +       return hc;
> +}
> +
>  /* This looks in the virtqueue and for the first available buffer, and 
> converts
>   * it to an iovec for convenient access.  Since descriptors consist of 
> some
>   * number of output then some number of input descriptors, it's actually 
> two
> @@ -981,31 +1024,36 @@ unsigned vhost_get_vq_desc(struct vhost_
>  }
>  
>  /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
> -void vhost_discard_vq_desc(struct vhost_virtqueue *vq)
> +void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
>  {
> -       vq->last_avail_idx--;
> +       vq->last_avail_idx -= n;
>  }
>  
>  /* After we've used one of their buffers, we tell them about it.  We'll 
> then
>   * want to notify the guest, using eventfd. */
> -int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int 
> len)
> +int vhost_add_used(struct vhost_virtqueue *vq, struct iovec *heads, int 
> count)

count is always 0 for send, right?
I think it is better to have two APIs here as well:
vhost_add_used
vhost_add_used_multiple
we can use functions to avoid code duplication.

>  {
> -       struct vring_used_elem *used;
> +       struct vring_used_elem *used = 0;

why is used initialized here?

> +       int i;
>  
> -       /* The virtqueue contains a ring of used buffers.  Get a pointer 
> to the
> -        * next entry in that used ring. */
> -       used = &vq->used->ring[vq->last_used_idx % vq->num];
> -       if (put_user(head, &used->id)) {
> -               vq_err(vq, "Failed to write used id");
> -               return -EFAULT;
> -       }
> -       if (put_user(len, &used->len)) {
> -               vq_err(vq, "Failed to write used len");
> -               return -EFAULT;
> +       if (count <= 0)
> +               return -EINVAL;
> +
> +       for (i = 0; i < count; ++i) {
> +               used = &vq->used->ring[vq->last_used_idx % vq->num];
> +               if (put_user((unsigned)heads[i].iov_base, &used->id)) {
> +                       vq_err(vq, "Failed to write used id");
> +                       return -EFAULT;
> +               }
> +               if (put_user(heads[i].iov_len, &used->len)) {
> +                       vq_err(vq, "Failed to write used len");
> +                       return -EFAULT;
> +               }
> +               vq->last_used_idx++;

I think we should update last_used_idx on success only,
at the end. Simply use last_used_idx + count
instead of last_used_idx + 1.

>         }
>         /* Make sure buffer is written before we update index. */
>         smp_wmb();
> -       if (put_user(vq->last_used_idx + 1, &vq->used->idx)) {
> +       if (put_user(vq->last_used_idx, &vq->used->idx)) {
>                 vq_err(vq, "Failed to increment used idx");
>                 return -EFAULT;
>         }
> @@ -1023,22 +1071,35 @@ int vhost_add_used(struct vhost_virtqueu
>                 if (vq->log_ctx)
>                         eventfd_signal(vq->log_ctx, 1);
>         }
> -       vq->last_used_idx++;
>         return 0;
>  }
>  
> +int vhost_available(struct vhost_virtqueue *vq)

since this function is non-static, please document what it does.

> +{
> +       int avail;
> +
> +       if (!vq->rxmaxheadcount)        /* haven't got any yet */
> +               return 1;

since seems to make net - specific asumptions.
How about moving this check out to net.c?

> +       avail = vq->avail_idx - vq->last_avail_idx;
> +       if (avail < 0)
> +               avail += 0x10000; /* wrapped */

A branch that is likely non-taken. Also, rxmaxheadcount
is really unlikely to get as large as half the ring.
So this just wastes cycles?

> +       return avail;
> +}
> +
>  /* This actually signals the guest, using eventfd. */
> -void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> +void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq, bool 
> recvr)

This one is not elegant. receive is net.c concept, let's not
put it in vhost.c. Pass in headcount if you must.

>  {
>         __u16 flags = 0;
> +
>         if (get_user(flags, &vq->avail->flags)) {
>                 vq_err(vq, "Failed to get flags");
>                 return;
>         }
>  
> -       /* If they don't want an interrupt, don't signal, unless empty. */
> +       /* If they don't want an interrupt, don't signal, unless
> +        * empty or receiver can't get a max-sized packet. */
>         if ((flags & VRING_AVAIL_F_NO_INTERRUPT) &&
> -           (vq->avail_idx != vq->last_avail_idx ||
> +           (!recvr || vhost_available(vq) >= vq->rxmaxheadcount ||

Is the above really worth the complexity?
Guests can't rely on this kind of fuzzy logic, can they?
Do you see this helping performance at all?

>              !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
>                 return;
>  
> @@ -1050,14 +1111,14 @@ void vhost_signal(struct vhost_dev *dev,
>  /* And here's the combo meal deal.  Supersize me! */
>  void vhost_add_used_and_signal(struct vhost_dev *dev,
>                                struct vhost_virtqueue *vq,
> -                              unsigned int head, int len)
> +                              struct iovec *heads, int count, bool recvr)
>  {
> -       vhost_add_used(vq, head, len);
> -       vhost_signal(dev, vq);
> +       vhost_add_used(vq, heads, count);
> +       vhost_signal(dev, vq, recvr);
>  }
>  
>  /* OK, now we need to know about added descriptors. */
> -bool vhost_enable_notify(struct vhost_virtqueue *vq)
> +bool vhost_enable_notify(struct vhost_virtqueue *vq, bool recvr)
>  {
>         u16 avail_idx;
>         int r;
> @@ -1080,6 +1141,8 @@ bool vhost_enable_notify(struct vhost_vi
>                 return false;
>         }
>  
> +       if (recvr && vq->rxmaxheadcount)
> +               return (avail_idx - vq->last_avail_idx) >= 
> vq->rxmaxheadcount;

The fuzzy logic behind rxmaxheadcount might be a good
optimization, but I am not comfortable using it
for correctness. Maybe vhost_enable_notify should get
the last head so we can redo poll when another one
is added.

>         return avail_idx != vq->last_avail_idx;
>  }
>  
> diff -ruNp net-next-p0/drivers/vhost/vhost.h 
> net-next-p3/drivers/vhost/vhost.h
> --- net-next-p0/drivers/vhost/vhost.h   2010-03-22 12:04:38.000000000 
> -0700
> +++ net-next-p3/drivers/vhost/vhost.h   2010-03-29 20:07:17.000000000 
> -0700
> @@ -82,9 +82,9 @@ struct vhost_virtqueue {
>         u64 log_addr;
>  
>         struct iovec indirect[VHOST_NET_MAX_SG];
> -       struct iovec iov[VHOST_NET_MAX_SG];
> -       struct iovec hdr[VHOST_NET_MAX_SG];
> -       size_t hdr_size;
> +       struct iovec iov[VHOST_NET_MAX_SG+1]; /* an extra for vnet hdr */

VHOST_NET_MAX_SG should already include iovec vnet hdr.

> +       struct iovec heads[VHOST_NET_MAX_SG];
> +       int rxmaxheadcount;
>         /* We use a kind of RCU to access private pointer.
>          * All readers access it from workqueue, which makes it possible 
> to
>          * flush the workqueue instead of synchronize_rcu. Therefore 
> readers do
> @@ -120,18 +120,20 @@ long vhost_dev_ioctl(struct vhost_dev *,
>  int vhost_vq_access_ok(struct vhost_virtqueue *vq);
>  int vhost_log_access_ok(struct vhost_dev *);
>  
> +unsigned vhost_get_heads(struct vhost_virtqueue *, int datalen, int 
> *iovcount,
> +                       struct vhost_log *log, unsigned int *log_num);
>  unsigned vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
>                            struct iovec iov[], unsigned int iov_count,
>                            unsigned int *out_num, unsigned int *in_num,
>                            struct vhost_log *log, unsigned int *log_num);
> -void vhost_discard_vq_desc(struct vhost_virtqueue *);
> +void vhost_discard_vq_desc(struct vhost_virtqueue *, int);
>  
> -int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len);
> -void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *);
> +int vhost_add_used(struct vhost_virtqueue *, struct iovec *heads, int 
> count);
>  void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue 
> *,
> -                              unsigned int head, int len);
> +                              struct iovec *heads, int count, bool 
> recvr);
> +void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *, bool);
>  void vhost_disable_notify(struct vhost_virtqueue *);
> -bool vhost_enable_notify(struct vhost_virtqueue *);
> +bool vhost_enable_notify(struct vhost_virtqueue *, bool);
>  
>  int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
>                     unsigned int log_num, u64 len);
> @@ -149,7 +151,8 @@ enum {
>         VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
>                          (1 << VIRTIO_RING_F_INDIRECT_DESC) |
>                          (1 << VHOST_F_LOG_ALL) |
> -                        (1 << VHOST_NET_F_VIRTIO_NET_HDR),
> +                        (1 << VHOST_NET_F_VIRTIO_NET_HDR) |
> +                        (1 << VIRTIO_NET_F_MRG_RXBUF),
>  };
>  
>  static inline int vhost_has_feature(struct vhost_dev *dev, int bit)
> 



^ permalink raw reply

* [PATCH NEXT 5/8] qlcnic: use IDC defined timeout value
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Sucheta Chakraborty
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

From: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>

o USE/Read IDC defined timeout value from ROM.
o While resetting chip, don't wait for other pci-func to respond,
  more than reset_ack_timeo seconds,

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic.h      |    4 +++-
 drivers/net/qlcnic/qlcnic_hdr.h  |    2 ++
 drivers/net/qlcnic/qlcnic_init.c |   16 ++++++++++++++++
 drivers/net/qlcnic/qlcnic_main.c |   26 ++++++++++++++++----------
 4 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic.h b/drivers/net/qlcnic/qlcnic.h
index 8a3446d..87cd1a7 100644
--- a/drivers/net/qlcnic/qlcnic.h
+++ b/drivers/net/qlcnic/qlcnic.h
@@ -958,8 +958,9 @@ struct qlcnic_adapter {
 	u8 dev_state;
 	u8 diag_test;
 	u8 diag_cnt;
+	u8 reset_ack_timeo;
+	u8 dev_init_timeo;
 	u8 rsrd1;
-	u16 rsrd2;
 
 	u8 mac_addr[ETH_ALEN];
 
@@ -1040,6 +1041,7 @@ int qlcnic_need_fw_reset(struct qlcnic_adapter *adapter);
 void qlcnic_request_firmware(struct qlcnic_adapter *adapter);
 void qlcnic_release_firmware(struct qlcnic_adapter *adapter);
 int qlcnic_pinit_from_rom(struct qlcnic_adapter *adapter);
+void qlcnic_setup_idc_param(struct qlcnic_adapter *adapter);
 
 int qlcnic_rom_fast_read(struct qlcnic_adapter *adapter, int addr, int *valp);
 int qlcnic_rom_fast_read_words(struct qlcnic_adapter *adapter, int addr,
diff --git a/drivers/net/qlcnic/qlcnic_hdr.h b/drivers/net/qlcnic/qlcnic_hdr.h
index e9fb692..51fa3fb 100644
--- a/drivers/net/qlcnic/qlcnic_hdr.h
+++ b/drivers/net/qlcnic/qlcnic_hdr.h
@@ -695,6 +695,8 @@ enum {
 #define QLCNIC_CRB_DRV_SCRATCH             (QLCNIC_CAM_RAM(0x148))
 #define QLCNIC_CRB_DEV_PARTITION_INFO      (QLCNIC_CAM_RAM(0x14c))
 #define QLCNIC_CRB_DRV_IDC_VER             (QLCNIC_CAM_RAM(0x14c))
+#define QLCNIC_ROM_DEV_INIT_TIMEOUT	(0x3e885c)
+#define QLCNIC_ROM_DRV_RESET_TIMEOUT	(0x3e8860)
 
 		 /* Device State */
 #define QLCNIC_DEV_COLD 		1
diff --git a/drivers/net/qlcnic/qlcnic_init.c b/drivers/net/qlcnic/qlcnic_init.c
index 0a424e0..ccd24f4 100644
--- a/drivers/net/qlcnic/qlcnic_init.c
+++ b/drivers/net/qlcnic/qlcnic_init.c
@@ -529,6 +529,22 @@ int qlcnic_pinit_from_rom(struct qlcnic_adapter *adapter)
 	return 0;
 }
 
+void
+qlcnic_setup_idc_param(struct qlcnic_adapter *adapter) {
+
+	int timeo;
+
+	if (qlcnic_rom_fast_read(adapter, QLCNIC_ROM_DEV_INIT_TIMEOUT, &timeo))
+		timeo = 30;
+
+	adapter->dev_init_timeo = timeo;
+
+	if (qlcnic_rom_fast_read(adapter, QLCNIC_ROM_DRV_RESET_TIMEOUT, &timeo))
+		timeo = 10;
+
+	adapter->reset_ack_timeo = timeo;
+}
+
 static int
 qlcnic_has_mn(struct qlcnic_adapter *adapter)
 {
diff --git a/drivers/net/qlcnic/qlcnic_main.c b/drivers/net/qlcnic/qlcnic_main.c
index a234622..38e0829 100644
--- a/drivers/net/qlcnic/qlcnic_main.c
+++ b/drivers/net/qlcnic/qlcnic_main.c
@@ -649,7 +649,10 @@ qlcnic_start_firmware(struct qlcnic_adapter *adapter)
 	if (err)
 		return err;
 
-	if (!qlcnic_can_start_firmware(adapter))
+	err = qlcnic_can_start_firmware(adapter);
+	if (err < 0)
+		return err;
+	else if (!err)
 		goto wait_init;
 
 	first_boot = QLCRD32(adapter, QLCNIC_CAM_RAM(0x1fc));
@@ -1138,6 +1141,7 @@ qlcnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_out_iounmap;
 	}
 
+	qlcnic_setup_idc_param(adapter);
 
 	err = qlcnic_start_firmware(adapter);
 	if (err)
@@ -2027,7 +2031,7 @@ static int
 qlcnic_can_start_firmware(struct qlcnic_adapter *adapter)
 {
 	u32 val, prev_state;
-	int cnt = 0;
+	u8 dev_init_timeo = adapter->dev_init_timeo;
 	int portnum = adapter->portnum;
 
 	if (qlcnic_api_lock(adapter))
@@ -2072,12 +2076,13 @@ start_fw:
 	}
 
 	qlcnic_api_unlock(adapter);
-	msleep(1000);
-	while ((QLCRD32(adapter, QLCNIC_CRB_DEV_STATE) != QLCNIC_DEV_READY) &&
-			++cnt < 20)
+
+	do {
 		msleep(1000);
+	} while ((QLCRD32(adapter, QLCNIC_CRB_DEV_STATE) != QLCNIC_DEV_READY)
+			&& --dev_init_timeo);
 
-	if (cnt >= 20)
+	if (!dev_init_timeo)
 		return -1;
 
 	if (qlcnic_api_lock(adapter))
@@ -2099,12 +2104,10 @@ qlcnic_fwinit_work(struct work_struct *work)
 			struct qlcnic_adapter, fw_work.work);
 	int dev_state;
 
-	if (++adapter->fw_wait_cnt > FW_POLL_THRESH)
-		goto err_ret;
-
 	if (test_bit(__QLCNIC_START_FW, &adapter->state)) {
 
-		if (qlcnic_check_drv_state(adapter)) {
+		if (qlcnic_check_drv_state(adapter) &&
+			(adapter->fw_wait_cnt++ < adapter->reset_ack_timeo)) {
 			qlcnic_schedule_work(adapter,
 					qlcnic_fwinit_work, FW_POLL_DELAY);
 			return;
@@ -2118,6 +2121,9 @@ qlcnic_fwinit_work(struct work_struct *work)
 		goto err_ret;
 	}
 
+	if (adapter->fw_wait_cnt++ > (adapter->dev_init_timeo / 2))
+		goto err_ret;
+
 	dev_state = QLCRD32(adapter, QLCNIC_CRB_DEV_STATE);
 	switch (dev_state) {
 	case QLCNIC_DEV_READY:
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 6/8] qlcnic: add driver debug support
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic.h         |    9 ++++++++-
 drivers/net/qlcnic/qlcnic_ethtool.c |   16 ++++++++++++++++
 drivers/net/qlcnic/qlcnic_hw.c      |    6 +++++-
 drivers/net/qlcnic/qlcnic_main.c    |   23 ++++++++++++++++++++---
 4 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic.h b/drivers/net/qlcnic/qlcnic.h
index 87cd1a7..dbf6335 100644
--- a/drivers/net/qlcnic/qlcnic.h
+++ b/drivers/net/qlcnic/qlcnic.h
@@ -960,7 +960,7 @@ struct qlcnic_adapter {
 	u8 diag_cnt;
 	u8 reset_ack_timeo;
 	u8 dev_init_timeo;
-	u8 rsrd1;
+	u8 msglvl;
 
 	u8 mac_addr[ETH_ALEN];
 
@@ -1135,4 +1135,11 @@ static inline u32 qlcnic_tx_avail(struct qlcnic_host_tx_ring *tx_ring)
 
 extern const struct ethtool_ops qlcnic_ethtool_ops;
 
+#define QLCDB(adapter, _fmt, _args...) do { \
+	if (adapter->msglvl) \
+		printk(KERN_INFO "%s: %s: " _fmt,		\
+			dev_name(&adapter->pdev->dev),\
+			__func__, ##_args);		\
+	} while (0)
+
 #endif				/* __QLCNIC_H_ */
diff --git a/drivers/net/qlcnic/qlcnic_ethtool.c b/drivers/net/qlcnic/qlcnic_ethtool.c
index f83e15f..1cf3e59 100644
--- a/drivers/net/qlcnic/qlcnic_ethtool.c
+++ b/drivers/net/qlcnic/qlcnic_ethtool.c
@@ -998,6 +998,20 @@ static int qlcnic_set_flags(struct net_device *netdev, u32 data)
 	return 0;
 }
 
+static u32 qlcnic_get_msglevel(struct net_device *netdev)
+{
+	struct qlcnic_adapter *adapter = netdev_priv(netdev);
+
+	return adapter->msglvl;
+}
+
+static void qlcnic_set_msglevel(struct net_device *netdev, u32 msglvl)
+{
+	struct qlcnic_adapter *adapter = netdev_priv(netdev);
+
+	adapter->msglvl = !!msglvl;
+}
+
 const struct ethtool_ops qlcnic_ethtool_ops = {
 	.get_settings = qlcnic_get_settings,
 	.set_settings = qlcnic_set_settings,
@@ -1029,4 +1043,6 @@ const struct ethtool_ops qlcnic_ethtool_ops = {
 	.get_flags = ethtool_op_get_flags,
 	.set_flags = qlcnic_set_flags,
 	.phys_id = qlcnic_blink_led,
+	.set_msglevel = qlcnic_set_msglevel,
+	.get_msglevel = qlcnic_get_msglevel,
 };
diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index 0933c2d..14c999a 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -294,8 +294,12 @@ qlcnic_pcie_sem_lock(struct qlcnic_adapter *adapter, int sem, u32 id_reg)
 		done = QLCRD32(adapter, QLCNIC_PCIE_REG(PCIE_SEM_LOCK(sem)));
 		if (done == 1)
 			break;
-		if (++timeout >= QLCNIC_PCIE_SEM_TIMEOUT)
+		if (++timeout >= QLCNIC_PCIE_SEM_TIMEOUT) {
+			dev_err(&adapter->pdev->dev,
+				"Failed to acquire sem=%d lock;reg_id=%d\n",
+				sem, id_reg);
 			return -EIO;
+		}
 		msleep(1);
 	}
 
diff --git a/drivers/net/qlcnic/qlcnic_main.c b/drivers/net/qlcnic/qlcnic_main.c
index 38e0829..87511f1 100644
--- a/drivers/net/qlcnic/qlcnic_main.c
+++ b/drivers/net/qlcnic/qlcnic_main.c
@@ -1742,6 +1742,7 @@ static void qlcnic_tx_timeout_task(struct work_struct *work)
 request_reset:
 	adapter->need_fw_reset = 1;
 	clear_bit(__QLCNIC_RESETTING, &adapter->state);
+	QLCDB(adapter, "Resetting adapter\n");
 }
 
 static struct net_device_stats *qlcnic_get_stats(struct net_device *netdev)
@@ -2046,6 +2047,7 @@ qlcnic_can_start_firmware(struct qlcnic_adapter *adapter)
 	}
 
 	prev_state = QLCRD32(adapter, QLCNIC_CRB_DEV_STATE);
+	QLCDB(adapter, "Device state = %u\n", prev_state);
 
 	switch (prev_state) {
 	case QLCNIC_DEV_COLD:
@@ -2082,8 +2084,11 @@ start_fw:
 	} while ((QLCRD32(adapter, QLCNIC_CRB_DEV_STATE) != QLCNIC_DEV_READY)
 			&& --dev_init_timeo);
 
-	if (!dev_init_timeo)
+	if (!dev_init_timeo) {
+		dev_err(&adapter->pdev->dev,
+			"Waiting for device to initialize timeout\n");
 		return -1;
+	}
 
 	if (qlcnic_api_lock(adapter))
 		return -1;
@@ -2113,6 +2118,7 @@ qlcnic_fwinit_work(struct work_struct *work)
 			return;
 		}
 
+		QLCDB(adapter, "Resetting FW\n");
 		if (!qlcnic_start_firmware(adapter)) {
 			qlcnic_schedule_work(adapter, qlcnic_attach_work, 0);
 			return;
@@ -2121,10 +2127,15 @@ qlcnic_fwinit_work(struct work_struct *work)
 		goto err_ret;
 	}
 
-	if (adapter->fw_wait_cnt++ > (adapter->dev_init_timeo / 2))
+	if (adapter->fw_wait_cnt++ > (adapter->dev_init_timeo / 2)) {
+		dev_err(&adapter->pdev->dev,
+				"Waiting for device to reset timeout\n");
 		goto err_ret;
+	}
 
 	dev_state = QLCRD32(adapter, QLCNIC_CRB_DEV_STATE);
+	QLCDB(adapter, "Func waiting: Device state=%d\n", dev_state);
+
 	switch (dev_state) {
 	case QLCNIC_DEV_READY:
 		if (!qlcnic_start_firmware(adapter)) {
@@ -2177,6 +2188,8 @@ qlcnic_detach_work(struct work_struct *work)
 	return;
 
 err_ret:
+	dev_err(&adapter->pdev->dev, "detach failed; status=%d temp=%d\n",
+			status, adapter->temp);
 	qlcnic_clr_all_drv_state(adapter);
 
 }
@@ -2194,6 +2207,7 @@ qlcnic_dev_request_reset(struct qlcnic_adapter *adapter)
 	if (state != QLCNIC_DEV_INITALIZING && state != QLCNIC_DEV_NEED_RESET) {
 		QLCWR32(adapter, QLCNIC_CRB_DEV_STATE, QLCNIC_DEV_NEED_RESET);
 		set_bit(__QLCNIC_START_FW, &adapter->state);
+		QLCDB(adapter, "NEED_RESET state set\n");
 	}
 
 	qlcnic_api_unlock(adapter);
@@ -2290,8 +2304,11 @@ detach:
 		QLCNIC_DEV_NEED_RESET;
 
 	if ((auto_fw_reset == AUTO_FW_RESET_ENABLED) &&
-			!test_and_set_bit(__QLCNIC_RESETTING, &adapter->state))
+		!test_and_set_bit(__QLCNIC_RESETTING, &adapter->state)) {
+
 		qlcnic_schedule_work(adapter, qlcnic_detach_work, 0);
+		QLCDB(adapter, "fw recovery scheduled.\n");
+	}
 
 	return 1;
 }
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 8/8] qlcnic: update version to 5.0.1
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic.h b/drivers/net/qlcnic/qlcnic.h
index dbf6335..b726db2 100644
--- a/drivers/net/qlcnic/qlcnic.h
+++ b/drivers/net/qlcnic/qlcnic.h
@@ -51,8 +51,8 @@
 
 #define _QLCNIC_LINUX_MAJOR 5
 #define _QLCNIC_LINUX_MINOR 0
-#define _QLCNIC_LINUX_SUBVERSION 0
-#define QLCNIC_LINUX_VERSIONID  "5.0.0"
+#define _QLCNIC_LINUX_SUBVERSION 1
+#define QLCNIC_LINUX_VERSIONID  "5.0.1"
 
 #define QLCNIC_VERSION_CODE(a, b, c)	(((a) << 24) + ((b) << 16) + (c))
 #define _major(v)	(((v) >> 24) & 0xff)
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 7/8] qlcnic: fix interface attach sequence
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

Interface should be visible even if resource allocation fails.
netif_device_attach should be called for every netif_device_detach.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_main.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_main.c b/drivers/net/qlcnic/qlcnic_main.c
index 87511f1..6faabf4 100644
--- a/drivers/net/qlcnic/qlcnic_main.c
+++ b/drivers/net/qlcnic/qlcnic_main.c
@@ -952,11 +952,11 @@ void qlcnic_diag_free_res(struct net_device *netdev, int max_sds_rings)
 	adapter->max_sds_rings = max_sds_rings;
 
 	if (qlcnic_attach(adapter))
-		return;
+		goto out;
 
 	if (netif_running(netdev))
 		__qlcnic_up(adapter, netdev);
-
+out:
 	netif_device_attach(netdev);
 }
 
@@ -978,8 +978,10 @@ int qlcnic_diag_alloc_res(struct net_device *netdev, int test)
 	adapter->diag_test = test;
 
 	ret = qlcnic_attach(adapter);
-	if (ret)
+	if (ret) {
+		netif_device_attach(netdev);
 		return ret;
+	}
 
 	if (adapter->diag_test == QLCNIC_INTERRUPT_TEST) {
 		for (ring = 0; ring < adapter->max_sds_rings; ring++) {
@@ -1012,16 +1014,12 @@ qlcnic_reset_context(struct qlcnic_adapter *adapter)
 		if (netif_running(netdev)) {
 			err = qlcnic_attach(adapter);
 			if (!err)
-				err = __qlcnic_up(adapter, netdev);
-
-			if (err)
-				goto done;
+				__qlcnic_up(adapter, netdev);
 		}
 
 		netif_device_attach(netdev);
 	}
 
-done:
 	clear_bit(__QLCNIC_RESETTING, &adapter->state);
 	return err;
 }
@@ -1337,6 +1335,7 @@ err_out_detach:
 	qlcnic_detach(adapter);
 err_out:
 	qlcnic_clr_all_drv_state(adapter);
+	netif_device_attach(netdev);
 	return err;
 }
 #endif
@@ -2152,6 +2151,7 @@ qlcnic_fwinit_work(struct work_struct *work)
 	}
 
 err_ret:
+	netif_device_attach(adapter->netdev);
 	qlcnic_clr_all_drv_state(adapter);
 }
 
@@ -2190,6 +2190,7 @@ qlcnic_detach_work(struct work_struct *work)
 err_ret:
 	dev_err(&adapter->pdev->dev, "detach failed; status=%d temp=%d\n",
 			status, adapter->temp);
+	netif_device_attach(netdev);
 	qlcnic_clr_all_drv_state(adapter);
 
 }
@@ -2252,9 +2253,8 @@ qlcnic_attach_work(struct work_struct *work)
 		qlcnic_config_indev_addr(netdev, NETDEV_UP);
 	}
 
-	netif_device_attach(netdev);
-
 done:
+	netif_device_attach(netdev);
 	adapter->fw_fail_cnt = 0;
 	clear_bit(__QLCNIC_RESETTING, &adapter->state);
 
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 2/8] qlcnic: handle queue manager access
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Dhananjay Phadke
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>

Check the access by tools for hardware queue engine and handle it
separately than other block registers, otherwise incorrect data
is returned.

Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic.h      |    5 +++++
 drivers/net/qlcnic/qlcnic_hdr.h  |    3 ++-
 drivers/net/qlcnic/qlcnic_hw.c   |   25 ++++++++++++++++++++++---
 drivers/net/qlcnic/qlcnic_main.c |   35 +++++++++++++++++++++++++++--------
 4 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic.h b/drivers/net/qlcnic/qlcnic.h
index 0da94b2..8a3446d 100644
--- a/drivers/net/qlcnic/qlcnic.h
+++ b/drivers/net/qlcnic/qlcnic.h
@@ -994,6 +994,11 @@ u32 qlcnic_hw_read_wx_2M(struct qlcnic_adapter *adapter, ulong off);
 int qlcnic_hw_write_wx_2M(struct qlcnic_adapter *, ulong off, u32 data);
 int qlcnic_pci_mem_write_2M(struct qlcnic_adapter *, u64 off, u64 data);
 int qlcnic_pci_mem_read_2M(struct qlcnic_adapter *, u64 off, u64 *data);
+void qlcnic_pci_camqm_read_2M(struct qlcnic_adapter *, u64, u64 *);
+void qlcnic_pci_camqm_write_2M(struct qlcnic_adapter *, u64, u64);
+
+#define ADDR_IN_RANGE(addr, low, high)	\
+	(((addr) < (high)) && ((addr) >= (low)))
 
 #define QLCRD32(adapter, off) \
 	(qlcnic_hw_read_wx_2M(adapter, off))
diff --git a/drivers/net/qlcnic/qlcnic_hdr.h b/drivers/net/qlcnic/qlcnic_hdr.h
index 0469f84..25465a9 100644
--- a/drivers/net/qlcnic/qlcnic_hdr.h
+++ b/drivers/net/qlcnic/qlcnic_hdr.h
@@ -435,9 +435,10 @@ enum {
 #define QLCNIC_PCI_MS_2M	(0x80000)
 #define QLCNIC_PCI_OCM0_2M	(0x000c0000UL)
 #define QLCNIC_PCI_CRBSPACE	(0x06000000UL)
+#define QLCNIC_PCI_CAMQM	(0x04800000UL)
+#define QLCNIC_PCI_CAMQM_END	(0x04800800UL)
 #define QLCNIC_PCI_2MB_SIZE	(0x00200000UL)
 #define QLCNIC_PCI_CAMQM_2M_BASE	(0x000ff800UL)
-#define QLCNIC_PCI_CAMQM_2M_END 	(0x04800800UL)
 
 #define QLCNIC_CRB_CAM	QLCNIC_PCI_CRB_WINDOW(QLCNIC_HW_PX_MAP_CRB_CAM)
 
diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index da00e16..b977874 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -53,9 +53,6 @@ static inline void writeq(u64 val, void __iomem *addr)
 }
 #endif
 
-#define ADDR_IN_RANGE(addr, low, high)	\
-	(((addr) < (high)) && ((addr) >= (low)))
-
 #define PCI_OFFSET_FIRST_RANGE(adapter, off)    \
 	((adapter)->ahw.pci_base0 + (off))
 
@@ -936,6 +933,28 @@ unlock:
 	return ret;
 }
 
+void
+qlcnic_pci_camqm_read_2M(struct qlcnic_adapter *adapter, u64 off, u64 *data)
+{
+	void __iomem *addr = adapter->ahw.pci_base0 +
+		QLCNIC_PCI_CAMQM_2M_BASE + (off - QLCNIC_PCI_CAMQM);
+
+	mutex_lock(&adapter->ahw.mem_lock);
+	*data = readq(addr);
+	mutex_unlock(&adapter->ahw.mem_lock);
+}
+
+void
+qlcnic_pci_camqm_write_2M(struct qlcnic_adapter *adapter, u64 off, u64 data)
+{
+	void __iomem *addr = adapter->ahw.pci_base0 +
+		QLCNIC_PCI_CAMQM_2M_BASE + (off - QLCNIC_PCI_CAMQM);
+
+	mutex_lock(&adapter->ahw.mem_lock);
+	writeq(data, addr);
+	mutex_unlock(&adapter->ahw.mem_lock);
+}
+
 #define MAX_CTL_CHECK   1000
 
 int
diff --git a/drivers/net/qlcnic/qlcnic_main.c b/drivers/net/qlcnic/qlcnic_main.c
index fc72156..a234622 100644
--- a/drivers/net/qlcnic/qlcnic_main.c
+++ b/drivers/net/qlcnic/qlcnic_main.c
@@ -2386,14 +2386,21 @@ static int
 qlcnic_sysfs_validate_crb(struct qlcnic_adapter *adapter,
 		loff_t offset, size_t size)
 {
+	size_t crb_size = 4;
+
 	if (!(adapter->flags & QLCNIC_DIAG_ENABLED))
 		return -EIO;
 
-	if ((size != 4) || (offset & 0x3))
-		return  -EINVAL;
+	if (offset < QLCNIC_PCI_CRBSPACE) {
+		if (ADDR_IN_RANGE(offset, QLCNIC_PCI_CAMQM,
+					QLCNIC_PCI_CAMQM_END))
+			crb_size = 8;
+		else
+			return -EINVAL;
+	}
 
-	if (offset < QLCNIC_PCI_CRBSPACE)
-		return -EINVAL;
+	if ((size != crb_size) || (offset & (crb_size-1)))
+		return  -EINVAL;
 
 	return 0;
 }
@@ -2405,14 +2412,20 @@ qlcnic_sysfs_read_crb(struct kobject *kobj, struct bin_attribute *attr,
 	struct device *dev = container_of(kobj, struct device, kobj);
 	struct qlcnic_adapter *adapter = dev_get_drvdata(dev);
 	u32 data;
+	u64 qmdata;
 	int ret;
 
 	ret = qlcnic_sysfs_validate_crb(adapter, offset, size);
 	if (ret != 0)
 		return ret;
 
-	data = QLCRD32(adapter, offset);
-	memcpy(buf, &data, size);
+	if (ADDR_IN_RANGE(offset, QLCNIC_PCI_CAMQM, QLCNIC_PCI_CAMQM_END)) {
+		qlcnic_pci_camqm_read_2M(adapter, offset, &qmdata);
+		memcpy(buf, &qmdata, size);
+	} else {
+		data = QLCRD32(adapter, offset);
+		memcpy(buf, &data, size);
+	}
 	return size;
 }
 
@@ -2423,14 +2436,20 @@ qlcnic_sysfs_write_crb(struct kobject *kobj, struct bin_attribute *attr,
 	struct device *dev = container_of(kobj, struct device, kobj);
 	struct qlcnic_adapter *adapter = dev_get_drvdata(dev);
 	u32 data;
+	u64 qmdata;
 	int ret;
 
 	ret = qlcnic_sysfs_validate_crb(adapter, offset, size);
 	if (ret != 0)
 		return ret;
 
-	memcpy(&data, buf, size);
-	QLCWR32(adapter, offset, data);
+	if (ADDR_IN_RANGE(offset, QLCNIC_PCI_CAMQM, QLCNIC_PCI_CAMQM_END)) {
+		memcpy(&qmdata, buf, size);
+		qlcnic_pci_camqm_write_2M(adapter, offset, qmdata);
+	} else {
+		memcpy(&data, buf, size);
+		QLCWR32(adapter, offset, data);
+	}
 	return size;
 }
 
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 0/8]qlcnic: fix diagnostic tools access
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman

Hi
  Series of 8 patches to fix diagnostic tools access and added debug
  messages in driver.
  Apply them in net-next branch.

-Amit

^ permalink raw reply

* [PATCH NEXT 3/8] qlcnic: update oncard memory size check
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Dhananjay Phadke
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>

All QLogic converged NICs have 128-bit 128MB on card memory.
Fix the limit check from 64MB to 128MB and remove unnecessary
64-bit read/write checks.

Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_hdr.h |    2 +-
 drivers/net/qlcnic/qlcnic_hw.c  |   55 +++++++++++++++++----------------------
 2 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_hdr.h b/drivers/net/qlcnic/qlcnic_hdr.h
index 25465a9..e9fb692 100644
--- a/drivers/net/qlcnic/qlcnic_hdr.h
+++ b/drivers/net/qlcnic/qlcnic_hdr.h
@@ -449,7 +449,7 @@ enum {
 #define QLCNIC_ADDR_OCM1	(0x0000000200400000ULL)
 #define QLCNIC_ADDR_OCM1_MAX	(0x00000002004fffffULL)
 #define QLCNIC_ADDR_QDR_NET	(0x0000000300000000ULL)
-#define QLCNIC_ADDR_QDR_NET_MAX_P3 (0x0000000303ffffffULL)
+#define QLCNIC_ADDR_QDR_NET_MAX (0x0000000307ffffffULL)
 
 /*
  *   Register offsets for MN
diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index b977874..419f46e 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -963,7 +963,6 @@ qlcnic_pci_mem_write_2M(struct qlcnic_adapter *adapter,
 {
 	int i, j, ret;
 	u32 temp, off8;
-	u64 stride;
 	void __iomem *mem_crb;
 
 	/* Only 64-bit aligned access */
@@ -972,7 +971,7 @@ qlcnic_pci_mem_write_2M(struct qlcnic_adapter *adapter,
 
 	/* P3 onward, test agent base for MIU and SIU is same */
 	if (ADDR_IN_RANGE(off, QLCNIC_ADDR_QDR_NET,
-				QLCNIC_ADDR_QDR_NET_MAX_P3)) {
+				QLCNIC_ADDR_QDR_NET_MAX)) {
 		mem_crb = qlcnic_get_ioaddr(adapter,
 				QLCNIC_CRB_QDR_NET+MIU_TEST_AGT_BASE);
 		goto correct;
@@ -990,9 +989,7 @@ qlcnic_pci_mem_write_2M(struct qlcnic_adapter *adapter,
 	return -EIO;
 
 correct:
-	stride = QLCNIC_IS_REVISION_P3P(adapter->ahw.revision_id) ? 16 : 8;
-
-	off8 = off & ~(stride-1);
+	off8 = off & ~0xf;
 
 	mutex_lock(&adapter->ahw.mem_lock);
 
@@ -1000,30 +997,28 @@ correct:
 	writel(0, (mem_crb + MIU_TEST_AGT_ADDR_HI));
 
 	i = 0;
-	if (stride == 16) {
-		writel(TA_CTL_ENABLE, (mem_crb + TEST_AGT_CTRL));
-		writel((TA_CTL_START | TA_CTL_ENABLE),
-				(mem_crb + TEST_AGT_CTRL));
-
-		for (j = 0; j < MAX_CTL_CHECK; j++) {
-			temp = readl(mem_crb + TEST_AGT_CTRL);
-			if ((temp & TA_CTL_BUSY) == 0)
-				break;
-		}
+	writel(TA_CTL_ENABLE, (mem_crb + TEST_AGT_CTRL));
+	writel((TA_CTL_START | TA_CTL_ENABLE),
+			(mem_crb + TEST_AGT_CTRL));
 
-		if (j >= MAX_CTL_CHECK) {
-			ret = -EIO;
-			goto done;
-		}
+	for (j = 0; j < MAX_CTL_CHECK; j++) {
+		temp = readl(mem_crb + TEST_AGT_CTRL);
+		if ((temp & TA_CTL_BUSY) == 0)
+			break;
+	}
 
-		i = (off & 0xf) ? 0 : 2;
-		writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i)),
-				mem_crb + MIU_TEST_AGT_WRDATA(i));
-		writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i+1)),
-				mem_crb + MIU_TEST_AGT_WRDATA(i+1));
-		i = (off & 0xf) ? 2 : 0;
+	if (j >= MAX_CTL_CHECK) {
+		ret = -EIO;
+		goto done;
 	}
 
+	i = (off & 0xf) ? 0 : 2;
+	writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i)),
+			mem_crb + MIU_TEST_AGT_WRDATA(i));
+	writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i+1)),
+			mem_crb + MIU_TEST_AGT_WRDATA(i+1));
+	i = (off & 0xf) ? 2 : 0;
+
 	writel(data & 0xffffffff,
 			mem_crb + MIU_TEST_AGT_WRDATA(i));
 	writel((data >> 32) & 0xffffffff,
@@ -1059,7 +1054,7 @@ qlcnic_pci_mem_read_2M(struct qlcnic_adapter *adapter,
 {
 	int j, ret;
 	u32 temp, off8;
-	u64 val, stride;
+	u64 val;
 	void __iomem *mem_crb;
 
 	/* Only 64-bit aligned access */
@@ -1068,7 +1063,7 @@ qlcnic_pci_mem_read_2M(struct qlcnic_adapter *adapter,
 
 	/* P3 onward, test agent base for MIU and SIU is same */
 	if (ADDR_IN_RANGE(off, QLCNIC_ADDR_QDR_NET,
-				QLCNIC_ADDR_QDR_NET_MAX_P3)) {
+				QLCNIC_ADDR_QDR_NET_MAX)) {
 		mem_crb = qlcnic_get_ioaddr(adapter,
 				QLCNIC_CRB_QDR_NET+MIU_TEST_AGT_BASE);
 		goto correct;
@@ -1088,9 +1083,7 @@ qlcnic_pci_mem_read_2M(struct qlcnic_adapter *adapter,
 	return -EIO;
 
 correct:
-	stride = QLCNIC_IS_REVISION_P3P(adapter->ahw.revision_id) ? 16 : 8;
-
-	off8 = off & ~(stride-1);
+	off8 = off & ~0xf;
 
 	mutex_lock(&adapter->ahw.mem_lock);
 
@@ -1112,7 +1105,7 @@ correct:
 		ret = -EIO;
 	} else {
 		off8 = MIU_TEST_AGT_RDDATA_LO;
-		if ((stride == 16) && (off & 0xf))
+		if (off & 0xf)
 			off8 = MIU_TEST_AGT_RDDATA_UPPER_LO;
 
 		temp = readl(mem_crb + off8 + 4);
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 1/8] qlcnic: fix fw load from file
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

Rarely: Fw file size can be unaligned to 8.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_init.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_init.c b/drivers/net/qlcnic/qlcnic_init.c
index 7c34e4e..0a424e0 100644
--- a/drivers/net/qlcnic/qlcnic_init.c
+++ b/drivers/net/qlcnic/qlcnic_init.c
@@ -949,6 +949,16 @@ qlcnic_load_firmware(struct qlcnic_adapter *adapter)
 
 			flashaddr += 8;
 		}
+
+		size = (__force u32)qlcnic_get_fw_size(adapter) % 8;
+		if (size) {
+			data = cpu_to_le64(ptr64[i]);
+
+			if (qlcnic_pci_mem_write_2M(adapter,
+						flashaddr, data))
+				return -EIO;
+		}
+
 	} else {
 		u64 data;
 		u32 hi, lo;
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH NEXT 4/8] qlcnic: fix onchip memory access
From: Amit Kumar Salecha @ 2010-03-31 12:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Dhananjay Phadke
In-Reply-To: <1270037026-9062-1-git-send-email-amit.salecha@qlogic.com>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>

Fix incorrect offset calculation and remove unnecessary remap
of the region in bar 0 to access onchip memory.

This was leading to read incorrect values by debug tools.

Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_hw.c |   39 ++-------------------------------------
 1 files changed, 2 insertions(+), 37 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index 419f46e..0933c2d 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -53,18 +53,6 @@ static inline void writeq(u64 val, void __iomem *addr)
 }
 #endif
 
-#define PCI_OFFSET_FIRST_RANGE(adapter, off)    \
-	((adapter)->ahw.pci_base0 + (off))
-
-static void __iomem *pci_base_offset(struct qlcnic_adapter *adapter,
-					    unsigned long off)
-{
-	if (ADDR_IN_RANGE(off, FIRST_PAGE_GROUP_START, FIRST_PAGE_GROUP_END))
-		return PCI_OFFSET_FIRST_RANGE(adapter, off);
-
-	return NULL;
-}
-
 static const struct crb_128M_2M_block_map
 crb_128M_2M_map[64] __cacheline_aligned_in_smp = {
     {{{0, 0,         0,         0} } },		/* 0: PCI */
@@ -871,13 +859,6 @@ qlcnic_pci_set_window_2M(struct qlcnic_adapter *adapter,
 		u64 addr, u32 *start)
 {
 	u32 window;
-	struct pci_dev *pdev = adapter->pdev;
-
-	if ((addr & 0x00ff800) == 0xff800) {
-		if (printk_ratelimit())
-			dev_warn(&pdev->dev, "QM access not handled\n");
-		return -EIO;
-	}
 
 	window = OCM_WIN_P3P(addr);
 
@@ -894,8 +875,7 @@ static int
 qlcnic_pci_mem_access_direct(struct qlcnic_adapter *adapter, u64 off,
 		u64 *data, int op)
 {
-	void __iomem *addr, *mem_ptr = NULL;
-	resource_size_t mem_base;
+	void __iomem *addr;
 	int ret;
 	u32 start;
 
@@ -905,21 +885,8 @@ qlcnic_pci_mem_access_direct(struct qlcnic_adapter *adapter, u64 off,
 	if (ret != 0)
 		goto unlock;
 
-	addr = pci_base_offset(adapter, start);
-	if (addr)
-		goto noremap;
-
-	mem_base = pci_resource_start(adapter->pdev, 0) + (start & PAGE_MASK);
-
-	mem_ptr = ioremap(mem_base, PAGE_SIZE);
-	if (mem_ptr == NULL) {
-		ret = -EIO;
-		goto unlock;
-	}
-
-	addr = mem_ptr + (start & (PAGE_SIZE - 1));
+	addr = adapter->ahw.pci_base0 + start;
 
-noremap:
 	if (op == 0)	/* read */
 		*data = readq(addr);
 	else		/* write */
@@ -928,8 +895,6 @@ noremap:
 unlock:
 	mutex_unlock(&adapter->ahw.mem_lock);
 
-	if (mem_ptr)
-		iounmap(mem_ptr);
 	return ret;
 }
 
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH 3/3] be2net: fix bug in vlan rx path for big endian architecture
From: Ajit Khaparde @ 2010-03-31 12:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

vlan traffic on big endian architecture is broken.
Need to swap the vid before giving packet to stack.
This patch fixes it.

Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
---
 drivers/net/benet/be_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index b0faaa2..ec6ace8 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -807,7 +807,7 @@ static void be_rx_compl_process(struct be_adapter *adapter,
 			return;
 		}
 		vid = AMAP_GET_BITS(struct amap_eth_rx_compl, vlan_tag, rxcp);
-		vid = be16_to_cpu(vid);
+		vid = swab16(vid);
 		vlan_hwaccel_receive_skb(skb, adapter->vlan_grp, vid);
 	} else {
 		netif_receive_skb(skb);
@@ -884,7 +884,7 @@ static void be_rx_compl_process_gro(struct be_adapter *adapter,
 		napi_gro_frags(&eq_obj->napi);
 	} else {
 		vid = AMAP_GET_BITS(struct amap_eth_rx_compl, vlan_tag, rxcp);
-		vid = be16_to_cpu(vid);
+		vid = swab16(vid);
 
 		if (!adapter->vlan_grp || adapter->vlans_added == 0)
 			return;
-- 
1.6.3.3


^ permalink raw reply related

* [PATCH 2/3] be2net: fix flashing on big endian architectures
From: Ajit Khaparde @ 2010-03-31 11:57 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Flashing is broken on big endian architectures like ppc.
This patch fixes it.

From: Naresh G <nareshg@serverengines.com>
Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
---
 drivers/net/benet/be_cmds.c |    4 ++--
 drivers/net/benet/be_main.c |   15 +++++++--------
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/net/benet/be_cmds.c b/drivers/net/benet/be_cmds.c
index 50e6259..d0ef4ac 100644
--- a/drivers/net/benet/be_cmds.c
+++ b/drivers/net/benet/be_cmds.c
@@ -1464,8 +1464,8 @@ int be_cmd_get_flash_crc(struct be_adapter *adapter, u8 *flashed_crc,
 
 	req->params.op_type = cpu_to_le32(IMG_TYPE_REDBOOT);
 	req->params.op_code = cpu_to_le32(FLASHROM_OPER_REPORT);
-	req->params.offset = offset;
-	req->params.data_buf_size = 0x4;
+	req->params.offset = cpu_to_le32(offset);
+	req->params.data_buf_size = cpu_to_le32(0x4);
 
 	status = be_mcc_notify_wait(adapter);
 	if (!status)
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index a08faf3..b0faaa2 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -1991,7 +1991,7 @@ int be_load_fw(struct be_adapter *adapter, u8 *func)
 	struct flash_file_hdr_g3 *fhdr3;
 	struct image_hdr *img_hdr_ptr = NULL;
 	struct be_dma_mem flash_cmd;
-	int status, i = 0;
+	int status, i = 0, num_imgs = 0;
 	const u8 *p;
 
 	strcpy(fw_file, func);
@@ -2017,15 +2017,14 @@ int be_load_fw(struct be_adapter *adapter, u8 *func)
 	if ((adapter->generation == BE_GEN3) &&
 			(get_ufigen_type(fhdr) == BE_GEN3)) {
 		fhdr3 = (struct flash_file_hdr_g3 *) fw->data;
-		for (i = 0; i < fhdr3->num_imgs; i++) {
+		num_imgs = le32_to_cpu(fhdr3->num_imgs);
+		for (i = 0; i < num_imgs; i++) {
 			img_hdr_ptr = (struct image_hdr *) (fw->data +
 					(sizeof(struct flash_file_hdr_g3) +
-					i * sizeof(struct image_hdr)));
-			if (img_hdr_ptr->imageid == 1) {
-				status = be_flash_data(adapter, fw,
-						&flash_cmd, fhdr3->num_imgs);
-			}
-
+					 i * sizeof(struct image_hdr)));
+			if (le32_to_cpu(img_hdr_ptr->imageid) == 1)
+				status = be_flash_data(adapter, fw, &flash_cmd,
+							num_imgs);
 		}
 	} else if ((adapter->generation == BE_GEN2) &&
 			(get_ufigen_type(fhdr) == BE_GEN2)) {
-- 
1.6.3.3


^ permalink raw reply related

* [PATCH 1/3] be2net: fix a bug in flashing the redboot section
From: Ajit Khaparde @ 2010-03-31 11:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
---
 drivers/net/benet/be_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 43e8032..a08faf3 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -1855,7 +1855,7 @@ static bool be_flash_redboot(struct be_adapter *adapter,
 	p += crc_offset;
 
 	status = be_cmd_get_flash_crc(adapter, flashed_crc,
-			(img_start + image_size - 4));
+			(image_size - 4));
 	if (status) {
 		dev_err(&adapter->pdev->dev,
 		"could not get crc from flash, not flashing redboot\n");
-- 
1.6.3.3


^ permalink raw reply related

* Undefined behaviour of connect(fd, NULL, 0);
From: Neil Brown @ 2010-03-31 11:36 UTC (permalink / raw)
  To: netdev


Hi Netdev.

We have a customer who was reporting strangely unpredictable behaviour of an
in-house application that used networking.

It called connect on a non-blocking socket and subsequently called
   connect(fd, NULL, 0)

to check if the connection had succeeded.
This would sometime "work" and sometimes close the connection.

Looking at the code (sys_connect, move_addr_to_kernel, inet_stream_connect),
it seems that in this case an uninitialised on-stack address is passed
to inet_stream_connect and it makes a decision based on ->sa_family (which is
uninitialised).

It seems clear that connect(fd, NULL, 0) is the wrong thing to do in this
circumstance, but I think it would be good if it failed consistently rather
than unpredictably.

Would it be appropriate for move_addr_to_kernel to zero out the remainder of
the address?
   memset(kaddr+ulen, 0, MAX_SOCK_ADDR-ulen);
??

Then connect(fd, NULL, 0) would always break the connection.

Thanks,
NeilBrown

^ permalink raw reply

* Re: [Patch] bonding: fix potential deadlock in bond_uninit()
From: Eric W. Biederman @ 2010-03-31 11:28 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, Jiri Pirko, Stephen Hemminger, netdev,
	David S. Miller, bonding-devel, Jay Vosburgh
In-Reply-To: <20100331105559.5607.38643.sendpatchset@localhost.localdomain>

Amerigo Wang <amwang@redhat.com> writes:

> bond_uninit() is invoked with rtnl_lock held, when it does destroy_workqueue()
> which will potentially flush all works in this workqueue, if we hold rtnl_lock
> again in the work function, it will deadlock.
>
> So unlock rtnl_lock before calling destroy_workqueue().

Ouch.  That seems rather rude to our caller, and likely very
dangerous.

Is this a deadlock you actually hit, or is this something lockdep
warned about?

My gut feel says we need to move the destroy_workqueue into
the network device destructor.

Eric



> Signed-off-by: WANG Cong <amwang@redhat.com>
> Cc: Jay Vosburgh <fubar@us.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: Jiri Pirko <jpirko@redhat.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>
> ---
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 5b92fbf..b781728 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -4542,8 +4542,11 @@ static void bond_uninit(struct net_device *bond_dev)
>  
>  	bond_remove_proc_entry(bond);
>  
> -	if (bond->wq)
> +	if (bond->wq) {
> +		rtnl_unlock();
>  		destroy_workqueue(bond->wq);
> +		rtnl_lock();
> +	}
>  
>  	netif_addr_lock_bh(bond_dev);
>  	bond_mc_list_destroy(bond);

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox