Netdev List
 help / color / mirror / Atom feed
* Re: 2.6.34-rc3-git8: Reported regressions 2.6.32 -> 2.6.33
From: Rafael J. Wysocki @ 2010-04-09 19:56 UTC (permalink / raw)
  To: Gertjan van Wingerde
  Cc: Linux Kernel Mailing List, Maciej Rutecki, Andrew Morton,
	Linus Torvalds, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI
In-Reply-To: <4BBE92AE.6090301@gmail.com>

On Friday 09 April 2010, Gertjan van Wingerde wrote:
> On 04/09/10 00:54, Rafael J. Wysocki wrote:
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=15699
> > Subject		: rt2500usb driver cannot remain connected
> > Submitter	:  <abcd@gentoo.org>
> > Date		: 2010-04-05 19:30 (4 days old)
> > Handled-By	: Ivo van Doorn <IvDoorn@gmail.com>
> > 
> 
> This one ought to be fixed by commit 9e76ad2a27f592c1390248867391880c7efe78b3
> in Linus' tree.

Thanks, closing.

Rafael

^ permalink raw reply

* [net-next-2.6 PATCH] igb: modify register test for i350 to reflect read only bits in RDLEN/TDLEN
From: Jeff Kirsher @ 2010-04-09 19:53 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

The registers for RDLEN/TDLEN on i350 have the first 7 bits as read only.
This is a change from previous hardware in which it was only the first 4
bits that were read only.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb_ethtool.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index cdebfbf..be0678a 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -909,10 +909,10 @@ static struct igb_reg_test reg_test_i350[] = {
 	{ E1000_VET,	   0x100, 1,  PATTERN_TEST, 0xFFFF0000, 0xFFFF0000 },
 	{ E1000_RDBAL(0),  0x100, 4,  PATTERN_TEST, 0xFFFFFF80, 0xFFFFFFFF },
 	{ E1000_RDBAH(0),  0x100, 4,  PATTERN_TEST, 0xFFFFFFFF, 0xFFFFFFFF },
-	{ E1000_RDLEN(0),  0x100, 4,  PATTERN_TEST, 0x000FFFF0, 0x000FFFFF },
+	{ E1000_RDLEN(0),  0x100, 4,  PATTERN_TEST, 0x000FFF80, 0x000FFFFF },
 	{ E1000_RDBAL(4),  0x40,  4,  PATTERN_TEST, 0xFFFFFF80, 0xFFFFFFFF },
 	{ E1000_RDBAH(4),  0x40,  4,  PATTERN_TEST, 0xFFFFFFFF, 0xFFFFFFFF },
-	{ E1000_RDLEN(4),  0x40,  4,  PATTERN_TEST, 0x000FFFF0, 0x000FFFFF },
+	{ E1000_RDLEN(4),  0x40,  4,  PATTERN_TEST, 0x000FFF80, 0x000FFFFF },
 	/* RDH is read-only for i350, only test RDT. */
 	{ E1000_RDT(0),	   0x100, 4,  PATTERN_TEST, 0x0000FFFF, 0x0000FFFF },
 	{ E1000_RDT(4),	   0x40,  4,  PATTERN_TEST, 0x0000FFFF, 0x0000FFFF },
@@ -921,10 +921,10 @@ static struct igb_reg_test reg_test_i350[] = {
 	{ E1000_TIPG,	   0x100, 1,  PATTERN_TEST, 0x3FFFFFFF, 0x3FFFFFFF },
 	{ E1000_TDBAL(0),  0x100, 4,  PATTERN_TEST, 0xFFFFFF80, 0xFFFFFFFF },
 	{ E1000_TDBAH(0),  0x100, 4,  PATTERN_TEST, 0xFFFFFFFF, 0xFFFFFFFF },
-	{ E1000_TDLEN(0),  0x100, 4,  PATTERN_TEST, 0x000FFFF0, 0x000FFFFF },
+	{ E1000_TDLEN(0),  0x100, 4,  PATTERN_TEST, 0x000FFF80, 0x000FFFFF },
 	{ E1000_TDBAL(4),  0x40,  4,  PATTERN_TEST, 0xFFFFFF80, 0xFFFFFFFF },
 	{ E1000_TDBAH(4),  0x40,  4,  PATTERN_TEST, 0xFFFFFFFF, 0xFFFFFFFF },
-	{ E1000_TDLEN(4),  0x40,  4,  PATTERN_TEST, 0x000FFFF0, 0x000FFFFF },
+	{ E1000_TDLEN(4),  0x40,  4,  PATTERN_TEST, 0x000FFF80, 0x000FFFFF },
 	{ E1000_TDT(0),	   0x100, 4,  PATTERN_TEST, 0x0000FFFF, 0x0000FFFF },
 	{ E1000_TDT(4),	   0x40,  4,  PATTERN_TEST, 0x0000FFFF, 0x0000FFFF },
 	{ E1000_RCTL,	   0x100, 1,  SET_READ_TEST, 0xFFFFFFFF, 0x00000000 },


^ permalink raw reply related

* [net-next-2.6 PATCH] igb: add support for reporting 5GT/s during probe on PCIe Gen2
From: Jeff Kirsher @ 2010-04-09 19:52 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change corrects the fact that we were not reporting Gen2 link speeds
when we were in fact connected at Gen2 rates.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/e1000_defines.h |    3 +++
 drivers/net/igb/e1000_mac.c     |   19 ++++++++++++++++---
 drivers/net/igb/igb_main.c      |    1 +
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/igb/e1000_defines.h b/drivers/net/igb/e1000_defines.h
index 31d24e0..8e440e8 100644
--- a/drivers/net/igb/e1000_defines.h
+++ b/drivers/net/igb/e1000_defines.h
@@ -615,6 +615,9 @@
 
 #define PCIE_LINK_WIDTH_MASK         0x3F0
 #define PCIE_LINK_WIDTH_SHIFT        4
+#define PCIE_LINK_SPEED_MASK         0x0F
+#define PCIE_LINK_SPEED_2500         0x01
+#define PCIE_LINK_SPEED_5000         0x02
 #define PCIE_DEVICE_CONTROL2_16ms    0x0005
 
 #define PHY_REVISION_MASK      0xFFFFFFF0
diff --git a/drivers/net/igb/e1000_mac.c b/drivers/net/igb/e1000_mac.c
index be8d010..4371835 100644
--- a/drivers/net/igb/e1000_mac.c
+++ b/drivers/net/igb/e1000_mac.c
@@ -53,17 +53,30 @@ s32 igb_get_bus_info_pcie(struct e1000_hw *hw)
 	u16 pcie_link_status;
 
 	bus->type = e1000_bus_type_pci_express;
-	bus->speed = e1000_bus_speed_2500;
 
 	ret_val = igb_read_pcie_cap_reg(hw,
 					  PCIE_LINK_STATUS,
 					  &pcie_link_status);
-	if (ret_val)
+	if (ret_val) {
 		bus->width = e1000_bus_width_unknown;
-	else
+		bus->speed = e1000_bus_speed_unknown;
+	} else {
+		switch (pcie_link_status & PCIE_LINK_SPEED_MASK) {
+		case PCIE_LINK_SPEED_2500:
+			bus->speed = e1000_bus_speed_2500;
+			break;
+		case PCIE_LINK_SPEED_5000:
+			bus->speed = e1000_bus_speed_5000;
+			break;
+		default:
+			bus->speed = e1000_bus_speed_unknown;
+			break;
+		}
+
 		bus->width = (enum e1000_bus_width)((pcie_link_status &
 						     PCIE_LINK_WIDTH_MASK) >>
 						     PCIE_LINK_WIDTH_SHIFT);
+	}
 
 	reg = rd32(E1000_STATUS);
 	bus->func = (reg & E1000_STATUS_FUNC_MASK) >> E1000_STATUS_FUNC_SHIFT;
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 876a49b..a3c79ad 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -1637,6 +1637,7 @@ static int __devinit igb_probe(struct pci_dev *pdev,
 	dev_info(&pdev->dev, "%s: (PCIe:%s:%s) %pM\n",
 		 netdev->name,
 		 ((hw->bus.speed == e1000_bus_speed_2500) ? "2.5Gb/s" :
+		  (hw->bus.speed == e1000_bus_speed_5000) ? "5.0Gb/s" :
 		                                            "unknown"),
 		 ((hw->bus.width == e1000_bus_width_pcie_x4) ? "Width x4" :
 		  (hw->bus.width == e1000_bus_width_pcie_x2) ? "Width x2" :


^ permalink raw reply related

* [net-2.6 PATCH] igb: restrict WoL for 82576 ET2 Quad Port Server Adapter
From: Jeff Kirsher @ 2010-04-09 19:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Stefan Assmann, Alexander Duyck, Jeff Kirsher

From: Stefan Assmann <sassmann@redhat.com>

Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is
only supported there.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb_ethtool.c |    1 +
 drivers/net/igb/igb_main.c    |    1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 1d4ee41..cdebfbf 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -1863,6 +1863,7 @@ static int igb_wol_exclusion(struct igb_adapter *adapter,
 		retval = 0;
 		break;
 	case E1000_DEV_ID_82576_QUAD_COPPER:
+	case E1000_DEV_ID_82576_QUAD_COPPER_ET2:
 		/* quad port adapters only support WoL on port A */
 		if (!(adapter->flags & IGB_FLAG_QUAD_PORT_A)) {
 			wol->supported = 0;
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 2745e17..876a49b 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -1593,6 +1593,7 @@ static int __devinit igb_probe(struct pci_dev *pdev,
 			adapter->eeprom_wol = 0;
 		break;
 	case E1000_DEV_ID_82576_QUAD_COPPER:
+	case E1000_DEV_ID_82576_QUAD_COPPER_ET2:
 		/* if quad port adapter, disable WoL on all but port A */
 		if (global_quad_port_a != 0)
 			adapter->eeprom_wol = 0;


^ permalink raw reply related

* Re: [PATCH 2/2] [V5] Add non-Virtex5 support for LL TEMAC driver
From: Grant Likely @ 2010-04-09 18:10 UTC (permalink / raw)
  To: John Linn
  Cc: netdev, linuxppc-dev, jwboyer, eric.dumazet, john.williams,
	michal.simek, John Tyner, David Miller
In-Reply-To: <960dddba-8a63-4480-8245-f06fad59ab36@SG2EHSMHS005.ehs.local>

On Thu, Apr 8, 2010 at 11:08 AM, John Linn <john.linn@xilinx.com> wrote:
> This patch adds support for using the LL TEMAC Ethernet driver on
> non-Virtex 5 platforms by adding support for accessing the Soft DMA
> registers as if they were memory mapped instead of solely through the
> DCR's (available on the Virtex 5).
>
> The patch also updates the driver so that it runs on the MicroBlaze.
> The changes were tested on the PowerPC 440, PowerPC 405, and the
> MicroBlaze platforms.
>
> Signed-off-by: John Tyner <jtyner@cs.ucr.edu>
> Signed-off-by: John Linn <john.linn@xilinx.com>

Picked up and build tested both patches on 405, 440, 60x and ppc64.
No build problems found either built-in or as a module.

for both:
Acked-by: Grant Likely <grant.likely@secretlab.ca>

g.

>
> ---
>
> V2 - Incorporated comments from Grant and added more logic to allow the driver
> to work on MicroBlaze.
>
> V3 - Only updated it to apply to head, minor change to include slab.h. Also
> verified that it now builds for MicroBlaze. Retested on PowerPC and MicroBlaze.
>
> V4 - Removed buffer alignment for skb and called the network functions that
> already do the alignment for cache line and word alignment. Added constants
> to MicroBlaze system to make sure network alignment is maintained. Also updated
> the Kconfig so it depends on Microblaze or PPC based on Grant's comment.
>
> V5 - Respun the patch on top of a new patch to the driver which removed the
> call to virt_to_bus as it's now illegal and caused a failure when building
> the driver in linux-next. Retested with 440, 405 and Microblaze.
>
> Grant, can you do a build test to verify no build issues?
> ---
>  arch/microblaze/include/asm/system.h |   11 +++
>  drivers/net/Kconfig                  |    2 +-
>  drivers/net/ll_temac.h               |   14 +++-
>  drivers/net/ll_temac_main.c          |  137 +++++++++++++++++++++++++--------
>  4 files changed, 126 insertions(+), 38 deletions(-)
>
> diff --git a/arch/microblaze/include/asm/system.h b/arch/microblaze/include/asm/system.h
> index 59efb3f..48c4f03 100644
> --- a/arch/microblaze/include/asm/system.h
> +++ b/arch/microblaze/include/asm/system.h
> @@ -12,6 +12,7 @@
>  #include <asm/registers.h>
>  #include <asm/setup.h>
>  #include <asm/irqflags.h>
> +#include <asm/cache.h>
>
>  #include <asm-generic/cmpxchg.h>
>  #include <asm-generic/cmpxchg-local.h>
> @@ -96,4 +97,14 @@ extern struct dentry *of_debugfs_root;
>
>  #define arch_align_stack(x) (x)
>
> +/*
> + * MicroBlaze doesn't handle unaligned accesses in hardware.
> + *
> + * Based on this we force the IP header alignment in network drivers.
> + * We also modify NET_SKB_PAD to be a cacheline in size, thus maintaining
> + * cacheline alignment of buffers.
> + */
> +#define NET_IP_ALIGN   2
> +#define NET_SKB_PAD    L1_CACHE_BYTES
> +
>  #endif /* _ASM_MICROBLAZE_SYSTEM_H */
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 7b832c7..9073741 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -2434,8 +2434,8 @@ config MV643XX_ETH
>
>  config XILINX_LL_TEMAC
>        tristate "Xilinx LL TEMAC (LocalLink Tri-mode Ethernet MAC) driver"
> +       depends on PPC || MICROBLAZE
>        select PHYLIB
> -       depends on PPC_DCR_NATIVE
>        help
>          This driver supports the Xilinx 10/100/1000 LocalLink TEMAC
>          core used in Xilinx Spartan and Virtex FPGAs
> diff --git a/drivers/net/ll_temac.h b/drivers/net/ll_temac.h
> index 1af66a1..c033584 100644
> --- a/drivers/net/ll_temac.h
> +++ b/drivers/net/ll_temac.h
> @@ -5,8 +5,11 @@
>  #include <linux/netdevice.h>
>  #include <linux/of.h>
>  #include <linux/spinlock.h>
> +
> +#ifdef CONFIG_PPC_DCR
>  #include <asm/dcr.h>
>  #include <asm/dcr-regs.h>
> +#endif
>
>  /* packet size info */
>  #define XTE_HDR_SIZE                   14      /* size of Ethernet header */
> @@ -290,9 +293,6 @@ This option defaults to enabled (set) */
>
>  #define TX_CONTROL_CALC_CSUM_MASK   1
>
> -#define XTE_ALIGN       32
> -#define BUFFER_ALIGN(adr) ((XTE_ALIGN - ((u32) adr)) % XTE_ALIGN)
> -
>  #define MULTICAST_CAM_TABLE_NUM 4
>
>  /* TX/RX CURDESC_PTR points to first descriptor */
> @@ -335,9 +335,15 @@ struct temac_local {
>        struct mii_bus *mii_bus;        /* MII bus reference */
>        int mdio_irqs[PHY_MAX_ADDR];    /* IRQs table for MDIO bus */
>
> -       /* IO registers and IRQs */
> +       /* IO registers, dma functions and IRQs */
>        void __iomem *regs;
> +       void __iomem *sdma_regs;
> +#ifdef CONFIG_PPC_DCR
>        dcr_host_t sdma_dcrs;
> +#endif
> +       u32 (*dma_in)(struct temac_local *, int);
> +       void (*dma_out)(struct temac_local *, int, u32);
> +
>        int tx_irq;
>        int rx_irq;
>        int emac_num;
> diff --git a/drivers/net/ll_temac_main.c b/drivers/net/ll_temac_main.c
> index ce9aa78..2b69d6c 100644
> --- a/drivers/net/ll_temac_main.c
> +++ b/drivers/net/ll_temac_main.c
> @@ -20,9 +20,6 @@
>  *   or rx, so this should be okay.
>  *
>  * TODO:
> - * - Fix driver to work on more than just Virtex5.  Right now the driver
> - *   assumes that the locallink DMA registers are accessed via DCR
> - *   instructions.
>  * - Factor out locallink DMA code into separate driver
>  * - Fix multicast assignment.
>  * - Fix support for hardware checksumming.
> @@ -116,17 +113,86 @@ void temac_indirect_out32(struct temac_local *lp, int reg, u32 value)
>        temac_iow(lp, XTE_CTL0_OFFSET, CNTLREG_WRITE_ENABLE_MASK | reg);
>  }
>
> +/**
> + * temac_dma_in32 - Memory mapped DMA read, this function expects a
> + * register input that is based on DCR word addresses which
> + * are then converted to memory mapped byte addresses
> + */
>  static u32 temac_dma_in32(struct temac_local *lp, int reg)
>  {
> -       return dcr_read(lp->sdma_dcrs, reg);
> +       return in_be32((u32 *)(lp->sdma_regs + (reg << 2)));
>  }
>
> +/**
> + * temac_dma_out32 - Memory mapped DMA read, this function expects a
> + * register input that is based on DCR word addresses which
> + * are then converted to memory mapped byte addresses
> + */
>  static void temac_dma_out32(struct temac_local *lp, int reg, u32 value)
>  {
> +       out_be32((u32 *)(lp->sdma_regs + (reg << 2)), value);
> +}
> +
> +/* DMA register access functions can be DCR based or memory mapped.
> + * The PowerPC 440 is DCR based, the PowerPC 405 and MicroBlaze are both
> + * memory mapped.
> + */
> +#ifdef CONFIG_PPC_DCR
> +
> +/**
> + * temac_dma_dcr_in32 - DCR based DMA read
> + */
> +static u32 temac_dma_dcr_in(struct temac_local *lp, int reg)
> +{
> +       return dcr_read(lp->sdma_dcrs, reg);
> +}
> +
> +/**
> + * temac_dma_dcr_out32 - DCR based DMA write
> + */
> +static void temac_dma_dcr_out(struct temac_local *lp, int reg, u32 value)
> +{
>        dcr_write(lp->sdma_dcrs, reg, value);
>  }
>
>  /**
> + * temac_dcr_setup - If the DMA is DCR based, then setup the address and
> + * I/O  functions
> + */
> +static int temac_dcr_setup(struct temac_local *lp, struct of_device *op,
> +                               struct device_node *np)
> +{
> +       unsigned int dcrs;
> +
> +       /* setup the dcr address mapping if it's in the device tree */
> +
> +       dcrs = dcr_resource_start(np, 0);
> +       if (dcrs != 0) {
> +               lp->sdma_dcrs = dcr_map(np, dcrs, dcr_resource_len(np, 0));
> +               lp->dma_in = temac_dma_dcr_in;
> +               lp->dma_out = temac_dma_dcr_out;
> +               dev_dbg(&op->dev, "DCR base: %x\n", dcrs);
> +               return 0;
> +       }
> +       /* no DCR in the device tree, indicate a failure */
> +       return -1;
> +}
> +
> +#else
> +
> +/*
> + * temac_dcr_setup - This is a stub for when DCR is not supported,
> + * such as with MicroBlaze
> + */
> +static int temac_dcr_setup(struct temac_local *lp, struct of_device *op,
> +                               struct device_node *np)
> +{
> +       return -1;
> +}
> +
> +#endif
> +
> +/**
>  * temac_dma_bd_init - Setup buffer descriptor rings
>  */
>  static int temac_dma_bd_init(struct net_device *ndev)
> @@ -156,14 +222,14 @@ static int temac_dma_bd_init(struct net_device *ndev)
>                lp->rx_bd_v[i].next = lp->rx_bd_p +
>                                sizeof(*lp->rx_bd_v) * ((i + 1) % RX_BD_NUM);
>
> -               skb = alloc_skb(XTE_MAX_JUMBO_FRAME_SIZE
> -                               + XTE_ALIGN, GFP_ATOMIC);
> +               skb = netdev_alloc_skb_ip_align(ndev,
> +                                               XTE_MAX_JUMBO_FRAME_SIZE);
> +
>                if (skb == 0) {
>                        dev_err(&ndev->dev, "alloc_skb error %d\n", i);
>                        return -1;
>                }
>                lp->rx_skb[i] = skb;
> -               skb_reserve(skb,  BUFFER_ALIGN(skb->data));
>                /* returns physical address of skb->data */
>                lp->rx_bd_v[i].phys = dma_map_single(ndev->dev.parent,
>                                                     skb->data,
> @@ -173,23 +239,23 @@ static int temac_dma_bd_init(struct net_device *ndev)
>                lp->rx_bd_v[i].app0 = STS_CTRL_APP0_IRQONEND;
>        }
>
> -       temac_dma_out32(lp, TX_CHNL_CTRL, 0x10220400 |
> +       lp->dma_out(lp, TX_CHNL_CTRL, 0x10220400 |
>                                          CHNL_CTRL_IRQ_EN |
>                                          CHNL_CTRL_IRQ_DLY_EN |
>                                          CHNL_CTRL_IRQ_COAL_EN);
>        /* 0x10220483 */
>        /* 0x00100483 */
> -       temac_dma_out32(lp, RX_CHNL_CTRL, 0xff010000 |
> +       lp->dma_out(lp, RX_CHNL_CTRL, 0xff010000 |
>                                          CHNL_CTRL_IRQ_EN |
>                                          CHNL_CTRL_IRQ_DLY_EN |
>                                          CHNL_CTRL_IRQ_COAL_EN |
>                                          CHNL_CTRL_IRQ_IOE);
>        /* 0xff010283 */
>
> -       temac_dma_out32(lp, RX_CURDESC_PTR,  lp->rx_bd_p);
> -       temac_dma_out32(lp, RX_TAILDESC_PTR,
> +       lp->dma_out(lp, RX_CURDESC_PTR,  lp->rx_bd_p);
> +       lp->dma_out(lp, RX_TAILDESC_PTR,
>                       lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1)));
> -       temac_dma_out32(lp, TX_CURDESC_PTR, lp->tx_bd_p);
> +       lp->dma_out(lp, TX_CURDESC_PTR, lp->tx_bd_p);
>
>        return 0;
>  }
> @@ -427,9 +493,9 @@ static void temac_device_reset(struct net_device *ndev)
>        temac_indirect_out32(lp, XTE_RXC1_OFFSET, val & ~XTE_RXC1_RXEN_MASK);
>
>        /* Reset Local Link (DMA) */
> -       temac_dma_out32(lp, DMA_CONTROL_REG, DMA_CONTROL_RST);
> +       lp->dma_out(lp, DMA_CONTROL_REG, DMA_CONTROL_RST);
>        timeout = 1000;
> -       while (temac_dma_in32(lp, DMA_CONTROL_REG) & DMA_CONTROL_RST) {
> +       while (lp->dma_in(lp, DMA_CONTROL_REG) & DMA_CONTROL_RST) {
>                udelay(1);
>                if (--timeout == 0) {
>                        dev_err(&ndev->dev,
> @@ -437,7 +503,7 @@ static void temac_device_reset(struct net_device *ndev)
>                        break;
>                }
>        }
> -       temac_dma_out32(lp, DMA_CONTROL_REG, DMA_TAIL_ENABLE);
> +       lp->dma_out(lp, DMA_CONTROL_REG, DMA_TAIL_ENABLE);
>
>        temac_dma_bd_init(ndev);
>
> @@ -598,7 +664,7 @@ static int temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>                lp->tx_bd_tail = 0;
>
>        /* Kick off the transfer */
> -       temac_dma_out32(lp, TX_TAILDESC_PTR, tail_p); /* DMA start */
> +       lp->dma_out(lp, TX_TAILDESC_PTR, tail_p); /* DMA start */
>
>        return NETDEV_TX_OK;
>  }
> @@ -638,16 +704,15 @@ static void ll_temac_recv(struct net_device *ndev)
>                ndev->stats.rx_packets++;
>                ndev->stats.rx_bytes += length;
>
> -               new_skb = alloc_skb(XTE_MAX_JUMBO_FRAME_SIZE + XTE_ALIGN,
> -                               GFP_ATOMIC);
> +               new_skb = netdev_alloc_skb_ip_align(ndev,
> +                                               XTE_MAX_JUMBO_FRAME_SIZE);
> +
>                if (new_skb == 0) {
>                        dev_err(&ndev->dev, "no memory for new sk_buff\n");
>                        spin_unlock_irqrestore(&lp->rx_lock, flags);
>                        return;
>                }
>
> -               skb_reserve(new_skb, BUFFER_ALIGN(new_skb->data));
> -
>                cur_p->app0 = STS_CTRL_APP0_IRQONEND;
>                cur_p->phys = dma_map_single(ndev->dev.parent, new_skb->data,
>                                             XTE_MAX_JUMBO_FRAME_SIZE,
> @@ -662,7 +727,7 @@ static void ll_temac_recv(struct net_device *ndev)
>                cur_p = &lp->rx_bd_v[lp->rx_bd_ci];
>                bdstat = cur_p->app0;
>        }
> -       temac_dma_out32(lp, RX_TAILDESC_PTR, tail_p);
> +       lp->dma_out(lp, RX_TAILDESC_PTR, tail_p);
>
>        spin_unlock_irqrestore(&lp->rx_lock, flags);
>  }
> @@ -673,8 +738,8 @@ static irqreturn_t ll_temac_tx_irq(int irq, void *_ndev)
>        struct temac_local *lp = netdev_priv(ndev);
>        unsigned int status;
>
> -       status = temac_dma_in32(lp, TX_IRQ_REG);
> -       temac_dma_out32(lp, TX_IRQ_REG, status);
> +       status = lp->dma_in(lp, TX_IRQ_REG);
> +       lp->dma_out(lp, TX_IRQ_REG, status);
>
>        if (status & (IRQ_COAL | IRQ_DLY))
>                temac_start_xmit_done(lp->ndev);
> @@ -691,8 +756,8 @@ static irqreturn_t ll_temac_rx_irq(int irq, void *_ndev)
>        unsigned int status;
>
>        /* Read and clear the status registers */
> -       status = temac_dma_in32(lp, RX_IRQ_REG);
> -       temac_dma_out32(lp, RX_IRQ_REG, status);
> +       status = lp->dma_in(lp, RX_IRQ_REG);
> +       lp->dma_out(lp, RX_IRQ_REG, status);
>
>        if (status & (IRQ_COAL | IRQ_DLY))
>                ll_temac_recv(lp->ndev);
> @@ -793,7 +858,7 @@ static ssize_t temac_show_llink_regs(struct device *dev,
>        int i, len = 0;
>
>        for (i = 0; i < 0x11; i++)
> -               len += sprintf(buf + len, "%.8x%s", temac_dma_in32(lp, i),
> +               len += sprintf(buf + len, "%.8x%s", lp->dma_in(lp, i),
>                               (i % 8) == 7 ? "\n" : " ");
>        len += sprintf(buf + len, "\n");
>
> @@ -819,7 +884,6 @@ temac_of_probe(struct of_device *op, const struct of_device_id *match)
>        struct net_device *ndev;
>        const void *addr;
>        int size, rc = 0;
> -       unsigned int dcrs;
>
>        /* Init network device structure */
>        ndev = alloc_etherdev(sizeof(*lp));
> @@ -869,13 +933,20 @@ temac_of_probe(struct of_device *op, const struct of_device_id *match)
>                goto nodev;
>        }
>
> -       dcrs = dcr_resource_start(np, 0);
> -       if (dcrs == 0) {
> -               dev_err(&op->dev, "could not get DMA register address\n");
> -               goto nodev;
> +       /* Setup the DMA register accesses, could be DCR or memory mapped */
> +       if (temac_dcr_setup(lp, op, np)) {
> +
> +               /* no DCR in the device tree, try non-DCR */
> +               lp->sdma_regs = of_iomap(np, 0);
> +               if (lp->sdma_regs) {
> +                       lp->dma_in = temac_dma_in32;
> +                       lp->dma_out = temac_dma_out32;
> +                       dev_dbg(&op->dev, "MEM base: %p\n", lp->sdma_regs);
> +               } else {
> +                       dev_err(&op->dev, "unable to map DMA registers\n");
> +                       goto nodev;
> +               }
>        }
> -       lp->sdma_dcrs = dcr_map(np, dcrs, dcr_resource_len(np, 0));
> -       dev_dbg(&op->dev, "DCR base: %x\n", dcrs);
>
>        lp->rx_irq = irq_of_parse_and_map(np, 0);
>        lp->tx_irq = irq_of_parse_and_map(np, 1);
> --
> 1.6.2.1
>
>
>
> This email and any attachments are intended for the sole use of the named recipient(s) and contain(s) confidential information that may be proprietary, privileged or copyrighted under applicable law. If you are not the intended recipient, do not read, copy, or forward this email message or any attachments. Delete this email message and any attachments immediately.
>
>
>



-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

^ permalink raw reply

* Re: [PATCH 1/1] add ethtool loopback support
From: Laurent Chavey @ 2010-04-09 18:09 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Ben Hutchings, davem, netdev, therbert
In-Reply-To: <4BBF5D72.8010603@garzik.org>

On Fri, Apr 9, 2010 at 10:01 AM, Jeff Garzik <jeff@garzik.org> wrote:
> On 04/09/2010 12:43 PM, Laurent Chavey wrote:
>>
>> isn't the existing ETHTOOL_TEST ioctl use for something like self test ?
>>
>> the intent of this patch is to enable a mode whereby one could run
>> netperf / iperf and other application  and have the packets sent and
>> received by the driver.
>
> I said "ethtool flags interface", which is ETHTOOL_[GS]FLAGS.
>
> ethtool private flags interface would also work, ETHTOOL_[GS]PFLAGS.
>
> Both are interfaces enabling user setting/clearing of 32 on/off switches
> (bits).

agree, no need to patch

>
>        Jeff
>
>
>
>
>

^ permalink raw reply

* [PATCH linux-2.6.34-rc3] drivers/net: ks8851 MLL ethernet network driver
From: Choi, David @ 2010-04-09 17:50 UTC (permalink / raw)
  To: davem; +Cc: netdev

Hello

>From : David J. Choi (david.choi@micrel.com)
Summary :
 1.Support big Endian mode
 2.Receive interrupt consolidation: a receive interrupt happens when 2 consecutive packets are
   received continuously within a certain period of time in order to enhance performance.
 3.Change a register name, which was caused by mis-type.
 4.Disable the device interrupt by controlling a device register, instead of host register
   ,when packet transmit starts.
 5.Display network interface name after the device registration for maintenance purpose.
 6.Fix to unmap of memory when the device is removed.

Signed-off-by : David J. Choi

---
--- linux-2.6.34-rc3/drivers/net/ks8851_mll.c.orig	2010-04-09 10:23:35.000000000 -0700
+++ linux-2.6.34-rc3/drivers/net/ks8851_mll.c	2010-04-09 10:23:23.000000000 -0700
@@ -41,6 +41,7 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 #define RX_BUF_SIZE			2000
 
 #define KS_CCR				0x08
+#define CCR_ENDIAN			(1 << 10)
 #define CCR_EEPROM			(1 << 9)
 #define CCR_SPI				(1 << 8)
 #define CCR_8BIT			(1 << 7)
@@ -143,7 +144,7 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 #define RXCR1_RXAE			(1 << 4)
 #define RXCR1_RXINVF			(1 << 1)
 #define RXCR1_RXE			(1 << 0)
-#define RXCR1_FILTER_MASK    		(RXCR1_RXINVF | RXCR1_RXAE | \
+#define RXCR1_FILTER_MASK		(RXCR1_RXINVF | RXCR1_RXAE | \
 					 RXCR1_RXMAFMA | RXCR1_RXPAFMA)
 
 #define KS_RXCR2			0x76
@@ -199,7 +200,7 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 #define RXQCR_ADRFE			(1 << 4)
 #define RXQCR_SDA			(1 << 3)
 #define RXQCR_RRXEF			(1 << 0)
-#define RXQCR_CMD_CNTL                	(RXQCR_RXFCTE|RXQCR_ADRFE)
+#define RXQCR_CMD_CNTL			(RXQCR_RXFCTE|RXQCR_ADRFE|RXQCR_RXDTTE)
 
 #define KS_TXFDPR			0x84
 #define TXFDPR_TXFPAI			(1 << 14)
@@ -208,6 +209,7 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 
 #define KS_RXFDPR			0x86
 #define RXFDPR_RXFPAI			(1 << 14)
+#define RXFDPR_ENDIAN			(1 << 11)
 
 #define KS_RXDTTR			0x8C
 #define KS_RXDBCTR			0x8E
@@ -229,7 +231,7 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 #define IRQ_DEDI			(1 << 0)
 
 #define KS_RXFCTR			0x9C
-#define RXFCTR_THRESHOLD_MASK     	0x00FF
+#define RXFCTR_THRESHOLD_MASK		0x00FF
 
 #define KS_RXFC				0x9D
 #define RXFCTR_RXFC_MASK		(0xff << 8)
@@ -265,7 +267,7 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 #define IACR_ADDR_SHIFT			(0)
 
 #define KS_IADLR			0xD0
-#define KS_IAHDR			0xD2
+#define KS_IADHR			0xD2
 
 #define KS_PMECR			0xD4
 #define PMECR_PME_DELAY			(1 << 14)
@@ -361,6 +363,10 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0
 #define MAX_MCAST_LST			32
 #define HW_MCAST_SIZE			8
 #define MAC_ADDR_LEN			6
+/* count to consolidate rx packets */
+#define	CONSOLIDATE_INT_FIRE_CNT 2
+/* nsec to trigger an interrupt */
+#define	CONSOLIDATE_INT_FIRE_DUR 100
 
 /**
  * union ks_tx_hdr - tx header data
@@ -378,34 +384,34 @@ union ks_tx_hdr {
 
 /**
  * struct ks_net - KS8851 driver private data
- * @net_device 	: The network device we're bound to
+ * @net_device	: The network device we're bound to
  * @hw_addr	: start address of data register.
  * @hw_addr_cmd	: start address of command register.
- * @txh    	: temporaly buffer to save status/length.
+ * @txh		: temporaly buffer to save status/length.
  * @lock	: Lock to ensure that the device is not accessed when busy.
  * @pdev	: Pointer to platform device.
  * @mii		: The MII state information for the mii calls.
- * @frame_head_info   	: frame header information for multi-pkt rx.
+ * @frame_head_info	: frame header information for multi-pkt rx.
  * @statelock	: Lock on this structure for tx list.
  * @msg_enable	: The message flags controlling driver output (see ethtool).
- * @frame_cnt  	: number of frames received.
- * @bus_width  	: i/o bus width.
- * @irq    	: irq number assigned to this device.
+ * @frame_cnt	: number of frames received.
+ * @bus_width	: i/o bus width.
+ * @irq		: irq number assigned to this device.
  * @rc_rxqcr	: Cached copy of KS_RXQCR.
  * @rc_txcr	: Cached copy of KS_TXCR.
  * @rc_ier	: Cached copy of KS_IER.
- * @sharedbus  	: Multipex(addr and data bus) mode indicator.
+ * @sharedbus	: Multipex(addr and data bus) mode indicator.
  * @cmd_reg_cache	: command register cached.
  * @cmd_reg_cache_int	: command register cached. Used in the irq handler.
  * @promiscuous	: promiscuous mode indicator.
- * @all_mcast  	: mutlicast indicator.
- * @mcast_lst_size   	: size of multicast list.
- * @mcast_lst    	: multicast list.
- * @mcast_bits    	: multicast enabed.
- * @mac_addr   		: MAC address assigned to this device.
- * @fid    		: frame id.
- * @extra_byte    	: number of extra byte prepended rx pkt.
- * @enabled    		: indicator this device works.
+ * @all_mcast	: mutlicast indicator.
+ * @mcast_lst_size	: size of multicast list.
+ * @mcast_lst		: multicast list.
+ * @mcast_bits		: multicast enabed.
+ * @mac_addr		: MAC address assigned to this device.
+ * @fid			: frame id.
+ * @extra_byte		: number of extra byte prepended rx pkt.
+ * @enabled		: indicator this device works.
  *
  * The @lock ensures that the chip is protected when certain operations are
  * in progress. When the read or write packet transfer is in progress, most
@@ -426,10 +432,10 @@ struct type_frame_head {
 
 struct ks_net {
 	struct net_device	*netdev;
-	void __iomem    	*hw_addr;
-	void __iomem    	*hw_addr_cmd;
+	void __iomem		*hw_addr;
+	void __iomem		*hw_addr_cmd;
 	union ks_tx_hdr		txh ____cacheline_aligned;
-	struct mutex      	lock; /* spinlock to be interrupt safe */
+	struct mutex		lock; /* spinlock to be interrupt safe */
 	struct platform_device *pdev;
 	struct mii_if_info	mii;
 	struct type_frame_head	*frame_head_info;
@@ -437,7 +443,7 @@ struct ks_net {
 	u32			msg_enable;
 	u32			frame_cnt;
 	int			bus_width;
-	int             	irq;
+	int			irq;
 
 	u16			rc_rxqcr;
 	u16			rc_txcr;
@@ -463,10 +469,27 @@ static int msg_enable;
 #define ks_dbg(_ks, _msg...) dev_dbg(&(_ks)->pdev->dev, _msg)
 #define ks_err(_ks, _msg...) dev_err(&(_ks)->pdev->dev, _msg)
 
+#if defined(__LITTLE_ENDIAN)
 #define BE3             0x8000      /* Byte Enable 3 */
 #define BE2             0x4000      /* Byte Enable 2 */
 #define BE1             0x2000      /* Byte Enable 1 */
 #define BE0             0x1000      /* Byte Enable 0 */
+#define MAKE_ADDR16(offset)	((u16)offset | ((BE1 | BE0) << (offset & 0x02)))
+#define MAKE_ADDR8(offset, shift_bit)	((u16)offset | (BE0 << shift_bit))
+#define	MAKE_SHIFT_INDEX(offset)	((offset & 1) << 3)
+#define	MAKE_SHIFT_VALUE(offset)	((offset & 1) << 8)
+#define	CONV_MAC_VALUE(u)	(((u & 0xFF) << 8) | ((u >> 8) & 0xFF))
+#else
+#define BE3             0x1000      /* Byte Enable 3 */
+#define BE2             0x2000      /* Byte Enable 2 */
+#define BE1             0x4000      /* Byte Enable 1 */
+#define BE0             0x8000      /* Byte Enable 0 */
+#define MAKE_ADDR16(offset)	((u16)offset | ((BE1 | BE0) >> (offset & 0x02)))
+#define MAKE_ADDR8(offset, shift_bit)	((u16)offset | (BE0 >> shift_bit))
+#define	MAKE_SHIFT_INDEX(offset)	(!(offset & 1) << 3)
+#define	MAKE_SHIFT_VALUE(offset)	(!(offset & 1) << 8)
+#define	CONV_MAC_VALUE(u)		u
+#endif
 
 /**
  * register read/write calls.
@@ -487,8 +510,8 @@ static u8 ks_rdreg8(struct ks_net *ks, i
 {
 	u16 data;
 	u8 shift_bit = offset & 0x03;
-	u8 shift_data = (offset & 1) << 3;
-	ks->cmd_reg_cache = (u16) offset | (u16)(BE0 << shift_bit);
+	u8 shift_data = MAKE_SHIFT_VALUE(offset);
+	ks->cmd_reg_cache = MAKE_ADDR8(offset, shift_bit);
 	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
 	data  = ioread16(ks->hw_addr);
 	return (u8)(data >> shift_data);
@@ -504,7 +527,7 @@ static u8 ks_rdreg8(struct ks_net *ks, i
 
 static u16 ks_rdreg16(struct ks_net *ks, int offset)
 {
-	ks->cmd_reg_cache = (u16)offset | ((BE1 | BE0) << (offset & 0x02));
+	ks->cmd_reg_cache = MAKE_ADDR16(offset);
 	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
 	return ioread16(ks->hw_addr);
 }
@@ -519,8 +542,8 @@ static u16 ks_rdreg16(struct ks_net *ks,
 static void ks_wrreg8(struct ks_net *ks, int offset, u8 value)
 {
 	u8  shift_bit = (offset & 0x03);
-	u16 value_write = (u16)(value << ((offset & 1) << 3));
-	ks->cmd_reg_cache = (u16)offset | (BE0 << shift_bit);
+	u16 value_write = (u16)(value << (MAKE_SHIFT_INDEX(offset)));
+	ks->cmd_reg_cache = MAKE_ADDR8(offset, shift_bit);
 	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
 	iowrite16(value_write, ks->hw_addr);
 }
@@ -535,7 +558,7 @@ static void ks_wrreg8(struct ks_net *ks,
 
 static void ks_wrreg16(struct ks_net *ks, int offset, u16 value)
 {
-	ks->cmd_reg_cache = (u16)offset | ((BE1 | BE0) << (offset & 0x02));
+	ks->cmd_reg_cache = MAKE_ADDR16(offset);
 	iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
 	iowrite16(value, ks->hw_addr);
 }
@@ -571,6 +594,9 @@ static inline void ks_outblk(struct ks_n
 static void ks_disable_int(struct ks_net *ks)
 {
 	ks_wrreg16(ks, KS_IER, 0x0000);
+	/* guarantee that device interrupt is disabled. */
+	ks_rdreg16(ks, KS_IER);
+
 }  /* ks_disable_int */
 
 static void ks_enable_int(struct ks_net *ks)
@@ -603,7 +629,7 @@ static inline void ks_save_cmd_reg(struc
 
 /**
  * ks_restore_cmd_reg - restore the command register from the cache and
- * 	write to hardware register.
+ * write to hardware register.
  * @ks: The chip information
  *
  */
@@ -639,8 +665,10 @@ static void ks_set_powermode(struct ks_n
  * ks_read_config - read chip configuration of bus width.
  * @ks: The chip information
  *
+ * When Endianness does not match between hardware and softwae, return false.
+ *
  */
-static void ks_read_config(struct ks_net *ks)
+static bool ks_read_config(struct ks_net *ks)
 {
 	u16 reg_data = 0;
 
@@ -648,6 +676,17 @@ static void ks_read_config(struct ks_net
 	reg_data = ks_rdreg8(ks, KS_CCR) & 0x00FF;
 	reg_data |= ks_rdreg8(ks, KS_CCR+1) << 8;
 
+#if defined(__LITTLE_ENDIAN)
+	if (!reg_data & CCR_ENDIAN) {
+		printk(KERN_ERR "%s: Endian mode error\n", __func__);
+		return false;
+	}
+#else
+	if (reg_data & CCR_ENDIAN) {
+		printk(KERN_ERR "%s: Endian mode error\n", __func__);
+		return false;
+	}
+#endif
 	/* addr/data bus are multiplexed */
 	ks->sharedbus = (reg_data & CCR_SHARED) == CCR_SHARED;
 
@@ -665,6 +704,7 @@ static void ks_read_config(struct ks_net
 		ks->bus_width = ENUM_BUS_32BIT;
 		ks->extra_byte = 4;
 	}
+	return true;
 }
 
 /**
@@ -809,7 +849,7 @@ static void ks_rcv(struct ks_net *ks, st
 			skb->protocol = eth_type_trans(skb, netdev);
 			netif_rx(skb);
 		} else {
-			printk(KERN_ERR "%s: err:skb alloc\n", __func__);
+			printk(KERN_ERR "err:skb alloc or frame\n");
 			ks_wrreg16(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_RRXEF));
 			if (skb)
 				dev_kfree_skb_irq(skb);
@@ -985,9 +1025,13 @@ static int ks_net_stop(struct net_device
 static void ks_write_qmu(struct ks_net *ks, u8 *pdata, u16 len)
 {
 	/* start header at txb[0] to align txw entries */
+#if defined(__LITTLE_ENDIAN)
 	ks->txh.txw[0] = 0;
 	ks->txh.txw[1] = cpu_to_le16(len);
-
+#else
+	ks->txh.txw[0] = 0;
+	ks->txh.txw[1] = cpu_to_be16(len);
+#endif
 	/* 1. set sudo-DMA mode */
 	ks_wrreg8(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_SDA) & 0xff);
 	/* 2. write status/lenth info */
@@ -1017,7 +1061,6 @@ static int ks_start_xmit(struct sk_buff 
 	int retv = NETDEV_TX_OK;
 	struct ks_net *ks = netdev_priv(netdev);
 
-	disable_irq(netdev->irq);
 	ks_disable_int(ks);
 	spin_lock(&ks->statelock);
 
@@ -1032,7 +1075,6 @@ static int ks_start_xmit(struct sk_buff 
 		retv = NETDEV_TX_BUSY;
 	spin_unlock(&ks->statelock);
 	ks_enable_int(ks);
-	enable_irq(netdev->irq);
 	return retv;
 }
 
@@ -1224,20 +1266,16 @@ static void ks_set_rx_mode(struct net_de
 static void ks_set_mac(struct ks_net *ks, u8 *data)
 {
 	u16 *pw = (u16 *)data;
-	u16 w, u;
+	u16 w;
 
 	ks_stop_rx(ks);  /* Stop receiving for reconfiguration */
-
-	u = *pw++;
-	w = ((u & 0xFF) << 8) | ((u >> 8) & 0xFF);
+	w = CONV_MAC_VALUE(*pw);
 	ks_wrreg16(ks, KS_MARH, w);
 
-	u = *pw++;
-	w = ((u & 0xFF) << 8) | ((u >> 8) & 0xFF);
+	w = CONV_MAC_VALUE(*(pw+1));
 	ks_wrreg16(ks, KS_MARM, w);
 
-	u = *pw;
-	w = ((u & 0xFF) << 8) | ((u >> 8) & 0xFF);
+	w = CONV_MAC_VALUE(*(pw+2));
 	ks_wrreg16(ks, KS_MARL, w);
 
 	memcpy(ks->mac_addr, data, 6);
@@ -1461,8 +1499,10 @@ static void ks_setup(struct ks_net *ks)
 	/* Setup Receive Frame Data Pointer Auto-Increment */
 	ks_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI);
 
-	/* Setup Receive Frame Threshold - 1 frame (RXFCTFC) */
-	ks_wrreg16(ks, KS_RXFCTR, 1 & RXFCTR_THRESHOLD_MASK);
+
+	/* Setup Receive Frame Threshold */
+	ks_wrreg16(ks, KS_RXFCTR, CONSOLIDATE_INT_FIRE_CNT);
+	ks_wrreg16(ks, KS_RXDTTR, CONSOLIDATE_INT_FIRE_DUR);
 
 	/* Setup RxQ Command Control (RXQCR) */
 	ks->rc_rxqcr = RXQCR_CMD_CNTL;
@@ -1585,7 +1625,19 @@ static int __devinit ks8851_probe(struct
 	ks->msg_enable = netif_msg_init(msg_enable, (NETIF_MSG_DRV |
 						     NETIF_MSG_PROBE |
 						     NETIF_MSG_LINK));
-	ks_read_config(ks);
+	/* set endian mode, which is write-only */
+	data = ks_rdreg16(ks, KS_RXFDPR);
+#if defined(__LITTLE_ENDIAN)
+	data &= ~RXFDPR_ENDIAN;
+#else
+	data |= RXFDPR_ENDIAN;
+#endif
+	ks_wrreg16(ks, KS_RXFDPR, data);
+
+	if (!ks_read_config(ks)) {
+		err = -ENODEV;
+		goto err_register;
+	}
 
 	/* simple check for a valid chip being connected to the bus */
 	if ((ks_rdreg16(ks, KS_CIDER) & ~CIDER_REV_MASK) != CIDER_ID) {
@@ -1604,6 +1656,7 @@ static int __devinit ks8851_probe(struct
 	if (err)
 		goto err_register;
 
+	printk(KERN_INFO "Network Interface of ks8851: %s\n", netdev->name);
 	platform_set_drvdata(pdev, netdev);
 
 	ks_soft_reset(ks, GRR_GSR);
@@ -1655,6 +1708,7 @@ static int __devexit ks8851_remove(struc
 	kfree(ks->frame_head_info);
 	unregister_netdev(netdev);
 	iounmap(ks->hw_addr);
+	iounmap(ks->hw_addr_cmd);
 	free_netdev(netdev);
 	release_mem_region(iomem->start, resource_size(iomem));
 	platform_set_drvdata(pdev, NULL);

---

^ permalink raw reply

* RE: net-next: 2.6.34-rc1 regression: panic when running diagnostic on interface with IPv6
From: Tantilov, Emil S @ 2010-04-09 17:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev@vger.kernel.org, David Miller
In-Reply-To: <25868121.76371270825640054.JavaMail.root@tahiti.vyatta.com>

[-- Attachment #1: Type: text/plain, Size: 543 bytes --]

Stephen Hemminger wrote:
> Send me your kernel config. And are you running tests online or
> offline 

Config attached. The kernel runs on top of RHEL5.4, not sure if that is significant, but should explain some of the sysfs deprecated options enabled in it. 

The test is offline (online doesn't do much other than link check).

I also reproduced it with a fresh pull from this morning. Looks like it is easier to reproduce after you pass some traffic, just assigning IPv6 address may not be enough. Also rmmod hangs.

Thanks,
Emil

[-- Attachment #2: config.gz --]
[-- Type: application/x-gzip, Size: 12115 bytes --]

^ permalink raw reply

* Re: [-next April 8] eHEA driver failure on powerpc
From: Sachin Sant @ 2010-04-09 17:48 UTC (permalink / raw)
  To: Linux/PPC Development
  Cc: linux-next, netdev, HERING2, Anton Blanchard,
	Benjamin Herrenschmidt
In-Reply-To: <4BBDC447.3040901@in.ibm.com>

Sachin Sant wrote:
> With today's next release, eHEA network interface on couple
> of power6 boxes fails to initialize.
>
> # modprobe ehea
> IBM eHEA ethernet device driver (Release EHEA_0102)
>  alloc irq_desc for 256 on node 0
>  alloc kstat_irqs on node 0
> irq: irq 590080 on host null mapped to virtual irq 256
> ehea: Error in ehea_plpar_hcall_norets: opcode=26c 
> ret=fffffffffffffffc arg1=8000000003000000 arg2=0 
> arg3=7000000000050400 arg4=fc9b0000 arg5=200 arg6=0 arg7=0
> ehea: Error in ehea_reg_mr_section: register_rpage_mr failed
> ehea: Error in ehea_reg_kernel_mr: registering mr failed
> ehea: Error in ehea_setup_ports: creating MR failed
> ehea 23c00200.lhea: setup_ports failed
> ehea: probe of 23c00200.lhea failed with error -5
I tracked this problem to the following commit.

commit 7545ba6f82924d4523f8f8a2baf2e517a750265d
powerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MB

If i revert this commit, the network interface is initialized
properly. Verified this on two different power6 boxes.

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

^ permalink raw reply

* Re: [PATCH] vhost: Make it more scalable by creating a vhost thread per device.
From: Rick Jones @ 2010-04-09 17:13 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Michael S. Tsirkin, Tom Lendacky, netdev, kvm@vger.kernel.org
In-Reply-To: <1270827580.25555.13.camel@w-sridhar.beaverton.ibm.com>

Sridhar Samudrala wrote:
> On Thu, 2010-04-08 at 17:14 -0700, Rick Jones wrote:
> 
>>>Here are the results with netperf TCP_STREAM 64K guest to host on a
>>>8-cpu Nehalem system.
>>
>>I presume you mean 8 core Nehalem-EP, or did you mean 8 processor Nehalem-EX?
> 
> 
> Yes. It is a 2 socket quad-core Nehalem. so i guess it is a 8 core
> Nehalem-EP.
> 
>>Don't get me wrong, I *like* the netperf 64K TCP_STREAM test, I lik it a lot!-) 
>>but I find it incomplete and also like to run things like single-instance TCP_RR 
>>and multiple-instance, multiple "transaction" (./configure --enable-burst) 
>>TCP_RR tests, particularly when concerned with "scaling" issues.
> 
> 
> Can we run multiple instance and multiple transaction tests with a
> single netperf commandline?

Do you count a shell for loop as a single command line?

> Is there any easy way to get consolidated throughput when a netserver on
> the host is servicing netperf clients from multiple guests?

I tend to use a script such as:

ftp://ftp.netperf.org/netperf/misc/runemomniagg2.sh

which presumes that netperf/netserver have been built with:

./configure --enable-omni --enable-burst ...

and uses the CSV output format of the omni tests.  When I want sums I then turn 
to a spreadsheet, or I suppose I could turn to awk etc.

The TCP_RR test can be flipped around request size for response size etc, so 
when I have a single sustem under test, I initiate the netperf commands on it, 
targetting netservers on the clients.  If I want inbound bulk throughput I use 
the TCP_MAERTS test rather than the TCP_STREAM test.

happy benchmarking,

rick jones

^ permalink raw reply

* Re: pull request: wireless-2.6 2010-04-09
From: David Miller @ 2010-04-09 17:03 UTC (permalink / raw)
  To: linville
  Cc: jeff.chua.linux, shanyu.zhao, reinette.chatre, stable,
	linux-kernel, torvalds, viro, wey-yi.w.guy, linux-wireless,
	netdev
In-Reply-To: <20100409153807.GA3014@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Fri, 9 Apr 2010 11:38:07 -0400

> This fix is intended for 2.6.34.  It resolves an issue involving an
> Oops on boxes w/ iwl4965 hardware, apparently introduced by another
> recent patch.  The thread describing the issue and the resolution
> is here:
> 
> 	http://marc.info/?l=linux-kernel&m=127074721531649&w=2
> 
> In order to avoid reverting that patch, please accept this fix instead.
> As usual, please let me know if there are problems!

Pulled, thanks John.

^ permalink raw reply

* Re: [PATCH 1/1] add ethtool loopback support
From: Jeff Garzik @ 2010-04-09 17:01 UTC (permalink / raw)
  To: Laurent Chavey; +Cc: Ben Hutchings, davem, netdev, therbert
In-Reply-To: <x2o97949e3e1004090943o4b6b29e5pd261e2cb4c7f421d@mail.gmail.com>

On 04/09/2010 12:43 PM, Laurent Chavey wrote:
> isn't the existing ETHTOOL_TEST ioctl use for something like self test ?
>
> the intent of this patch is to enable a mode whereby one could run
> netperf / iperf and other application  and have the packets sent and
> received by the driver.

I said "ethtool flags interface", which is ETHTOOL_[GS]FLAGS.

ethtool private flags interface would also work, ETHTOOL_[GS]PFLAGS.

Both are interfaces enabling user setting/clearing of 32 on/off switches 
(bits).

	Jeff





^ permalink raw reply

* Re: [PATCH 1/1] add ethtool loopback support
From: Ben Hutchings @ 2010-04-09 16:55 UTC (permalink / raw)
  To: Laurent Chavey; +Cc: Jeff Garzik, davem, netdev, therbert
In-Reply-To: <x2o97949e3e1004090943o4b6b29e5pd261e2cb4c7f421d@mail.gmail.com>

Don't top-post.

On Fri, 2010-04-09 at 09:43 -0700, Laurent Chavey wrote:
> isn't the existing ETHTOOL_TEST ioctl use for something like self test ?
> 
> the intent of this patch is to enable a mode whereby one could run
> netperf / iperf and other application  and have the packets sent and
> received by the driver.
[...]

If you send to a local address, the traffic will be routed over the
internal loopback interface.  Applications will not use a network
interface in loopback unless they override routing or send raw packets.
netperf and iperf don't allow you todo that, so far as I'm ware.

Also those applications are *performance* tests; they are not very
useful for fault-finding.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH 1/1] add ethtool loopback support
From: Laurent Chavey @ 2010-04-09 16:43 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Ben Hutchings, davem, netdev, therbert
In-Reply-To: <4BBE78E2.2000709@garzik.org>

isn't the existing ETHTOOL_TEST ioctl use for something like self test ?

the intent of this patch is to enable a mode whereby one could run
netperf / iperf and other application  and have the packets sent and
received by the driver.


On Thu, Apr 8, 2010 at 5:46 PM, Jeff Garzik <jeff@garzik.org> wrote:
> On 04/08/2010 06:43 PM, Laurent Chavey wrote:
>>
>> On Thu, Apr 8, 2010 at 12:35 PM, Ben Hutchings
>> <bhutchings@solarflare.com>  wrote:
>>>
>>> On Thu, 2010-04-08 at 12:17 -0700, Laurent Chavey wrote:
>>>>
>>>> On Thu, Apr 8, 2010 at 11:29 AM, Ben Hutchings
>>>> <bhutchings@solarflare.com>  wrote:
>>>>>
>>>>> On Thu, 2010-04-08 at 10:35 -0700, chavey@google.com wrote:
>>>
>>> [...]
>>>>>>
>>>>>> +enum ethtool_loopback_type {
>>>>>> +     ETH_MAC                 = 0x00000001,
>>>>>> +     ETH_PHY_INT             = 0x00000002,
>>>>>> +     ETH_PHY_EXT             = 0x00000004
>>>>>> +};
>>>>>
>>>>> [...]
>>>>>
>>>>> There are many different places you can loop back within a MAC or PHY,
>>>>> not to mention bypassing the MAC altogether.  See
>>>>> drivers/net/sfc/mcdi_pcol.h, starting from the line
>>>>> '#define MC_CMD_LOOPBACK_NONE 0'.  I believe we implement all of those
>>>>> loopback modes on at least one board.
>>>>>
>>>>> Also are these supposed to be an enumeration or flags?  In theory you
>>>>
>>>> those are enums that can be or together.
>>>
>>> I.e. they are flags.  So how do you answer this:
>>>
>>>>> could use wire-side and host-side loopback at the same time if they
>>>>> don't overlap, but it's probably too much trouble to bother with.  Any
>>>>> other combination is meaningless.
>>
>> since the intent is to enable the sending and receiving of packets at
>> the hw/driver interfaces, a simple loopback mode on/off is sufficient
>> and the ethtool_loopback_type are not necessary. the implementor can
>> choose
>> how to implement the loopback. From drivers/net/sfc/mcdi_pcol.h it is
>> clear
>> that unless ethtool_loopback_type as defined are meaningless.
>
> If an off/on switch is sufficient, the existing ethtool flags interface
> should work just fine.
>
>        Jeff
>
>
>
>

^ permalink raw reply

* Re: HTB - What's the minimal value for 'rate' parameter?
From: Antonio Almeida @ 2010-04-09 15:40 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik
In-Reply-To: <4BBE4BB4.1060209@gmail.com>

So, what about the rate limit miss?
As you can see the ceil of class 1:2 is set to 4096Kbit but its
sending rate is actually 8071Kbit!
It looks like classes 1:10 and 1:11 are ignoring hierarchical rate
restrictions of class 1:2
Here:
class htb 1:2 parent 1:1 rate 4096Kbit ceil 4096Kbit burst 3655b cburst 3655b
 Sent 84285894 bytes 55671 pkt (dropped 0, overlimits 0 requeues 0)
 rate 8071Kbit 666pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -937499999 ctokens: -937499999




On Thu, Apr 8, 2010 at 10:33 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> Antonio Almeida wrote, On 04/08/2010 01:07 PM:
>
>> Hi!
>
> Hi!
>
>> I've been using HTB for a while, and we've already sent some e-mails
>> each other when resolving HTB accuracy issue.
>> When using HTB, I realised that for some configurations the rate limit
>> doesn't work.
>> I suspect that the problem is the minimum value of rate parameter,
>> which I cant figure out what is.
>>
>> I simple configuration that turns out to be wrong is as fallows: The
>> root (1:1) gets the link bandwidth configuration; the second (1:2) is
>> set to 4096Kbit; then I have two branches (1:10 and 1:11) with rate
>> 1024Kbit and ceil 4096Kbit; and finally a leaf class in each branch
>> (1:111 below 1:11, and 1:101 below 1:10) with rate 8bit and ceil
>> 4096Kbit, and the same priority.
>> I don't want to have sustained rate, and since I must configure 'rate'
>> parameter I decide to set it to 8bits - which is the minimal accepted
>> value. My cue goes for 'rate' parameter. If I set 'rate' parameter to
>> 1Kbit for instance, the problem disappears and the shaping is done
>> perfectly.
>>
>> So, I'm looking for help to find out if the problem is actually in
>> this parameter configuration or if it's just coincidence and I'll get
>> the same problem ahead :(
>> What's the minimal value for 'rate' parameter using HTB qdisc?
>
>
> I think "reasonable" or "minimally useful" (for common use) should be
> enough, and 8bits meaning one 1500 byte packet per 25 minutes or
> something, doesn't look like this to me.
>
> This changelog:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a4a710c4a7490587406462bf1d54504b7783d7d7
> mentions ~2 minutes as max time for accounting, so 1 max packet
> per 2 minutes should give such a minimal rate, I guess, but I'd
> still multiply it a few times to call it useful.
>
> Regards,
> Jarek P.
>
>
>

^ permalink raw reply

* Re: [PATCH] vhost: Make it more scalable by creating a vhost thread per device.
From: Sridhar Samudrala @ 2010-04-09 15:39 UTC (permalink / raw)
  To: Rick Jones; +Cc: Michael S. Tsirkin, Tom Lendacky, netdev, kvm@vger.kernel.org
In-Reply-To: <4BBE716B.7050904@hp.com>

On Thu, 2010-04-08 at 17:14 -0700, Rick Jones wrote:
> > Here are the results with netperf TCP_STREAM 64K guest to host on a
> > 8-cpu Nehalem system.
> 
> I presume you mean 8 core Nehalem-EP, or did you mean 8 processor Nehalem-EX?

Yes. It is a 2 socket quad-core Nehalem. so i guess it is a 8 core
Nehalem-EP.
> 
> Don't get me wrong, I *like* the netperf 64K TCP_STREAM test, I lik it a lot!-) 
> but I find it incomplete and also like to run things like single-instance TCP_RR 
> and multiple-instance, multiple "transaction" (./configure --enable-burst) 
> TCP_RR tests, particularly when concerned with "scaling" issues.

Can we run multiple instance and multiple transaction tests with a
single netperf commandline?

Is there any easy way to get consolidated throughput when a netserver on
the host is servicing netperf clients from multiple guests?

Thanks
Sridhar

> 
> happy benchmarking,
> 
> rick jones
> 
> > It shows cumulative bandwidth in Mbps and host 
> > CPU utilization.
> > 
> > Current default single vhost thread
> > -----------------------------------
> > 1 guest:  12500  37%    
> > 2 guests: 12800  46%
> > 3 guests: 12600  47%
> > 4 guests: 12200  47%
> > 5 guests: 12000  47%
> > 6 guests: 11700  47%
> > 7 guests: 11340  47%
> > 8 guests: 11200  48%
> > 
> > vhost thread per cpu
> > --------------------
> > 1 guest:   4900 25%
> > 2 guests: 10800 49%
> > 3 guests: 17100 67%
> > 4 guests: 20400 84%
> > 5 guests: 21000 90%
> > 6 guests: 22500 92%
> > 7 guests: 23500 96%
> > 8 guests: 24500 99%
> > 
> > vhost thread per guest interface
> > --------------------------------
> > 1 guest:  12500 37%
> > 2 guests: 21000 72%
> > 3 guests: 21600 79%
> > 4 guests: 21600 85%
> > 5 guests: 22500 89%
> > 6 guests: 22800 94%
> > 7 guests: 24500 98%
> > 8 guests: 26400 99%
> > 
> > Thanks
> > Sridhar
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* pull request: wireless-2.6 2010-04-09
From: John W. Linville @ 2010-04-09 15:38 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: Jeff Chua, Zhao, Shanyu, Chatre, Reinette,
	stable-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, Linux Kernel,
	Linus Torvalds, Al Viro, Guy, Wey-Yi,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1270762107.20845.3.camel@wwguy-ubuntu>

Dave,

This fix is intended for 2.6.34.  It resolves an issue involving an
Oops on boxes w/ iwl4965 hardware, apparently introduced by another
recent patch.  The thread describing the issue and the resolution
is here:

	http://marc.info/?l=linux-kernel&m=127074721531649&w=2

In order to avoid reverting that patch, please accept this fix instead.
As usual, please let me know if there are problems!

Thanks,

John

---

The following changes since commit 2626419ad5be1a054d350786b684b41d23de1538:
  David S. Miller (1):
        tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Wey-Yi Guy (1):
      iwlwifi: need check for valid qos packet before free

 drivers/net/wireless/iwlwifi/iwl-4965.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-4965.c b/drivers/net/wireless/iwlwifi/iwl-4965.c
index 83c52a6..8972166 100644
--- a/drivers/net/wireless/iwlwifi/iwl-4965.c
+++ b/drivers/net/wireless/iwlwifi/iwl-4965.c
@@ -2015,7 +2015,9 @@ static void iwl4965_rx_reply_tx(struct iwl_priv *priv,
 			IWL_DEBUG_TX_REPLY(priv, "Retry scheduler reclaim scd_ssn "
 					   "%d index %d\n", scd_ssn , index);
 			freed = iwl_tx_queue_reclaim(priv, txq_id, index);
-			iwl_free_tfds_in_queue(priv, sta_id, tid, freed);
+			if (qc)
+				iwl_free_tfds_in_queue(priv, sta_id,
+						       tid, freed);
 
 			if (priv->mac80211_registered &&
 			    (iwl_queue_space(&txq->q) > txq->q.low_mark) &&
@@ -2041,14 +2043,17 @@ static void iwl4965_rx_reply_tx(struct iwl_priv *priv,
 				   tx_resp->failure_frame);
 
 		freed = iwl_tx_queue_reclaim(priv, txq_id, index);
-		iwl_free_tfds_in_queue(priv, sta_id, tid, freed);
+		if (qc && likely(sta_id != IWL_INVALID_STATION))
+			iwl_free_tfds_in_queue(priv, sta_id, tid, freed);
+		else if (sta_id == IWL_INVALID_STATION)
+			IWL_DEBUG_TX_REPLY(priv, "Station not known\n");
 
 		if (priv->mac80211_registered &&
 		    (iwl_queue_space(&txq->q) > txq->q.low_mark))
 			iwl_wake_queue(priv, txq_id);
 	}

^ permalink raw reply related

* Re: net-next: 2.6.34-rc1 regression: panic when running diagnostic on interface with IPv6
From: Stephen Hemminger @ 2010-04-09 15:07 UTC (permalink / raw)
  To: Emil S Tantilov; +Cc: netdev, David Miller
In-Reply-To: <EA929A9653AAE14F841771FB1DE5A1365FE4F6B39F@rrsmsx501.amr.corp.intel.com>

Send me your kernel config. And are you running tests online or offline

^ permalink raw reply

* Re: mmotm 2010-04-05-16-09 uploaded
From: Patrick McHardy @ 2010-04-09 14:49 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, David S. Miller,
	linux-kernel, netfilter-devel, netdev
In-Reply-To: <8356.1270774206@localhost>

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

Valdis.Kletnieks@vt.edu wrote:
> On Thu, 08 Apr 2010 17:36:07 +0200, Patrick McHardy said:
> 
>> Valdis.Kletnieks@vt.edu wrote:
> 
>>> Well, it *changed* it.  Does the rcu_defererence_check() only fire on the
>>> first time it hits something, so we've fixed the first one and now we get to
>>> see the second one?
>> It appears that way, otherwise you should have seen a second warning in
>> nf_conntrack_ecache the last time.
>>
>>> (For what it's worth, if this is going to be one-at-a-time whack-a-mole, I'm
>>> OK on that, just want to know up front.)
>> I went through the other files and I believe this should be it.
>> We already removed most of these incorrect rcu_dereference()
>> calls a while back.
> 
> Confirming - the second version of the patch fixes all the network-related
> RCU complaints I've been able to trigger...

Thanks. I've added the attached commit to the nf-next tree. I'll push
it to Dave shortly so this can get included in the next tree.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 4009 bytes --]

>From ed86308f6179d8fa6151c2d0f652aad0091548e2 Mon Sep 17 00:00:00 2001
From: Patrick McHardy <kaber@trash.net>
Date: Fri, 9 Apr 2010 16:42:15 +0200
Subject: [PATCH] netfilter: remove invalid rcu_dereference() calls

The CONFIG_PROVE_RCU option discovered a few invalid uses of
rcu_dereference() in netfilter. In all these cases, the code code
intends to check whether a pointer is already assigned when
performing registration or whether the assigned pointer matches
when performing unregistration. The entire registration/
unregistration is protected by a mutex, so we don't need the
rcu_dereference() calls.

Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 net/netfilter/nf_conntrack_ecache.c |   18 ++++--------------
 net/netfilter/nf_log.c              |    8 ++------
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c
index d5a9bcd..849614a 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -81,11 +81,9 @@ EXPORT_SYMBOL_GPL(nf_ct_deliver_cached_events);
 int nf_conntrack_register_notifier(struct nf_ct_event_notifier *new)
 {
 	int ret = 0;
-	struct nf_ct_event_notifier *notify;
 
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference(nf_conntrack_event_cb);
-	if (notify != NULL) {
+	if (nf_conntrack_event_cb != NULL) {
 		ret = -EBUSY;
 		goto out_unlock;
 	}
@@ -101,11 +99,8 @@ EXPORT_SYMBOL_GPL(nf_conntrack_register_notifier);
 
 void nf_conntrack_unregister_notifier(struct nf_ct_event_notifier *new)
 {
-	struct nf_ct_event_notifier *notify;
-
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference(nf_conntrack_event_cb);
-	BUG_ON(notify != new);
+	BUG_ON(nf_conntrack_event_cb != new);
 	rcu_assign_pointer(nf_conntrack_event_cb, NULL);
 	mutex_unlock(&nf_ct_ecache_mutex);
 }
@@ -114,11 +109,9 @@ EXPORT_SYMBOL_GPL(nf_conntrack_unregister_notifier);
 int nf_ct_expect_register_notifier(struct nf_exp_event_notifier *new)
 {
 	int ret = 0;
-	struct nf_exp_event_notifier *notify;
 
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference(nf_expect_event_cb);
-	if (notify != NULL) {
+	if (nf_expect_event_cb != NULL) {
 		ret = -EBUSY;
 		goto out_unlock;
 	}
@@ -134,11 +127,8 @@ EXPORT_SYMBOL_GPL(nf_ct_expect_register_notifier);
 
 void nf_ct_expect_unregister_notifier(struct nf_exp_event_notifier *new)
 {
-	struct nf_exp_event_notifier *notify;
-
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference(nf_expect_event_cb);
-	BUG_ON(notify != new);
+	BUG_ON(nf_expect_event_cb != new);
 	rcu_assign_pointer(nf_expect_event_cb, NULL);
 	mutex_unlock(&nf_ct_ecache_mutex);
 }
diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 015725a..908f599 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -35,7 +35,6 @@ static struct nf_logger *__find_logger(int pf, const char *str_logger)
 /* return EEXIST if the same logger is registred, 0 on success. */
 int nf_log_register(u_int8_t pf, struct nf_logger *logger)
 {
-	const struct nf_logger *llog;
 	int i;
 
 	if (pf >= ARRAY_SIZE(nf_loggers))
@@ -52,8 +51,7 @@ int nf_log_register(u_int8_t pf, struct nf_logger *logger)
 	} else {
 		/* register at end of list to honor first register win */
 		list_add_tail(&logger->list[pf], &nf_loggers_l[pf]);
-		llog = rcu_dereference(nf_loggers[pf]);
-		if (llog == NULL)
+		if (nf_loggers[pf] == NULL)
 			rcu_assign_pointer(nf_loggers[pf], logger);
 	}
 
@@ -65,13 +63,11 @@ EXPORT_SYMBOL(nf_log_register);
 
 void nf_log_unregister(struct nf_logger *logger)
 {
-	const struct nf_logger *c_logger;
 	int i;
 
 	mutex_lock(&nf_log_mutex);
 	for (i = 0; i < ARRAY_SIZE(nf_loggers); i++) {
-		c_logger = rcu_dereference(nf_loggers[i]);
-		if (c_logger == logger)
+		if (nf_loggers[i] == logger)
 			rcu_assign_pointer(nf_loggers[i], NULL);
 		list_del(&logger->list[i]);
 	}
-- 
1.7.0.4


^ permalink raw reply related

* Re: [PATCH] net/wireless/libertas: do not call wiphy_unregister() w/o wiphy_register()
From: Holger Schurig @ 2010-04-09 13:51 UTC (permalink / raw)
  To: John W. Linville
  Cc: Dan Williams, Daniel Mack, libertas-dev, netdev, linux-wireless,
	linux-kernel
In-Reply-To: <20100408190358.GB2999@tuxdriver.com>

> Ping?

Pong.

I'm a bit swamped with other stuff, so I can't do that right now. That's the 
pity of someone who can only use this-and-then on kernel projects.

^ permalink raw reply

* Re: [PATCH v3] rfs: Receive Flow Steering
From: Tom Herbert @ 2010-04-09 13:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev
In-Reply-To: <1270797457.2623.19.camel@edumazet-laptop>

>>   * @mc_ttl - Multicasting TTL
>>   * @is_icsk - is this an inet_connection_sock?
>> @@ -124,6 +126,9 @@ struct inet_sock {
>>       __u16                   cmsg_flags;
>>       __be16                  inet_sport;
>>       __u16                   inet_id;
>> +#ifdef CONFIG_RPS
>> +     __u32                   rxhash;
>> +#endif
>
> I am a bit worried, because dirtying this cache line might hurt non RPS
> setups (if network interrupts are balanced to all cpus)
>
The rxhash should only be written when it changes.  So as long as
device or lower stack provide a consistent rxhash for a connection
this should be okay.

> Best place would be to put rxhash close to sk_refcnt (because we dirty
> it to get a reference on rcu sk lookups)
>
In sock_common?... I don't know if we need this in every socket yet.

> I believe we have a 32bits hole on 64bit arches for this :)
>
>
> While testint latest net-nex-2.6 on my nehalem machine, I got a crash
> (in RPS I am afraid...)
>
> I am going to correct this crash before testing RFS and let you know the
> results.

Thanks for doing that.

>
> Thanks
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [Patch 1/3] sysctl: refactor integer handling proc code
From: Octavian Purdila @ 2010-04-09 13:40 UTC (permalink / raw)
  To: Changli Gao
  Cc: Amerigo Wang, linux-kernel, Eric Dumazet, netdev, Neil Horman,
	David Miller, ebiederm
In-Reply-To: <n2j412e6f7f1004090349q37cb01f0u177aaa4fc16664ec@mail.gmail.com>


Hi and thanks for reviewing.

On Friday 09 April 2010 13:49:12 you wrote:
> > + *
> > + * In case of success 0 is returned and buf and size are updated with
> > + * the amount of bytes read. If tr is non NULL and a trailing
> > + * character exist (size is non zero after returning from this
> > + * function) tr is updated with the trailing character.
> > + */
> > +static int proc_get_ulong(char __user **buf, size_t *size,
> > +                         unsigned long *val, bool *neg,
> > +                         const char *perm_tr, unsigned perm_tr_len, char
> > *tr) +{
> > +       int len;
> > +       char *p, tmp[TMPBUFLEN];
> > +
> > +       if (!*size)
> > +               return -EINVAL;
> > +
> > +       len = *size;
> > +       if (len > TMPBUFLEN-1)
> > +               len = TMPBUFLEN-1;
> > +
> > +       if (copy_from_user(tmp, *buf, len))
> > +               return -EFAULT;
> > +
> > +       tmp[len] = 0;
> > +       p = tmp;
> > +       if (*p == '-' && *size > 1) {
> > +               *neg = 1;
> > +               p++;
> > +       } else
> > +               *neg = 0;
> 
> the function name implies that it is used to parse unsigned long, so
> negative value should not be supported.
> 

My intention was to signal that the argument is unsigned long and that the 
sign come separate in neg, but I am OK with changing the function name to 
proc_get_long() if you think that is better.

> > +       if (!isdigit(*p))
> > +               return -EINVAL;
> 
> It seems that ledding white space should be allowed, so this check
> isn't needed, and simple_strtoul can handle it.
> 

Leading white space is skipped with proc_skip_space before calling this 
function. AFAICS simple_strtoul does not handle whitespaces.

> > +
> > +       *val = simple_strtoul(p, &p, 0);
> > +
> > +       len = p - tmp;
> > +
> > +       /* We don't know if the next char is whitespace thus we may
> > accept +        * invalid integers (e.g. 1234...a) or two integers
> > instead of one +        * (e.g. 123...1). So lets not allow such large
> > numbers. */ +       if (len == TMPBUFLEN - 1)
> > +               return -EINVAL;
> > +
> > +       if (len < *size && perm_tr_len && !isanyof(*p, perm_tr,
> > perm_tr_len)) +               return -EINVAL;
> 
> is strspn() better?
> 

I don't  think it will work out, \0 is an accepted trailer for many of the 
function which use this function.


> > +
> > +       if (tr && (len < *size))
> > +               *tr = *p;
> > +
> > +       *buf += len;
> > +       *size -= len;
> > +
> > +       return 0;
> > +}
> > +
> > +/**
> > + * proc_put_ulong - coverts an integer to a decimal ASCII formated
> > string + *
> > + * @buf - the user buffer
> > + * @size - the size of the user buffer
> > + * @val - the integer to be converted
> > + * @neg - sign of the number, %TRUE for negative
> > + * @first - if %FALSE will insert a separator character before the
> > number + * @separator - the separator character
> > + *
> > + * In case of success 0 is returned and buf and size are updated with
> > + * the amount of bytes read.
> > + */
> > +static int proc_put_ulong(char __user **buf, size_t *size, unsigned long
> > val, +                         bool neg, bool first, char separator)
> > +{
> > +       int len;
> > +       char tmp[TMPBUFLEN], *p = tmp;
> > +
> > +       if (!first)
> > +               *p++ = separator;
> > +       sprintf(p, "%s%lu", neg ? "-" : "", val);
> 
> negative should not be supported too.
> 

We need negatives in proc_dointvec, again we can change the function name if 
it will clear things up.

<snip>
> >                int val = *valp;
> >                unsigned long lval;
> >                if (val < 0) {
> > -                       *negp = -1;
> > +                       *negp = 1;
> >                        lval = (unsigned long)-val;
> >                } else {
> >                        *negp = 0;
> 
> These functions have so much lines of code. I think you can make them
> less. Please refer to strsep().
> 

Hmm, the input its pretty permissive and maybe this is why it looks so fat, we 
need to account for quite a few cases. 

Or maybe I spent too much time on this code already and I can't see the simple 
solution :)

Thanks,
tavi

^ permalink raw reply

* Re: Strange packet drops with heavy firewalling
From: Eric Dumazet @ 2010-04-09 13:29 UTC (permalink / raw)
  To: Benny Amorsen; +Cc: netdev
In-Reply-To: <m3vdc0ztyc.fsf@ursa.amorsen.dk>

Le vendredi 09 avril 2010 à 14:33 +0200, Benny Amorsen a écrit :

> Thank you very much for the help! I will report back whether it was the
> hash buckets.

OK

You could try :

ethtool -C eth0 tx-usecs 200 tx-frames 100 tx-frames-irq 100
ethtool -C eth1 tx-usecs 200 tx-frames 100 tx-frames-irq 100

(to reduce tx completion irqs)


Before buying multiqueue devices, you also could try net-next-2.6 kernel,
because RPS (Remote Packet Steering) is in.

In your setup, this might help a bit, distribute the packets to all cpus,
with appropriate cache handling.




^ permalink raw reply

* Re: [Patch 3/3] net: reserve ports for applications using fixed port numbers
From: Tetsuo Handa @ 2010-04-09 13:21 UTC (permalink / raw)
  To: amwang, linux-kernel
  Cc: opurdila, eric.dumazet, netdev, nhorman, davem, ebiederm
In-Reply-To: <20100409101513.5051.97926.sendpatchset@localhost.localdomain>

Hello.

Amerigo Wang wrote:
> Index: linux-2.6/drivers/infiniband/core/cma.c
> ===================================================================
> --- linux-2.6.orig/drivers/infiniband/core/cma.c
> +++ linux-2.6/drivers/infiniband/core/cma.c
> @@ -1980,6 +1980,8 @@ retry:
>  	/* FIXME: add proper port randomization per like inet_csk_get_port */
>  	do {
>  		ret = idr_get_new_above(ps, bind_list, next_port, &port);
> +		if (inet_is_reserved_local_port(port))
> +			ret = -EAGAIN;

You should not overwrite ret with -EAGAIN when idr_get_new_above() returned
-ENOSPC. I don't know about idr, thus I don't know whether

		if (!ret && inet_is_reserved_local_port(port))
			ret = -EAGAIN;

is correct or not.

>  	} while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL));
>  
>  	if (ret)
> @@ -2996,10 +2998,13 @@ static int __init cma_init(void)
>  {
>  	int ret, low, high, remaining;
>  
> -	get_random_bytes(&next_port, sizeof next_port);
>  	inet_get_local_port_range(&low, &high);
> +again:
> +	get_random_bytes(&next_port, sizeof next_port);
>  	remaining = (high - low) + 1;
>  	next_port = ((unsigned int) next_port % remaining) + low;
> +	if (inet_is_reserved_local_port(next_port))
> +		goto again;
>  

You should not unconditionally "goto again;".
If all ports were reserved, it will loop forever (CPU stalls).

>  	cma_wq = create_singlethread_workqueue("rdma_cm");
>  	if (!cma_wq)


> Index: linux-2.6/net/sctp/socket.c
> ===================================================================
> --- linux-2.6.orig/net/sctp/socket.c
> +++ linux-2.6/net/sctp/socket.c
> @@ -5436,6 +5436,8 @@ static long sctp_get_port_local(struct s
>  			rover++;
>  			if ((rover < low) || (rover > high))
>  				rover = low;
> +			if (inet_is_reserved_local_port(rover))
> +				continue;

This one needs to be

			if (inet_is_reserved_local_port(rover))
				goto next_nolock;

>  			index = sctp_phashfn(rover);
>  			head = &sctp_port_hashtable[index];
>  			sctp_spin_lock(&head->lock);

 next:
			sctp_spin_unlock(&head->lock);
+next_nolock:
		} while (--remaining > 0);

otherwise, it will loop forever if all ports were reserved.

^ permalink raw reply

* Re: [Patch 2/3] sysctl: add proc_do_large_bitmap
From: Octavian Purdila @ 2010-04-09 12:35 UTC (permalink / raw)
  To: Changli Gao
  Cc: Amerigo Wang, linux-kernel, ebiederm, Eric Dumazet, netdev,
	Neil Horman, David Miller
In-Reply-To: <s2w412e6f7f1004090333g3b23eb94udb1e6cc3939a07e5@mail.gmail.com>

On Friday 09 April 2010 13:33:29 you wrote:
> On Fri, Apr 9, 2010 at 6:11 PM, Amerigo Wang <amwang@redhat.com> wrote:
> > From: Octavian Purdila <opurdila@ixiacom.com>
> >
> > The new function can be used to read/write large bitmaps via /proc. A
> > comma separated range format is used for compact output and input
> > (e.g. 1,3-4,10-10).
> >
> > Writing into the file will first reset the bitmap then update it
> > based on the given input.
> 
> We have bitmap_scnprintf() and bitmap_parse_user(), why invent a new suite?
> 

A decimal comma separated ranges seems the best option for this feature, and 
unfortunately both of the above functions only support hexadecimal and no 
ranges.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox