Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] genetlink: use genl_register_family_with_ops()
From: David Miller @ 2010-07-27  3:54 UTC (permalink / raw)
  To: xiaosuo; +Cc: linux-kernel, netdev
In-Reply-To: <1280136065-17566-1-git-send-email-xiaosuo@gmail.com>

From: Changli Gao <xiaosuo@gmail.com>
Date: Mon, 26 Jul 2010 17:21:05 +0800

> Signed-off-by: Changli Gao <xiaosuo@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [patch] caif: handle snprintf() return
From: David Miller @ 2010-07-27  4:05 UTC (permalink / raw)
  To: error27; +Cc: sjur.brandeland, netdev, kernel-janitors
In-Reply-To: <20100726072358.GK26313@bicker>

From: Dan Carpenter <error27@gmail.com>
Date: Mon, 26 Jul 2010 09:23:59 +0200

> snprintf() returns the number of bytes that would have been written.  It
> can be larger than the size of the buffer.  The current code won't
> overflow, but people cut and paste this stuff so lets do it right and
> also make the static checkers happy.
> 
> Signed-off-by: Dan Carpenter <error27@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2] ks8842: Support 100Mbps when accessed via timberdale
From: David Miller @ 2010-07-27  4:05 UTC (permalink / raw)
  To: richard.rojfors; +Cc: netdev
In-Reply-To: <1280134265.3498.4.camel@debian>

From: Richard Röjfors <richard.rojfors@pelagicore.com>
Date: Mon, 26 Jul 2010 10:51:05 +0200

> This patch removes the code which disables 100Mbps advertising when
> the ks8842 is accessed via timberdale.
> 
> At higher speed it's good to be nice to the internal state machine
> of timberdale by acking interrupts. That is done by a write to the
> interrupt ack register (IAR).
> 
> Signed-off-by: Richard Röjfors <richard.rojfors@pelagicore.com>

Applied.

^ permalink raw reply

* Re: [PATCH] usbnet: use jiffies in schedule_timeout(), not msecs
From: David Miller @ 2010-07-27  4:05 UTC (permalink / raw)
  To: segooon
  Cc: kernel-janitors, dbrownell, gregkh, ben, joe, shemminger,
	jussi.kivilinna, netdev, linux-usb, linux-kernel
In-Reply-To: <1280132742-8126-1-git-send-email-segooon@gmail.com>

From: Kulikov Vasiliy <segooon@gmail.com>
Date: Mon, 26 Jul 2010 12:25:42 +0400

> usbnet_terminate_urbs() uses schedule_timeout() with argument 3 msecs.
> schedule_timeout() uses jiffies as argument, so convert msecs to jiffies
> with msecs_to_jiffies().
> 
> Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Michael S. Tsirkin @ 2010-07-27  4:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sridhar Samudrala, Peter Zijlstra, Tejun Heo, Ingo Molnar, netdev,
	lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
	Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100726180834.GA26988@redhat.com>

On Mon, Jul 26, 2010 at 08:08:34PM +0200, Oleg Nesterov wrote:
> On 07/26, Sridhar Samudrala wrote:
> >
> > I have been testing out a similar patch that uses kernel_thread() without CLONE_FILES
> > flag rather than create_kthread() and then closing the files.
> 
> !CLONE_FILES can't help. copy_files() does dup_fd() in this case.
> The child still inherits the files.
> 
> > Either version should be fine.
> 
> I think neither version is fine ;)
> 
> exit_files() is not enough too. How about the signals, reparenting?
> 
> 
> I already forgot all details, probably I missed somethig. But it
> seems to me that it is better to just export get/set affinity and
> forget about all complications.
> 
> Oleg.


Peter, could you please indicate whether you think this is the way to
go, too?

-- 
MST

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2010-07-27  6:31 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Add wimax device IDs for devices found in thinkpad laptops.  From
   Alexey Shvetsov.

2) Caching netfilter settings across dev_forward_skb() breaks
   namespaces and conntrack zones, we need to call nf_reset() here.
   From Ben Greear.

3) pskb_expand_head() can corrupt skb->csum with some devices.  Fix
   from Andrea Shepard.  skb_copy_expand() had the same bug, so fix it
   there too.

4) Fix crashes wrt. RLB mode in bonding, from Greg Edwards.

5) macvtap must limit packet queue backlog size else OOM, fix from
   Herbert Xu.

6) GSO errors in tun driver trigger a BUG(), which makes it impossible
   to get useful data and debug this problem.  Just warn, print a
   useful message, and drop the packet.  From Michael S. Tsirkin.

7) The mirred packet scheduler action holds references to network
   devices.  However, it doesn't have a netdevice notifier so those
   entries can drop their refs on unregister.  This makes network	
   devices unremovable and leads to hangs.  Fix from Stephen Hemminger.

8) ath9k driver uses wrong DMA direction values in map/unmap calls
   during ath_rx_tasklet.  Fix from Ming Lei.

9) ieee80211_send_layer2_update does receive from process context,
   therefore it must use netif_rx_ni().  From John Linville.

10) When ipv6 is disabled on an interface, we should never add ipv6 routes
    to it.  From Brian Haley.

11) Trap invalid virtual function settings in ixgbe/igb driver, from
    Andy Gospodarek.

12) bnx2x driver needs more mutual exclusion to prevent corruption of
    the statistics handling and state.  From Vladislav Zolotarov.

Please pull, thanks a lot!

The following changes since commit 1a041a23da7c77b53c71fe11b4f940388bee37b1:

  Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip (2010-07-26 16:02:07 -0700)

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Alexey Shvetsov (1):
      wimax/i2400m: Add PID & VID for Intel WiMAX 6250

Andrea Shepard (1):
      net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.c

Andy Gospodarek (1):
      ixgbe/igb: catch invalid VF settings

Ben Greear (1):
      net: dev_forward_skb should call nf_reset

Breno Leitao (1):
      s2io: fixing DBG_PRINT() macro

Brian Haley (1):
      ipv6: Don't add routes to ipv6 disabled interfaces.

David S. Miller (3):
      net: Fix skb_copy_expand() handling of ->csum_start
      Merge branch 'wimax-2.6.35.y' of git://git.kernel.org/.../inaky/wimax
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6

Greg Edwards (1):
      bonding: set device in RLB ARP packet handler

Herbert Xu (1):
      macvtap: Limit packet queue length

John W. Linville (1):
      wireless: use netif_rx_ni in ieee80211_send_layer2_update

Michael S. Tsirkin (1):
      tun: avoid BUG, dump packet on GSO errors

Ming Lei (1):
      ath9k: fix dma direction for map/unmap in ath_rx_tasklet

Vladislav Zolotarov (3):
      bnx2x: Protect a SM state change
      bnx2x: Protect statistics ramrod and sequence number
      bnx2x: Advance a module version

stephen hemminger (1):
      net sched: fix race in mirred device removal

 drivers/net/bnx2x.h                   |    4 +++
 drivers/net/bnx2x_main.c              |   42 ++++++++++++++++++++-----------
 drivers/net/bonding/bond_alb.c        |    2 +-
 drivers/net/igb/igb_main.c            |    9 +++++++
 drivers/net/ixgbe/ixgbe_main.c        |    9 +++++++
 drivers/net/macvlan.c                 |   10 ++++++-
 drivers/net/macvtap.c                 |   18 ++++++++++++-
 drivers/net/s2io.h                    |    2 +-
 drivers/net/tun.c                     |   14 +++++++++-
 drivers/net/wimax/i2400m/i2400m-usb.h |    1 +
 drivers/net/wimax/i2400m/usb.c        |    2 +
 drivers/net/wireless/ath/ath9k/recv.c |    4 +-
 include/linux/if_macvlan.h            |    2 +
 include/net/tc_act/tc_mirred.h        |    1 +
 net/core/dev.c                        |    1 +
 net/core/skbuff.c                     |    7 ++++-
 net/ipv6/addrconf.c                   |   14 +++++++----
 net/mac80211/cfg.c                    |    2 +-
 net/sched/act_mirred.c                |   43 ++++++++++++++++++++++++++++++--
 19 files changed, 151 insertions(+), 36 deletions(-)

^ permalink raw reply

* [net-next-2.6 PATCH] e1000: use netif_<level> instead of netdev_<level>
From: Jeff Kirsher @ 2010-07-27  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Joe Perches, Emil Tantilov, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch restores the ability to set msglvl through ethtool.
The issue was introduced by:
commit 675ad47375c76a7c3be4ace9554d92cd55518ced

CC: Joe Perches <joe@perches.com>

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000/e1000.h         |   18 ++++----
 drivers/net/e1000/e1000_ethtool.c |   27 ++++++------
 drivers/net/e1000/e1000_main.c    |   86 ++++++++++++++++++++-----------------
 3 files changed, 70 insertions(+), 61 deletions(-)

diff --git a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h
index 65298a6..99288b9 100644
--- a/drivers/net/e1000/e1000.h
+++ b/drivers/net/e1000/e1000.h
@@ -324,18 +324,20 @@ enum e1000_state_t {
 extern struct net_device *e1000_get_hw_dev(struct e1000_hw *hw);
 #define e_dbg(format, arg...) \
 	netdev_dbg(e1000_get_hw_dev(hw), format, ## arg)
-#define e_err(format, arg...) \
-	netdev_err(adapter->netdev, format, ## arg)
-#define e_info(format, arg...) \
-	netdev_info(adapter->netdev, format, ## arg)
-#define e_warn(format, arg...) \
-	netdev_warn(adapter->netdev, format, ## arg)
-#define e_notice(format, arg...) \
-	netdev_notice(adapter->netdev, format, ## arg)
+#define e_err(msglvl, format, arg...) \
+	netif_err(adapter, msglvl, adapter->netdev, format, ## arg)
+#define e_info(msglvl, format, arg...) \
+	netif_info(adapter, msglvl, adapter->netdev, format, ## arg)
+#define e_warn(msglvl, format, arg...) \
+	netif_warn(adapter, msglvl, adapter->netdev, format, ## arg)
+#define e_notice(msglvl, format, arg...) \
+	netif_notice(adapter, msglvl, adapter->netdev, format, ## arg)
 #define e_dev_info(format, arg...) \
 	dev_info(&adapter->pdev->dev, format, ## arg)
 #define e_dev_warn(format, arg...) \
 	dev_warn(&adapter->pdev->dev, format, ## arg)
+#define e_dev_err(format, arg...) \
+	dev_err(&adapter->pdev->dev, format, ## arg)
 
 extern char e1000_driver_name[];
 extern const char e1000_driver_version[];
diff --git a/drivers/net/e1000/e1000_ethtool.c b/drivers/net/e1000/e1000_ethtool.c
index d5ff029..f4d0922 100644
--- a/drivers/net/e1000/e1000_ethtool.c
+++ b/drivers/net/e1000/e1000_ethtool.c
@@ -346,7 +346,7 @@ static int e1000_set_tso(struct net_device *netdev, u32 data)
 
 	netdev->features &= ~NETIF_F_TSO6;
 
-	e_info("TSO is %s\n", data ? "Enabled" : "Disabled");
+	e_info(probe, "TSO is %s\n", data ? "Enabled" : "Disabled");
 	adapter->tso_force = true;
 	return 0;
 }
@@ -714,9 +714,9 @@ static bool reg_pattern_test(struct e1000_adapter *adapter, u64 *data, int reg,
 		writel(write & test[i], address);
 		read = readl(address);
 		if (read != (write & test[i] & mask)) {
-			e_info("pattern test reg %04X failed: "
-			       "got 0x%08X expected 0x%08X\n",
-			       reg, read, (write & test[i] & mask));
+			e_err(drv, "pattern test reg %04X failed: "
+			      "got 0x%08X expected 0x%08X\n",
+			      reg, read, (write & test[i] & mask));
 			*data = reg;
 			return true;
 		}
@@ -734,7 +734,7 @@ static bool reg_set_and_check(struct e1000_adapter *adapter, u64 *data, int reg,
 	writel(write & mask, address);
 	read = readl(address);
 	if ((read & mask) != (write & mask)) {
-		e_err("set/check reg %04X test failed: "
+		e_err(drv, "set/check reg %04X test failed: "
 		      "got 0x%08X expected 0x%08X\n",
 		      reg, (read & mask), (write & mask));
 		*data = reg;
@@ -779,7 +779,7 @@ static int e1000_reg_test(struct e1000_adapter *adapter, u64 *data)
 	ew32(STATUS, toggle);
 	after = er32(STATUS) & toggle;
 	if (value != after) {
-		e_err("failed STATUS register test got: "
+		e_err(drv, "failed STATUS register test got: "
 		      "0x%08X expected: 0x%08X\n", after, value);
 		*data = 1;
 		return 1;
@@ -894,7 +894,8 @@ static int e1000_intr_test(struct e1000_adapter *adapter, u64 *data)
 		*data = 1;
 		return -1;
 	}
-	e_info("testing %s interrupt\n", (shared_int ? "shared" : "unshared"));
+	e_info(hw, "testing %s interrupt\n", (shared_int ?
+	       "shared" : "unshared"));
 
 	/* Disable all the interrupts */
 	ew32(IMC, 0xFFFFFFFF);
@@ -1561,7 +1562,7 @@ static void e1000_diag_test(struct net_device *netdev,
 		u8 forced_speed_duplex = hw->forced_speed_duplex;
 		u8 autoneg = hw->autoneg;
 
-		e_info("offline testing starting\n");
+		e_info(hw, "offline testing starting\n");
 
 		/* Link test performed before hardware reset so autoneg doesn't
 		 * interfere with test result */
@@ -1601,7 +1602,7 @@ static void e1000_diag_test(struct net_device *netdev,
 		if (if_running)
 			dev_open(netdev);
 	} else {
-		e_info("online testing starting\n");
+		e_info(hw, "online testing starting\n");
 		/* Online tests */
 		if (e1000_link_test(adapter, &data[4]))
 			eth_test->flags |= ETH_TEST_FL_FAILED;
@@ -1694,8 +1695,8 @@ static void e1000_get_wol(struct net_device *netdev,
 		wol->supported &= ~WAKE_UCAST;
 
 		if (adapter->wol & E1000_WUFC_EX)
-			e_err("Interface does not support "
-		        "directed (unicast) frame wake-up packets\n");
+			e_err(drv, "Interface does not support directed "
+			      "(unicast) frame wake-up packets\n");
 		break;
 	default:
 		break;
@@ -1726,8 +1727,8 @@ static int e1000_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
 	switch (hw->device_id) {
 	case E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3:
 		if (wol->wolopts & WAKE_UCAST) {
-			e_err("Interface does not support "
-			      "directed (unicast) frame wake-up packets\n");
+			e_err(drv, "Interface does not support directed "
+			      "(unicast) frame wake-up packets\n");
 			return -EOPNOTSUPP;
 		}
 		break;
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 68a8089..02833af 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -275,7 +275,7 @@ static int e1000_request_irq(struct e1000_adapter *adapter)
 	err = request_irq(adapter->pdev->irq, handler, irq_flags, netdev->name,
 	                  netdev);
 	if (err) {
-		e_err("Unable to allocate interrupt Error: %d\n", err);
+		e_err(probe, "Unable to allocate interrupt Error: %d\n", err);
 	}
 
 	return err;
@@ -657,7 +657,7 @@ void e1000_reset(struct e1000_adapter *adapter)
 		ew32(WUC, 0);
 
 	if (e1000_init_hw(hw))
-		e_err("Hardware Error\n");
+		e_dev_err("Hardware Error\n");
 	e1000_update_mng_vlan(adapter);
 
 	/* if (adapter->hwflags & HWFLAGS_PHY_PWR_BIT) { */
@@ -925,7 +925,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 
 	/* initialize eeprom parameters */
 	if (e1000_init_eeprom_params(hw)) {
-		e_err("EEPROM initialization failed\n");
+		e_err(probe, "EEPROM initialization failed\n");
 		goto err_eeprom;
 	}
 
@@ -936,7 +936,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 
 	/* make sure the EEPROM is good */
 	if (e1000_validate_eeprom_checksum(hw) < 0) {
-		e_err("The EEPROM Checksum Is Not Valid\n");
+		e_err(probe, "The EEPROM Checksum Is Not Valid\n");
 		e1000_dump_eeprom(adapter);
 		/*
 		 * set MAC address to all zeroes to invalidate and temporary
@@ -950,14 +950,14 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 	} else {
 		/* copy the MAC address out of the EEPROM */
 		if (e1000_read_mac_addr(hw))
-			e_err("EEPROM Read Error\n");
+			e_err(probe, "EEPROM Read Error\n");
 	}
 	/* don't block initalization here due to bad MAC address */
 	memcpy(netdev->dev_addr, hw->mac_addr, netdev->addr_len);
 	memcpy(netdev->perm_addr, hw->mac_addr, netdev->addr_len);
 
 	if (!is_valid_ether_addr(netdev->perm_addr))
-		e_err("Invalid MAC Address\n");
+		e_err(probe, "Invalid MAC Address\n");
 
 	e1000_get_bus_info(hw);
 
@@ -1047,7 +1047,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 		goto err_register;
 
 	/* print bus type/speed/width info */
-	e_info("(PCI%s:%dMHz:%d-bit) %pM\n",
+	e_info(probe, "(PCI%s:%dMHz:%d-bit) %pM\n",
 	       ((hw->bus_type == e1000_bus_type_pcix) ? "-X" : ""),
 	       ((hw->bus_speed == e1000_bus_speed_133) ? 133 :
 		(hw->bus_speed == e1000_bus_speed_120) ? 120 :
@@ -1059,7 +1059,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 	/* carrier off reporting is important to ethtool even BEFORE open */
 	netif_carrier_off(netdev);
 
-	e_info("Intel(R) PRO/1000 Network Connection\n");
+	e_info(probe, "Intel(R) PRO/1000 Network Connection\n");
 
 	cards_found++;
 	return 0;
@@ -1159,7 +1159,7 @@ static int __devinit e1000_sw_init(struct e1000_adapter *adapter)
 	/* identify the MAC */
 
 	if (e1000_set_mac_type(hw)) {
-		e_err("Unknown MAC Type\n");
+		e_err(probe, "Unknown MAC Type\n");
 		return -EIO;
 	}
 
@@ -1192,7 +1192,7 @@ static int __devinit e1000_sw_init(struct e1000_adapter *adapter)
 	adapter->num_rx_queues = 1;
 
 	if (e1000_alloc_queues(adapter)) {
-		e_err("Unable to allocate memory for queues\n");
+		e_err(probe, "Unable to allocate memory for queues\n");
 		return -ENOMEM;
 	}
 
@@ -1386,7 +1386,8 @@ static int e1000_setup_tx_resources(struct e1000_adapter *adapter,
 	size = sizeof(struct e1000_buffer) * txdr->count;
 	txdr->buffer_info = vmalloc(size);
 	if (!txdr->buffer_info) {
-		e_err("Unable to allocate memory for the Tx descriptor ring\n");
+		e_err(probe, "Unable to allocate memory for the Tx descriptor "
+		      "ring\n");
 		return -ENOMEM;
 	}
 	memset(txdr->buffer_info, 0, size);
@@ -1401,7 +1402,8 @@ static int e1000_setup_tx_resources(struct e1000_adapter *adapter,
 	if (!txdr->desc) {
 setup_tx_desc_die:
 		vfree(txdr->buffer_info);
-		e_err("Unable to allocate memory for the Tx descriptor ring\n");
+		e_err(probe, "Unable to allocate memory for the Tx descriptor "
+		      "ring\n");
 		return -ENOMEM;
 	}
 
@@ -1409,7 +1411,7 @@ setup_tx_desc_die:
 	if (!e1000_check_64k_bound(adapter, txdr->desc, txdr->size)) {
 		void *olddesc = txdr->desc;
 		dma_addr_t olddma = txdr->dma;
-		e_err("txdr align check failed: %u bytes at %p\n",
+		e_err(tx_err, "txdr align check failed: %u bytes at %p\n",
 		      txdr->size, txdr->desc);
 		/* Try again, without freeing the previous */
 		txdr->desc = dma_alloc_coherent(&pdev->dev, txdr->size,
@@ -1427,7 +1429,7 @@ setup_tx_desc_die:
 					  txdr->dma);
 			dma_free_coherent(&pdev->dev, txdr->size, olddesc,
 					  olddma);
-			e_err("Unable to allocate aligned memory "
+			e_err(probe, "Unable to allocate aligned memory "
 			      "for the transmit descriptor ring\n");
 			vfree(txdr->buffer_info);
 			return -ENOMEM;
@@ -1460,7 +1462,7 @@ int e1000_setup_all_tx_resources(struct e1000_adapter *adapter)
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		err = e1000_setup_tx_resources(adapter, &adapter->tx_ring[i]);
 		if (err) {
-			e_err("Allocation for Tx Queue %u failed\n", i);
+			e_err(probe, "Allocation for Tx Queue %u failed\n", i);
 			for (i-- ; i >= 0; i--)
 				e1000_free_tx_resources(adapter,
 							&adapter->tx_ring[i]);
@@ -1580,7 +1582,8 @@ static int e1000_setup_rx_resources(struct e1000_adapter *adapter,
 	size = sizeof(struct e1000_buffer) * rxdr->count;
 	rxdr->buffer_info = vmalloc(size);
 	if (!rxdr->buffer_info) {
-		e_err("Unable to allocate memory for the Rx descriptor ring\n");
+		e_err(probe, "Unable to allocate memory for the Rx descriptor "
+		      "ring\n");
 		return -ENOMEM;
 	}
 	memset(rxdr->buffer_info, 0, size);
@@ -1596,7 +1599,8 @@ static int e1000_setup_rx_resources(struct e1000_adapter *adapter,
 					GFP_KERNEL);
 
 	if (!rxdr->desc) {
-		e_err("Unable to allocate memory for the Rx descriptor ring\n");
+		e_err(probe, "Unable to allocate memory for the Rx descriptor "
+		      "ring\n");
 setup_rx_desc_die:
 		vfree(rxdr->buffer_info);
 		return -ENOMEM;
@@ -1606,7 +1610,7 @@ setup_rx_desc_die:
 	if (!e1000_check_64k_bound(adapter, rxdr->desc, rxdr->size)) {
 		void *olddesc = rxdr->desc;
 		dma_addr_t olddma = rxdr->dma;
-		e_err("rxdr align check failed: %u bytes at %p\n",
+		e_err(rx_err, "rxdr align check failed: %u bytes at %p\n",
 		      rxdr->size, rxdr->desc);
 		/* Try again, without freeing the previous */
 		rxdr->desc = dma_alloc_coherent(&pdev->dev, rxdr->size,
@@ -1615,8 +1619,8 @@ setup_rx_desc_die:
 		if (!rxdr->desc) {
 			dma_free_coherent(&pdev->dev, rxdr->size, olddesc,
 					  olddma);
-			e_err("Unable to allocate memory for the Rx descriptor "
-			      "ring\n");
+			e_err(probe, "Unable to allocate memory for the Rx "
+			      "descriptor ring\n");
 			goto setup_rx_desc_die;
 		}
 
@@ -1626,8 +1630,8 @@ setup_rx_desc_die:
 					  rxdr->dma);
 			dma_free_coherent(&pdev->dev, rxdr->size, olddesc,
 					  olddma);
-			e_err("Unable to allocate aligned memory for the Rx "
-			      "descriptor ring\n");
+			e_err(probe, "Unable to allocate aligned memory for "
+			      "the Rx descriptor ring\n");
 			goto setup_rx_desc_die;
 		} else {
 			/* Free old allocation, new allocation was successful */
@@ -1659,7 +1663,7 @@ int e1000_setup_all_rx_resources(struct e1000_adapter *adapter)
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		err = e1000_setup_rx_resources(adapter, &adapter->rx_ring[i]);
 		if (err) {
-			e_err("Allocation for Rx Queue %u failed\n", i);
+			e_err(probe, "Allocation for Rx Queue %u failed\n", i);
 			for (i-- ; i >= 0; i--)
 				e1000_free_rx_resources(adapter,
 							&adapter->rx_ring[i]);
@@ -2110,7 +2114,7 @@ static void e1000_set_rx_mode(struct net_device *netdev)
 	u32 *mcarray = kcalloc(mta_reg_count, sizeof(u32), GFP_ATOMIC);
 
 	if (!mcarray) {
-		e_err("memory allocation failed\n");
+		e_err(probe, "memory allocation failed\n");
 		return;
 	}
 
@@ -2648,7 +2652,8 @@ static bool e1000_tx_csum(struct e1000_adapter *adapter,
 		break;
 	default:
 		if (unlikely(net_ratelimit()))
-			e_warn("checksum_partial proto=%x!\n", skb->protocol);
+			e_warn(drv, "checksum_partial proto=%x!\n",
+			       skb->protocol);
 		break;
 	}
 
@@ -2992,7 +2997,8 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
 				/* fall through */
 				pull_size = min((unsigned int)4, skb->data_len);
 				if (!__pskb_pull_tail(skb, pull_size)) {
-					e_err("__pskb_pull_tail failed.\n");
+					e_err(drv, "__pskb_pull_tail "
+					      "failed.\n");
 					dev_kfree_skb_any(skb);
 					return NETDEV_TX_OK;
 				}
@@ -3140,7 +3146,7 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 
 	if ((max_frame < MINIMUM_ETHERNET_FRAME_SIZE) ||
 	    (max_frame > MAX_JUMBO_FRAME_SIZE)) {
-		e_err("Invalid MTU setting\n");
+		e_err(probe, "Invalid MTU setting\n");
 		return -EINVAL;
 	}
 
@@ -3148,7 +3154,7 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	switch (hw->mac_type) {
 	case e1000_undefined ... e1000_82542_rev2_1:
 		if (max_frame > (ETH_FRAME_LEN + ETH_FCS_LEN)) {
-			e_err("Jumbo Frames not supported.\n");
+			e_err(probe, "Jumbo Frames not supported.\n");
 			return -EINVAL;
 		}
 		break;
@@ -3500,7 +3506,7 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter,
 		    !(er32(STATUS) & E1000_STATUS_TXOFF)) {
 
 			/* detected Tx unit hang */
-			e_err("Detected Tx Unit Hang\n"
+			e_err(drv, "Detected Tx Unit Hang\n"
 			      "  Tx Queue             <%lu>\n"
 			      "  TDH                  <%x>\n"
 			      "  TDT                  <%x>\n"
@@ -3749,7 +3755,7 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 
 		/* eth type trans needs skb->data to point to something */
 		if (!pskb_may_pull(skb, ETH_HLEN)) {
-			e_err("pskb_may_pull failed.\n");
+			e_err(drv, "pskb_may_pull failed.\n");
 			dev_kfree_skb(skb);
 			goto next_desc;
 		}
@@ -3874,7 +3880,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 
 		if (adapter->discarding) {
 			/* All receives must fit into a single buffer */
-			e_info("Receive packet consumed multiple buffers\n");
+			e_dbg("Receive packet consumed multiple buffers\n");
 			/* recycle */
 			buffer_info->skb = skb;
 			if (status & E1000_RXD_STAT_EOP)
@@ -3986,8 +3992,8 @@ e1000_alloc_jumbo_rx_buffers(struct e1000_adapter *adapter,
 		/* Fix for errata 23, can't cross 64kB boundary */
 		if (!e1000_check_64k_bound(adapter, skb->data, bufsz)) {
 			struct sk_buff *oldskb = skb;
-			e_err("skb align check failed: %u bytes at %p\n",
-			      bufsz, skb->data);
+			e_err(rx_err, "skb align check failed: %u bytes at "
+			      "%p\n", bufsz, skb->data);
 			/* Try again, without freeing the previous */
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
 			/* Failed allocation, critical failure */
@@ -4095,8 +4101,8 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 		/* Fix for errata 23, can't cross 64kB boundary */
 		if (!e1000_check_64k_bound(adapter, skb->data, bufsz)) {
 			struct sk_buff *oldskb = skb;
-			e_err("skb align check failed: %u bytes at %p\n",
-			      bufsz, skb->data);
+			e_err(rx_err, "skb align check failed: %u bytes at "
+			      "%p\n", bufsz, skb->data);
 			/* Try again, without freeing the previous */
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
 			/* Failed allocation, critical failure */
@@ -4141,8 +4147,8 @@ map_skb:
 		if (!e1000_check_64k_bound(adapter,
 					(void *)(unsigned long)buffer_info->dma,
 					adapter->rx_buffer_len)) {
-			e_err("dma align check failed: %u bytes at %p\n",
-			      adapter->rx_buffer_len,
+			e_err(rx_err, "dma align check failed: %u bytes at "
+			      "%p\n", adapter->rx_buffer_len,
 			      (void *)(unsigned long)buffer_info->dma);
 			dev_kfree_skb(skb);
 			buffer_info->skb = NULL;
@@ -4355,7 +4361,7 @@ void e1000_pci_set_mwi(struct e1000_hw *hw)
 	int ret_val = pci_set_mwi(adapter->pdev);
 
 	if (ret_val)
-		e_err("Error in setting MWI\n");
+		e_err(probe, "Error in setting MWI\n");
 }
 
 void e1000_pci_clear_mwi(struct e1000_hw *hw)
@@ -4486,7 +4492,7 @@ int e1000_set_spd_dplx(struct e1000_adapter *adapter, u16 spddplx)
 	/* Fiber NICs only allow 1000 gbps Full duplex */
 	if ((hw->media_type == e1000_media_type_fiber) &&
 		spddplx != (SPEED_1000 + DUPLEX_FULL)) {
-		e_err("Unsupported Speed/Duplex configuration\n");
+		e_err(probe, "Unsupported Speed/Duplex configuration\n");
 		return -EINVAL;
 	}
 
@@ -4509,7 +4515,7 @@ int e1000_set_spd_dplx(struct e1000_adapter *adapter, u16 spddplx)
 		break;
 	case SPEED_1000 + DUPLEX_HALF: /* not supported */
 	default:
-		e_err("Unsupported Speed/Duplex configuration\n");
+		e_err(probe, "Unsupported Speed/Duplex configuration\n");
 		return -EINVAL;
 	}
 	return 0;


^ permalink raw reply related

* Re: [net-next-2.6 PATCH] e1000: use netif_<level> instead of netdev_<level>
From: David Miller @ 2010-07-27  6:38 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, bphilips, joe, emil.s.tantilov
In-Reply-To: <20100727063129.24362.20955.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 26 Jul 2010 23:33:31 -0700

> From: Emil Tantilov <emil.s.tantilov@intel.com>
> 
> This patch restores the ability to set msglvl through ethtool.
> The issue was introduced by:
> commit 675ad47375c76a7c3be4ace9554d92cd55518ced
> 
> CC: Joe Perches <joe@perches.com>
> 
> Reported-by: Joe Perches <joe@perches.com>
> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied, thanks.

^ permalink raw reply

* [net-next-2.6 PATCH] ixgbe: priority tagging FCoE frames without FCoE offload
From: Jeff Kirsher @ 2010-07-27  6:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Yi Zou, John Fastabend, Jeff Kirsher

From: John Fastabend <john.r.fastabend@intel.com>

The DCB user priority for FCoE is available regardless of whether
FCoE offload is enabled (IXGBE_FLAG_FCOE_ENABLED bit is set).
This allows proper DCB user priority tagging for FCoE
traffic on both 82598 and 82599 devices.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_main.c |   45 +++++++++++++++++++++-------------------
 1 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 9203759..bc22ab4 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -4783,6 +4783,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 #ifdef CONFIG_IXGBE_DCB
 		/* Default traffic class to use for FCoE */
 		adapter->fcoe.tc = IXGBE_FCOE_DEFTC;
+		adapter->fcoe.up = IXGBE_FCOE_DEFTC;
 #endif
 #endif /* IXGBE_FCOE */
 	}
@@ -6147,21 +6148,26 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb)
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
 	int txq = smp_processor_id();
 
+#ifdef IXGBE_FCOE
+	if ((skb->protocol == htons(ETH_P_FCOE)) ||
+	    (skb->protocol == htons(ETH_P_FIP))) {
+		if (adapter->flags & IXGBE_FLAG_FCOE_ENABLED) {
+			txq &= (adapter->ring_feature[RING_F_FCOE].indices - 1);
+			txq += adapter->ring_feature[RING_F_FCOE].mask;
+			return txq;
+		} else if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
+			txq = adapter->fcoe.up;
+			return txq;
+		}
+	}
+#endif
+
 	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
 		while (unlikely(txq >= dev->real_num_tx_queues))
 			txq -= dev->real_num_tx_queues;
 		return txq;
 	}
 
-#ifdef IXGBE_FCOE
-	if ((adapter->flags & IXGBE_FLAG_FCOE_ENABLED) &&
-	    ((skb->protocol == htons(ETH_P_FCOE)) ||
-	     (skb->protocol == htons(ETH_P_FIP)))) {
-		txq &= (adapter->ring_feature[RING_F_FCOE].indices - 1);
-		txq += adapter->ring_feature[RING_F_FCOE].mask;
-		return txq;
-	}
-#endif
 	if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
 		if (skb->priority == TC_PRIO_CONTROL)
 			txq = adapter->ring_feature[RING_F_DCB].indices-1;
@@ -6205,18 +6211,15 @@ static netdev_tx_t ixgbe_xmit_frame(struct sk_buff *skb,
 	tx_ring = adapter->tx_ring[skb->queue_mapping];
 
 #ifdef IXGBE_FCOE
-	if (adapter->flags & IXGBE_FLAG_FCOE_ENABLED) {
-#ifdef CONFIG_IXGBE_DCB
-		/* for FCoE with DCB, we force the priority to what
-		 * was specified by the switch */
-		if ((skb->protocol == htons(ETH_P_FCOE)) ||
-		    (skb->protocol == htons(ETH_P_FIP))) {
-			tx_flags &= ~(IXGBE_TX_FLAGS_VLAN_PRIO_MASK
-				      << IXGBE_TX_FLAGS_VLAN_SHIFT);
-			tx_flags |= ((adapter->fcoe.up << 13)
-				     << IXGBE_TX_FLAGS_VLAN_SHIFT);
-		}
-#endif
+	/* for FCoE with DCB, we force the priority to what
+	 * was specified by the switch */
+	if (adapter->flags & IXGBE_FLAG_FCOE_ENABLED &&
+	    (skb->protocol == htons(ETH_P_FCOE) ||
+	     skb->protocol == htons(ETH_P_FIP))) {
+		tx_flags &= ~(IXGBE_TX_FLAGS_VLAN_PRIO_MASK
+			      << IXGBE_TX_FLAGS_VLAN_SHIFT);
+		tx_flags |= ((adapter->fcoe.up << 13)
+			      << IXGBE_TX_FLAGS_VLAN_SHIFT);
 		/* flag for FCoE offloads */
 		if (skb->protocol == htons(ETH_P_FCOE))
 			tx_flags |= IXGBE_TX_FLAGS_FCOE;


^ permalink raw reply related

* ixgbe->bonding->vlan->bridge->ebtables&iptables causes memory corruption
From: Gyorgy Jeney @ 2010-07-27  7:15 UTC (permalink / raw)
  To: netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

Hi,

I  have a rather interesting network setup here.  First I have a dual
10Gbit ixgbe ethernet card, being bonded into one link, on which we
run a number of vlans, which are bridged with various virtual
machines.  Now, to protect the box itself and the virtual machines I
have a set of iptables and ebtables rules.  Whenever one of the
virtual machines start the kernel, which is 2.6.35-rc6, will reliably
panic.

This error needs both ebtables and iptables rules, without either one
or the other, things seem to work quite well.

The errors and backtraces are many and long, so I attached the full
dmes output, please say if you need more.

nog.

P.S.: Please Cc: me on replies as I'm subscribed to neither
linux-kernel, nor netdev.

[-- Attachment #2: dmesg_net3 --]
[-- Type: application/octet-stream, Size: 100394 bytes --]

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-07-27  8:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <20100726195714.GD27644@redhat.com>

Hello,

On 07/26/2010 09:57 PM, Michael S. Tsirkin wrote:
>> For freeze, it probably is okay but for stop, I think it's better to
>> keep the semantics straight forward.
> 
> What are the semantics then? What do we want stop followed
> by queue and flush to do?

One scenario I can think of is the following.

 kthread_worker allows kthreads to be attached and stopped anytime, so
 if the caller stops the current worker while flushing is pending and
 attaches a new worker, the flushing which was pending will never
 happen.

But, in general, it's nasty to allow execution and its completion to
be separated.  Things like that are likely to bite us back in obscure
ways.  I think it would be silly to have such oddity in generic code
when it can be avoided without too much trouble.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-07-27  8:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <20100726201907.GF27644@redhat.com>

Hello,

On 07/26/2010 10:19 PM, Michael S. Tsirkin wrote:
> Let's try to define what do we want to achieve then.  Do you want
> code that flushes workers not to block when workers are frozen? How
> will we handle work submitted when worker is frozen?

As I wrote earlier, it's not necessarily about correctness but rather
avoiding unnecessary surprises and of course flushing can and should
stall if the queue is frozen but let's not separate execution of a
work and its completion with something which can take undeterminate
amount of time.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: ixgbe->bonding->vlan->bridge->ebtables&iptables causes memory corruption
From: Eric Dumazet @ 2010-07-27  8:55 UTC (permalink / raw)
  To: Gyorgy Jeney; +Cc: netdev, linux-kernel
In-Reply-To: <AANLkTimEYWiz4d-go6PszKnR0U29FYfjvHeLbPavi-RH@mail.gmail.com>

Le mardi 27 juillet 2010 à 09:15 +0200, Gyorgy Jeney a écrit :
> Hi,
> 
> I  have a rather interesting network setup here.  First I have a dual
> 10Gbit ixgbe ethernet card, being bonded into one link, on which we
> run a number of vlans, which are bridged with various virtual
> machines.  Now, to protect the box itself and the virtual machines I
> have a set of iptables and ebtables rules.  Whenever one of the
> virtual machines start the kernel, which is 2.6.35-rc6, will reliably
> panic.
> 
> This error needs both ebtables and iptables rules, without either one
> or the other, things seem to work quite well.
> 
> The errors and backtraces are many and long, so I attached the full
> dmes output, please say if you need more.

Seems tricky :(

Could you send  your {eb/ip}tables rules ?

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Kay Sievers @ 2010-07-27  9:10 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric W. Biederman, Greg KH, Johannes Berg, Andrew Morton,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev
In-Reply-To: <20100726180941.GB4883@kroah.com>

On Mon, Jul 26, 2010 at 20:09, Greg KH <greg@kroah.com> wrote:
>> Does this version of the change look less bleh worthy?  The effect is
>> the same as my previous patch but the test is more abstract so the
>> effect is not strictly limited to /sys/class/net.
>
> What other class type has a namespace at this point in time?
> Essentially this is the same exact thing, just in a different format
> that obfuscates what you are really doing here.

The patch still looks broken, and does not belong into the core the
way it is done. We denied hacks like this for good reason. But
out-of-the-blue it was a bluetooth naming regression to fix in the
driver core? Interesting!

If someone is going to add namespaces to 'block' or 'input', the sysfs
layout will break, and userspace will be unable to handle the
resulting changes.

Kay

^ permalink raw reply

* [PATCH] ks8842: Support DMA when accessed via timberdale
From: Richard Röjfors @ 2010-07-27  9:18 UTC (permalink / raw)
  To: netdev; +Cc: davem

This patch adds support for RX and TX DMA via the DMA API,
this is only supported when the KS8842 is accessed via timberdale.

There is no support for DMA on the generic bus interface it self,
a state machine inside the FPGA is handling RX and TX transfers to/from
buffers in the FPGA. The host CPU can do DMA to and from these buffers.

The FPGA has to handle the RX interrupts, so these must be enabled in
the ks8842 but not in the FPGA. The driver must not disable the RX interrupt
that would mean that the data transfers into the FPGA buffers would stop.

The host shall not enable TX interrupts since TX is handled by the FPGA,
the host is notified by DMA callbacks when transfers are finished.

Which DMA channels to use are added as parameters in the platform data struct.

Signed-off-by: Richard Röjfors <richard.rojfors@pelagicore.com>
---
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 70b98e7..386e283 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1757,6 +1757,16 @@ config KS8842
 	 ethernet switch chip (managed, VLAN, QoS) from Micrel or
 	 Timberdale(FPGA).
 
+config KS8842_TIMB_DMA
+	bool "Use Timberdale specific DMA engine"
+	depends on KS8842 && MFD_TIMBERDALE
+	select CONFIG_TIMB_DMA
+	help
+	  This option enables usage of the timberdale specific DMA engine
+	  for the KS8842 driver. Rather than using PIO which results in
+	  single accesses over PCIe, the DMA block of the timberdale FPGA
+	  will burst data to and from the KS8842.
+
 config KS8851
        tristate "Micrel KS8851 SPI"
        depends on SPI
diff --git a/drivers/net/ks8842.c b/drivers/net/ks8842.c
index 289b0be..c59c8e1 100644
--- a/drivers/net/ks8842.c
+++ b/drivers/net/ks8842.c
@@ -30,6 +30,11 @@
 #include <linux/etherdevice.h>
 #include <linux/ethtool.h>
 #include <linux/ks8842.h>
+#ifdef CONFIG_KS8842_TIMB_DMA
+#include <linux/dmaengine.h>
+#include <linux/dma-mapping.h>
+#include <linux/scatterlist.h>
+#endif
 
 #define DRV_NAME "ks8842"
 
@@ -82,6 +87,17 @@
 #define IRQ_RX_ERROR	0x0080
 #define ENABLED_IRQS	(IRQ_LINK_CHANGE | IRQ_TX | IRQ_RX | IRQ_RX_STOPPED | \
 		IRQ_TX_STOPPED | IRQ_RX_OVERRUN | IRQ_RX_ERROR)
+#ifdef CONFIG_KS8842_TIMB_DMA
+	/* When running via timberdale in DMA mode, the RX interrupt should be
+	   enabled in the KS8842, but not in the FPGA IP, since the IP handles
+	   RX DMA internally.
+	   TX interrupts are not needed it is handled by the FPGA the driver is
+	   notified via DMA callbacks.
+	*/
+	#define ENABLED_IRQS_IP	(IRQ_LINK_CHANGE | IRQ_RX_STOPPED | \
+		IRQ_TX_STOPPED | IRQ_RX_OVERRUN | IRQ_RX_ERROR)
+	#define ENABLED_IRQS_DMA	(ENABLED_IRQS_IP | IRQ_RX)
+#endif
 #define REG_ISR		0x02
 #define REG_RXSR	0x04
 #define RXSR_VALID	0x8000
@@ -124,6 +140,31 @@
 #define	MICREL_KS884X		0x01	/* 0=Timeberdale(FPGA), 1=Micrel */
 #define	KS884X_16BIT		0x02	/*  1=16bit, 0=32bit */
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+#define DMA_BUFFER_SIZE		2048
+
+struct ks8842_tx_dma_ctl {
+	struct dma_chan *chan;
+	struct dma_async_tx_descriptor *adesc;
+	void *buf;
+	struct scatterlist sg;
+	int channel;
+};
+
+struct ks8842_rx_dma_ctl {
+	struct dma_chan *chan;
+	struct dma_async_tx_descriptor *adesc;
+	struct sk_buff  *skb;
+	struct scatterlist sg;
+	struct tasklet_struct tasklet;
+	int channel;
+};
+
+#define KS8842_USE_DMA(adapter) (((adapter)->dma_tx.channel != -1) && \
+	 ((adapter)->dma_rx.channel != -1))
+
+#endif
+
 struct ks8842_adapter {
 	void __iomem	*hw_addr;
 	int		irq;
@@ -132,8 +173,23 @@ struct ks8842_adapter {
 	spinlock_t	lock; /* spinlock to be interrupt safe */
 	struct work_struct timeout_work;
 	struct net_device *netdev;
+#ifdef CONFIG_KS8842_TIMB_DMA
+	struct device *dev;
+	struct ks8842_tx_dma_ctl	dma_tx;
+	struct ks8842_rx_dma_ctl	dma_rx;
+#endif
 };
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+static void ks8842_dma_rx_cb(void *data);
+static void ks8842_dma_tx_cb(void *data);
+
+static inline void ks8842_resume_dma(struct ks8842_adapter *adapter)
+{
+	iowrite32(1, adapter->hw_addr + REQ_TIMB_DMA_RESUME);
+}
+#endif
+
 static inline void ks8842_select_bank(struct ks8842_adapter *adapter, u16 bank)
 {
 	iowrite16(bank, adapter->hw_addr + REG_SELECT_BANK);
@@ -297,8 +353,23 @@ static void ks8842_reset_hw(struct ks8842_adapter *adapter)
 	ks8842_write16(adapter, 18, 0xffff, REG_ISR);
 
 	/* enable interrupts */
-	ks8842_write16(adapter, 18, ENABLED_IRQS, REG_IER);
-
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter)) {
+		/* When running in DMA Mode the RX interrupt is not enabled in
+		   timberdale because RX data is received by DMA callbacks
+		   it must still be enabled in the KS8842 because it indicates
+		   to timberdale when there is RX data for it's DMA FIFOs */
+		iowrite16(ENABLED_IRQS_IP, adapter->hw_addr + REG_TIMB_IER);
+		ks8842_write16(adapter, 18, ENABLED_IRQS_DMA, REG_IER);
+	} else {
+#endif
+		if (!(adapter->conf_flags & MICREL_KS884X))
+			iowrite16(ENABLED_IRQS,
+				adapter->hw_addr + REG_TIMB_IER);
+		ks8842_write16(adapter, 18, ENABLED_IRQS, REG_IER);
+#ifdef CONFIG_KS8842_TIMB_DMA
+	}
+#endif
 	/* enable the switch */
 	ks8842_write16(adapter, 32, 0x1, REG_SW_ID_AND_ENABLE);
 }
@@ -371,6 +442,55 @@ static inline u16 ks8842_tx_fifo_space(struct ks8842_adapter *adapter)
 	return ks8842_read16(adapter, 16, REG_TXMIR) & 0x1fff;
 }
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+static int ks8842_tx_frame_dma(struct sk_buff *skb, struct net_device *netdev)
+{
+	struct ks8842_adapter *adapter = netdev_priv(netdev);
+	struct ks8842_tx_dma_ctl *ctl = &adapter->dma_tx;
+	u8 *buf = ctl->buf;
+
+	if (ctl->adesc) {
+		netdev_dbg(netdev, "%s: TX ongoing\n", __func__);
+		/* transfer ongoing */
+		return NETDEV_TX_BUSY;
+	}
+
+	sg_dma_len(&ctl->sg) = skb->len + sizeof(u32);
+
+	/* copy data to the TX buffer */
+	/* the control word, enable IRQ, port 1 and the length */
+	*buf++ = 0x00;
+	*buf++ = 0x01; /* Port 1 */
+	*buf++ = skb->len & 0xff;
+	*buf++ = (skb->len >> 8) & 0xff;
+	skb_copy_from_linear_data(skb, buf, skb->len);
+
+	dma_sync_single_range_for_device(adapter->dev,
+		sg_dma_address(&ctl->sg), 0, sg_dma_len(&ctl->sg),
+		DMA_TO_DEVICE);
+
+	/* make sure the length is a multiple of 4 */
+	if (sg_dma_len(&ctl->sg) % 4)
+		sg_dma_len(&ctl->sg) += 4 - sg_dma_len(&ctl->sg) % 4;
+
+	ctl->adesc = ctl->chan->device->device_prep_slave_sg(ctl->chan,
+		&ctl->sg, 1, DMA_TO_DEVICE,
+		DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+	if (!ctl->adesc)
+		return NETDEV_TX_BUSY;
+
+	ctl->adesc->callback_param = netdev;
+	ctl->adesc->callback = ks8842_dma_tx_cb;
+	ctl->adesc->tx_submit(ctl->adesc);
+
+	netdev->stats.tx_bytes += skb->len;
+
+	dev_kfree_skb(skb);
+
+	return NETDEV_TX_OK;
+}
+#endif
+
 static int ks8842_tx_frame(struct sk_buff *skb, struct net_device *netdev)
 {
 	struct ks8842_adapter *adapter = netdev_priv(netdev);
@@ -422,6 +542,123 @@ static int ks8842_tx_frame(struct sk_buff *skb, struct net_device *netdev)
 	return NETDEV_TX_OK;
 }
 
+static void ks8842_update_rx_err_counters(struct net_device *netdev, u32 status)
+{
+	netdev_dbg(netdev, "RX error, status: %x\n", status);
+
+	netdev->stats.rx_errors++;
+	if (status & RXSR_TOO_LONG)
+		netdev->stats.rx_length_errors++;
+	if (status & RXSR_CRC_ERROR)
+		netdev->stats.rx_crc_errors++;
+	if (status & RXSR_RUNT)
+		netdev->stats.rx_frame_errors++;
+}
+
+static void ks8842_update_rx_counters(struct net_device *netdev, u32 status,
+	int len)
+{
+	netdev_dbg(netdev, "RX packet, len: %d\n", len);
+
+	netdev->stats.rx_packets++;
+	netdev->stats.rx_bytes += len;
+	if (status & RXSR_MULTICAST)
+		netdev->stats.multicast++;
+}
+
+#ifdef CONFIG_KS8842_TIMB_DMA
+static int __ks8842_start_new_rx_dma(struct net_device *netdev)
+{
+	struct ks8842_adapter *adapter = netdev_priv(netdev);
+	struct ks8842_rx_dma_ctl *ctl = &adapter->dma_rx;
+	struct scatterlist *sg = &ctl->sg;
+	int err;
+
+	ctl->skb = netdev_alloc_skb(netdev, DMA_BUFFER_SIZE);
+	if (ctl->skb) {
+		sg_init_table(sg, 1);
+		sg_dma_address(sg) = dma_map_single(adapter->dev,
+			ctl->skb->data, DMA_BUFFER_SIZE, DMA_FROM_DEVICE);
+		err = dma_mapping_error(adapter->dev, sg_dma_address(sg));
+		if (unlikely(err)) {
+			sg_dma_address(sg) = 0;
+			goto out;
+		}
+
+		sg_dma_len(sg) = DMA_BUFFER_SIZE;
+
+		ctl->adesc = ctl->chan->device->device_prep_slave_sg(ctl->chan,
+			sg, 1, DMA_FROM_DEVICE,
+			DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+
+		if (!ctl->adesc)
+			goto out;
+
+		ctl->adesc->callback_param = netdev;
+		ctl->adesc->callback = ks8842_dma_rx_cb;
+		ctl->adesc->tx_submit(ctl->adesc);
+	} else {
+		err = -ENOMEM;
+		sg_dma_address(sg) = 0;
+		goto out;
+	}
+
+	return err;
+out:
+	if (sg_dma_address(sg))
+		dma_unmap_single(adapter->dev, sg_dma_address(sg),
+			DMA_BUFFER_SIZE, DMA_FROM_DEVICE);
+	sg_dma_address(sg) = 0;
+	if (ctl->skb)
+		dev_kfree_skb(ctl->skb);
+
+	ctl->skb = NULL;
+
+	printk(KERN_ERR DRV_NAME": Failed to start RX DMA: %d\n", err);
+	return err;
+}
+
+static void ks8842_rx_frame_dma_tasklet(unsigned long arg)
+{
+	struct net_device *netdev = (struct net_device *)arg;
+	struct ks8842_adapter *adapter = netdev_priv(netdev);
+	struct ks8842_rx_dma_ctl *ctl = &adapter->dma_rx;
+	struct sk_buff *skb = ctl->skb;
+	dma_addr_t addr = sg_dma_address(&ctl->sg);
+	u32 status;
+
+	ctl->adesc = NULL;
+
+	/* kick next transfer going */
+	__ks8842_start_new_rx_dma(netdev);
+
+	/* now handle the data we got */
+	dma_unmap_single(adapter->dev, addr, DMA_BUFFER_SIZE, DMA_FROM_DEVICE);
+
+	status = *((u32 *)skb->data);
+
+	netdev_dbg(netdev, "%s - rx_data: status: %x\n",
+		__func__, status & 0xffff);
+
+	/* check the status */
+	if ((status & RXSR_VALID) && !(status & RXSR_ERROR)) {
+		int len = (status >> 16) & 0x7ff;
+
+		ks8842_update_rx_counters(netdev, status, len);
+
+		/* reserve 4 bytes which is the status word */
+		skb_reserve(skb, 4);
+		skb_put(skb, len);
+
+		skb->protocol = eth_type_trans(skb, netdev);
+		netif_rx(skb);
+	} else {
+		ks8842_update_rx_err_counters(netdev, status);
+		dev_kfree_skb(skb);
+	}
+}
+#endif
+
 static void ks8842_rx_frame(struct net_device *netdev,
 	struct ks8842_adapter *adapter)
 {
@@ -445,13 +682,9 @@ static void ks8842_rx_frame(struct net_device *netdev,
 	if ((status & RXSR_VALID) && !(status & RXSR_ERROR)) {
 		struct sk_buff *skb = netdev_alloc_skb_ip_align(netdev, len);
 
-		netdev_dbg(netdev, "%s, got package, len: %d\n", __func__, len);
 		if (skb) {
 
-			netdev->stats.rx_packets++;
-			netdev->stats.rx_bytes += len;
-			if (status & RXSR_MULTICAST)
-				netdev->stats.multicast++;
+			ks8842_update_rx_counters(netdev, status, len);
 
 			if (adapter->conf_flags & KS884X_16BIT) {
 				u16 *data16 = (u16 *)skb_put(skb, len);
@@ -477,16 +710,8 @@ static void ks8842_rx_frame(struct net_device *netdev,
 			netif_rx(skb);
 		} else
 			netdev->stats.rx_dropped++;
-	} else {
-		netdev_dbg(netdev, "RX error, status: %x\n", status);
-		netdev->stats.rx_errors++;
-		if (status & RXSR_TOO_LONG)
-			netdev->stats.rx_length_errors++;
-		if (status & RXSR_CRC_ERROR)
-			netdev->stats.rx_crc_errors++;
-		if (status & RXSR_RUNT)
-			netdev->stats.rx_frame_errors++;
-	}
+	} else
+		ks8842_update_rx_err_counters(netdev, status);
 
 	/* set high watermark to 3K */
 	ks8842_clear_bits(adapter, 0, 1 << 12, REG_QRFCR);
@@ -541,6 +766,14 @@ void ks8842_tasklet(unsigned long arg)
 	isr = ks8842_read16(adapter, 18, REG_ISR);
 	netdev_dbg(netdev, "%s - ISR: 0x%x\n", __func__, isr);
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+	/* when running in DMA mode, do not ack RX interrupts, it is handled
+	   internally by timberdale, otherwise it's DMA FIFO:s would stop
+	*/
+	if (KS8842_USE_DMA(adapter))
+		isr &= ~IRQ_RX;
+#endif
+
 	/* Ack */
 	ks8842_write16(adapter, 18, isr, REG_ISR);
 
@@ -554,9 +787,14 @@ void ks8842_tasklet(unsigned long arg)
 	if (isr & IRQ_LINK_CHANGE)
 		ks8842_update_link_status(netdev, adapter);
 
+	/* should not get IRQ_RX when running DMA mode */
 	if (isr & (IRQ_RX | IRQ_RX_ERROR))
-		ks8842_handle_rx(netdev, adapter);
+#ifdef CONFIG_KS8842_TIMB_DMA
+		if (!KS8842_USE_DMA(adapter))
+#endif
+			ks8842_handle_rx(netdev, adapter);
 
+	/* should only happen when in PIO mode */
 	if (isr & IRQ_TX)
 		ks8842_handle_tx(netdev, adapter);
 
@@ -575,8 +813,19 @@ void ks8842_tasklet(unsigned long arg)
 
 	/* re-enable interrupts, put back the bank selection register */
 	spin_lock_irqsave(&adapter->lock, flags);
-	ks8842_write16(adapter, 18, ENABLED_IRQS, REG_IER);
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter))
+		ks8842_write16(adapter, 18, ENABLED_IRQS_DMA, REG_IER);
+	else
+#endif
+		ks8842_write16(adapter, 18, ENABLED_IRQS, REG_IER);
 	iowrite16(entry_bank, adapter->hw_addr + REG_SELECT_BANK);
+#ifdef CONFIG_KS8842_TIMB_DMA
+	/* Make sure timberdale continues DMA operations, they are stopped while
+	   we are handling the ks8842 because we might change bank */
+	if (KS8842_USE_DMA(adapter))
+		ks8842_resume_dma(adapter);
+#endif
 	spin_unlock_irqrestore(&adapter->lock, flags);
 }
 
@@ -592,8 +841,14 @@ static irqreturn_t ks8842_irq(int irq, void *devid)
 	netdev_dbg(netdev, "%s - ISR: 0x%x\n", __func__, isr);
 
 	if (isr) {
-		/* disable IRQ */
-		ks8842_write16(adapter, 18, 0x00, REG_IER);
+#ifdef CONFIG_KS8842_TIMB_DMA
+		if (KS8842_USE_DMA(adapter))
+			/* disable all but RX IRQ, since the FPGA relies on it*/
+			ks8842_write16(adapter, 18, IRQ_RX, REG_IER);
+		else
+#endif
+			/* disable IRQ */
+			ks8842_write16(adapter, 18, 0x00, REG_IER);
 
 		/* schedule tasklet */
 		tasklet_schedule(&adapter->tasklet);
@@ -602,10 +857,154 @@ static irqreturn_t ks8842_irq(int irq, void *devid)
 	}
 
 	iowrite16(entry_bank, adapter->hw_addr + REG_SELECT_BANK);
-
+#ifdef CONFIG_KS8842_TIMB_DMA
+	/* After an interrupt, tell timberdale to continue DMA operations.
+	   DMA is disabled while we are handling the ks8842 because we might
+	   change bank */
+	ks8842_resume_dma(adapter);
+#endif
 	return ret;
 }
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+static void ks8842_dma_rx_cb(void *data)
+{
+	struct net_device	*netdev = data;
+	struct ks8842_adapter	*adapter = netdev_priv(netdev);
+
+	netdev_dbg(netdev, "RX DMA finished\n");
+	/* schedule tasklet */
+	if (adapter->dma_rx.adesc)
+		tasklet_schedule(&adapter->dma_rx.tasklet);
+}
+
+static void ks8842_dma_tx_cb(void *data)
+{
+	struct net_device		*netdev = data;
+	struct ks8842_adapter		*adapter = netdev_priv(netdev);
+	struct ks8842_tx_dma_ctl	*ctl = &adapter->dma_tx;
+
+	netdev_dbg(netdev, "TX DMA finished\n");
+
+	if (!ctl->adesc)
+		return;
+
+	netdev->stats.tx_packets++;
+	ctl->adesc = NULL;
+
+	if (netif_queue_stopped(netdev))
+		netif_wake_queue(netdev);
+}
+
+static void ks8842_stop_dma(struct ks8842_adapter *adapter)
+{
+	struct ks8842_tx_dma_ctl *tx_ctl = &adapter->dma_tx;
+	struct ks8842_rx_dma_ctl *rx_ctl = &adapter->dma_rx;
+
+	tx_ctl->adesc = NULL;
+	if (tx_ctl->chan)
+		tx_ctl->chan->device->device_control(tx_ctl->chan,
+			DMA_TERMINATE_ALL, 0);
+
+	rx_ctl->adesc = NULL;
+	if (rx_ctl->chan)
+		rx_ctl->chan->device->device_control(rx_ctl->chan,
+			DMA_TERMINATE_ALL, 0);
+
+	if (sg_dma_address(&rx_ctl->sg))
+		dma_unmap_single(adapter->dev, sg_dma_address(&rx_ctl->sg),
+			DMA_BUFFER_SIZE, DMA_FROM_DEVICE);
+	sg_dma_address(&rx_ctl->sg) = 0;
+
+	dev_kfree_skb(rx_ctl->skb);
+	rx_ctl->skb = NULL;
+}
+
+static void ks8842_dealloc_dma_bufs(struct ks8842_adapter *adapter)
+{
+	struct ks8842_tx_dma_ctl *tx_ctl = &adapter->dma_tx;
+	struct ks8842_rx_dma_ctl *rx_ctl = &adapter->dma_rx;
+
+	ks8842_stop_dma(adapter);
+
+	if (tx_ctl->chan)
+		dma_release_channel(tx_ctl->chan);
+	tx_ctl->chan = NULL;
+
+	if (rx_ctl->chan)
+		dma_release_channel(rx_ctl->chan);
+	rx_ctl->chan = NULL;
+
+	tasklet_kill(&rx_ctl->tasklet);
+
+	if (sg_dma_address(&tx_ctl->sg))
+		dma_unmap_single(adapter->dev, sg_dma_address(&tx_ctl->sg),
+			DMA_BUFFER_SIZE, DMA_TO_DEVICE);
+	sg_dma_address(&tx_ctl->sg) = 0;
+
+	kfree(tx_ctl->buf);
+	tx_ctl->buf = NULL;
+}
+
+static bool ks8842_dma_filter_fn(struct dma_chan *chan, void *filter_param)
+{
+	return chan->chan_id == (int)filter_param;
+}
+
+static int ks8842_alloc_dma_bufs(struct net_device *netdev)
+{
+	struct ks8842_adapter *adapter = netdev_priv(netdev);
+	struct ks8842_tx_dma_ctl *tx_ctl = &adapter->dma_tx;
+	struct ks8842_rx_dma_ctl *rx_ctl = &adapter->dma_rx;
+	int err;
+
+	dma_cap_mask_t mask;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+	dma_cap_set(DMA_PRIVATE, mask);
+
+	sg_init_table(&tx_ctl->sg, 1);
+
+	tx_ctl->chan = dma_request_channel(mask, ks8842_dma_filter_fn,
+		(void *)tx_ctl->channel);
+	if (!tx_ctl->chan) {
+		err = -ENODEV;
+		goto err;
+	}
+
+	/* allocate DMA buffer */
+	tx_ctl->buf = kmalloc(DMA_BUFFER_SIZE, GFP_KERNEL);
+	if (!tx_ctl->buf) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	sg_dma_address(&tx_ctl->sg) = dma_map_single(adapter->dev,
+		tx_ctl->buf, DMA_BUFFER_SIZE, DMA_TO_DEVICE);
+	err = dma_mapping_error(adapter->dev,
+		sg_dma_address(&tx_ctl->sg));
+	if (err) {
+		sg_dma_address(&tx_ctl->sg) = 0;
+		goto err;
+	}
+
+	rx_ctl->chan = dma_request_channel(mask, ks8842_dma_filter_fn,
+		(void *)rx_ctl->channel);
+	if (!rx_ctl->chan) {
+		err = -ENODEV;
+		goto err;
+	}
+
+	tasklet_init(&rx_ctl->tasklet, ks8842_rx_frame_dma_tasklet,
+		(unsigned long)netdev);
+
+	return 0;
+err:
+	ks8842_dealloc_dma_bufs(adapter);
+	return err;
+}
+#endif
 
 /* Netdevice operations */
 
@@ -616,6 +1015,27 @@ static int ks8842_open(struct net_device *netdev)
 
 	netdev_dbg(netdev, "%s - entry\n", __func__);
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter)) {
+		err = ks8842_alloc_dma_bufs(netdev);
+
+		if (!err) {
+			/* start RX dma */
+			err = __ks8842_start_new_rx_dma(netdev);
+			if (err)
+				ks8842_dealloc_dma_bufs(adapter);
+		}
+
+		if (err) {
+			printk(KERN_WARNING DRV_NAME
+				": Failed to initiate DMA, running PIO\n");
+			ks8842_dealloc_dma_bufs(adapter);
+			adapter->dma_rx.channel = -1;
+			adapter->dma_tx.channel = -1;
+		}
+	}
+#endif
+
 	/* reset the HW */
 	ks8842_reset_hw(adapter);
 
@@ -641,6 +1061,11 @@ static int ks8842_close(struct net_device *netdev)
 
 	cancel_work_sync(&adapter->timeout_work);
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter))
+		ks8842_dealloc_dma_bufs(adapter);
+#endif
+
 	/* free the irq */
 	free_irq(adapter->irq, netdev);
 
@@ -658,6 +1083,19 @@ static netdev_tx_t ks8842_xmit_frame(struct sk_buff *skb,
 
 	netdev_dbg(netdev, "%s: entry\n", __func__);
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter)) {
+		unsigned long flags;
+		ret = ks8842_tx_frame_dma(skb, netdev);
+		/* for now only allow one transfer at the time */
+		spin_lock_irqsave(&adapter->lock, flags);
+		if (adapter->dma_tx.adesc)
+			netif_stop_queue(netdev);
+		spin_unlock_irqrestore(&adapter->lock, flags);
+		return ret;
+	}
+#endif
+
 	ret = ks8842_tx_frame(skb, netdev);
 
 	if (ks8842_tx_fifo_space(adapter) <  netdev->mtu + 8)
@@ -693,6 +1131,10 @@ static void ks8842_tx_timeout_work(struct work_struct *work)
 	netdev_dbg(netdev, "%s: entry\n", __func__);
 
 	spin_lock_irqsave(&adapter->lock, flags);
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter))
+		ks8842_stop_dma(adapter);
+#endif
 	/* disable interrupts */
 	ks8842_write16(adapter, 18, 0, REG_IER);
 	ks8842_write16(adapter, 18, 0xFFFF, REG_ISR);
@@ -706,6 +1148,11 @@ static void ks8842_tx_timeout_work(struct work_struct *work)
 	ks8842_write_mac_addr(adapter, netdev->dev_addr);
 
 	ks8842_update_link_status(netdev, adapter);
+
+#ifdef CONFIG_KS8842_TIMB_DMA
+	if (KS8842_USE_DMA(adapter))
+		__ks8842_start_new_rx_dma(netdev);
+#endif
 }
 
 static void ks8842_tx_timeout(struct net_device *netdev)
@@ -765,6 +1212,20 @@ static int __devinit ks8842_probe(struct platform_device *pdev)
 		goto err_get_irq;
 	}
 
+#ifdef CONFIG_KS8842_TIMB_DMA
+	adapter->dev = (pdev->dev.parent) ? pdev->dev.parent : &pdev->dev;
+
+	/* DMA is only supported when accessed via timberdale */
+	if (!(adapter->conf_flags & MICREL_KS884X) && pdata &&
+		(pdata->tx_dma_channel != -1) &&
+		(pdata->rx_dma_channel != -1)) {
+		adapter->dma_rx.channel = pdata->rx_dma_channel;
+		adapter->dma_tx.channel = pdata->tx_dma_channel;
+	} else {
+		adapter->dma_rx.channel = -1;
+		adapter->dma_tx.channel = -1;
+	}
+#endif
 	tasklet_init(&adapter->tasklet, ks8842_tasklet, (unsigned long)netdev);
 	spin_lock_init(&adapter->lock);
 
diff --git a/include/linux/ks8842.h b/include/linux/ks8842.h
index da0341b..953b3fb 100644
--- a/include/linux/ks8842.h
+++ b/include/linux/ks8842.h
@@ -25,10 +25,14 @@
  * struct ks8842_platform_data - Platform data of the KS8842 network driver
  * @macaddr:	The MAC address of the device, set to all 0:s to use the on in
  *		the chip.
+ * @rx_dma_channel:	The DMA channel to use for RX, -1 for none.
+ * @tx_dma_channel:	The DMA channel to use for RX, -1 for none.
  *
  */
 struct ks8842_platform_data {
 	u8 macaddr[ETH_ALEN];
+	int rx_dma_channel;
+	int tx_dma_channel;
 };
 
 #endif


^ permalink raw reply related

* [net-next-2.6 PATCH] igbvf, ixgbevf: use dev_hw_addr_random
From: Stefan Assmann @ 2010-07-27  9:24 UTC (permalink / raw)
  To: netdev; +Cc: e1000-devel, jeffrey.t.kirsher, Rose, Gregory V, David Miller

From: Stefan Assmann <sassmann@redhat.com>

Both igbvf and ixgbevf should set addr_assign_type to NET_ADDR_RANDOM
so udev creates persistent net rules by matching the device path.
Do this by using the dev_hw_addr_random helper function.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
---
 drivers/net/igbvf/netdev.c         |    2 +-
 drivers/net/ixgbevf/ixgbevf_main.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/igbvf/netdev.c b/drivers/net/igbvf/netdev.c
index 5e2b2a8..048595b 100644
--- a/drivers/net/igbvf/netdev.c
+++ b/drivers/net/igbvf/netdev.c
@@ -2751,7 +2751,7 @@ static int __devinit igbvf_probe(struct pci_dev *pdev,
 		dev_info(&pdev->dev,
 			 "PF still in reset state, assigning new address."
 			 " Is the PF interface up?\n");
-		random_ether_addr(hw->mac.addr);
+		dev_hw_addr_random(adapter->netdev, hw->mac.addr);
 	} else {
 		err = hw->mac.ops.read_mac_addr(hw);
 		if (err) {
diff --git a/drivers/net/ixgbevf/ixgbevf_main.c b/drivers/net/ixgbevf/ixgbevf_main.c
index af49135..4867440 100644
--- a/drivers/net/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ixgbevf/ixgbevf_main.c
@@ -2229,7 +2229,7 @@ static int __devinit ixgbevf_sw_init(struct ixgbevf_adapter *adapter)
 	if (err) {
 		dev_info(&pdev->dev,
 		         "PF still in reset state, assigning new address\n");
-		random_ether_addr(hw->mac.addr);
+		dev_hw_addr_random(adapter->netdev, hw->mac.addr);
 	} else {
 		err = hw->mac.ops.init_hw(hw);
 		if (err) {
-- 
1.6.5.2


^ permalink raw reply related

* [patch -next] ixgbe: potential null dereference
From: Dan Carpenter @ 2010-07-27 10:05 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jesse Brandeburg, Bruce Allan, Alex Duyck, PJ Waskiewicz,
	John Ronciak, David S. Miller, Don Skidmore,
	Mallikarjuna R Chilakala, e1000-devel, netdev, kernel-janitors

The e_dev_err() macro dereferences "adapter" which is NULL here.

Signed-off-by: Dan Carpenter <error27@gmail.com>

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 9203759..0360260 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6549,8 +6549,8 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 			err = dma_set_coherent_mask(&pdev->dev,
 						    DMA_BIT_MASK(32));
 			if (err) {
-				e_dev_err("No usable DMA configuration, "
-					  "aborting\n");
+				dev_err(&pdev->dev,
+					"No usable DMA configuration, aborting\n");
 				goto err_dma;
 			}
 		}
@@ -6560,7 +6560,8 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 	err = pci_request_selected_regions(pdev, pci_select_bars(pdev,
 	                                   IORESOURCE_MEM), ixgbe_driver_name);
 	if (err) {
-		e_dev_err("pci_request_selected_regions failed 0x%x\n", err);
+		dev_err(&pdev->dev,
+			"pci_request_selected_regions failed 0x%x\n", err);
 		goto err_pci_reg;
 	}
 

^ permalink raw reply related

* [net-next 2/3] stmmac: fix timer setup when use dual mac Kconfig
From: Giuseppe CAVALLARO @ 2010-07-27 10:09 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1280225387-26240-1-git-send-email-peppe.cavallaro@st.com>

The driver erroneously sets the tmrate to zero when the
TMU initialisation fails. This actually generates problems
while using the dual GMAC configuration.

With this patch, enabling both the dual gmac and the timer
optimisation, the first interface opened will use the tmu
channel 2, the second one won't be able to use the timer but
will continue to work without mitigating the interrupts by
using the external timer (i.e. TMU channel 2).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/stmmac_main.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index 0bdd332..1083334 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -829,7 +829,6 @@ static int stmmac_open(struct net_device *dev)
 	 * In case of failure continue without timer. */
 	if (unlikely((stmmac_open_ext_timer(dev, priv->tm)) < 0)) {
 		pr_warning("stmmaceth: cannot attach the external timer.\n");
-		tmrate = 0;
 		priv->tm->freq = 0;
 		priv->tm->timer_start = stmmac_no_timer_started;
 		priv->tm->timer_stop = stmmac_no_timer_stopped;
-- 
1.5.5.6


^ permalink raw reply related

* [net-next 1/3] stmmac: remove the STMMAC_DUAL_MAC option
From: Giuseppe CAVALLARO @ 2010-07-27 10:09 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro

The STMMAC_DUAL_MAC is now removed from the driver's Kconfig.
It will be available from a specific STM boards Kconfig.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/Kconfig |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/drivers/net/stmmac/Kconfig b/drivers/net/stmmac/Kconfig
index eb63d44..2513555 100644
--- a/drivers/net/stmmac/Kconfig
+++ b/drivers/net/stmmac/Kconfig
@@ -20,15 +20,6 @@ config STMMAC_DA
 	  By default, the DMA arbitration scheme is based on Round-robin
 	  (rx:tx priority is 1:1).
 
-config STMMAC_DUAL_MAC
-	bool "STMMAC: dual mac support (EXPERIMENTAL)"
-	default n
-        depends on EXPERIMENTAL && STMMAC_ETH && !STMMAC_TIMER
-	help
-	  Some ST SoCs (for example the stx7141 and stx7200c2) have two
-	  Ethernet Controllers. This option turns on the second Ethernet
-	  device on this kind of platforms.
-
 config STMMAC_TIMER
 	bool "STMMAC Timer optimisation"
 	default n
-- 
1.5.5.6


^ permalink raw reply related

* [net-next 3/3] stmmac: fix automatic PAD/FCS stripping
From: Giuseppe CAVALLARO @ 2010-07-27 10:09 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1280225387-26240-2-git-send-email-peppe.cavallaro@st.com>

For Simple Ethernet frames (802.2 and 802.3) the GMAC Core
never strips pad and fcs. This means the ACS has no effect
on IPv4/6 frames.
The FL bits, in the RDES0, include the FCS so the driver
has to remove it in SW.
For 802.3 frame format with LLC or LLC-SNAP, when set the ACS
bit, the HW strips both PAD and FCS.
The FL bits, in the RDES0, actually represents the frame length
already stripped.
This patch fixes this logic within the device driver that
erroneously removed 4byte from 802.3 frames already stripped
corrupting the payload.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/common.h      |    1 +
 drivers/net/stmmac/dwmac1000.h   |    2 +-
 drivers/net/stmmac/enh_desc.c    |    2 +-
 drivers/net/stmmac/stmmac_main.c |    8 ++++++--
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index 144f76f..66b9da0 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -108,6 +108,7 @@ enum rx_frame_status { /* IPC status */
 	good_frame = 0,
 	discard_frame = 1,
 	csum_none = 2,
+	llc_snap = 4,
 };
 
 enum tx_dma_irq_status {
diff --git a/drivers/net/stmmac/dwmac1000.h b/drivers/net/stmmac/dwmac1000.h
index d8d0f35..8b20b19 100644
--- a/drivers/net/stmmac/dwmac1000.h
+++ b/drivers/net/stmmac/dwmac1000.h
@@ -93,7 +93,7 @@ enum inter_frame_gap {
 #define GMAC_CONTROL_IPC	0x00000400 /* Checksum Offload */
 #define GMAC_CONTROL_DR		0x00000200 /* Disable Retry */
 #define GMAC_CONTROL_LUD	0x00000100 /* Link up/down */
-#define GMAC_CONTROL_ACS	0x00000080 /* Automatic Pad Stripping */
+#define GMAC_CONTROL_ACS	0x00000080 /* Automatic Pad/FCS Stripping */
 #define GMAC_CONTROL_DC		0x00000010 /* Deferral Check */
 #define GMAC_CONTROL_TE		0x00000008 /* Transmitter Enable */
 #define GMAC_CONTROL_RE		0x00000004 /* Receiver Enable */
diff --git a/drivers/net/stmmac/enh_desc.c b/drivers/net/stmmac/enh_desc.c
index 3c18ebe..f612f98 100644
--- a/drivers/net/stmmac/enh_desc.c
+++ b/drivers/net/stmmac/enh_desc.c
@@ -123,7 +123,7 @@ static int enh_desc_coe_rdes0(int ipc_err, int type, int payload_err)
 	 */
 	if (status == 0x0) {
 		CHIP_DBG(KERN_INFO "RX Des0 status: IEEE 802.3 Type frame.\n");
-		ret = good_frame;
+		ret = llc_snap;
 	} else if (status == 0x4) {
 		CHIP_DBG(KERN_INFO "RX Des0 status: IPv4/6 No CSUM errorS.\n");
 		ret = good_frame;
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index 1083334..bbb7951 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1216,9 +1216,13 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 			priv->dev->stats.rx_errors++;
 		else {
 			struct sk_buff *skb;
-			/* Length should omit the CRC */
-			int frame_len = priv->hw->desc->get_rx_frame_len(p) - 4;
+			int frame_len;
 
+			frame_len = priv->hw->desc->get_rx_frame_len(p);
+			/* ACS is set; GMAC core strips PAD/FCS for IEEE 802.3
+			 * Type frames (LLC/LLC-SNAP) */
+			if (unlikely(status != llc_snap))
+				frame_len -= ETH_FCS_LEN;
 #ifdef STMMAC_RX_DEBUG
 			if (frame_len > ETH_FRAME_LEN)
 				pr_debug("\tRX frame size %d, COE status: %d\n",
-- 
1.5.5.6


^ permalink raw reply related

* br_forward.c - rcu dereference warning
From: Johannes Berg @ 2010-07-27 10:45 UTC (permalink / raw)
  To: netdev

I couldn't find this reported yet, apologies if I missed it.

johannes

[   60.140433] ===================================================
[   60.140437] [ INFO: suspicious rcu_dereference_check() usage. ]
[   60.140440] ---------------------------------------------------
[   60.140444] /home/johannes/sys/wireless-testing/net/bridge/br_forward.c:215 invoked rcu_dereference_check() without protection!
[   60.140447] 
[   60.140448] other info that might help us debug this:
[   60.140449] 
[   60.140452] 
[   60.140453] rcu_scheduler_active = 1, debug_locks = 1
[   60.140457] 2 locks held by Xorg/3083:
[   60.140459]  #0:  (&im->timer){+.-...}, at: [<ffffffff81058180>] call_timer_fn+0x0/0x2f0
[   60.140473]  #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff813b6c3a>] dev_queue_xmit+0x5a/0x690
[   60.140484] 
[   60.140484] stack backtrace:
[   60.140489] Pid: 3083, comm: Xorg Not tainted 2.6.35-rc6-wl-47665-gc2e2180-dirty #174
[   60.140492] Call Trace:
[   60.140495]  <IRQ>  [<ffffffff8107e494>] lockdep_rcu_dereference+0xa4/0xc0
[   60.140514]  [<ffffffffa07617c3>] br_multicast_flood+0x293/0x310 [bridge]
[   60.140531]  [<ffffffffa0761877>] br_multicast_deliver+0x17/0x20 [bridge]
[   60.140539]  [<ffffffffa07607bc>] br_dev_xmit+0x10c/0x170 [bridge]
[   60.140550]  [<ffffffff813b69ba>] dev_hard_start_xmit+0x21a/0x2e0
[   60.140556]  [<ffffffff813b708e>] dev_queue_xmit+0x4ae/0x690
[   60.140576]  [<ffffffff813c0d43>] neigh_resolve_output+0x113/0x250
[   60.140582]  [<ffffffff813eec46>] ip_finish_output+0x2a6/0x570
[   60.140588]  [<ffffffff813ef33c>] ip_mc_output+0x1dc/0x320
[   60.140593]  [<ffffffff813ed7ed>] ip_local_out+0x2d/0x80
[   60.140600]  [<ffffffff81422236>] igmp_send_report+0x1c6/0x200
[   60.140610]  [<ffffffff814238f0>] igmp_timer_expire+0x100/0x130
[   60.140615]  [<ffffffff81058219>] call_timer_fn+0x99/0x2f0
[   60.140636]  [<ffffffff810585e3>] run_timer_softirq+0x173/0x330
[   60.140641]  [<ffffffff8104f114>] __do_softirq+0x114/0x3d0
[   60.140652]  [<ffffffff8100360c>] call_softirq+0x1c/0x50



^ permalink raw reply

* Tx queue selection
From: Benjamin Herrenschmidt @ 2010-07-27 10:51 UTC (permalink / raw)
  To: netdev

Hi folks !

I'm putting my newbie hat on ... :-)

While looking at our ehea driver (and in fact another upcoming driver
I'm helping with), I noticed it's using the "old style" multiqueue. IE.
It doesn't use the alloc_netdev_mq() variant, creates one queue on the
linux side, an makes its own selection of HW queue in start_xmit.

This had many drawbacks, obviously, such as not getting per-queue locks
etc...

Now, the mechanics of converting that to the new scheme are easy enough
to figure out by reading the code. However, where my lack of networking
background fails me is when it comes to the policy of choosing a Tx
queue.

ehea uses its own hash of the header, different from the "default" queue
selection in the net core. Looking at other drivers such as ixgbe, I see
that it can chose to use smp_processor_id() when a flag is set for which
I don't totally understand the meaning or default to the core algorithm.

Now, while I can understand why it's a good idea to use the current
processor, in order to limit cache ping pong etc... I'm not really
confident I understand the pro/cons of using the hashing for tx. I
understand that the net core can play interesting games with associating
sockets with queues etc... but I'm a bit at a loss when it comes to
deciding what's best for this driver. I suppose I could start by
implementing my own queue selection based on what ehea does today but I
have the nasty feeling that's going to be sub-optimal :-)

So I would very much appreciate (and reward with free beer at the next
conference) if somebody could give me a bit of a heads up on how things
are expected to be done there, pro/cons, perf impact etc...

Thanks in avance !

Cheers,
Ben.



^ permalink raw reply

* Re: Tx queue selection
From: Neil Horman @ 2010-07-27 11:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: netdev
In-Reply-To: <1280227867.1970.208.camel@pasglop>

On Tue, Jul 27, 2010 at 08:51:07PM +1000, Benjamin Herrenschmidt wrote:
> Hi folks !
> 
> I'm putting my newbie hat on ... :-)
> 
> While looking at our ehea driver (and in fact another upcoming driver
> I'm helping with), I noticed it's using the "old style" multiqueue. IE.
> It doesn't use the alloc_netdev_mq() variant, creates one queue on the
> linux side, an makes its own selection of HW queue in start_xmit.
> 
> This had many drawbacks, obviously, such as not getting per-queue locks
> etc...
> 
> Now, the mechanics of converting that to the new scheme are easy enough
> to figure out by reading the code. However, where my lack of networking
> background fails me is when it comes to the policy of choosing a Tx
> queue.
> 
> ehea uses its own hash of the header, different from the "default" queue
> selection in the net core. Looking at other drivers such as ixgbe, I see
> that it can chose to use smp_processor_id() when a flag is set for which
> I don't totally understand the meaning or default to the core algorithm.
> 
IIRC, that intels Flow Director feature.  I've not used it, but from what I
understand flow director is a technology that allows a card to isolate flows to
and from a socket to a single cpu.  i.e. the cpu which handles the transmission
of frames will be the cpu that handles frames destined for that flow as well.
To do this they do some additional analysis and steering configuration in the
driver using a user space utility as well.  I don't think it makes much sense to
use smp_processor_id in your tx hashing if you don't have some sort of specific
feature like that.  The other drivers which implement ndo_select_queue don't do
it, they hash on the skb header like the core does, and make modifications to
that based on hardware capabilities.

HTH
Neil

> 

^ permalink raw reply

* Re: Tx queue selection
From: Eric Dumazet @ 2010-07-27 11:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: netdev
In-Reply-To: <1280227867.1970.208.camel@pasglop>

Le mardi 27 juillet 2010 à 20:51 +1000, Benjamin Herrenschmidt a écrit :
> Hi folks !
> 
> I'm putting my newbie hat on ... :-)
> 
> While looking at our ehea driver (and in fact another upcoming driver
> I'm helping with), I noticed it's using the "old style" multiqueue. IE.
> It doesn't use the alloc_netdev_mq() variant, creates one queue on the
> linux side, an makes its own selection of HW queue in start_xmit.
> 
> This had many drawbacks, obviously, such as not getting per-queue locks
> etc...
> 
> Now, the mechanics of converting that to the new scheme are easy enough
> to figure out by reading the code. However, where my lack of networking
> background fails me is when it comes to the policy of choosing a Tx
> queue.
> 
> ehea uses its own hash of the header, different from the "default" queue
> selection in the net core. Looking at other drivers such as ixgbe, I see
> that it can chose to use smp_processor_id() when a flag is set for which
> I don't totally understand the meaning or default to the core algorithm.
> 
> Now, while I can understand why it's a good idea to use the current
> processor, in order to limit cache ping pong etc... I'm not really
> confident I understand the pro/cons of using the hashing for tx. I
> understand that the net core can play interesting games with associating
> sockets with queues etc... but I'm a bit at a loss when it comes to
> deciding what's best for this driver. I suppose I could start by
> implementing my own queue selection based on what ehea does today but I
> have the nasty feeling that's going to be sub-optimal :-)
> 
> So I would very much appreciate (and reward with free beer at the next
> conference) if somebody could give me a bit of a heads up on how things
> are expected to be done there, pro/cons, perf impact etc...

I am not sure ndo_select_queue() is really needed these days. It was
done before core network was able to use a socket provided hash.

tx queue selection done by default (skb_tx_hash()) should be fine.

bnx2 for example doesnt provide a ndo_select_queue()




^ permalink raw reply

* RE: nfs client hang
From: Eric Dumazet @ 2010-07-27 12:21 UTC (permalink / raw)
  To: Andy Chittenden
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev
In-Reply-To: <ABFC24E4C13D81489F7F624E14891C8607DDF8D1E4@uk-ex-mbx1.terastack.bluearc.com>

Le mardi 27 juillet 2010 à 11:53 +0100, Andy Chittenden a écrit :
> > >> IE the client starts a connection and then closes it again without sending data.
> > > Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
> > >
> > > ... lots of the following nfsv3 WRITE requests:
> > > [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
> > > [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
> > > [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
> > > [ 7680.520079] RPC: 33550 disabling timer
> > > [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> > > [ 7680.520089] RPC:       __rpc_wake_up_task done
> > > [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
> > > [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
> > > [ 7680.520103] RPC: 33550 call_connect_status (status -11)
> > > [ 7680.520108] RPC: 33550 call_transmit (status 0)
> > > [ 7680.520112] RPC: 33550 xprt_prepare_transmit
> > > [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
> > > [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
> > > [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> > > [ 7680.520136] RPC: 33550 xprt_transmit(32920)
> > > [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
> > > [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
> > > [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
> 
> > I changed that debug to output sk_shutdown too. That has a value of 2 
> > (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
> 
> >          err = -EPIPE;
> >          if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> >                  goto out_err;
> 
> > which correlates with the trace "xs_tcp_send_request(32920) = -32". So, 
> > this looks like a problem in the sockets/tcp layer. The rpc layer issues 
> > a shutdown and then reconnects using the same socket. So either 
> > sk_shutdown needs zeroing once the shutdown completes or should be 
> > zeroed on subsequent connect. The latter sounds safer.
> 
> This patch for 2.6.34.1 fixes the issue:
> 
> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
> @@ -2522,6 +2522,13 @@
>         struct tcp_sock *tp = tcp_sk(sk);
>         __u8 rcv_wscale;
>  
> +       /* clear down any previous shutdown attempts so that
> +        * reconnects on a socket that's been shutdown leave the
> +        * socket in a usable state (otherwise tcp_sendmsg() returns
> +        * -EPIPE).
> +        */
> +       sk->sk_shutdown = 0;
> +
>         /* We'll fix this up when we get a response from the other end.
>          * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>          */
> 
> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
> 

CC netdev

This reminds me a similar problem we had in the past, fixed with commit
1fdf475a (tcp: tcp_disconnect() should clear window_clamp)

But tcp_disconnect() already clears sk->sk_shutdown

If NFS calls tcp_disconnect(), then shutdown(), there is a problem.

Maybe xs_tcp_shutdown() should make some sanity tests ?

Following sequence is legal, and your patch might break it.

fd = socket(...);
shutdown(fd, SHUT_WR);
...
connect(fd, ...);




^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox