Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Bryan Schumaker @ 2011-04-13 20:42 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Trond Myklebust, Jiri Slaby, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4DA49F7F.8060005-AlSwsSmVLrQ@public.gmane.org>

On 04/12/2011 02:52 PM, Jiri Slaby wrote:
> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>> inacceptable for automounted NFS dirs.
>>>>
>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>> and then trying rpcsec_gss if and only if that fails.
>>>>
>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>> supported flavour?
>>>
>>> I don't know, I connect to a nfs server which is not maintained by me.
>>> It looks like that. How can I find out?
>>
>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
> 
> I don't have NFS in modules. It's all built-in. And this one is
> unconditionally selected because of CONFIG_NFS_V4.

Does this patch help?

- Bryan

We should attempt an AUTH_NULL style mount before
trying gss flavors.  This should prevent a hang if
gss modules are loaded but the userspace program
isn't running.

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9bf41ea..4e3c16b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2218,8 +2218,8 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
 	rpc_authflavor_t flav_array[NFS_MAX_SECFLAVORS + 2];
 
 	flav_array[0] = RPC_AUTH_UNIX;
-	len = gss_mech_list_pseudoflavors(&flav_array[1]);
-	flav_array[1+len] = RPC_AUTH_NULL;
+	flav_array[1] = RPC_AUTH_NULL;
+	len = gss_mech_list_pseudoflavors(&flav_array[2]);
 	len += 2;
 
 	for (i = 0; i < len; i++) {


> 
> regards,

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH net-next 1/5] tg3: Provide full regdump on tx timeout
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson, Michael Chan

The current amount of information provided in the output of a tx timeout
is insufficient to determine a root cause.  This patch replaces the
terse, four-register status output with a more complete body of
information.  For PCIe devices, the full register space is dumped.  For
other devices, select registers are dumped instead.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
 drivers/net/tg3.c |  189 ++++++++++++++++++++++++++++++++++-------------------
 drivers/net/tg3.h |    2 +
 2 files changed, 123 insertions(+), 68 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 9d7defc..7274435 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4459,6 +4459,123 @@ static inline int tg3_irq_sync(struct tg3 *tp)
 	return tp->irq_sync;
 }
 
+static inline void tg3_rd32_loop(struct tg3 *tp, u32 *dst, u32 off, u32 len)
+{
+	int i;
+
+	dst = (u32 *)((u8 *)dst + off);
+	for (i = 0; i < len; i += sizeof(u32))
+		*dst++ = tr32(off + i);
+}
+
+static void tg3_dump_legacy_regs(struct tg3 *tp, u32 *regs)
+{
+	tg3_rd32_loop(tp, regs, TG3PCI_VENDOR, 0xb0);
+	tg3_rd32_loop(tp, regs, MAILBOX_INTERRUPT_0, 0x200);
+	tg3_rd32_loop(tp, regs, MAC_MODE, 0x4f0);
+	tg3_rd32_loop(tp, regs, SNDDATAI_MODE, 0xe0);
+	tg3_rd32_loop(tp, regs, SNDDATAC_MODE, 0x04);
+	tg3_rd32_loop(tp, regs, SNDBDS_MODE, 0x80);
+	tg3_rd32_loop(tp, regs, SNDBDI_MODE, 0x48);
+	tg3_rd32_loop(tp, regs, SNDBDC_MODE, 0x04);
+	tg3_rd32_loop(tp, regs, RCVLPC_MODE, 0x20);
+	tg3_rd32_loop(tp, regs, RCVLPC_SELLST_BASE, 0x15c);
+	tg3_rd32_loop(tp, regs, RCVDBDI_MODE, 0x0c);
+	tg3_rd32_loop(tp, regs, RCVDBDI_JUMBO_BD, 0x3c);
+	tg3_rd32_loop(tp, regs, RCVDBDI_BD_PROD_IDX_0, 0x44);
+	tg3_rd32_loop(tp, regs, RCVDCC_MODE, 0x04);
+	tg3_rd32_loop(tp, regs, RCVBDI_MODE, 0x20);
+	tg3_rd32_loop(tp, regs, RCVCC_MODE, 0x14);
+	tg3_rd32_loop(tp, regs, RCVLSC_MODE, 0x08);
+	tg3_rd32_loop(tp, regs, MBFREE_MODE, 0x08);
+	tg3_rd32_loop(tp, regs, HOSTCC_MODE, 0x100);
+
+	if (tp->tg3_flags & TG3_FLAG_SUPPORT_MSIX)
+		tg3_rd32_loop(tp, regs, HOSTCC_RXCOL_TICKS_VEC1, 0x180);
+
+	tg3_rd32_loop(tp, regs, MEMARB_MODE, 0x10);
+	tg3_rd32_loop(tp, regs, BUFMGR_MODE, 0x58);
+	tg3_rd32_loop(tp, regs, RDMAC_MODE, 0x08);
+	tg3_rd32_loop(tp, regs, WDMAC_MODE, 0x08);
+	tg3_rd32_loop(tp, regs, RX_CPU_MODE, 0x04);
+	tg3_rd32_loop(tp, regs, RX_CPU_STATE, 0x04);
+	tg3_rd32_loop(tp, regs, RX_CPU_PGMCTR, 0x04);
+	tg3_rd32_loop(tp, regs, RX_CPU_HWBKPT, 0x04);
+
+	if (!(tp->tg3_flags2 & TG3_FLG2_5705_PLUS)) {
+		tg3_rd32_loop(tp, regs, TX_CPU_MODE, 0x04);
+		tg3_rd32_loop(tp, regs, TX_CPU_STATE, 0x04);
+		tg3_rd32_loop(tp, regs, TX_CPU_PGMCTR, 0x04);
+	}
+
+	tg3_rd32_loop(tp, regs, GRCMBOX_INTERRUPT_0, 0x110);
+	tg3_rd32_loop(tp, regs, FTQ_RESET, 0x120);
+	tg3_rd32_loop(tp, regs, MSGINT_MODE, 0x0c);
+	tg3_rd32_loop(tp, regs, DMAC_MODE, 0x04);
+	tg3_rd32_loop(tp, regs, GRC_MODE, 0x4c);
+
+	if (tp->tg3_flags & TG3_FLAG_NVRAM)
+		tg3_rd32_loop(tp, regs, NVRAM_CMD, 0x24);
+}
+
+static void tg3_dump_state(struct tg3 *tp)
+{
+	int i;
+	u32 *regs;
+
+	regs = kzalloc(TG3_REG_BLK_SIZE, GFP_ATOMIC);
+	if (!regs) {
+		netdev_err(tp->dev, "Failed allocating register dump buffer\n");
+		return;
+	}
+
+	if (tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS) {
+		/* Read up to but not including private PCI registers */
+		for (i = 0; i < TG3_PCIE_TLDLPL_PORT; i += sizeof(u32))
+			regs[i / sizeof(u32)] = tr32(i);
+	} else
+		tg3_dump_legacy_regs(tp, regs);
+
+	for (i = 0; i < TG3_REG_BLK_SIZE / sizeof(u32); i += 4) {
+		if (!regs[i + 0] && !regs[i + 1] &&
+		    !regs[i + 2] && !regs[i + 3])
+			continue;
+
+		netdev_err(tp->dev, "0x%08x: 0x%08x, 0x%08x, 0x%08x, 0x%08x\n",
+			   i * 4,
+			   regs[i + 0], regs[i + 1], regs[i + 2], regs[i + 3]);
+	}
+
+	kfree(regs);
+
+	for (i = 0; i < tp->irq_cnt; i++) {
+		struct tg3_napi *tnapi = &tp->napi[i];
+
+		/* SW status block */
+		netdev_err(tp->dev,
+			 "%d: Host status block [%08x:%08x:(%04x:%04x:%04x):(%04x:%04x)]\n",
+			   i,
+			   tnapi->hw_status->status,
+			   tnapi->hw_status->status_tag,
+			   tnapi->hw_status->rx_jumbo_consumer,
+			   tnapi->hw_status->rx_consumer,
+			   tnapi->hw_status->rx_mini_consumer,
+			   tnapi->hw_status->idx[0].rx_producer,
+			   tnapi->hw_status->idx[0].tx_consumer);
+
+		netdev_err(tp->dev,
+		"%d: NAPI info [%08x:%08x:(%04x:%04x:%04x):%04x:(%04x:%04x:%04x:%04x)]\n",
+			   i,
+			   tnapi->last_tag, tnapi->last_irq_tag,
+			   tnapi->tx_prod, tnapi->tx_cons, tnapi->tx_pending,
+			   tnapi->rx_rcb_ptr,
+			   tnapi->prodring.rx_std_prod_idx,
+			   tnapi->prodring.rx_std_cons_idx,
+			   tnapi->prodring.rx_jmb_prod_idx,
+			   tnapi->prodring.rx_jmb_cons_idx);
+	}
+}
+
 /* This is called whenever we suspect that the system chipset is re-
  * ordering the sequence of MMIO to the tx send mailbox. The symptom
  * is bogus tx completions. We try to recover by setting the
@@ -5516,21 +5633,13 @@ out:
 		tg3_phy_start(tp);
 }
 
-static void tg3_dump_short_state(struct tg3 *tp)
-{
-	netdev_err(tp->dev, "DEBUG: MAC_TX_STATUS[%08x] MAC_RX_STATUS[%08x]\n",
-		   tr32(MAC_TX_STATUS), tr32(MAC_RX_STATUS));
-	netdev_err(tp->dev, "DEBUG: RDMAC_STATUS[%08x] WDMAC_STATUS[%08x]\n",
-		   tr32(RDMAC_STATUS), tr32(WDMAC_STATUS));
-}
-
 static void tg3_tx_timeout(struct net_device *dev)
 {
 	struct tg3 *tp = netdev_priv(dev);
 
 	if (netif_msg_tx_err(tp)) {
 		netdev_err(dev, "transmit timed out, resetting\n");
-		tg3_dump_short_state(tp);
+		tg3_dump_state(tp);
 	}
 
 	schedule_work(&tp->reset_task);
@@ -9624,82 +9733,26 @@ static void tg3_set_rx_mode(struct net_device *dev)
 	tg3_full_unlock(tp);
 }
 
-#define TG3_REGDUMP_LEN		(32 * 1024)
-
 static int tg3_get_regs_len(struct net_device *dev)
 {
-	return TG3_REGDUMP_LEN;
+	return TG3_REG_BLK_SIZE;
 }
 
 static void tg3_get_regs(struct net_device *dev,
 		struct ethtool_regs *regs, void *_p)
 {
-	u32 *p = _p;
 	struct tg3 *tp = netdev_priv(dev);
-	u8 *orig_p = _p;
-	int i;
 
 	regs->version = 0;
 
-	memset(p, 0, TG3_REGDUMP_LEN);
+	memset(_p, 0, TG3_REG_BLK_SIZE);
 
 	if (tp->phy_flags & TG3_PHYFLG_IS_LOW_POWER)
 		return;
 
 	tg3_full_lock(tp, 0);
 
-#define __GET_REG32(reg)	(*(p)++ = tr32(reg))
-#define GET_REG32_LOOP(base, len)		\
-do {	p = (u32 *)(orig_p + (base));		\
-	for (i = 0; i < len; i += 4)		\
-		__GET_REG32((base) + i);	\
-} while (0)
-#define GET_REG32_1(reg)			\
-do {	p = (u32 *)(orig_p + (reg));		\
-	__GET_REG32((reg));			\
-} while (0)
-
-	GET_REG32_LOOP(TG3PCI_VENDOR, 0xb0);
-	GET_REG32_LOOP(MAILBOX_INTERRUPT_0, 0x200);
-	GET_REG32_LOOP(MAC_MODE, 0x4f0);
-	GET_REG32_LOOP(SNDDATAI_MODE, 0xe0);
-	GET_REG32_1(SNDDATAC_MODE);
-	GET_REG32_LOOP(SNDBDS_MODE, 0x80);
-	GET_REG32_LOOP(SNDBDI_MODE, 0x48);
-	GET_REG32_1(SNDBDC_MODE);
-	GET_REG32_LOOP(RCVLPC_MODE, 0x20);
-	GET_REG32_LOOP(RCVLPC_SELLST_BASE, 0x15c);
-	GET_REG32_LOOP(RCVDBDI_MODE, 0x0c);
-	GET_REG32_LOOP(RCVDBDI_JUMBO_BD, 0x3c);
-	GET_REG32_LOOP(RCVDBDI_BD_PROD_IDX_0, 0x44);
-	GET_REG32_1(RCVDCC_MODE);
-	GET_REG32_LOOP(RCVBDI_MODE, 0x20);
-	GET_REG32_LOOP(RCVCC_MODE, 0x14);
-	GET_REG32_LOOP(RCVLSC_MODE, 0x08);
-	GET_REG32_1(MBFREE_MODE);
-	GET_REG32_LOOP(HOSTCC_MODE, 0x100);
-	GET_REG32_LOOP(MEMARB_MODE, 0x10);
-	GET_REG32_LOOP(BUFMGR_MODE, 0x58);
-	GET_REG32_LOOP(RDMAC_MODE, 0x08);
-	GET_REG32_LOOP(WDMAC_MODE, 0x08);
-	GET_REG32_1(RX_CPU_MODE);
-	GET_REG32_1(RX_CPU_STATE);
-	GET_REG32_1(RX_CPU_PGMCTR);
-	GET_REG32_1(RX_CPU_HWBKPT);
-	GET_REG32_1(TX_CPU_MODE);
-	GET_REG32_1(TX_CPU_STATE);
-	GET_REG32_1(TX_CPU_PGMCTR);
-	GET_REG32_LOOP(GRCMBOX_INTERRUPT_0, 0x110);
-	GET_REG32_LOOP(FTQ_RESET, 0x120);
-	GET_REG32_LOOP(MSGINT_MODE, 0x0c);
-	GET_REG32_1(DMAC_MODE);
-	GET_REG32_LOOP(GRC_MODE, 0x4c);
-	if (tp->tg3_flags & TG3_FLAG_NVRAM)
-		GET_REG32_LOOP(NVRAM_CMD, 0x24);
-
-#undef __GET_REG32
-#undef GET_REG32_LOOP
-#undef GET_REG32_1
+	tg3_dump_legacy_regs(tp, (u32 *)_p);
 
 	tg3_full_unlock(tp);
 }
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 829a84a..9912010 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -1954,6 +1954,8 @@
 #define TG3_PCIE_PL_LO_PHYCTL5		 0x00000014
 #define TG3_PCIE_PL_LO_PHYCTL5_DIS_L2CLKREQ	  0x80000000
 
+#define TG3_REG_BLK_SIZE		0x00008000
+
 /* OTP bit definitions */
 #define TG3_OTP_AGCTGT_MASK		0x000000e0
 #define TG3_OTP_AGCTGT_SHIFT		1
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 0/5] tg3: Add more selftest and debug support
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

This patchset adds register dump capabilities for first failure debugging,
a jumbo frame loopback test mode, and extended VPD block handling.



^ permalink raw reply

* [PATCH net-next 4/5] tg3: Add jumbo frame loopback tests to selftest
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

This patch adds jumbo frame loopback test support to the ethtool
selftest.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
 drivers/net/tg3.c |   34 +++++++++++++++++++++++++---------
 1 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 9975cdb..52dd516 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10935,7 +10935,7 @@ static int tg3_test_memory(struct tg3 *tp)
 #define TG3_MAC_LOOPBACK	0
 #define TG3_PHY_LOOPBACK	1
 
-static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
+static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 {
 	u32 mac_mode, rx_start_idx, rx_idx, tx_idx, opaque_key;
 	u32 desc_idx, coal_now;
@@ -11033,7 +11033,7 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
 
 	err = -EIO;
 
-	tx_len = 1514;
+	tx_len = pktsz;
 	skb = netdev_alloc_skb(tp->dev, tx_len);
 	if (!skb)
 		return -ENOMEM;
@@ -11042,7 +11042,7 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
 	memcpy(tx_data, tp->dev->dev_addr, 6);
 	memset(tx_data + 6, 0x0, 8);
 
-	tw32(MAC_RX_MTU_SIZE, tx_len + 4);
+	tw32(MAC_RX_MTU_SIZE, tx_len + ETH_FCS_LEN);
 
 	for (i = 14; i < tx_len; i++)
 		tx_data[i] = (u8) (i & 0xff);
@@ -11098,8 +11098,6 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
 	desc = &rnapi->rx_rcb[rx_start_idx];
 	desc_idx = desc->opaque & RXD_OPAQUE_INDEX_MASK;
 	opaque_key = desc->opaque & RXD_OPAQUE_RING_MASK;
-	if (opaque_key != RXD_OPAQUE_RING_STD)
-		goto out;
 
 	if ((desc->err_vlan & RXD_ERR_MASK) != 0 &&
 	    (desc->err_vlan != RXD_ERR_ODD_NIBBLE_RCVD_MII))
@@ -11109,9 +11107,20 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
 	if (rx_len != tx_len)
 		goto out;
 
-	rx_skb = tpr->rx_std_buffers[desc_idx].skb;
+	if (pktsz <= TG3_RX_STD_DMA_SZ - ETH_FCS_LEN) {
+		if (opaque_key != RXD_OPAQUE_RING_STD)
+			goto out;
+
+		rx_skb = tpr->rx_std_buffers[desc_idx].skb;
+		map = dma_unmap_addr(&tpr->rx_std_buffers[desc_idx], mapping);
+	} else {
+		if (opaque_key != RXD_OPAQUE_RING_JUMBO)
+			goto out;
+
+		rx_skb = tpr->rx_jmb_buffers[desc_idx].skb;
+		map = dma_unmap_addr(&tpr->rx_jmb_buffers[desc_idx], mapping);
+	}
 
-	map = dma_unmap_addr(&tpr->rx_std_buffers[desc_idx], mapping);
 	pci_dma_sync_single_for_cpu(tp->pdev, map, rx_len, PCI_DMA_FROMDEVICE);
 
 	for (i = 14; i < tx_len; i++) {
@@ -11177,9 +11186,13 @@ static int tg3_test_loopback(struct tg3 *tp)
 				  CPMU_CTRL_LINK_AWARE_MODE));
 	}
 
-	if (tg3_run_loopback(tp, TG3_MAC_LOOPBACK))
+	if (tg3_run_loopback(tp, ETH_FRAME_LEN, TG3_MAC_LOOPBACK))
 		err |= TG3_MAC_LOOPBACK_FAILED;
 
+	if ((tp->tg3_flags & TG3_FLAG_JUMBO_RING_ENABLE) &&
+	    tg3_run_loopback(tp, 9000 + ETH_HLEN, TG3_MAC_LOOPBACK))
+		err |= (TG3_MAC_LOOPBACK_FAILED << 2);
+
 	if (tp->tg3_flags & TG3_FLAG_CPMU_PRESENT) {
 		tw32(TG3_CPMU_CTRL, cpmuctrl);
 
@@ -11189,8 +11202,11 @@ static int tg3_test_loopback(struct tg3 *tp)
 
 	if (!(tp->phy_flags & TG3_PHYFLG_PHY_SERDES) &&
 	    !(tp->tg3_flags3 & TG3_FLG3_USE_PHYLIB)) {
-		if (tg3_run_loopback(tp, TG3_PHY_LOOPBACK))
+		if (tg3_run_loopback(tp, ETH_FRAME_LEN, TG3_PHY_LOOPBACK))
 			err |= TG3_PHY_LOOPBACK_FAILED;
+		if ((tp->tg3_flags & TG3_FLAG_JUMBO_RING_ENABLE) &&
+		    tg3_run_loopback(tp, 9000 + ETH_HLEN, TG3_PHY_LOOPBACK))
+			err |= (TG3_PHY_LOOPBACK_FAILED << 2);
 	}
 
 	/* Re-enable gphy autopowerdown. */
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 5/5] tg3: Add support for extended VPD blocks
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

In some devices, the VPD block is relocated to a different area in
NVRAM.  The original location can still contain old, but still valid VPD
data.  This patch changes the code to look for an extended VPD block in
NVRAM.  If one is found, that block is used for all VPD operations
instead.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
 drivers/net/tg3.c |  125 ++++++++++++++++++++++++++++++++++-------------------
 drivers/net/tg3.h |    2 +
 2 files changed, 83 insertions(+), 44 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 52dd516..10fa476 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10416,6 +10416,81 @@ static void tg3_get_ethtool_stats(struct net_device *dev,
 	memcpy(tmp_stats, tg3_get_estats(tp), sizeof(tp->estats));
 }
 
+static __be32 * tg3_vpd_readblock(struct tg3 *tp)
+{
+	int i;
+	__be32 *buf;
+	u32 offset = 0, len = 0;
+	u32 magic, val;
+
+	if ((tp->tg3_flags3 & TG3_FLG3_NO_NVRAM) ||
+	    tg3_nvram_read(tp, 0, &magic))
+		return NULL;
+
+	if (magic == TG3_EEPROM_MAGIC) {
+		for (offset = TG3_NVM_DIR_START;
+		     offset < TG3_NVM_DIR_END;
+		     offset += TG3_NVM_DIRENT_SIZE) {
+			if (tg3_nvram_read(tp, offset, &val))
+				return NULL;
+
+			if ((val >> TG3_NVM_DIRTYPE_SHIFT) ==
+			    TG3_NVM_DIRTYPE_EXTVPD)
+				break;
+		}
+
+		if (offset != TG3_NVM_DIR_END) {
+			len = (val & TG3_NVM_DIRTYPE_LENMSK) * 4;
+			if (tg3_nvram_read(tp, offset + 4, &offset))
+				return NULL;
+
+			offset = tg3_nvram_logical_addr(tp, offset);
+		}
+	}
+
+	if (!offset || !len) {
+		offset = TG3_NVM_VPD_OFF;
+		len = TG3_NVM_VPD_LEN;
+	}
+
+	buf = kmalloc(len, GFP_KERNEL);
+	if (buf == NULL)
+		return NULL;
+
+	if (magic == TG3_EEPROM_MAGIC) {
+		for (i = 0; i < len; i += 4) {
+			/* The data is in little-endian format in NVRAM.
+			 * Use the big-endian read routines to preserve
+			 * the byte order as it exists in NVRAM.
+			 */
+			if (tg3_nvram_read_be32(tp, offset + i, &buf[i/4]))
+				goto error;
+		}
+	} else {
+		u8 *ptr;
+		ssize_t cnt;
+		unsigned int pos = 0;
+
+		ptr = (u8 *)&buf[0];
+		for (i = 0; pos < len && i < 3; i++, pos += cnt, ptr += cnt) {
+			cnt = pci_read_vpd(tp->pdev, pos,
+					   len - pos, ptr);
+			if (cnt == -ETIMEDOUT || cnt == -EINTR)
+				cnt = 0;
+			else if (cnt < 0)
+				goto error;
+		}
+		if (pos != len)
+			goto error;
+	}
+
+	return buf;
+
+error:
+	kfree(buf);
+	return NULL;
+}
+
 #define NVRAM_TEST_SIZE 0x100
 #define NVRAM_SELFBOOT_FORMAT1_0_SIZE	0x14
 #define NVRAM_SELFBOOT_FORMAT1_2_SIZE	0x18
@@ -10555,14 +10630,11 @@ static int tg3_test_nvram(struct tg3 *tp)
 	if (csum != le32_to_cpu(buf[0xfc/4]))
 		goto out;
 
-	for (i = 0; i < TG3_NVM_VPD_LEN; i += 4) {
-		/* The data is in little-endian format in NVRAM.
-		 * Use the big-endian read routines to preserve
-		 * the byte order as it exists in NVRAM.
-		 */
-		if (tg3_nvram_read_be32(tp, TG3_NVM_VPD_OFF + i, &buf[i/4]))
-			goto out;
-	}
+	kfree(buf);
+
+	buf = tg3_vpd_readblock(tp);
+	if (!buf)
+		return -ENOMEM;
 
 	i = pci_vpd_find_tag((u8 *)buf, 0, TG3_NVM_VPD_LEN,
 			     PCI_VPD_LRDT_RO_DATA);
@@ -12905,46 +12977,11 @@ static void __devinit tg3_read_vpd(struct tg3 *tp)
 	u8 *vpd_data;
 	unsigned int block_end, rosize, len;
 	int j, i = 0;
-	u32 magic;
-
-	if ((tp->tg3_flags3 & TG3_FLG3_NO_NVRAM) ||
-	    tg3_nvram_read(tp, 0x0, &magic))
-		goto out_no_vpd;
 
-	vpd_data = kmalloc(TG3_NVM_VPD_LEN, GFP_KERNEL);
+	vpd_data = (u8 *)tg3_vpd_readblock(tp);
 	if (!vpd_data)
 		goto out_no_vpd;
 
-	if (magic == TG3_EEPROM_MAGIC) {
-		for (i = 0; i < TG3_NVM_VPD_LEN; i += 4) {
-			u32 tmp;
-
-			/* The data is in little-endian format in NVRAM.
-			 * Use the big-endian read routines to preserve
-			 * the byte order as it exists in NVRAM.
-			 */
-			if (tg3_nvram_read_be32(tp, TG3_NVM_VPD_OFF + i, &tmp))
-				goto out_not_found;
-
-			memcpy(&vpd_data[i], &tmp, sizeof(tmp));
-		}
-	} else {
-		ssize_t cnt;
-		unsigned int pos = 0;
-
-		for (; pos < TG3_NVM_VPD_LEN && i < 3; i++, pos += cnt) {
-			cnt = pci_read_vpd(tp->pdev, pos,
-					   TG3_NVM_VPD_LEN - pos,
-					   &vpd_data[pos]);
-			if (cnt == -ETIMEDOUT || cnt == -EINTR)
-				cnt = 0;
-			else if (cnt < 0)
-				goto out_not_found;
-		}
-		if (pos != TG3_NVM_VPD_LEN)
-			goto out_not_found;
-	}
-
 	i = pci_vpd_find_tag(vpd_data, 0, TG3_NVM_VPD_LEN,
 			     PCI_VPD_LRDT_RO_DATA);
 	if (i < 0)
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index b3ccfcc..224c3e0 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2009,7 +2009,9 @@
 #define TG3_NVM_DIR_END			0x78
 #define TG3_NVM_DIRENT_SIZE		0xc
 #define TG3_NVM_DIRTYPE_SHIFT		24
+#define TG3_NVM_DIRTYPE_LENMSK		0x003fffff
 #define TG3_NVM_DIRTYPE_ASFINI		1
+#define TG3_NVM_DIRTYPE_EXTVPD		20
 #define TG3_NVM_PTREV_BCVER		0x94
 #define TG3_NVM_BCVER_MAJMSK		0x0000ff00
 #define TG3_NVM_BCVER_MAJSFT		8
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 2/5] tg3: Dump registers when status block shows errors
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson, Michael Chan

This patch monitors the error bit of the status word within the status
block.  If it is set, the driver will dump the driver state after
validating the error and then reset the chip.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
 drivers/net/tg3.c |   40 +++++++++++++++++++++++++++++++++++++++-
 drivers/net/tg3.h |    3 +++
 2 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 7274435..b61b52f 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5259,6 +5259,40 @@ tx_recovery:
 	return work_done;
 }
 
+static void tg3_process_error(struct tg3 *tp)
+{
+	u32 val;
+	bool real_error = false;
+
+	if (tp->tg3_flags & TG3_FLAG_ERROR_PROCESSED)
+		return;
+
+	/* Check Flow Attention register */
+	val = tr32(HOSTCC_FLOW_ATTN);
+	if (val & ~HOSTCC_FLOW_ATTN_MBUF_LWM) {
+		netdev_err(tp->dev, "FLOW Attention error.  Resetting chip.\n");
+		real_error = true;
+	}
+
+	if (tr32(MSGINT_STATUS) & ~MSGINT_STATUS_MSI_REQ) {
+		netdev_err(tp->dev, "MSI Status error.  Resetting chip.\n");
+		real_error = true;
+	}
+
+	if (tr32(RDMAC_STATUS) || tr32(WDMAC_STATUS)) {
+		netdev_err(tp->dev, "DMA Status error.  Resetting chip.\n");
+		real_error = true;
+	}
+
+	if (!real_error)
+		return;
+
+	tg3_dump_state(tp);
+
+	tp->tg3_flags |= TG3_FLAG_ERROR_PROCESSED;
+	schedule_work(&tp->reset_task);
+}
+
 static int tg3_poll(struct napi_struct *napi, int budget)
 {
 	struct tg3_napi *tnapi = container_of(napi, struct tg3_napi, napi);
@@ -5267,6 +5301,9 @@ static int tg3_poll(struct napi_struct *napi, int budget)
 	struct tg3_hw_status *sblk = tnapi->hw_status;
 
 	while (1) {
+		if (sblk->status & SD_STATUS_ERROR)
+			tg3_process_error(tp);
+
 		tg3_poll_link(tp);
 
 		work_done = tg3_poll_work(tnapi, work_done, budget);
@@ -7316,7 +7353,8 @@ static int tg3_chip_reset(struct tg3 *tp)
 
 	tg3_restore_pci_state(tp);
 
-	tp->tg3_flags &= ~TG3_FLAG_CHIP_RESETTING;
+	tp->tg3_flags &= ~(TG3_FLAG_CHIP_RESETTING |
+			   TG3_FLAG_ERROR_PROCESSED);
 
 	val = 0;
 	if (tp->tg3_flags2 & TG3_FLG2_5780_CLASS)
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 9912010..b3ccfcc 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -1201,6 +1201,7 @@
 #define HOSTCC_STATS_BLK_NIC_ADDR	0x00003c40
 #define HOSTCC_STATUS_BLK_NIC_ADDR	0x00003c44
 #define HOSTCC_FLOW_ATTN		0x00003c48
+#define HOSTCC_FLOW_ATTN_MBUF_LWM	 0x00000040
 /* 0x3c4c --> 0x3c50 unused */
 #define HOSTCC_JUMBO_CON_IDX		0x00003c50
 #define HOSTCC_STD_CON_IDX		0x00003c54
@@ -1611,6 +1612,7 @@
 #define  MSGINT_MODE_ONE_SHOT_DISABLE	 0x00000020
 #define  MSGINT_MODE_MULTIVEC_EN	 0x00000080
 #define MSGINT_STATUS			0x00006004
+#define  MSGINT_STATUS_MSI_REQ		 0x00000001
 #define MSGINT_FIFO			0x00006008
 /* 0x600c --> 0x6400 unused */
 
@@ -2886,6 +2888,7 @@ struct tg3 {
 #define TG3_FLAG_TAGGED_STATUS		0x00000001
 #define TG3_FLAG_TXD_MBOX_HWBUG		0x00000002
 #define TG3_FLAG_USE_LINKCHG_REG	0x00000008
+#define TG3_FLAG_ERROR_PROCESSED	0x00000010
 #define TG3_FLAG_ENABLE_ASF		0x00000020
 #define TG3_FLAG_ASPM_WORKAROUND	0x00000040
 #define TG3_FLAG_POLL_SERDES		0x00000080
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 3/5] tg3: Automatically size stat/test string arrays
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson, Benjamin Li

This patch reimplements the size preprocessor constants of the stats and
ethtool test string arrays.  The size is calculated at compile time
rather than using static constants.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   15 ++++++++-------
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index b61b52f..9975cdb 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -165,11 +165,6 @@
 
 #define TG3_RAW_IP_ALIGN 2
 
-/* number of ETHTOOL_GSTATS u64's */
-#define TG3_NUM_STATS		(sizeof(struct tg3_ethtool_stats)/sizeof(u64))
-
-#define TG3_NUM_TEST		6
-
 #define TG3_FW_UPDATE_TIMEOUT_SEC	5
 
 #define FIRMWARE_TG3		"tigon/tg3.bin"
@@ -279,7 +274,7 @@ MODULE_DEVICE_TABLE(pci, tg3_pci_tbl);
 
 static const struct {
 	const char string[ETH_GSTRING_LEN];
-} ethtool_stats_keys[TG3_NUM_STATS] = {
+} ethtool_stats_keys[] = {
 	{ "rx_octets" },
 	{ "rx_fragments" },
 	{ "rx_ucast_packets" },
@@ -358,9 +353,12 @@ static const struct {
 	{ "nic_tx_threshold_hit" }
 };
 
+#define TG3_NUM_STATS	ARRAY_SIZE(ethtool_stats_keys)
+
+
 static const struct {
 	const char string[ETH_GSTRING_LEN];
-} ethtool_test_keys[TG3_NUM_TEST] = {
+} ethtool_test_keys[] = {
 	{ "nvram test     (online) " },
 	{ "link test      (online) " },
 	{ "register test  (offline)" },
@@ -369,6 +367,9 @@ static const struct {
 	{ "interrupt test (offline)" },
 };
 
+#define TG3_NUM_TEST	ARRAY_SIZE(ethtool_test_keys)
+
+
 static void tg3_write32(struct tg3 *tp, u32 off, u32 val)
 {
 	writel(val, tp->regs + off);
-- 
1.7.3.4



^ permalink raw reply related

* Re: [net-next-2.6 RFC PATCH v2 12/13] sky2: set ethtool set_phys_id on/off cycle frequency to 1/sec
From: Stephen Hemminger @ 2011-04-13 21:00 UTC (permalink / raw)
  To: Bruce Allan; +Cc: netdev
In-Reply-To: <20110413195949.25901.86878.stgit@gitlad.jf.intel.com>

On Wed, 13 Apr 2011 12:59:49 -0700
Bruce Allan <bruce.w.allan@intel.com> wrote:

> Physical identification frequency based on how it was done prior to the
> introduction of set_phys_id.  Compile tested only.
> 
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
> Cc: Stephen Hemminger <shemminger@linux-foundation.org>

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

Assume same for skge

^ permalink raw reply

* Re: SMSC 8720a/MDIO/PHY help.
From: Michael Riesch @ 2011-04-13 21:19 UTC (permalink / raw)
  To: netdev; +Cc: ANDY KENNEDY
In-Reply-To: <9AC3F0E75060224C8BBC5BA2DDC8853A1FA8E632@EXV1.corp.adtran.com>


> If you have an idea of something for me to try, I'd love to entertain
> it.

I am rather new to PHYLIB, but these are my ideas:

 1) make sure phy_connect is executed (AFIAK called by MDIO bus driver)

 2) maybe you need to call phy_start / phy_stop (AFAIK from the PHY
driver's open / close function)

HTH,
Michael


^ permalink raw reply

* Re: [PATCH 1/1] ipv6: ignore looped-back NA while dad is running
From: David Miller @ 2011-04-13 21:30 UTC (permalink / raw)
  To: dwalter; +Cc: netdev, linux-kernel
In-Reply-To: <1302706963.8923.25.camel@localhost>

From: Daniel Walter <dwalter@barracuda.com>
Date: Wed, 13 Apr 2011 17:02:43 +0200

> This message and any attached files are confidential and intended
> solely for the addressee(s). Any publication, transmission or other
> use of the information by a person or entity other than the intended
> addressee is prohibited. If you receive this in error please contact
> the sender and delete the material. The sender does not accept
> liability for any errors or omissions as a result of the
> transmission.

I'm not applying patches that have legal disclaimers like this.

It has no place in a posting made on a public mailing list where open
and unrestricted discussions are essential.

^ permalink raw reply

* RE: SMSC 8720a/MDIO/PHY help.
From: ANDY KENNEDY @ 2011-04-13 21:38 UTC (permalink / raw)
  To: michael, netdev
In-Reply-To: <1302729564.2742.28.camel@malcolm>

> -----Original Message-----
> From: Michael Riesch [mailto:michael@riesch.at]
> Sent: Wednesday, April 13, 2011 4:19 PM
> To: netdev@vger.kernel.org
> Cc: ANDY KENNEDY
> Subject: Re: SMSC 8720a/MDIO/PHY help.
> 
> 
> > If you have an idea of something for me to try, I'd love to
> entertain
> > it.
> 
> I am rather new to PHYLIB, but these are my ideas:
> 
>  1) make sure phy_connect is executed (AFIAK called by MDIO bus
> driver)

Going through the phy.txt doc under Documentation/networking:
PHY Abstraction Layer
(Updated 2008-04-08)
though it may be a bit out-of-date, I did see what you are talking about.  What I'm hung up on at the moment is the behavior of adjust_link().  It appears that I only need to start the queues, though I don’t know.

> 
>  2) maybe you need to call phy_start / phy_stop (AFAIK from the PHY
> driver's open / close function)

Currently, when I do this I only get the call to adjust_link() over and over again.

> 
> HTH,
> Michael

Thanks for the help!

Andy

^ permalink raw reply

* [RFC][PATCH] Zero-copy receive from socket into bio
From: Andreas Gruenbacher @ 2011-04-13 21:39 UTC (permalink / raw)
  To: David S. Miller, netdev; +Cc: linux-kernel

Hello,

I'm currently looking into supporting zero-copy receive in drbd.

The basic idea is this: drbd transmits bios via sockets.  An ideal sender
sends the packet header and data in separate packets, and the network driver
supports RX_COPYBREAK and receives them into separate socket buffers.  The
socket buffers end up aligned properly, and we add them to bios and submit
them, no copying required.

This scenario doesn't seem to be supported by the existing infrastructure, so
does this patch make sense?

Thanks,
Andreas

---

[PATCH] Add a generic zero-copy-receive primitive

This requires a network driver which supports header-data split, i.e.,
receiving small header packets and big data packets into different
buffers so that the data will end up aligned well enough for consumption
by the block layer (search for RX_COPYBREAK in the drivers).

diff --git a/tcp_recvbio.c b/tcp_recvbio.c
new file mode 100644
index 0000000..38342e9
--- /dev/null
+++ b/tcp_recvbio.c
@@ -0,0 +1,185 @@
+#include <linux/kernel.h>
+#include <net/tcp.h>
+#include <linux/bio.h>
+#include <linux/blkdev.h>
+#include <linux/fs.h>
+#include "tcp_recvbio.h"
+
+static int tcp_recvbio_add(struct sk_buff *skb, struct bio *bio,
+			   struct bio_vec *last)
+{
+	struct request_queue *q = bio->bi_bdev->bd_disk->queue;
+	struct sk_buff **frag_list = &skb_shinfo(skb)->frag_list;
+	int ret;
+
+	/*
+	 * Reject fragmented skbs: there should be no need to support them.  We
+	 * use frag_list to keep track of the skbs attached to a bio instead.
+	 */
+	if (*frag_list && skb != (struct sk_buff *)bio->bi_private)
+		return false;
+
+	if (!blk_rq_aligned(q, last->bv_offset, last->bv_len))
+		return false;
+	ret = bio_add_page(bio, last->bv_page, last->bv_len, last->bv_offset);
+
+	if (ret && !*frag_list) {
+		/* Tell the network layer to leave @skb alone.  */
+		skb_get(skb);
+
+		/* Put this skb on the list.  */
+		*frag_list = (struct sk_buff *)bio->bi_private;
+		bio->bi_private = skb;
+	}
+	return ret;
+}
+
+static int tcp_recvbio_data(read_descriptor_t *rd_desc, struct sk_buff *skb,
+			    unsigned int offset, size_t len)
+{
+	struct bio *bio = rd_desc->arg.data;
+	struct request_queue *q = bio->bi_bdev->bd_disk->queue;
+	int start = skb_headlen(skb), consumed = 0, i;
+	struct bio_vec last = { };
+
+	/* Cannot zero-copy from the header.  */
+	if (offset < start)
+		goto give_up;
+
+	/* Give up if the payload is unaligned.  */
+	if (!blk_rq_aligned(q, offset - start, 0))
+		goto give_up;
+
+	/* Do not consume more data than we need.  */
+	if (len > rd_desc->count - rd_desc->written)
+		len = rd_desc->count - rd_desc->written;
+
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i];
+		int end, frag_len;
+
+		WARN_ON(start > offset + len);
+
+		end = start + frag->size;
+		frag_len = end - offset;
+		if (frag_len > 0) {
+			bool merged = false;
+			unsigned int page_offset;
+
+			if (frag_len > len)
+				frag_len = len;
+
+			page_offset = frag->page_offset + offset - start;
+			if (last.bv_page == frag->page &&
+			    last.bv_offset + last.bv_len == page_offset) {
+				/* Merge with the previous fragment.  */
+				last.bv_len += frag_len;
+				merged = true;
+			}
+			len -= frag_len;
+			offset += frag_len;
+			if (!len || !merged) {
+				if (last.bv_page) {
+					if (!tcp_recvbio_add(skb, bio, &last))
+						goto give_up;
+					consumed += last.bv_len;
+				}
+				if (!len)
+					goto out;
+				last.bv_page = frag->page;
+				last.bv_offset = page_offset;
+				last.bv_len = frag_len;
+			}
+		}
+		start = end;
+	}
+
+	/*
+	 * We don't care if there are additional blocks in the skb's frag_list
+	 * that are zero-copyable: at worst, we end up copying too many blocks.
+	 * (See skb_copy_bits() for an example of walking the frag_list.)
+	 */
+
+out:
+	rd_desc->written += consumed;
+	return consumed;
+
+give_up:
+	rd_desc->count = 0;
+	goto out;
+}
+
+/**
+ * tcp_recvbio  -  zero-copy receive a bio from a socket
+ * @sk: socket to receive from
+ * @bio: bio to add socket data to
+ * @size: bytes to receive
+ * @list: single linked list of skbs added to @bio
+ *
+ * Zero-copy receive data from @sk into @bio by directly using the socket
+ * buffer pages, bypassing the page cache.  To keep the network layer from
+ * modifying the socket buffers while in use by @bio, we skb_get() them and
+ * return a list of skbs that @bio now references.  The caller is
+ * responsible for releasing @list with consume_skbs() once done.
+ *
+ * Returns the number of bytes received into @bio.
+ */
+int tcp_recvbio(struct sock *sk, struct bio *bio, size_t size,
+		struct sk_buff **list)
+{
+	read_descriptor_t rd_desc = {
+		.count = size,
+		.arg = { .data = bio },
+	};
+	void *old_bi_private;
+	int err = 0;
+
+	/* Temporarily build referenced skb list in bi_private.  */
+	old_bi_private = bio->bi_private;
+	bio->bi_private = NULL;
+
+	lock_sock(sk);
+	while (rd_desc.written < rd_desc.count) {
+		long timeo = sock_rcvtimeo(sk, 0);
+
+		sk_wait_data(sk, &timeo);
+		if (signal_pending(current)) {
+			err = sock_intr_errno(timeo);
+			break;
+		}
+		if (!timeo) {
+			if (!rd_desc.written)
+				err = -EAGAIN;
+			break;
+		}
+		read_lock(&sk->sk_callback_lock);
+		err = tcp_read_sock(sk, &rd_desc, tcp_recvbio_data);
+		read_unlock(&sk->sk_callback_lock);
+		if (err < 0)
+			break;
+	}
+	release_sock(sk);
+
+	*list = (struct sk_buff *)bio->bi_private;
+	bio->bi_private = old_bi_private;
+
+	if (err)
+		return err;
+	return rd_desc.written;
+}
+
+/**
+ * consume_skbs  -  consume a list of skbs
+ *
+ * This assumes that the skbs are linked on frag_list, as the @list returned
+ * from tcp_recvbio().
+ */
+void consume_skbs(struct sk_buff **skb)
+{
+	while (*skb) {
+		struct sk_buff *tmp = *skb;
+		*skb = skb_shinfo(tmp)->frag_list;
+		skb_shinfo(tmp)->frag_list = NULL;
+		consume_skb(tmp);
+	}
+}
diff --git a/tcp_recvbio.h b/tcp_recvbio.h
new file mode 100644
index 0000000..0ba30ee
--- /dev/null
+++ b/tcp_recvbio.h
@@ -0,0 +1,9 @@
+#ifndef __TCP_RECVBIO_H
+#define __TCP_RECVBIO_H
+
+
+extern int tcp_recvbio(struct sock *, struct bio *, size_t, struct sk_buff **);
+extern void consume_skbs(struct sk_buff **);
+
+
+#endif  /* __TCP_RECVBIO_H */
-- 
1.7.4.1.415.g5e839

^ permalink raw reply related

* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options
From: David Miller @ 2011-04-13 21:48 UTC (permalink / raw)
  To: eric.dumazet; +Cc: lkml, shemminger, shimoda.hiroaki, netdev
In-Reply-To: <1302708487.3725.0.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 13 Apr 2011 17:28:07 +0200

> Dont worry, Stephen or me will send it asap.

I'm looking forward to it :)

^ permalink raw reply

* Re: [PATCHv2 net-next-2.6] rndis_host: Poll status before control channel where necessary
From: David Miller @ 2011-04-13 21:49 UTC (permalink / raw)
  To: ben-/+tVBieCtBitmTQ+vhA3Yw
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, vzeeaxwl-ubggFOsnOr3gwBMGfI3FeA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1302670523.5282.610.camel@localhost>

From: Ben Hutchings <ben-/+tVBieCtBitmTQ+vhA3Yw@public.gmane.org>
Date: Wed, 13 Apr 2011 05:55:23 +0100

> Some RNDIS devices don't respond on the control channel until polled
> on the status channel.  In particular, this was reported to be the
> case for the 2Wire HomePortal 1000SW and for some Windows Mobile
> devices.
> 
> This is roughly based on a patch by John Carr <john.carr-3P/l8hQepEe9FHfhHBbuYA@public.gmane.org>
> which is currently applied by Mandriva.
> 
> Reported-by: Mark Glassberg <vzeeaxwl-ubggFOsnOr3gwBMGfI3FeA@public.gmane.org>
> Signed-off-by: Ben Hutchings <ben-/+tVBieCtBitmTQ+vhA3Yw@public.gmane.org>
> ---
> The first version made this behaviour unconditional and had to be
> reverted.  This version adds a quirk flag instead.

Applied, thanks Ben.

The feedback about whether to use the point-to-point flag or not should
be addressed, but seperately.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Allan, Bruce W @ 2011-04-13 22:39 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev@vger.kernel.org
In-Reply-To: <1302726313.2873.18.camel@bwh-desktop>



>-----Original Message-----
>From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>Sent: Wednesday, April 13, 2011 1:25 PM
>To: Allan, Bruce W
>Cc: netdev@vger.kernel.org
>Subject: Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
>for physical identification
>
>I'm sure there ought to be a clearer way to do this, and to avoid any
>weird effects from integer overflow in the multiplication.  How about
>using an inner loop for each second:
>
>		/* Driver expects to be called at twice the frequency in rc */
>		int n = rc * 2, i, interval = HZ / n;
>
>		do {
>			i = n;
>			do {
>	 			rtnl_lock();
> 				rc = dev->ethtool_ops->set_phys_id(
>					dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
>	 			rtnl_unlock();
> 				if (rc)
> 					break;
>				schedule_timeout_interruptible(interval);
>			} while (!signal_pending(current) && --i != 0);
> 		} while (!signal_pending(current) &&
>			 (id.data == 0 || --id.data != 0));
>
>Ben.

OK, if that is clearer to you...v3 forthcoming.

Thanks,
Bruce.

^ permalink raw reply

* RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Ben Hutchings @ 2011-04-13 22:44 UTC (permalink / raw)
  To: Allan, Bruce W; +Cc: netdev@vger.kernel.org
In-Reply-To: <8DD2590731AB5D4C9DBF71A877482A90018A3427B6@orsmsx509.amr.corp.intel.com>

On Wed, 2011-04-13 at 15:39 -0700, Allan, Bruce W wrote:
> 
> >-----Original Message-----
> >From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> >Sent: Wednesday, April 13, 2011 1:25 PM
> >To: Allan, Bruce W
> >Cc: netdev@vger.kernel.org
> >Subject: Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
> >for physical identification
> >
> >I'm sure there ought to be a clearer way to do this, and to avoid any
> >weird effects from integer overflow in the multiplication.  How about
> >using an inner loop for each second:
> >
> >		/* Driver expects to be called at twice the frequency in rc */
> >		int n = rc * 2, i, interval = HZ / n;
> >

		/* Count down seconds */
> >		do {
			/* Count down iterations per second */
> >			i = n;
> >			do {
> >	 			rtnl_lock();
> > 				rc = dev->ethtool_ops->set_phys_id(
> >					dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
> >	 			rtnl_unlock();
> > 				if (rc)
> > 					break;
> >				schedule_timeout_interruptible(interval);
> >			} while (!signal_pending(current) && --i != 0);
> > 		} while (!signal_pending(current) &&
> >			 (id.data == 0 || --id.data != 0));
> >
> >Ben.
> 
> OK, if that is clearer to you...v3 forthcoming.

I guess it wouldn't hurt to add comemnts too.  Would you agree that it's
clear with the additions above?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Allan, Bruce W @ 2011-04-13 22:55 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev@vger.kernel.org
In-Reply-To: <1302734679.2873.23.camel@bwh-desktop>

>-----Original Message-----
>From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>Sent: Wednesday, April 13, 2011 3:45 PM
>To: Allan, Bruce W
>Cc: netdev@vger.kernel.org
>Subject: RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
>for physical identification
>
>On Wed, 2011-04-13 at 15:39 -0700, Allan, Bruce W wrote:
>>
>> >-----Original Message-----
>> >From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>> >Sent: Wednesday, April 13, 2011 1:25 PM
>> >To: Allan, Bruce W
>> >Cc: netdev@vger.kernel.org
>> >Subject: Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
>> >for physical identification
>> >
>> >I'm sure there ought to be a clearer way to do this, and to avoid any
>> >weird effects from integer overflow in the multiplication.  How about
>> >using an inner loop for each second:
>> >
>> >		/* Driver expects to be called at twice the frequency in rc */
>> >		int n = rc * 2, i, interval = HZ / n;
>> >
>
>		/* Count down seconds */
>> >		do {
>			/* Count down iterations per second */
>> >			i = n;
>> >			do {
>> >	 			rtnl_lock();
>> > 				rc = dev->ethtool_ops->set_phys_id(
>> >					dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
>> >	 			rtnl_unlock();
>> > 				if (rc)
>> > 					break;
>> >				schedule_timeout_interruptible(interval);
>> >			} while (!signal_pending(current) && --i != 0);
>> > 		} while (!signal_pending(current) &&
>> >			 (id.data == 0 || --id.data != 0));
>> >
>> >Ben.
>>
>> OK, if that is clearer to you...v3 forthcoming.
>
>I guess it wouldn't hurt to add comemnts too.  Would you agree that it's
>clear with the additions above?
>
>Ben.

Sure, makes sense to me.

Thanks,
Bruce.

^ permalink raw reply

* kernel panic, 2.6.38.2, gretap
From: Denys Fedoryshchenko @ 2011-04-13 22:58 UTC (permalink / raw)
  To: netdev

 Did following rule to route incoming (over eth0) traffic over gretap 
 interface

 Bringing up interface
 ip link add eoip1 type gretap remote X.X.X.X local Y.Y.Y.Y nopmtudisc
 ifconfig eoip1 10.255.254.1 netmask 255.255.255.252 up mtu 1500

 made source routing:
 32000:  from all iif eth0 lookup 203

 Some routes added to table 203

 After few(1-3) seconds running around 30-40 Mbps getting kernel panic:

 Notes: I have vlan on same interface, eth0.2023, where rest of traffic 
 going, and this vlan "shaped" by HTB. It is not involved in gretap 
 operation.
 on eth0 i have huge bfifo:
 qdisc bfifo 8001: dev eth0 root refcnt 9 limit 100000000b
  Sent 14652829681 bytes 15646355 pkt (dropped 0, overlimits 0 requeues 
 8)
  backlog 0b 0p requeues 8


 [  658.492347] skb_over_panic: text:f80f37d4 len:3028 put:1514 
 head:d1af2000 data:d1af20a4 tail:0xd1af2c78 end:0xd1af2700 dev:eth0.2022
 [  658.492975] ------------[ cut here ]------------
 [  658.493264] Kernel BUG at c0377eaf [verbose debug info unavailable]
 [  658.493317] invalid opcode: 0000 [#1]
 SMP

 [  658.493317] last sysfs file: 
 /sys/devices/virtual/net/eth0.2022/address
 [  658.493317] Modules linked in:
 ip_gre
 gre
 netconsole
 ipmi_si
 tun
 configfs
 cls_u32
 sch_htb
 8021q
 garp
 stp
 llc
 iptable_filter
 ipt_addrtype
 xt_dscp
 xt_string
 xt_owner
 xt_multiport
 xt_iprange
 xt_hashlimit
 xt_conntrack
 xt_DSCP
 xt_NFQUEUE
 xt_mark
 xt_connmark
 nf_conntrack
 ip_tables
 x_tables
 bnx2
 ipmi_devintf
 ipmi_msghandler
 processor
 ata_piix
 i5k_amb
 iTCO_wdt
 pata_acpi
 hwmon
 [last unloaded: netconsole]

 [  658.493317]
 [  658.493317] Pid: 0, comm: kworker/0:1 Not tainted 2.6.38.2-devel2 #2

 Dell Inc. PowerEdge 1950
 /
 0D8635

 [  658.493317] EIP: 0060:[<c0377eaf>] EFLAGS: 00010282 CPU: 3
 [  658.493317] EIP is at skb_put+0x7f/0x89
 [  658.493317] EAX: 0000008e EBX: d1af2c78 ECX: f64b5e40 EDX: c05032e8
 [  658.493317] ESI: 000005ea EDI: f5f28380 EBP: 006d006d ESP: f64b5e3c
 [  658.493317]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 [  658.493317] Process kworker/0:1 (pid: 0, ti=f64b4000 task=f64a4a80 
 task.ti=f64b0000)
 [  658.493317] Stack:
 [  658.493317]  c05032e8
 f80f37d4
 00000bd4
 000005ea
 d1af2000
 d1af20a4
 d1af2c78
 d1af2700

 [  658.493317]  f5e54000
 00000000
 eee81e00
 f80f37d4
 00000604
 00000002
 00000000
 e5602500

 [  658.493317]  00000001
 f6b02cb8
 0000004d
 512e75c0
 eef28480
 00000000
 f5f28400
 f5f28380

 [  658.493317] Call Trace:
 [  658.493317]  [<f80f37d4>] ? bnx2_poll_work+0x980/0xf48 [bnx2]
 [  658.493317]  [<f80f37d4>] ? bnx2_poll_work+0x980/0xf48 [bnx2]
 [  658.493317]  [<c0140e49>] ? hrtimer_start+0x20/0x25
 [  658.493317]  [<f826ffd1>] ? htb_dequeue+0x757/0x770 [sch_htb]
 [  658.493317]  [<f80f3f27>] ? bnx2_poll+0xf7/0x1d9 [bnx2]
 [  658.493317]  [<c037f564>] ? net_rx_action+0x8c/0x176
 [  658.493317]  [<c012f28f>] ? __do_softirq+0x6b/0x104
 [  658.493317]  [<c012f224>] ? __do_softirq+0x0/0x104
 [  658.493317]  <IRQ>

 [  658.493317]  [<c012f17e>] ? irq_exit+0x26/0x59
 [  658.493317]  [<c0103b3d>] ? do_IRQ+0x81/0x95
 [  658.493317]  [<c0102ca9>] ? common_interrupt+0x29/0x30
 [  658.493317]  [<c010807a>] ? mwait_idle+0x51/0x56
 [  658.493317]  [<c0101a97>] ? cpu_idle+0x41/0x5e
 [  658.493317] Code:
 24
 14
 8b
 81
 a4
 00
 00
 00
 89
 74
 24
 0c
 89
 44
 24
 10
 8b
 41
 4c
 c7
 04
 24
 e8
 32
 50
 c0
 89
 44
 24
 08
 8b
 44
 24
 2c
 89
 44
 24
 04
 e8
 51
 85
 07
 00
 Apr 13 22:48:46 217.151.224.119 unparseable log message: "<0f> "
 0b
 eb
 fe
 83
 c4
 24
 5b
 5e
 c3
 55
 57
 56
 53
 83
 ec
 24
 fc
 89
 c5
 89

 [  658.493317] EIP: [<c0377eaf>]
 skb_put+0x7f/0x89
 SS:ESP 0068:f64b5e3c
 [  658.512472] ---[ end trace d06a076521439891 ]---
 [  658.512750] Kernel panic - not syncing: Fatal exception in interrupt
 [  658.514034] Rebooting in 5 seconds..



^ permalink raw reply

* Best route for re-implementing TCPHA
From: RichardFliam @ 2011-04-13 23:08 UTC (permalink / raw)
  To: netdev

TCPHA (http://dragon.linux-vs.org/~dragonfly/htm/tcpha.htm) provided
several neat features for content and health aware load balancing. I
am looking to re-implement on the 2.6 kernel and I am struck by
indecision on a few key features.

In particular the original project created its own polling methods for
TCP sockets based on fs/select.c and tcp_poll but to me this seems
inelegant. I am wondering if there is a "correct" way to poll sockets
in kernel or should I simply call sock_map_fd on the kernel socket.

After extensive searching I did find this post
http://permalink.gmane.org/gmane.linux.network/180354 to this mailing
list, but it does not seem to contain an answer as to the correct
direction for polling tcp sockets in kernel.

--
--Richard Fliam

^ permalink raw reply

* [net-next-2.6 RFC PATCH v3] ethtool: allow custom interval for physical identification
From: Bruce Allan @ 2011-04-13 23:09 UTC (permalink / raw)
  To: netdev
  Cc: Bruce Allan, Ben Hutchings, Sathya Perla, Subbu Seetharaman,
	Ajit Khaparde, Michael Chan, Eilon Greenstein, Divy Le Ray,
	Don Fry, Jon Mason, Solarflare linux maintainers, Steve Hodgson,
	Stephen Hemminger, Matt Carlson

When physical identification of an adapter is done by toggling the
mechanism on and off through software utilizing the set_phys_id operation,
it is done with a fixed duration for both on and off states.  Some drivers
may want to set a custom duration for the on/off intervals.  This patch
changes the API so the return code from the driver's entry point when it
is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to
cycle the on/off states, and updates the drivers that have already been
converted to use the new set_phys_id and use the synchronous method for
identifying an adapter.

The physical identification frequency set in the updated drivers is based
on how it was done prior to the introduction of set_phys_id.

Compile tested only.  Also fixes a compiler warning in sfc.

v2: drivers do not return -EINVAL for ETHOOL_ID_ACTIVE
v3: fold patchset into single patch and cleanup per Ben's feedback

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Divy Le Ray <divy@chelsio.com>
Cc: Don Fry <pcnet32@frontier.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Steve Hodgson <shodgson@solarflare.com>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Matt Carlson <mcarlson@broadcom.com>
---

 drivers/net/benet/be_ethtool.c    |    2 +-
 drivers/net/bnx2.c                |    2 +-
 drivers/net/bnx2x/bnx2x_ethtool.c |    2 +-
 drivers/net/cxgb3/cxgb3_main.c    |    2 +-
 drivers/net/ewrk3.c               |    2 +-
 drivers/net/niu.c                 |    2 +-
 drivers/net/pcnet32.c             |    2 +-
 drivers/net/s2io.c                |    2 +-
 drivers/net/sfc/ethtool.c         |    6 +++---
 drivers/net/skge.c                |    2 +-
 drivers/net/sky2.c                |    2 +-
 drivers/net/tg3.c                 |    2 +-
 include/linux/ethtool.h           |    6 ++++--
 net/core/ethtool.c                |   31 ++++++++++++++++---------------
 14 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
index 96f5502..80226e4 100644
--- a/drivers/net/benet/be_ethtool.c
+++ b/drivers/net/benet/be_ethtool.c
@@ -516,7 +516,7 @@ be_set_phys_id(struct net_device *netdev,
 	case ETHTOOL_ID_ACTIVE:
 		be_cmd_get_beacon_state(adapter, adapter->hba_port_num,
 					&adapter->beacon_state);
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_ON:
 		be_cmd_set_beacon_state(adapter, adapter->hba_port_num, 0, 0,
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 0a52079..bf729ee 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -7473,7 +7473,7 @@ bnx2_set_phys_id(struct net_device *dev, enum ethtool_phys_id_state state)
 
 		bp->leds_save = REG_RD(bp, BNX2_MISC_CFG);
 		REG_WR(bp, BNX2_MISC_CFG, BNX2_MISC_CFG_LEDMODE_MAC);
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_ON:
 		REG_WR(bp, BNX2_EMAC_LED, BNX2_EMAC_LED_OVERRIDE |
diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
index ad7d91e..0a5e88d 100644
--- a/drivers/net/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/bnx2x/bnx2x_ethtool.c
@@ -2025,7 +2025,7 @@ static int bnx2x_set_phys_id(struct net_device *dev,
 
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_ON:
 		bnx2x_set_led(&bp->link_params, &bp->link_vars,
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index 802c7a7..a087e06 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -1757,7 +1757,7 @@ static int set_phys_id(struct net_device *dev,
 
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_OFF:
 		t3_set_reg_field(adapter, A_T3DBG_GPIO_EN, F_GPIO0_OUT_VAL, 0);
diff --git a/drivers/net/ewrk3.c b/drivers/net/ewrk3.c
index c7ce443..17b6027 100644
--- a/drivers/net/ewrk3.c
+++ b/drivers/net/ewrk3.c
@@ -1618,7 +1618,7 @@ static int ewrk3_set_phys_id(struct net_device *dev,
 		/* Prevent ISR from twiddling the LED */
 		lp->led_mask = 0;
 		spin_unlock_irq(&lp->hw_lock);
-		return -EINVAL;
+		return 2;	/* cycle on/off twice per second */
 
 	case ETHTOOL_ID_ON:
 		cr = inb(EWRK3_CR);
diff --git a/drivers/net/niu.c b/drivers/net/niu.c
index 3fa1e9c..ea2272f 100644
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -7896,7 +7896,7 @@ static int niu_set_phys_id(struct net_device *dev,
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
 		np->orig_led_state = niu_led_state_save(np);
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_ON:
 		niu_force_led(np, 1);
diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c
index e89afb9..0a1efba 100644
--- a/drivers/net/pcnet32.c
+++ b/drivers/net/pcnet32.c
@@ -1038,7 +1038,7 @@ static int pcnet32_set_phys_id(struct net_device *dev,
 		for (i = 4; i < 8; i++)
 			lp->save_regs[i - 4] = a->read_bcr(ioaddr, i);
 		spin_unlock_irqrestore(&lp->lock, flags);
-		return -EINVAL;
+		return 2;	/* cycle on/off twice per second */
 
 	case ETHTOOL_ID_ON:
 	case ETHTOOL_ID_OFF:
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index 2d5cc61..2302d97 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -5541,7 +5541,7 @@ static int s2io_ethtool_set_led(struct net_device *dev,
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
 		sp->adapt_ctrl_org = readq(&bar0->gpio_control);
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_ON:
 		s2io_set_led(sp, true);
diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index 644f7c1..5d8468f 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -182,7 +182,7 @@ static int efx_ethtool_phys_id(struct net_device *net_dev,
 			       enum ethtool_phys_id_state state)
 {
 	struct efx_nic *efx = netdev_priv(net_dev);
-	enum efx_led_mode mode;
+	enum efx_led_mode mode = EFX_LED_DEFAULT;
 
 	switch (state) {
 	case ETHTOOL_ID_ON:
@@ -194,8 +194,8 @@ static int efx_ethtool_phys_id(struct net_device *net_dev,
 	case ETHTOOL_ID_INACTIVE:
 		mode = EFX_LED_DEFAULT;
 		break;
-	default:
-		return -EINVAL;
+	case ETHTOOL_ID_ACTIVE:
+		return 1;	/* cycle on/off once per second */
 	}
 
 	efx->type->set_id_led(efx, mode);
diff --git a/drivers/net/skge.c b/drivers/net/skge.c
index 310dcbc..176d784 100644
--- a/drivers/net/skge.c
+++ b/drivers/net/skge.c
@@ -753,7 +753,7 @@ static int skge_set_phys_id(struct net_device *dev,
 
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
-		return -EINVAL;
+		return 2;	/* cycle on/off twice per second */
 
 	case ETHTOOL_ID_ON:
 		skge_led(skge, LED_MODE_TST);
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index a4b8fe5..c8d0451 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -3813,7 +3813,7 @@ static int sky2_set_phys_id(struct net_device *dev,
 
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 	case ETHTOOL_ID_INACTIVE:
 		sky2_led(sky2, MO_LED_NORM);
 		break;
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 9d7defc..7c1a9dd 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10292,7 +10292,7 @@ static int tg3_set_phys_id(struct net_device *dev,
 
 	switch (state) {
 	case ETHTOOL_ID_ACTIVE:
-		return -EINVAL;
+		return 1;	/* cycle on/off once per second */
 
 	case ETHTOOL_ID_ON:
 		tw32(MAC_LED_CTRL, LED_CTRL_LNKLED_OVERRIDE |
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index ad22a68..9de3127 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -798,8 +798,10 @@ bool ethtool_invalid_flags(struct net_device *dev, u32 data, u32 supported);
  *	attached to it.  The implementation may update the indicator
  *	asynchronously or synchronously, but in either case it must return
  *	quickly.  It is initially called with the argument %ETHTOOL_ID_ACTIVE,
- *	and must either activate asynchronous updates or return -%EINVAL.
- *	If it returns -%EINVAL then it will be called again at intervals with
+ *	and must either activate asynchronous updates and return zero, return
+ *	a negative error or return a positive frequency for synchronous
+ *	indication (e.g. 1 for one on/off cycle per second).  If it returns
+ *	a frequency then it will be called again at intervals with the
  *	argument %ETHTOOL_ID_ON or %ETHTOOL_ID_OFF and should set the state of
  *	the indicator accordingly.  Finally, it is called with the argument
  *	%ETHTOOL_ID_INACTIVE and must deactivate the indicator.  Returns a
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 41dee2d..13d79f5 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1669,7 +1669,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 		return dev->ethtool_ops->phys_id(dev, id.data);
 
 	rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ACTIVE);
-	if (rc && rc != -EINVAL)
+	if (rc < 0)
 		return rc;
 
 	/* Drop the RTNL lock while waiting, but prevent reentry or
@@ -1684,21 +1684,22 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 		schedule_timeout_interruptible(
 			id.data ? (id.data * HZ) : MAX_SCHEDULE_TIMEOUT);
 	} else {
-		/* Driver expects to be called periodically */
+		/* Driver expects to be called at twice the frequency in rc */
+		int n = rc * 2, i, interval = HZ / n;
+
+		/* Count down seconds */
 		do {
-			rtnl_lock();
-			rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ON);
-			rtnl_unlock();
-			if (rc)
-				break;
-			schedule_timeout_interruptible(HZ / 2);
-
-			rtnl_lock();
-			rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_OFF);
-			rtnl_unlock();
-			if (rc)
-				break;
-			schedule_timeout_interruptible(HZ / 2);
+			/* Count down iterations per second */
+			i = n;
+			do {
+				rtnl_lock();
+				rc = dev->ethtool_ops->set_phys_id(dev,
+				    (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
+				rtnl_unlock();
+				if (rc)
+					break;
+				schedule_timeout_interruptible(interval);
+			} while (!signal_pending(current) && --i != 0);
 		} while (!signal_pending(current) &&
 			 (id.data == 0 || --id.data != 0));
 	}


^ permalink raw reply related

* Re: [PATCH] iproute2: tc add mqprio qdisc support
From: Ben Hutchings @ 2011-04-13 23:10 UTC (permalink / raw)
  To: John Fastabend; +Cc: shemminger, netdev
In-Reply-To: <20110412155727.4656.42756.stgit@jf-dev1-dcblab>

I know that this has already been applied, but:

On Tue, 2011-04-12 at 08:57 -0700, John Fastabend wrote:
> Add mqprio qdisc support. Output matches the following,
> 
> # ./tc/tc qdisc
> qdisc mq 0: dev eth1 root
> qdisc mq 0: dev eth2 root
> qdisc mqprio 8001: dev eth3 root  tc 8 map 0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1
>              queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63)
> 
> And usage is,
> 
> # ./tc/tc qdisc add dev eth3 root mqprio help
> Usage: ... mclass [num_tc NUMBER] [map P0 P1...]

mclass?

>                   [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1|0]

Of course I wrote something similar to this, but I never finished it
off, so thanks.

I don't think it makes sense to require count and offset to be specified
as separate lists.  The arguments could be interleaved but that adds
more opportunity for error.  Since offsets have to be in order and you
generally don't want to have gaps then the offsets could normally be
inferred.  So maybe something like:

	queues cnt0[@txq0] cnt1[@txq1] ...

[...]
> +static int mqprio_parse_opt(struct qdisc_util *qu, int argc,
> +			    char **argv, struct nlmsghdr *n)
> +{
> +	int idx;
> +	struct tc_mqprio_qopt opt = {
> +				     8,
> +				     {0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 1, 1, 3, 3, 3, 3},
> +				     1,
> +				    };

It would be clearer to name the fields being initialised.

[...]
> +int mqprio_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
> +{
> +	int i;
> +	struct tc_mqprio_qopt *qopt;
> +
> +	if (opt == NULL)
> +		return 0;
> +
> +	qopt = RTA_DATA(opt);
> +
> +	fprintf(f, " tc %u map ", qopt->num_tc);
> +	for (i = 0; i <= TC_PRIO_MAX; i++)
> +		fprintf(f, "%d ", qopt->prio_tc_map[i]);
> +	fprintf(f, "\n             queues:");
> +	for (i = 0; i < qopt->num_tc; i++)
> +		fprintf(f, "(%i:%i) ", qopt->offset[i],
> +			qopt->offset[i] + qopt->count[i] - 1);
[...]

Shouldn't this output be consistent with the command-line syntax?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] iproute2: tc add mqprio qdisc support
From: John Fastabend @ 2011-04-13 23:37 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: shemminger@vyatta.com, netdev@vger.kernel.org
In-Reply-To: <1302736222.2873.39.camel@bwh-desktop>

On 4/13/2011 4:10 PM, Ben Hutchings wrote:
> I know that this has already been applied, but:
> 
> On Tue, 2011-04-12 at 08:57 -0700, John Fastabend wrote:
>> Add mqprio qdisc support. Output matches the following,
>>
>> # ./tc/tc qdisc
>> qdisc mq 0: dev eth1 root
>> qdisc mq 0: dev eth2 root
>> qdisc mqprio 8001: dev eth3 root  tc 8 map 0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1
>>              queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63)
>>
>> And usage is,
>>
>> # ./tc/tc qdisc add dev eth3 root mqprio help
>> Usage: ... mclass [num_tc NUMBER] [map P0 P1...]
> 
> mclass?

agh stupid typo in the description that was my working name
for the qdisc some time ago. The help in 'tc' is correct.

> 
>>                   [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1|0]
> 
> Of course I wrote something similar to this, but I never finished it
> off, so thanks.
> 
> I don't think it makes sense to require count and offset to be specified
> as separate lists.  The arguments could be interleaved but that adds
> more opportunity for error.  Since offsets have to be in order and you
> generally don't want to have gaps then the offsets could normally be
> inferred.  So maybe something like:
> 
> 	queues cnt0[@txq0] cnt1[@txq1] ...

OK. I agree with you this is better.

> 
> [...]
>> +static int mqprio_parse_opt(struct qdisc_util *qu, int argc,
>> +			    char **argv, struct nlmsghdr *n)
>> +{
>> +	int idx;
>> +	struct tc_mqprio_qopt opt = {
>> +				     8,
>> +				     {0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 1, 1, 3, 3, 3, 3},
>> +				     1,
>> +				    };
> 
> It would be clearer to name the fields being initialised.
> 

OK.

> [...]
>> +int mqprio_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
>> +{
>> +	int i;
>> +	struct tc_mqprio_qopt *qopt;
>> +
>> +	if (opt == NULL)
>> +		return 0;
>> +
>> +	qopt = RTA_DATA(opt);
>> +
>> +	fprintf(f, " tc %u map ", qopt->num_tc);
>> +	for (i = 0; i <= TC_PRIO_MAX; i++)
>> +		fprintf(f, "%d ", qopt->prio_tc_map[i]);
>> +	fprintf(f, "\n             queues:");
>> +	for (i = 0; i < qopt->num_tc; i++)
>> +		fprintf(f, "(%i:%i) ", qopt->offset[i],
>> +			qopt->offset[i] + qopt->count[i] - 1);
> [...]
> 
> Shouldn't this output be consistent with the command-line syntax?

I'm not sure, here's what it is now,

queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63)

And here's what it would be with the change,

queues: 8@0 8@8 8@16 8@24 8@32 8@40 8@48 8@56

I like the first option with (#:#) it seems a bit more obvious to me
what the layout is. I'll get this fixed up tomorrow. Thanks for
taking a look Ben.

~John.


^ permalink raw reply

* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options
From: Stephen Hemminger @ 2011-04-14  0:03 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, lkml, shimoda.hiroaki, netdev
In-Reply-To: <20110413.144812.116375845.davem@davemloft.net>

On Wed, 13 Apr 2011 14:48:12 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 13 Apr 2011 17:28:07 +0200
> 
> > Dont worry, Stephen or me will send it asap.
> 
> I'm looking forward to it :)

You applied the clear of ipcb already.

^ permalink raw reply

* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options
From: David Miller @ 2011-04-14  0:05 UTC (permalink / raw)
  To: shemminger; +Cc: eric.dumazet, lkml, shimoda.hiroaki, netdev
In-Reply-To: <20110413170351.078cfa2f@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 13 Apr 2011 17:03:51 -0700

> On Wed, 13 Apr 2011 14:48:12 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Wed, 13 Apr 2011 17:28:07 +0200
>> 
>> > Dont worry, Stephen or me will send it asap.
>> 
>> I'm looking forward to it :)
> 
> You applied the clear of ipcb already.

There are other patches involved, I think.

The one with the NULL route handling, for one.

Please follow back in this thread for the details, the IPCB clear
wasn't sufficient to get rid of all of the reporter's OOPS's.

^ permalink raw reply

* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options
From: Stephen Hemminger @ 2011-04-14  0:08 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, lkml, shimoda.hiroaki, netdev
In-Reply-To: <20110413.170503.193709874.davem@davemloft.net>

On Wed, 13 Apr 2011 17:05:03 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Wed, 13 Apr 2011 17:03:51 -0700
> 
> > On Wed, 13 Apr 2011 14:48:12 -0700 (PDT)
> > David Miller <davem@davemloft.net> wrote:
> > 
> >> From: Eric Dumazet <eric.dumazet@gmail.com>
> >> Date: Wed, 13 Apr 2011 17:28:07 +0200
> >> 
> >> > Dont worry, Stephen or me will send it asap.
> >> 
> >> I'm looking forward to it :)
> > 
> > You applied the clear of ipcb already.
> 
> There are other patches involved, I think.
> 
> The one with the NULL route handling, for one.
> 
> Please follow back in this thread for the details, the IPCB clear
> wasn't sufficient to get rid of all of the reporter's OOPS's.

Agreed, it is not the complete fix. 

-- 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox