Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: Eric Dumazet @ 2012-06-26  8:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120626.013730.902797211256084220.davem@davemloft.net>

On Tue, 2012-06-26 at 01:37 -0700, David Miller wrote:

> And think, we don't do any stupidity like this for the inetpeer cache
> and no small cute animals have died as a result.

Thats because inetpeer is kept small.

We do have a smart gc on inetpeer cache (inet_peer_gc()),
and inetpeer threshold is 65536 

^ permalink raw reply

* [PATCH 2/3] net: fec: enable regulator for fec phy
From: Shawn Guo @ 2012-06-26  8:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-arm-kernel, Shawn Guo
In-Reply-To: <1340700308-8315-1-git-send-email-shawn.guo@linaro.org>

If bootloader or platform initialization code does not enable the
power supply to fec phy, we need to do it in fec driver before calling
fec_reset_phy to have the phy powered on.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/net/ethernet/freescale/fec.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
index e868a37..4dce9e3 100644
--- a/drivers/net/ethernet/freescale/fec.c
+++ b/drivers/net/ethernet/freescale/fec.c
@@ -49,6 +49,7 @@
 #include <linux/of_gpio.h>
 #include <linux/of_net.h>
 #include <linux/pinctrl/consumer.h>
+#include <linux/regulator/consumer.h>
 
 #include <asm/cacheflush.h>
 
@@ -1546,6 +1547,7 @@ fec_probe(struct platform_device *pdev)
 	const struct of_device_id *of_id;
 	static int dev_id;
 	struct pinctrl *pinctrl;
+	struct regulator *reg_phy;
 
 	of_id = of_match_device(fec_dt_ids, &pdev->dev);
 	if (of_id)
@@ -1632,6 +1634,16 @@ fec_probe(struct platform_device *pdev)
 	clk_prepare_enable(fep->clk_ahb);
 	clk_prepare_enable(fep->clk_ipg);
 
+	reg_phy = devm_regulator_get(&pdev->dev, "phy");
+	if (!IS_ERR(reg_phy)) {
+		ret = regulator_enable(reg_phy);
+		if (ret) {
+			dev_err(&pdev->dev,
+				"Failed to enable phy regulator: %d\n", ret);
+			goto failed_regulator;
+		}
+	}
+
 	fec_reset_phy(pdev);
 
 	ret = fec_enet_init(ndev);
@@ -1655,6 +1667,7 @@ failed_register:
 	fec_enet_mii_remove(fep);
 failed_mii_init:
 failed_init:
+failed_regulator:
 	clk_disable_unprepare(fep->clk_ahb);
 	clk_disable_unprepare(fep->clk_ipg);
 failed_pin:
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 3/3] net: fec: add phy-reset-interval for device tree probe
From: Shawn Guo @ 2012-06-26  8:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-arm-kernel, Shawn Guo
In-Reply-To: <1340700308-8315-1-git-send-email-shawn.guo@linaro.org>

Different boards may require different phy reset interval time.  Add
property phy-reset-interval for device tree probe, so that the boards
that need a longer interval time can specify it in their device tree.

Along with the update to phy related stuff, it also makes a minor fix
on phy-reset-gpios in binding document to have it be optional to match
the driver code.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 Documentation/devicetree/bindings/net/fsl-fec.txt |    5 ++++-
 drivers/net/ethernet/freescale/fec.c              |    4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt
index 7ab9e1a..74a4ce7 100644
--- a/Documentation/devicetree/bindings/net/fsl-fec.txt
+++ b/Documentation/devicetree/bindings/net/fsl-fec.txt
@@ -7,10 +7,13 @@ Required properties:
 - phy-mode : String, operation mode of the PHY interface.
   Supported values are: "mii", "gmii", "sgmii", "tbi", "rmii",
   "rgmii", "rgmii-id", "rgmii-rxid", "rgmii-txid", "rtbi", "smii".
-- phy-reset-gpios : Should specify the gpio for phy reset
 
 Optional properties:
 - local-mac-address : 6 bytes, mac address
+- phy-reset-gpios : Should specify the gpio for phy reset
+- phy-reset-interval : Reset interval time in milliseconds.  Should present
+  only if property "phy-reset-gpios" is available.  When "phy-reset-gpios"
+  is available, missing the property will have the interval be 1 millisecond.
 
 Example:
 
diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
index 4dce9e3..86ecaae 100644
--- a/drivers/net/ethernet/freescale/fec.c
+++ b/drivers/net/ethernet/freescale/fec.c
@@ -1507,18 +1507,20 @@ static int __devinit fec_get_phy_mode_dt(struct platform_device *pdev)
 static void __devinit fec_reset_phy(struct platform_device *pdev)
 {
 	int err, phy_reset;
+	int msec = 1;
 	struct device_node *np = pdev->dev.of_node;
 
 	if (!np)
 		return;
 
+	of_property_read_u32(np, "phy-reset-interval", &msec);
 	phy_reset = of_get_named_gpio(np, "phy-reset-gpios", 0);
 	err = gpio_request_one(phy_reset, GPIOF_OUT_INIT_LOW, "phy-reset");
 	if (err) {
 		pr_debug("FEC: failed to get gpio phy-reset: %d\n", err);
 		return;
 	}
-	msleep(1);
+	msleep(msec);
 	gpio_set_value(phy_reset, 1);
 }
 #else /* CONFIG_OF */
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 1/3] net: fec: reset phy after pinctrl setup
From: Shawn Guo @ 2012-06-26  8:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-arm-kernel, Shawn Guo
In-Reply-To: <1340700308-8315-1-git-send-email-shawn.guo@linaro.org>

In case that bootloader or platform initialization does not set up
fec pins, the fec_reset_phy will not be able to succeed, because
fec_reset_phy is currently called before devm_pinctrl_get_select_default.
Move fec_reset_phy call to the place between devm_pinctrl_get_select_default
and fec_enet_init to have above case be taken care.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/net/ethernet/freescale/fec.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
index ff7f4c5..e868a37 100644
--- a/drivers/net/ethernet/freescale/fec.c
+++ b/drivers/net/ethernet/freescale/fec.c
@@ -1593,8 +1593,6 @@ fec_probe(struct platform_device *pdev)
 		fep->phy_interface = ret;
 	}
 
-	fec_reset_phy(pdev);
-
 	for (i = 0; i < FEC_IRQ_NUM; i++) {
 		irq = platform_get_irq(pdev, i);
 		if (irq < 0) {
@@ -1634,6 +1632,8 @@ fec_probe(struct platform_device *pdev)
 	clk_prepare_enable(fep->clk_ahb);
 	clk_prepare_enable(fep->clk_ipg);
 
+	fec_reset_phy(pdev);
+
 	ret = fec_enet_init(ndev);
 	if (ret)
 		goto failed_init;
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 0/3] fec driver updates
From: Shawn Guo @ 2012-06-26  8:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-arm-kernel, Shawn Guo

The series makes a few updates on fec driver to make it work better
for device tree boot.

Shawn Guo (3):
  net: fec: reset phy after pinctrl setup
  net: fec: enable regulator for fec phy
  net: fec: add phy-reset-interval for device tree probe

 Documentation/devicetree/bindings/net/fsl-fec.txt |    5 ++++-
 drivers/net/ethernet/freescale/fec.c              |   21 ++++++++++++++++++---
 2 files changed, 22 insertions(+), 4 deletions(-)

-- 
1.7.5.4

^ permalink raw reply

* Re: [net-next patch] bnx2x,bnx2fc,bnx2i,cnic: Add statistics support and FCoE capabilities advertisement
From: Eilon Greenstein @ 2012-06-26  8:44 UTC (permalink / raw)
  To: David Miller; +Cc: meravs, netdev, barak, eddie.wai, mchan, bprakash
In-Reply-To: <20120626.013837.1956509898964121509.davem@davemloft.net>

On Tue, 2012-06-26 at 01:38 -0700, David Miller wrote:
> How many times are you going to post this patch?
> 
> The only difference is this copy is that you removed Eilon
> from the CC: list.  That's really wasteful and confusing.
> 

Sorry about that - the first two attempts did not reach netdev
(semicolon was used instead of a comma as address separator) but
apparently the first person on the email did receive them… Sorry about
the spam.

^ permalink raw reply

* [patch] net: qmi_wwan: simplify a check in qmi_wwan_bind()
From: Dan Carpenter @ 2012-06-26  8:39 UTC (permalink / raw)
  To: netdev; +Cc: Bjørn Mork, David S. Miller

This code is easier to read if we specify which flags we want at the
condition instead of at the top of the function.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Bjørn Mork <bjorn@mork.no>

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index f1e7791..23cb13c 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -129,7 +129,6 @@ static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 	struct usb_interface_descriptor *desc = &intf->cur_altsetting->desc;
 	struct usb_cdc_union_desc *cdc_union = NULL;
 	struct usb_cdc_ether_desc *cdc_ether = NULL;
-	u32 required = 1 << USB_CDC_HEADER_TYPE | 1 << USB_CDC_UNION_TYPE;
 	u32 found = 0;
 	struct usb_driver *driver = driver_of(intf);
 	struct qmi_wwan_state *info = (void *)&dev->data;
@@ -197,7 +196,8 @@ next_desc:
 	}
 
 	/* did we find all the required ones? */
-	if ((found & required) != required) {
+	if (!(found & (1 << USB_CDC_HEADER_TYPE)) ||
+	    !(found & (1 << USB_CDC_UNION_TYPE))) {
 		dev_err(&intf->dev, "CDC functional descriptors missing\n");
 		goto err;
 	}

^ permalink raw reply related

* Re: [net-next patch] bnx2x,bnx2fc,bnx2i,cnic: Add statistics support and FCoE capabilities advertisement
From: David Miller @ 2012-06-26  8:38 UTC (permalink / raw)
  To: meravs; +Cc: netdev, eilong, barak, eddie.wai, mchan, bprakash
In-Reply-To: <1340710279-18241-1-git-send-email-meravs@broadcom.com>


How many times are you going to post this patch?

The only difference is this copy is that you removed Eilon
from the CC: list.  That's really wasteful and confusing.

^ permalink raw reply

* Re: [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: David Miller @ 2012-06-26  8:37 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1340698984.10893.248.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 26 Jun 2012 10:23:04 +0200

> Thats because gc_interval (60) is big compared to ip_rt_gc_timeout
> (300)
> 
> So each time rt_check_expire() triggers, we handle a big part of the
> cache. On big servers I had to lower gc_interval to smooth things.

I know it's stupid, that's why I want to eventually kill this off
completely.

> Garbage collect is needed to not waste kernel memory, even on legitimate
> traffic on a typical web server.
> 
> Taken from my 8GB machine : 
> 
> # cat /proc/sys/net/ipv4/route/gc_thresh
> 262144
> 
> 320 bytes per dst : 262144*320 = 83886080 bytes to store one dst per hash chain.

83MB on a machine with 8GB of ram that does enough networking to fill
the routing cache.  This sounds absolutely reasonable to me.

> Also, why keeping a dst in cache if no traffic uses it in a 5 minutes period ?

Traffic is bursty and periodic.

And think, we don't do any stupidity like this for the inetpeer cache
and no small cute animals have died as a result.

^ permalink raw reply

* [PATCH net-next] bnx2x: organize BDs calculation for stop/resume
From: Dmitry Kravkov @ 2012-06-26  8:32 UTC (permalink / raw)
  To: davem, netdev; +Cc: Dmitry Kravkov, Eilon Greenstein, Eric Dumazet

Put the numbers used for stop/resume queue in a single place and
fix the condition for sanity check.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Cc: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h     |   16 ++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |   10 ++++++----
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 7211cb0..392fffe 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -612,6 +612,22 @@ struct bnx2x_fastpath {
 #define TX_BD(x)		((x) & MAX_TX_BD)
 #define TX_BD_POFF(x)		((x) & MAX_TX_DESC_CNT)
 
+/* number of NEXT_PAGE descriptors may be required during placement */
+#define NEXT_CNT_PER_TX_PKT(bds)	\
+				(((bds) + MAX_TX_DESC_CNT - 1) / \
+				 MAX_TX_DESC_CNT * NEXT_PAGE_TX_DESC_CNT)
+/* max BDs per tx packet w/o next_pages:
+ * START_BD		- describes packed
+ * START_BD(splitted)	- includes unpaged data segment for GSO
+ * PARSING_BD		- for TSO and CSUM data
+ * Frag BDs		- decribes pages for frags
+ */
+#define BDS_PER_TX_PKT		3
+#define MAX_BDS_PER_TX_PKT	(MAX_SKB_FRAGS + BDS_PER_TX_PKT)
+/* max BDs per tx packet including next pages */
+#define MAX_DESC_PER_TX_PKT	(MAX_BDS_PER_TX_PKT + \
+				 NEXT_CNT_PER_TX_PKT(MAX_BDS_PER_TX_PKT))
+
 /* The RX BD ring is special, each bd is 8 bytes but the last one is 16 */
 #define NUM_RX_RINGS		8
 #define RX_DESC_CNT		(BCM_PAGE_SIZE / sizeof(struct eth_rx_bd))
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 00951b3..e357a5d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -221,7 +221,7 @@ int bnx2x_tx_int(struct bnx2x *bp, struct bnx2x_fp_txdata *txdata)
 
 		if ((netif_tx_queue_stopped(txq)) &&
 		    (bp->state == BNX2X_STATE_OPEN) &&
-		    (bnx2x_tx_avail(bp, txdata) >= MAX_SKB_FRAGS + 4))
+		    (bnx2x_tx_avail(bp, txdata) >= MAX_DESC_PER_TX_PKT))
 			netif_tx_wake_queue(txq);
 
 		__netif_tx_unlock(txq);
@@ -2937,7 +2937,9 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	   txdata->cid, fp_index, txdata_index, txdata, fp); */
 
 	if (unlikely(bnx2x_tx_avail(bp, txdata) <
-		     (skb_shinfo(skb)->nr_frags + 3))) {
+			skb_shinfo(skb)->nr_frags +
+			BDS_PER_TX_PKT +
+			NEXT_CNT_PER_TX_PKT(MAX_BDS_PER_TX_PKT))) {
 		bnx2x_fp_qstats(bp, txdata->parent_fp)->driver_xoff++;
 		netif_tx_stop_queue(txq);
 		BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
@@ -3212,7 +3214,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	txdata->tx_bd_prod += nbd;
 
-	if (unlikely(bnx2x_tx_avail(bp, txdata) < MAX_SKB_FRAGS + 4)) {
+	if (unlikely(bnx2x_tx_avail(bp, txdata) < MAX_DESC_PER_TX_PKT)) {
 		netif_tx_stop_queue(txq);
 
 		/* paired memory barrier is in bnx2x_tx_int(), we have to keep
@@ -3221,7 +3223,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		smp_mb();
 
 		bnx2x_fp_qstats(bp, txdata->parent_fp)->driver_xoff++;
-		if (bnx2x_tx_avail(bp, txdata) >= MAX_SKB_FRAGS + 4)
+		if (bnx2x_tx_avail(bp, txdata) >= MAX_DESC_PER_TX_PKT)
 			netif_tx_wake_queue(txq);
 	}
 	txdata->tx_pkt++;
-- 
1.7.1

^ permalink raw reply related

* [net-next patch] bnx2x,bnx2fc,bnx2i,cnic: Add statistics support and FCoE capabilities advertisement
From: Merav Sicron @ 2012-06-26 11:31 UTC (permalink / raw)
  To: davem, netdev, eilong
  Cc: Merav Sicron, Barak Witkowski, Eddie Wai, Michael Chan,
	Bhanu Prakash Gollapudi

From: Barak Witkowski <barak@broadcom.com>

1. When FCoE offload driver is registered, copy its capabilities to the chip
   scratchpad.
2. Copy FCoE/iSCSI MAC addresses in aligned manner to chip scratchpad.
3. Add FCoE/iSCSI statistics collection support

Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
---
Hi Dave,

Please consider applying this patch to net-next.

Thanks,
Merav

 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h        |    2 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h    |  120 +--------------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   |   41 ++++-
 .../net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h    |  168 ++++++++++++++++++++
 drivers/net/ethernet/broadcom/cnic.c               |   11 +-
 drivers/net/ethernet/broadcom/cnic_if.h            |    9 +
 drivers/scsi/bnx2fc/bnx2fc.h                       |    4 +
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c                  |   44 +++++
 drivers/scsi/bnx2i/57xx_iscsi_hsi.h                |   16 ++-
 drivers/scsi/bnx2i/bnx2i.h                         |   58 +++++++
 drivers/scsi/bnx2i/bnx2i_hwi.c                     |   35 +++-
 drivers/scsi/bnx2i/bnx2i_init.c                    |   40 +++++
 drivers/scsi/bnx2i/bnx2i_iscsi.c                   |   11 ++
 13 files changed, 420 insertions(+), 139 deletions(-)
 create mode 100644 drivers/net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 1f2e03a..27a4dd9 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -51,6 +51,7 @@
 
 #include "bnx2x_reg.h"
 #include "bnx2x_fw_defs.h"
+#include "bnx2x_mfw_req.h"
 #include "bnx2x_hsi.h"
 #include "bnx2x_link.h"
 #include "bnx2x_sp.h"
@@ -1332,6 +1333,7 @@ struct bnx2x {
 #define NO_ISCSI_FLAG			(1 << 14)
 #define NO_FCOE_FLAG			(1 << 15)
 #define BC_SUPPORTS_PFC_STATS		(1 << 17)
+#define BC_SUPPORTS_FCOE_FEATURES	(1 << 19)
 #define USING_SINGLE_MSIX_FLAG		(1 << 20)
 #define BC_SUPPORTS_DCBX_MSG_NON_PMF	(1 << 21)
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
index 6b77630..b4837c8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
@@ -10,6 +10,7 @@
 #define BNX2X_HSI_H
 
 #include "bnx2x_fw_defs.h"
+#include "bnx2x_mfw_req.h"
 
 #define FW_ENCODE_32BIT_PATTERN         0x1e1e1e1e
 
@@ -33,12 +34,6 @@ struct license_key {
 	u32 reserved_b[4];
 };
 
-
-#define PORT_0			0
-#define PORT_1			1
-#define PORT_MAX		2
-#define NVM_PATH_MAX		2
-
 /****************************************************************************
  * Shared HW configuration                                                  *
  ****************************************************************************/
@@ -1250,6 +1245,7 @@ struct drv_func_mb {
 	#define REQ_BC_VER_4_VRFY_AFEX_SUPPORTED        0x00070002
 	#define REQ_BC_VER_4_SFP_TX_DISABLE_SUPPORTED   0x00070014
 	#define REQ_BC_VER_4_PFC_STATS_SUPPORTED        0x00070201
+	#define REQ_BC_VER_4_FCOE_FEATURES              0x00070209
 
 	#define DRV_MSG_CODE_DCBX_ADMIN_PMF_MSG         0xb0000000
 	#define DRV_MSG_CODE_DCBX_PMF_DRV_OK            0xb2000000
@@ -2698,118 +2694,6 @@ struct host_func_stats {
 /* VIC definitions */
 #define VICSTATST_UIF_INDEX 2
 
-/* current drv_info version */
-#define DRV_INFO_CUR_VER 1
-
-/* drv_info op codes supported */
-enum drv_info_opcode {
-	ETH_STATS_OPCODE,
-	FCOE_STATS_OPCODE,
-	ISCSI_STATS_OPCODE
-};
-
-#define ETH_STAT_INFO_VERSION_LEN	12
-/*  Per PCI Function Ethernet Statistics required from the driver */
-struct eth_stats_info {
-	/* Function's Driver Version. padded to 12 */
-	u8 version[ETH_STAT_INFO_VERSION_LEN];
-	/* Locally Admin Addr. BigEndian EIU48. Actual size is 6 bytes */
-	u8 mac_local[8];
-	u8 mac_add1[8];		/* Additional Programmed MAC Addr 1. */
-	u8 mac_add2[8];		/* Additional Programmed MAC Addr 2. */
-	u32 mtu_size;		/* MTU Size. Note   : Negotiated MTU */
-	u32 feature_flags;	/* Feature_Flags. */
-#define FEATURE_ETH_CHKSUM_OFFLOAD_MASK		0x01
-#define FEATURE_ETH_LSO_MASK			0x02
-#define FEATURE_ETH_BOOTMODE_MASK		0x1C
-#define FEATURE_ETH_BOOTMODE_SHIFT		2
-#define FEATURE_ETH_BOOTMODE_NONE		(0x0 << 2)
-#define FEATURE_ETH_BOOTMODE_PXE		(0x1 << 2)
-#define FEATURE_ETH_BOOTMODE_ISCSI		(0x2 << 2)
-#define FEATURE_ETH_BOOTMODE_FCOE		(0x3 << 2)
-#define FEATURE_ETH_TOE_MASK			0x20
-	u32 lso_max_size;	/* LSO MaxOffloadSize. */
-	u32 lso_min_seg_cnt;	/* LSO MinSegmentCount. */
-	/* Num Offloaded Connections TCP_IPv4. */
-	u32 ipv4_ofld_cnt;
-	/* Num Offloaded Connections TCP_IPv6. */
-	u32 ipv6_ofld_cnt;
-	u32 promiscuous_mode;	/* Promiscuous Mode. non-zero true */
-	u32 txq_size;		/* TX Descriptors Queue Size */
-	u32 rxq_size;		/* RX Descriptors Queue Size */
-	/* TX Descriptor Queue Avg Depth. % Avg Queue Depth since last poll */
-	u32 txq_avg_depth;
-	/* RX Descriptors Queue Avg Depth. % Avg Queue Depth since last poll */
-	u32 rxq_avg_depth;
-	/* IOV_Offload. 0=none; 1=MultiQueue, 2=VEB 3= VEPA*/
-	u32 iov_offload;
-	/* Number of NetQueue/VMQ Config'd. */
-	u32 netq_cnt;
-	u32 vf_cnt;		/* Num VF assigned to this PF. */
-};
-
-/*  Per PCI Function FCOE Statistics required from the driver */
-struct fcoe_stats_info {
-	u8 version[12];		/* Function's Driver Version. */
-	u8 mac_local[8];	/* Locally Admin Addr. */
-	u8 mac_add1[8];		/* Additional Programmed MAC Addr 1. */
-	u8 mac_add2[8];		/* Additional Programmed MAC Addr 2. */
-	/* QoS Priority (per 802.1p). 0-7255 */
-	u32 qos_priority;
-	u32 txq_size;		/* FCoE TX Descriptors Queue Size. */
-	u32 rxq_size;		/* FCoE RX Descriptors Queue Size. */
-	/* FCoE TX Descriptor Queue Avg Depth. */
-	u32 txq_avg_depth;
-	/* FCoE RX Descriptors Queue Avg Depth. */
-	u32 rxq_avg_depth;
-	u32 rx_frames_lo;	/* FCoE RX Frames received. */
-	u32 rx_frames_hi;	/* FCoE RX Frames received. */
-	u32 rx_bytes_lo;	/* FCoE RX Bytes received. */
-	u32 rx_bytes_hi;	/* FCoE RX Bytes received. */
-	u32 tx_frames_lo;	/* FCoE TX Frames sent. */
-	u32 tx_frames_hi;	/* FCoE TX Frames sent. */
-	u32 tx_bytes_lo;	/* FCoE TX Bytes sent. */
-	u32 tx_bytes_hi;	/* FCoE TX Bytes sent. */
-};
-
-/* Per PCI  Function iSCSI Statistics required from the driver*/
-struct iscsi_stats_info {
-	u8 version[12];		/* Function's Driver Version. */
-	u8 mac_local[8];	/* Locally Admin iSCSI MAC Addr. */
-	u8 mac_add1[8];		/* Additional Programmed MAC Addr 1. */
-	/* QoS Priority (per 802.1p). 0-7255 */
-	u32 qos_priority;
-	u8 initiator_name[64];	/* iSCSI Boot Initiator Node name. */
-	u8 ww_port_name[64];	/* iSCSI World wide port name */
-	u8 boot_target_name[64];/* iSCSI Boot Target Name. */
-	u8 boot_target_ip[16];	/* iSCSI Boot Target IP. */
-	u32 boot_target_portal;	/* iSCSI Boot Target Portal. */
-	u8 boot_init_ip[16];	/* iSCSI Boot Initiator IP Address. */
-	u32 max_frame_size;	/* Max Frame Size. bytes */
-	u32 txq_size;		/* PDU TX Descriptors Queue Size. */
-	u32 rxq_size;		/* PDU RX Descriptors Queue Size. */
-	u32 txq_avg_depth;	/* PDU TX Descriptor Queue Avg Depth. */
-	u32 rxq_avg_depth;	/* PDU RX Descriptors Queue Avg Depth. */
-	u32 rx_pdus_lo;		/* iSCSI PDUs received. */
-	u32 rx_pdus_hi;		/* iSCSI PDUs received. */
-	u32 rx_bytes_lo;	/* iSCSI RX Bytes received. */
-	u32 rx_bytes_hi;	/* iSCSI RX Bytes received. */
-	u32 tx_pdus_lo;		/* iSCSI PDUs sent. */
-	u32 tx_pdus_hi;		/* iSCSI PDUs sent. */
-	u32 tx_bytes_lo;	/* iSCSI PDU TX Bytes sent. */
-	u32 tx_bytes_hi;	/* iSCSI PDU TX Bytes sent. */
-	u32 pcp_prior_map_tbl;	/* C-PCP to S-PCP Priority MapTable.
-				 * 9 nibbles, the position of each nibble
-				 * represents the C-PCP value, the value
-				 * of the nibble = S-PCP value.
-				 */
-};
-
-union drv_info_to_mcp {
-	struct eth_stats_info	ether_stat;
-	struct fcoe_stats_info	fcoe_stat;
-	struct iscsi_stats_info	iscsi_stat;
-};
 
 /* stats collected for afex.
  * NOTE: structure is exactly as expected to be received by the switch.
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 26a8296..eb3e55a 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -74,6 +74,8 @@
 #define FW_FILE_NAME_E1H	"bnx2x/bnx2x-e1h-" FW_FILE_VERSION ".fw"
 #define FW_FILE_NAME_E2		"bnx2x/bnx2x-e2-" FW_FILE_VERSION ".fw"
 
+#define MAC_LEADING_ZERO_CNT (ALIGN(ETH_ALEN, sizeof(u32)) - ETH_ALEN)
+
 /* Time in jiffies before concluding the transmitter is hung */
 #define TX_TIMEOUT		(5*HZ)
 
@@ -3060,7 +3062,8 @@ static void bnx2x_drv_info_fcoe_stat(struct bnx2x *bp)
 	struct fcoe_stats_info *fcoe_stat =
 		&bp->slowpath->drv_info_to_mcp.fcoe_stat;
 
-	memcpy(fcoe_stat->mac_local, bp->fip_mac, ETH_ALEN);
+	memcpy(fcoe_stat->mac_local + MAC_LEADING_ZERO_CNT,
+	       bp->fip_mac, ETH_ALEN);
 
 	fcoe_stat->qos_priority =
 		app->traffic_type_priority[LLFC_TRAFFIC_TYPE_FCOE];
@@ -3151,7 +3154,8 @@ static void bnx2x_drv_info_iscsi_stat(struct bnx2x *bp)
 	struct iscsi_stats_info *iscsi_stat =
 		&bp->slowpath->drv_info_to_mcp.iscsi_stat;
 
-	memcpy(iscsi_stat->mac_local, bp->cnic_eth_dev.iscsi_mac, ETH_ALEN);
+	memcpy(iscsi_stat->mac_local + MAC_LEADING_ZERO_CNT,
+	       bp->cnic_eth_dev.iscsi_mac, ETH_ALEN);
 
 	iscsi_stat->qos_priority =
 		app->traffic_type_priority[LLFC_TRAFFIC_TYPE_ISCSI];
@@ -9732,6 +9736,9 @@ static void __devinit bnx2x_get_common_hwinfo(struct bnx2x *bp)
 	bp->flags |= (val >= REQ_BC_VER_4_PFC_STATS_SUPPORTED) ?
 			BC_SUPPORTS_PFC_STATS : 0;
 
+	bp->flags |= (val >= REQ_BC_VER_4_FCOE_FEATURES) ?
+			BC_SUPPORTS_FCOE_FEATURES : 0;
+
 	bp->flags |= (val >= REQ_BC_VER_4_DCBX_ADMIN_MSG_NON_PMF) ?
 			BC_SUPPORTS_DCBX_MSG_NON_PMF : 0;
 	boot_mode = SHMEM_RD(bp,
@@ -12548,21 +12555,45 @@ static int bnx2x_drv_ctl(struct net_device *dev, struct drv_ctl_info *ctl)
 		break;
 	}
 	case DRV_CTL_ULP_REGISTER_CMD: {
-		int ulp_type = ctl->data.ulp_type;
+		int ulp_type = ctl->data.register_data.ulp_type;
 
 		if (CHIP_IS_E3(bp)) {
 			int idx = BP_FW_MB_IDX(bp);
-			u32 cap;
+			u32 cap = SHMEM2_RD(bp, drv_capabilities_flag[idx]);
+			int path = BP_PATH(bp);
+			int port = BP_PORT(bp);
+			int i;
+			u32 scratch_offset;
+			u32 *host_addr;
 
-			cap = SHMEM2_RD(bp, drv_capabilities_flag[idx]);
+			/* first write capability to shmem2 */
 			if (ulp_type == CNIC_ULP_ISCSI)
 				cap |= DRV_FLAGS_CAPABILITIES_LOADED_ISCSI;
 			else if (ulp_type == CNIC_ULP_FCOE)
 				cap |= DRV_FLAGS_CAPABILITIES_LOADED_FCOE;
 			SHMEM2_WR(bp, drv_capabilities_flag[idx], cap);
+
+			if ((ulp_type != CNIC_ULP_FCOE) ||
+			    (!SHMEM2_HAS(bp, ncsi_oem_data_addr)) ||
+			    (!(bp->flags &  BC_SUPPORTS_FCOE_FEATURES)))
+				break;
+
+			/* if reached here - should write fcoe capabilities */
+			scratch_offset = SHMEM2_RD(bp, ncsi_oem_data_addr);
+			if (!scratch_offset)
+				break;
+			scratch_offset += offsetof(struct glob_ncsi_oem_data,
+						   fcoe_features[path][port]);
+			host_addr = (u32 *) &(ctl->data.register_data.
+					      fcoe_features);
+			for (i = 0; i < sizeof(struct fcoe_capabilities);
+			     i += 4)
+				REG_WR(bp, scratch_offset + i,
+				       *(host_addr + i/4));
 		}
 		break;
 	}
+
 	case DRV_CTL_ULP_UNREGISTER_CMD: {
 		int ulp_type = ctl->data.ulp_type;
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h
new file mode 100644
index 0000000..ddd5106
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h
@@ -0,0 +1,168 @@
+/* bnx2x_mfw_req.h: Broadcom Everest network driver.
+ *
+ * Copyright (c) 2012 Broadcom Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef BNX2X_MFW_REQ_H
+#define BNX2X_MFW_REQ_H
+
+#define PORT_0			0
+#define PORT_1			1
+#define PORT_MAX		2
+#define NVM_PATH_MAX		2
+
+/* FCoE capabilities required from the driver */
+struct fcoe_capabilities {
+	u32 capability1;
+	/* Maximum number of I/Os per connection */
+	#define FCOE_IOS_PER_CONNECTION_MASK    0x0000ffff
+	#define FCOE_IOS_PER_CONNECTION_SHIFT   0
+	/* Maximum number of Logins per port */
+	#define FCOE_LOGINS_PER_PORT_MASK       0xffff0000
+	#define FCOE_LOGINS_PER_PORT_SHIFT   16
+
+	u32 capability2;
+	/* Maximum number of exchanges */
+	#define FCOE_NUMBER_OF_EXCHANGES_MASK   0x0000ffff
+	#define FCOE_NUMBER_OF_EXCHANGES_SHIFT  0
+	/* Maximum NPIV WWN per port */
+	#define FCOE_NPIV_WWN_PER_PORT_MASK     0xffff0000
+	#define FCOE_NPIV_WWN_PER_PORT_SHIFT    16
+
+	u32 capability3;
+	/* Maximum number of targets supported */
+	#define FCOE_TARGETS_SUPPORTED_MASK     0x0000ffff
+	#define FCOE_TARGETS_SUPPORTED_SHIFT    0
+	/* Maximum number of outstanding commands across all connections */
+	#define FCOE_OUTSTANDING_COMMANDS_MASK  0xffff0000
+	#define FCOE_OUTSTANDING_COMMANDS_SHIFT 16
+
+	u32 capability4;
+	#define FCOE_CAPABILITY4_STATEFUL			0x00000001
+	#define FCOE_CAPABILITY4_STATELESS			0x00000002
+	#define FCOE_CAPABILITY4_CAPABILITIES_REPORTED_VALID	0x00000004
+};
+
+struct glob_ncsi_oem_data {
+	u32 driver_version;
+	u32 unused[3];
+	struct fcoe_capabilities fcoe_features[NVM_PATH_MAX][PORT_MAX];
+};
+
+/* current drv_info version */
+#define DRV_INFO_CUR_VER 2
+
+/* drv_info op codes supported */
+enum drv_info_opcode {
+	ETH_STATS_OPCODE,
+	FCOE_STATS_OPCODE,
+	ISCSI_STATS_OPCODE
+};
+
+#define ETH_STAT_INFO_VERSION_LEN	12
+/*  Per PCI Function Ethernet Statistics required from the driver */
+struct eth_stats_info {
+	/* Function's Driver Version. padded to 12 */
+	u8 version[ETH_STAT_INFO_VERSION_LEN];
+	/* Locally Admin Addr. BigEndian EIU48. Actual size is 6 bytes */
+	u8 mac_local[8];
+	u8 mac_add1[8];		/* Additional Programmed MAC Addr 1. */
+	u8 mac_add2[8];		/* Additional Programmed MAC Addr 2. */
+	u32 mtu_size;		/* MTU Size. Note   : Negotiated MTU */
+	u32 feature_flags;	/* Feature_Flags. */
+#define FEATURE_ETH_CHKSUM_OFFLOAD_MASK		0x01
+#define FEATURE_ETH_LSO_MASK			0x02
+#define FEATURE_ETH_BOOTMODE_MASK		0x1C
+#define FEATURE_ETH_BOOTMODE_SHIFT		2
+#define FEATURE_ETH_BOOTMODE_NONE		(0x0 << 2)
+#define FEATURE_ETH_BOOTMODE_PXE		(0x1 << 2)
+#define FEATURE_ETH_BOOTMODE_ISCSI		(0x2 << 2)
+#define FEATURE_ETH_BOOTMODE_FCOE		(0x3 << 2)
+#define FEATURE_ETH_TOE_MASK			0x20
+	u32 lso_max_size;	/* LSO MaxOffloadSize. */
+	u32 lso_min_seg_cnt;	/* LSO MinSegmentCount. */
+	/* Num Offloaded Connections TCP_IPv4. */
+	u32 ipv4_ofld_cnt;
+	/* Num Offloaded Connections TCP_IPv6. */
+	u32 ipv6_ofld_cnt;
+	u32 promiscuous_mode;	/* Promiscuous Mode. non-zero true */
+	u32 txq_size;		/* TX Descriptors Queue Size */
+	u32 rxq_size;		/* RX Descriptors Queue Size */
+	/* TX Descriptor Queue Avg Depth. % Avg Queue Depth since last poll */
+	u32 txq_avg_depth;
+	/* RX Descriptors Queue Avg Depth. % Avg Queue Depth since last poll */
+	u32 rxq_avg_depth;
+	/* IOV_Offload. 0=none; 1=MultiQueue, 2=VEB 3= VEPA*/
+	u32 iov_offload;
+	/* Number of NetQueue/VMQ Config'd. */
+	u32 netq_cnt;
+	u32 vf_cnt;		/* Num VF assigned to this PF. */
+};
+
+/*  Per PCI Function FCOE Statistics required from the driver */
+struct fcoe_stats_info {
+	u8 version[12];		/* Function's Driver Version. */
+	u8 mac_local[8];	/* Locally Admin Addr. */
+	u8 mac_add1[8];		/* Additional Programmed MAC Addr 1. */
+	u8 mac_add2[8];		/* Additional Programmed MAC Addr 2. */
+	/* QoS Priority (per 802.1p). 0-7255 */
+	u32 qos_priority;
+	u32 txq_size;		/* FCoE TX Descriptors Queue Size. */
+	u32 rxq_size;		/* FCoE RX Descriptors Queue Size. */
+	/* FCoE TX Descriptor Queue Avg Depth. */
+	u32 txq_avg_depth;
+	/* FCoE RX Descriptors Queue Avg Depth. */
+	u32 rxq_avg_depth;
+	u32 rx_frames_lo;	/* FCoE RX Frames received. */
+	u32 rx_frames_hi;	/* FCoE RX Frames received. */
+	u32 rx_bytes_lo;	/* FCoE RX Bytes received. */
+	u32 rx_bytes_hi;	/* FCoE RX Bytes received. */
+	u32 tx_frames_lo;	/* FCoE TX Frames sent. */
+	u32 tx_frames_hi;	/* FCoE TX Frames sent. */
+	u32 tx_bytes_lo;	/* FCoE TX Bytes sent. */
+	u32 tx_bytes_hi;	/* FCoE TX Bytes sent. */
+};
+
+/* Per PCI  Function iSCSI Statistics required from the driver*/
+struct iscsi_stats_info {
+	u8 version[12];		/* Function's Driver Version. */
+	u8 mac_local[8];	/* Locally Admin iSCSI MAC Addr. */
+	u8 mac_add1[8];		/* Additional Programmed MAC Addr 1. */
+	/* QoS Priority (per 802.1p). 0-7255 */
+	u32 qos_priority;
+	u8 initiator_name[64];	/* iSCSI Boot Initiator Node name. */
+	u8 ww_port_name[64];	/* iSCSI World wide port name */
+	u8 boot_target_name[64];/* iSCSI Boot Target Name. */
+	u8 boot_target_ip[16];	/* iSCSI Boot Target IP. */
+	u32 boot_target_portal;	/* iSCSI Boot Target Portal. */
+	u8 boot_init_ip[16];	/* iSCSI Boot Initiator IP Address. */
+	u32 max_frame_size;	/* Max Frame Size. bytes */
+	u32 txq_size;		/* PDU TX Descriptors Queue Size. */
+	u32 rxq_size;		/* PDU RX Descriptors Queue Size. */
+	u32 txq_avg_depth;	/* PDU TX Descriptor Queue Avg Depth. */
+	u32 rxq_avg_depth;	/* PDU RX Descriptors Queue Avg Depth. */
+	u32 rx_pdus_lo;		/* iSCSI PDUs received. */
+	u32 rx_pdus_hi;		/* iSCSI PDUs received. */
+	u32 rx_bytes_lo;	/* iSCSI RX Bytes received. */
+	u32 rx_bytes_hi;	/* iSCSI RX Bytes received. */
+	u32 tx_pdus_lo;		/* iSCSI PDUs sent. */
+	u32 tx_pdus_hi;		/* iSCSI PDUs sent. */
+	u32 tx_bytes_lo;	/* iSCSI PDU TX Bytes sent. */
+	u32 tx_bytes_hi;	/* iSCSI PDU TX Bytes sent. */
+	u32 pcp_prior_map_tbl;	/* C-PCP to S-PCP Priority MapTable.
+				 * 9 nibbles, the position of each nibble
+				 * represents the C-PCP value, the value
+				 * of the nibble = S-PCP value.
+				 */
+};
+
+union drv_info_to_mcp {
+	struct eth_stats_info	ether_stat;
+	struct fcoe_stats_info	fcoe_stat;
+	struct iscsi_stats_info	iscsi_stat;
+};
+#endif /* BNX2X_MFW_REQ_H */
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index 65e66ca..0e9be2b 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -256,11 +256,16 @@ static void cnic_ulp_ctl(struct cnic_dev *dev, int ulp_type, bool reg)
 	struct cnic_local *cp = dev->cnic_priv;
 	struct cnic_eth_dev *ethdev = cp->ethdev;
 	struct drv_ctl_info info;
+	struct fcoe_capabilities *fcoe_cap =
+		&info.data.register_data.fcoe_features;
 
-	if (reg)
+	if (reg) {
 		info.cmd = DRV_CTL_ULP_REGISTER_CMD;
-	else
+		if (ulp_type == CNIC_ULP_FCOE && dev->fcoe_cap)
+			memcpy(fcoe_cap, dev->fcoe_cap, sizeof(*fcoe_cap));
+	} else {
 		info.cmd = DRV_CTL_ULP_UNREGISTER_CMD;
+	}
 
 	info.data.ulp_type = ulp_type;
 	ethdev->drv_ctl(dev->netdev, &info);
@@ -611,6 +616,8 @@ static int cnic_unregister_device(struct cnic_dev *dev, int ulp_type)
 
 	if (ulp_type == CNIC_ULP_ISCSI)
 		cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL);
+	else if (ulp_type == CNIC_ULP_FCOE)
+		dev->fcoe_cap = NULL;
 
 	synchronize_rcu();
 
diff --git a/drivers/net/ethernet/broadcom/cnic_if.h b/drivers/net/ethernet/broadcom/cnic_if.h
index 289274e..d63d455 100644
--- a/drivers/net/ethernet/broadcom/cnic_if.h
+++ b/drivers/net/ethernet/broadcom/cnic_if.h
@@ -12,6 +12,8 @@
 #ifndef CNIC_IF_H
 #define CNIC_IF_H
 
+#include "bnx2x/bnx2x_mfw_req.h"
+
 #define CNIC_MODULE_VERSION	"2.5.10"
 #define CNIC_MODULE_RELDATE	"March 21, 2012"
 
@@ -131,6 +133,11 @@ struct drv_ctl_l2_ring {
 	u32		cid;
 };
 
+struct drv_ctl_register_data {
+	int ulp_type;
+	struct fcoe_capabilities fcoe_features;
+};
+
 struct drv_ctl_info {
 	int	cmd;
 	union {
@@ -138,6 +145,7 @@ struct drv_ctl_info {
 		struct drv_ctl_io io;
 		struct drv_ctl_l2_ring ring;
 		int ulp_type;
+		struct drv_ctl_register_data register_data;
 		char bytes[MAX_DRV_CTL_DATA];
 	} data;
 };
@@ -305,6 +313,7 @@ struct cnic_dev {
 	int		max_rdma_conn;
 
 	union drv_info_to_mcp	*stats_addr;
+	struct fcoe_capabilities	*fcoe_cap;
 
 	void		*cnic_priv;
 };
diff --git a/drivers/scsi/bnx2fc/bnx2fc.h b/drivers/scsi/bnx2fc/bnx2fc.h
index 0578fa0..42969e8 100644
--- a/drivers/scsi/bnx2fc/bnx2fc.h
+++ b/drivers/scsi/bnx2fc/bnx2fc.h
@@ -59,6 +59,7 @@
 #include "57xx_hsi_bnx2fc.h"
 #include "bnx2fc_debug.h"
 #include "../../net/ethernet/broadcom/cnic_if.h"
+#include  "../../net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h"
 #include "bnx2fc_constants.h"
 
 #define BNX2FC_NAME		"bnx2fc"
@@ -84,6 +85,8 @@
 #define BNX2FC_NUM_MAX_SESS	1024
 #define BNX2FC_NUM_MAX_SESS_LOG	(ilog2(BNX2FC_NUM_MAX_SESS))
 
+#define BNX2FC_MAX_NPIV		256
+
 #define BNX2FC_MAX_OUTSTANDING_CMNDS	2048
 #define BNX2FC_CAN_QUEUE		BNX2FC_MAX_OUTSTANDING_CMNDS
 #define BNX2FC_ELSTM_XIDS		BNX2FC_CAN_QUEUE
@@ -206,6 +209,7 @@ struct bnx2fc_hba {
 	struct fcoe_statistics_params *stats_buffer;
 	dma_addr_t stats_buf_dma;
 	struct completion stat_req_done;
+	struct fcoe_capabilities fcoe_cap;
 
 	/*destroy handling */
 	struct timer_list destroy_timer;
diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index f52f668..05fe662 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -1326,6 +1326,7 @@ static void bnx2fc_hba_destroy(struct bnx2fc_hba *hba)
 static struct bnx2fc_hba *bnx2fc_hba_create(struct cnic_dev *cnic)
 {
 	struct bnx2fc_hba *hba;
+	struct fcoe_capabilities *fcoe_cap;
 	int rc;
 
 	hba = kzalloc(sizeof(*hba), GFP_KERNEL);
@@ -1361,6 +1362,21 @@ static struct bnx2fc_hba *bnx2fc_hba_create(struct cnic_dev *cnic)
 		printk(KERN_ERR PFX "em_config:bnx2fc_cmd_mgr_alloc failed\n");
 		goto cmgr_err;
 	}
+	fcoe_cap = &hba->fcoe_cap;
+
+	fcoe_cap->capability1 = BNX2FC_TM_MAX_SQES <<
+					FCOE_IOS_PER_CONNECTION_SHIFT;
+	fcoe_cap->capability1 |= BNX2FC_NUM_MAX_SESS <<
+					FCOE_LOGINS_PER_PORT_SHIFT;
+	fcoe_cap->capability2 = BNX2FC_MAX_OUTSTANDING_CMNDS <<
+					FCOE_NUMBER_OF_EXCHANGES_SHIFT;
+	fcoe_cap->capability2 |= BNX2FC_MAX_NPIV <<
+					FCOE_NPIV_WWN_PER_PORT_SHIFT;
+	fcoe_cap->capability3 = BNX2FC_NUM_MAX_SESS <<
+					FCOE_TARGETS_SUPPORTED_SHIFT;
+	fcoe_cap->capability3 |= BNX2FC_MAX_OUTSTANDING_CMNDS <<
+					FCOE_OUTSTANDING_COMMANDS_SHIFT;
+	fcoe_cap->capability4 = FCOE_CAPABILITY4_STATEFUL;
 
 	init_waitqueue_head(&hba->shutdown_wait);
 	init_waitqueue_head(&hba->destroy_wait);
@@ -1691,6 +1707,32 @@ static void bnx2fc_unbind_pcidev(struct bnx2fc_hba *hba)
 	hba->pcidev = NULL;
 }
 
+/**
+ * bnx2fc_ulp_get_stats - cnic callback to populate FCoE stats
+ *
+ * @handle:    transport handle pointing to adapter struture
+ */
+static int bnx2fc_ulp_get_stats(void *handle)
+{
+	struct bnx2fc_hba *hba = handle;
+	struct cnic_dev *cnic;
+	struct fcoe_stats_info *stats_addr;
+
+	if (!hba)
+		return -EINVAL;
+
+	cnic = hba->cnic;
+	stats_addr = &cnic->stats_addr->fcoe_stat;
+	if (!stats_addr)
+		return -EINVAL;
+
+	strncpy(stats_addr->version, BNX2FC_VERSION,
+		sizeof(stats_addr->version));
+	stats_addr->txq_size = BNX2FC_SQ_WQES_MAX;
+	stats_addr->rxq_size = BNX2FC_CQ_WQES_MAX;
+
+	return 0;
+}
 
 
 /**
@@ -1944,6 +1986,7 @@ static void bnx2fc_ulp_init(struct cnic_dev *dev)
 	adapter_count++;
 	mutex_unlock(&bnx2fc_dev_lock);
 
+	dev->fcoe_cap = &hba->fcoe_cap;
 	clear_bit(BNX2FC_CNIC_REGISTERED, &hba->reg_with_cnic);
 	rc = dev->register_device(dev, CNIC_ULP_FCOE,
 						(void *) hba);
@@ -2643,4 +2686,5 @@ static struct cnic_ulp_ops bnx2fc_cnic_cb = {
 	.cnic_stop		= bnx2fc_ulp_stop,
 	.indicate_kcqes		= bnx2fc_indicate_kcqe,
 	.indicate_netevent	= bnx2fc_indicate_netevent,
+	.cnic_get_stats		= bnx2fc_ulp_get_stats,
 };
diff --git a/drivers/scsi/bnx2i/57xx_iscsi_hsi.h b/drivers/scsi/bnx2i/57xx_iscsi_hsi.h
index dc0a08e..f2db5fe 100644
--- a/drivers/scsi/bnx2i/57xx_iscsi_hsi.h
+++ b/drivers/scsi/bnx2i/57xx_iscsi_hsi.h
@@ -267,7 +267,13 @@ struct bnx2i_cmd_request {
  * task statistics for write response
  */
 struct bnx2i_write_resp_task_stat {
-	u32 num_data_ins;
+#if defined(__BIG_ENDIAN)
+	u16 num_r2ts;
+	u16 num_data_outs;
+#elif defined(__LITTLE_ENDIAN)
+	u16 num_data_outs;
+	u16 num_r2ts;
+#endif
 };
 
 /*
@@ -275,11 +281,11 @@ struct bnx2i_write_resp_task_stat {
  */
 struct bnx2i_read_resp_task_stat {
 #if defined(__BIG_ENDIAN)
-	u16 num_data_outs;
-	u16 num_r2ts;
+	u16 reserved;
+	u16 num_data_ins;
 #elif defined(__LITTLE_ENDIAN)
-	u16 num_r2ts;
-	u16 num_data_outs;
+	u16 num_data_ins;
+	u16 reserved;
 #endif
 };
 
diff --git a/drivers/scsi/bnx2i/bnx2i.h b/drivers/scsi/bnx2i/bnx2i.h
index 0c53c28..2e32614 100644
--- a/drivers/scsi/bnx2i/bnx2i.h
+++ b/drivers/scsi/bnx2i/bnx2i.h
@@ -44,6 +44,8 @@
 #include "57xx_iscsi_hsi.h"
 #include "57xx_iscsi_constants.h"
 
+#include "../../net/ethernet/broadcom/bnx2x/bnx2x_mfw_req.h"
+
 #define BNX2_ISCSI_DRIVER_NAME		"bnx2i"
 
 #define BNX2I_MAX_ADAPTERS		8
@@ -126,6 +128,43 @@
 #define REG_WR(__hba, offset, val)			\
 		writel(val, __hba->regview + offset)
 
+#ifdef CONFIG_32BIT
+#define GET_STATS_64(__hba, dst, field)				\
+	do {							\
+		spin_lock_bh(&__hba->stat_lock);		\
+		dst->field##_lo = __hba->stats.field##_lo;	\
+		dst->field##_hi = __hba->stats.field##_hi;	\
+		spin_unlock_bh(&__hba->stat_lock);		\
+	} while (0)
+
+#define ADD_STATS_64(__hba, field, len)				\
+	do {							\
+		if (spin_trylock(&__hba->stat_lock)) {		\
+			if (__hba->stats.field##_lo + len <	\
+			    __hba->stats.field##_lo)		\
+				__hba->stats.field##_hi++;	\
+			__hba->stats.field##_lo += len;		\
+			spin_unlock(&__hba->stat_lock);		\
+		}						\
+	} while (0)
+
+#else
+#define GET_STATS_64(__hba, dst, field)				\
+	do {							\
+		u64 val, *out;					\
+								\
+		val = __hba->bnx2i_stats.field;			\
+		out = (u64 *)&__hba->stats.field##_lo;		\
+		*out = cpu_to_le64(val);			\
+		out = (u64 *)&dst->field##_lo;			\
+		*out = cpu_to_le64(val);			\
+	} while (0)
+
+#define ADD_STATS_64(__hba, field, len)				\
+	do {							\
+		__hba->bnx2i_stats.field += len;		\
+	} while (0)
+#endif
 
 /**
  * struct generic_pdu_resc - login pdu resource structure
@@ -288,6 +327,15 @@ struct iscsi_cid_queue {
 	struct bnx2i_conn **conn_cid_tbl;
 };
 
+
+struct bnx2i_stats_info {
+	u64 rx_pdus;
+	u64 rx_bytes;
+	u64 tx_pdus;
+	u64 tx_bytes;
+};
+
+
 /**
  * struct bnx2i_hba - bnx2i adapter structure
  *
@@ -341,6 +389,8 @@ struct iscsi_cid_queue {
  * @ctx_ccell_tasks:       captures number of ccells and tasks supported by
  *                         currently offloaded connection, used to decode
  *                         context memory
+ * @stat_lock:		   spin lock used by the statistic collector (32 bit)
+ * @stats:		   local iSCSI statistic collection place holder
  *
  * Adapter Data Structure
  */
@@ -426,6 +476,12 @@ struct bnx2i_hba {
 	u32 num_sess_opened;
 	u32 num_conn_opened;
 	unsigned int ctx_ccell_tasks;
+
+#ifdef CONFIG_32BIT
+	spinlock_t stat_lock;
+#endif
+	struct bnx2i_stats_info bnx2i_stats;
+	struct iscsi_stats_info stats;
 };
 
 
@@ -749,6 +805,8 @@ extern void bnx2i_ulp_init(struct cnic_dev *dev);
 extern void bnx2i_ulp_exit(struct cnic_dev *dev);
 extern void bnx2i_start(void *handle);
 extern void bnx2i_stop(void *handle);
+extern int bnx2i_get_stats(void *handle);
+
 extern struct bnx2i_hba *get_adapter_list_head(void);
 
 struct bnx2i_conn *bnx2i_get_conn_from_id(struct bnx2i_hba *hba,
diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
index ece47e5..49e8b05 100644
--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
@@ -1350,6 +1350,7 @@ int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
 				struct cqe *cqe)
 {
 	struct iscsi_conn *conn = bnx2i_conn->cls_conn->dd_data;
+	struct bnx2i_hba *hba = bnx2i_conn->hba;
 	struct bnx2i_cmd_response *resp_cqe;
 	struct bnx2i_cmd *bnx2i_cmd;
 	struct iscsi_task *task;
@@ -1367,16 +1368,26 @@ int bnx2i_process_scsi_cmd_resp(struct iscsi_session *session,
 
 	if (bnx2i_cmd->req.op_attr & ISCSI_CMD_REQUEST_READ) {
 		conn->datain_pdus_cnt +=
-			resp_cqe->task_stat.read_stat.num_data_outs;
+			resp_cqe->task_stat.read_stat.num_data_ins;
 		conn->rxdata_octets +=
 			bnx2i_cmd->req.total_data_transfer_length;
+		ADD_STATS_64(hba, rx_pdus,
+			     resp_cqe->task_stat.read_stat.num_data_ins);
+		ADD_STATS_64(hba, rx_bytes,
+			     bnx2i_cmd->req.total_data_transfer_length);
 	} else {
 		conn->dataout_pdus_cnt +=
-			resp_cqe->task_stat.read_stat.num_data_outs;
+			resp_cqe->task_stat.write_stat.num_data_outs;
 		conn->r2t_pdus_cnt +=
-			resp_cqe->task_stat.read_stat.num_r2ts;
+			resp_cqe->task_stat.write_stat.num_r2ts;
 		conn->txdata_octets +=
 			bnx2i_cmd->req.total_data_transfer_length;
+		ADD_STATS_64(hba, tx_pdus,
+			     resp_cqe->task_stat.write_stat.num_data_outs);
+		ADD_STATS_64(hba, tx_bytes,
+			     bnx2i_cmd->req.total_data_transfer_length);
+		ADD_STATS_64(hba, rx_pdus,
+			     resp_cqe->task_stat.write_stat.num_r2ts);
 	}
 	bnx2i_iscsi_unmap_sg_list(bnx2i_cmd);
 
@@ -1961,6 +1972,7 @@ static int bnx2i_process_new_cqes(struct bnx2i_conn *bnx2i_conn)
 {
 	struct iscsi_conn *conn = bnx2i_conn->cls_conn->dd_data;
 	struct iscsi_session *session = conn->session;
+	struct bnx2i_hba *hba = bnx2i_conn->hba;
 	struct qp_info *qp;
 	struct bnx2i_nop_in_msg *nopin;
 	int tgt_async_msg;
@@ -1973,7 +1985,7 @@ static int bnx2i_process_new_cqes(struct bnx2i_conn *bnx2i_conn)
 
 	if (!qp->cq_virt) {
 		printk(KERN_ALERT "bnx2i (%s): cq resr freed in bh execution!",
-			bnx2i_conn->hba->netdev->name);
+		       hba->netdev->name);
 		goto out;
 	}
 	while (1) {
@@ -1985,9 +1997,9 @@ static int bnx2i_process_new_cqes(struct bnx2i_conn *bnx2i_conn)
 			if (nopin->op_code == ISCSI_OP_NOOP_IN &&
 			    nopin->itt == (u16) RESERVED_ITT) {
 				printk(KERN_ALERT "bnx2i: Unsolicited "
-					"NOP-In detected for suspended "
-					"connection dev=%s!\n",
-					bnx2i_conn->hba->netdev->name);
+				       "NOP-In detected for suspended "
+				       "connection dev=%s!\n",
+				       hba->netdev->name);
 				bnx2i_unsol_pdu_adjust_rq(bnx2i_conn);
 				goto cqe_out;
 			}
@@ -2001,7 +2013,7 @@ static int bnx2i_process_new_cqes(struct bnx2i_conn *bnx2i_conn)
 			/* Run the kthread engine only for data cmds
 			   All other cmds will be completed in this bh! */
 			bnx2i_queue_scsi_cmd_resp(session, bnx2i_conn, nopin);
-			break;
+			goto done;
 		case ISCSI_OP_LOGIN_RSP:
 			bnx2i_process_login_resp(session, bnx2i_conn,
 						 qp->cq_cons_qe);
@@ -2044,11 +2056,15 @@ static int bnx2i_process_new_cqes(struct bnx2i_conn *bnx2i_conn)
 			printk(KERN_ALERT "bnx2i: unknown opcode 0x%x\n",
 					  nopin->op_code);
 		}
+
+		ADD_STATS_64(hba, rx_pdus, 1);
+		ADD_STATS_64(hba, rx_bytes, nopin->data_length);
+done:
 		if (!tgt_async_msg) {
 			if (!atomic_read(&bnx2i_conn->ep->num_active_cmds))
 				printk(KERN_ALERT "bnx2i (%s): no active cmd! "
 				       "op 0x%x\n",
-				       bnx2i_conn->hba->netdev->name,
+				       hba->netdev->name,
 				       nopin->op_code);
 			else
 				atomic_dec(&bnx2i_conn->ep->num_active_cmds);
@@ -2692,6 +2708,7 @@ struct cnic_ulp_ops bnx2i_cnic_cb = {
 	.cm_remote_close = bnx2i_cm_remote_close,
 	.cm_remote_abort = bnx2i_cm_remote_abort,
 	.iscsi_nl_send_msg = bnx2i_send_nl_mesg,
+	.cnic_get_stats = bnx2i_get_stats,
 	.owner = THIS_MODULE
 };
 
diff --git a/drivers/scsi/bnx2i/bnx2i_init.c b/drivers/scsi/bnx2i/bnx2i_init.c
index 8b68167..7729a52 100644
--- a/drivers/scsi/bnx2i/bnx2i_init.c
+++ b/drivers/scsi/bnx2i/bnx2i_init.c
@@ -381,6 +381,46 @@ void bnx2i_ulp_exit(struct cnic_dev *dev)
 
 
 /**
+ * bnx2i_get_stats - Retrieve various statistic from iSCSI offload
+ * @handle:	bnx2i_hba
+ *
+ * function callback exported via bnx2i - cnic driver interface to
+ *      retrieve various iSCSI offload related statistics.
+ */
+int bnx2i_get_stats(void *handle)
+{
+	struct bnx2i_hba *hba = handle;
+	struct iscsi_stats_info *stats;
+
+	if (!hba)
+		return -EINVAL;
+
+	stats = (struct iscsi_stats_info *)hba->cnic->stats_addr;
+
+	if (!stats)
+		return -ENOMEM;
+
+	memcpy(stats->version, DRV_MODULE_VERSION, sizeof(stats->version));
+	memcpy(stats->mac_add1 + 2, hba->cnic->mac_addr, ETH_ALEN);
+
+	stats->max_frame_size = hba->netdev->mtu;
+	stats->txq_size = hba->max_sqes;
+	stats->rxq_size = hba->max_cqes;
+
+	stats->txq_avg_depth = 0;
+	stats->rxq_avg_depth = 0;
+
+	GET_STATS_64(hba, stats, rx_pdus);
+	GET_STATS_64(hba, stats, rx_bytes);
+
+	GET_STATS_64(hba, stats, tx_pdus);
+	GET_STATS_64(hba, stats, tx_bytes);
+
+	return 0;
+}
+
+
+/**
  * bnx2i_percpu_thread_create - Create a receive thread for an
  *				online CPU
  *
diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
index f8d516b..b40ac01 100644
--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
+++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
@@ -874,6 +874,11 @@ struct bnx2i_hba *bnx2i_alloc_hba(struct cnic_dev *cnic)
 		hba->conn_ctx_destroy_tmo = 2 * HZ;
 	}
 
+#ifdef CONFIG_32BIT
+	spin_lock_init(&hba->stat_lock);
+#endif
+	memset(&hba->stats, 0, sizeof(struct iscsi_stats_info));
+
 	if (iscsi_host_add(shost, &hba->pcidev->dev))
 		goto free_dump_mem;
 	return hba;
@@ -1181,12 +1186,18 @@ static int
 bnx2i_mtask_xmit(struct iscsi_conn *conn, struct iscsi_task *task)
 {
 	struct bnx2i_conn *bnx2i_conn = conn->dd_data;
+	struct bnx2i_hba *hba = bnx2i_conn->hba;
 	struct bnx2i_cmd *cmd = task->dd_data;
 
 	memset(bnx2i_conn->gen_pdu.req_buf, 0, ISCSI_DEF_MAX_RECV_SEG_LEN);
 
 	bnx2i_setup_cmd_wqe_template(cmd);
 	bnx2i_conn->gen_pdu.req_buf_size = task->data_count;
+
+	/* Tx PDU/data length count */
+	ADD_STATS_64(hba, tx_pdus, 1);
+	ADD_STATS_64(hba, tx_bytes, task->data_count);
+
 	if (task->data_count) {
 		memcpy(bnx2i_conn->gen_pdu.req_buf, task->data,
 		       task->data_count);
-- 
1.7.0.6

^ permalink raw reply related

* Re: [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: Eric Dumazet @ 2012-06-26  8:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120626.004658.2123525722448546355.davem@davemloft.net>

On Tue, 2012-06-26 at 00:46 -0700, David Miller wrote:

> And for legitimate traffic it's completely the wrong thing to do.
> 
> There is absolutely zero reason to pure valid entries when hash chains
> average length of one.
> 
> I've been monitoring routing cache activity, and it's the height of
> stupidity.  Every 5 minutes we pure, and then they all get regenerated
> again.
> 

Thats because gc_interval (60) is big compared to ip_rt_gc_timeout
(300)

So each time rt_check_expire() triggers, we handle a big part of the
cache. On big servers I had to lower gc_interval to smooth things.

Garbage collect is needed to not waste kernel memory, even on legitimate
traffic on a typical web server.

Taken from my 8GB machine : 

# cat /proc/sys/net/ipv4/route/gc_thresh
262144

320 bytes per dst : 262144*320 = 83886080 bytes to store one dst per hash chain.

Also, why keeping a dst in cache if no traffic uses it in a 5 minutes period ?

^ permalink raw reply

* Re: [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: Eric Dumazet @ 2012-06-26  7:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120626.004658.2123525722448546355.davem@davemloft.net>

On Tue, 2012-06-26 at 00:46 -0700, David Miller wrote:

> But regardless, could you actually answer the question I asked of you?

I did a revert, no more no less.

^ permalink raw reply

* [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: Jeff Kirsher @ 2012-06-26  7:54 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, stable, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

FCoE target mode was experiencing issues due to the fact that we were
sending up data frames that were padded to 60 bytes after the DDP logic had
already stripped the frame down to 52 or 56 depending on the use of VLANs.
This was resulting in the FCoE DDP logic having issues since it thought the
frame still had data in it due to the padding.

To resolve this, adding code so that we do not pad FCoE frames prior to
handling them to the stack.

CC: <stable@vger.kernel.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  |    2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   14 ++++++++++----
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 3ef3c52..7af291e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -196,7 +196,7 @@ enum ixgbe_ring_state_t {
 	__IXGBE_HANG_CHECK_ARMED,
 	__IXGBE_RX_RSC_ENABLED,
 	__IXGBE_RX_CSUM_UDP_ZERO_ERR,
-	__IXGBE_RX_FCOE_BUFSZ,
+	__IXGBE_RX_FCOE,
 };
 
 #define check_for_tx_hang(ring) \
@@ -290,7 +290,7 @@ struct ixgbe_ring_feature {
 #if defined(IXGBE_FCOE) && (PAGE_SIZE < 8192)
 static inline unsigned int ixgbe_rx_pg_order(struct ixgbe_ring *ring)
 {
-	return test_bit(__IXGBE_RX_FCOE_BUFSZ, &ring->state) ? 1 : 0;
+	return test_bit(__IXGBE_RX_FCOE, &ring->state) ? 1 : 0;
 }
 #else
 #define ixgbe_rx_pg_order(_ring) 0
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index af1a531..c377706 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -634,7 +634,7 @@ static int ixgbe_alloc_q_vector(struct ixgbe_adapter *adapter,
 			f = &adapter->ring_feature[RING_F_FCOE];
 			if ((rxr_idx >= f->mask) &&
 			    (rxr_idx < f->mask + f->indices))
-				set_bit(__IXGBE_RX_FCOE_BUFSZ, &ring->state);
+				set_bit(__IXGBE_RX_FCOE, &ring->state);
 		}
 
 #endif /* IXGBE_FCOE */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index cbb05d6..18ca3bc 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1058,17 +1058,17 @@ static inline void ixgbe_rx_hash(struct ixgbe_ring *ring,
 #ifdef IXGBE_FCOE
 /**
  * ixgbe_rx_is_fcoe - check the rx desc for incoming pkt type
- * @adapter: address of board private structure
+ * @ring: structure containing ring specific data
  * @rx_desc: advanced rx descriptor
  *
  * Returns : true if it is FCoE pkt
  */
-static inline bool ixgbe_rx_is_fcoe(struct ixgbe_adapter *adapter,
+static inline bool ixgbe_rx_is_fcoe(struct ixgbe_ring *ring,
 				    union ixgbe_adv_rx_desc *rx_desc)
 {
 	__le16 pkt_info = rx_desc->wb.lower.lo_dword.hs_rss.pkt_info;
 
-	return (adapter->flags & IXGBE_FLAG_FCOE_ENABLED) &&
+	return test_bit(__IXGBE_RX_FCOE, &ring->state) &&
 	       ((pkt_info & cpu_to_le16(IXGBE_RXDADV_PKTTYPE_ETQF_MASK)) ==
 		(cpu_to_le16(IXGBE_ETQF_FILTER_FCOE <<
 			     IXGBE_RXDADV_PKTTYPE_ETQF_SHIFT)));
@@ -1549,6 +1549,12 @@ static bool ixgbe_cleanup_headers(struct ixgbe_ring *rx_ring,
 		skb->truesize -= ixgbe_rx_bufsz(rx_ring);
 	}
 
+#ifdef IXGBE_FCOE
+	/* do not attempt to pad FCoE Frames as this will disrupt DDP */
+	if (ixgbe_rx_is_fcoe(rx_ring, rx_desc))
+		return false;
+
+#endif
 	/* if skb_pad returns an error the skb was freed */
 	if (unlikely(skb->len < 60)) {
 		int pad_len = 60 - skb->len;
@@ -1775,7 +1781,7 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 
 #ifdef IXGBE_FCOE
 		/* if ddp, not passing to ULD unless for FCP_RSP or error */
-		if (ixgbe_rx_is_fcoe(adapter, rx_desc)) {
+		if (ixgbe_rx_is_fcoe(rx_ring, rx_desc)) {
 			ddp_bytes = ixgbe_fcoe_ddp(adapter, rx_desc, skb);
 			if (!ddp_bytes) {
 				dev_kfree_skb_any(skb);
-- 
1.7.10.2

^ permalink raw reply related

* Re: [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: Jeff Kirsher @ 2012-06-26  7:53 UTC (permalink / raw)
  To: David Miller; +Cc: alexander.h.duyck, netdev, gospo, sassmann, stable
In-Reply-To: <20120626.005029.909254566674577767.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 502 bytes --]

On Tue, 2012-06-26 at 00:50 -0700, David Miller wrote:
> Sorry, quotes don't work either, what you did is still a SMTP syntax error,
> here's what is in the bounce I get back:
> 
> 	<stable@vger.kernel.org> "[3.4]",
> 	Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Illegal-Object:	Syntax error in Cc: address found on vger.kernel.org:
> 	Cc:	<stable@vger.kernel.org>"[3.4]"
> 						^-missing end of address

Grrr...

I will re-send without the "[3.4]", Greg will just have to deal with it.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: David Miller @ 2012-06-26  7:50 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: alexander.h.duyck, netdev, gospo, sassmann, stable
In-Reply-To: <1340696980.2255.18.camel@jtkirshe-mobl>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 26 Jun 2012 00:49:40 -0700

> On Tue, 2012-06-26 at 00:43 -0700, David Miller wrote:
>> You can't put things like "[3.4]" unquoted into the CC: list, that's
>> not kosher and vger rejected it.
> 
> Re-sent with the proper quoting.

Doesn't work, see my reply.

^ permalink raw reply

* Re: [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: David Miller @ 2012-06-26  7:50 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: alexander.h.duyck, netdev, gospo, sassmann, stable
In-Reply-To: <1340696951-2686-1-git-send-email-jeffrey.t.kirsher@intel.com>


Sorry, quotes don't work either, what you did is still a SMTP syntax error,
here's what is in the bounce I get back:

	<stable@vger.kernel.org> "[3.4]",
	Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Illegal-Object:	Syntax error in Cc: address found on vger.kernel.org:
	Cc:	<stable@vger.kernel.org>"[3.4]"
						^-missing end of address

^ permalink raw reply

* Re: [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: Jeff Kirsher @ 2012-06-26  7:49 UTC (permalink / raw)
  To: David Miller; +Cc: alexander.h.duyck, netdev, gospo, sassmann, stable
In-Reply-To: <20120626.004302.1065943933933136878.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 200 bytes --]

On Tue, 2012-06-26 at 00:43 -0700, David Miller wrote:
> You can't put things like "[3.4]" unquoted into the CC: list, that's
> not kosher and vger rejected it.

Re-sent with the proper quoting.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: Jeff Kirsher @ 2012-06-26  7:47 UTC (permalink / raw)
  To: David Miller; +Cc: alexander.h.duyck, netdev, gospo, sassmann, stable
In-Reply-To: <20120626.004302.1065943933933136878.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 263 bytes --]

On Tue, 2012-06-26 at 00:43 -0700, David Miller wrote:
> You can't put things like "[3.4]" unquoted into the CC: list, that's
> not kosher and vger rejected it.

Sorry, that was what Greg told me to do.  I did not realize it needed to
be in quotes, my bad.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: David Miller @ 2012-06-26  7:46 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1340696398.10893.209.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 26 Jun 2012 09:39:58 +0200

> On Tue, 2012-06-26 at 00:21 -0700, David Miller wrote:
>> IPv4 routing cache entries no longer use dst->expires, because the
>> metrics, PMTU, and redirect information are stored in the inetpeer
>> cache.
>> 
>> Signed-off-by: David S. Miller <davem@davemloft.net>
>> ---
>> 
>> Eric, when you did commit 9f28a2fc0bd77511f649c0a788c7bf9a5fd04edb
>> (ipv4: reintroduce route cache garbage collector) do you remember
>> if the thing we needed was the real expiry or both the
>> rt_is_expired() and the rt_may_expire() cases?
>> 
>> I really want to remove rt_may_expire() from this conditional because
>> it results in absolutely stupid behavior.  If your system is idle
>> for 5 minutes, all of your input routing cache entries are purged.
> 
> Hmm, after a DDOS, purging all those routing cache entries in 5 minutes
> is good to recover some Mbytes of kernel memory.

And for legitimate traffic it's completely the wrong thing to do.

There is absolutely zero reason to pure valid entries when hash chains
average length of one.

I've been monitoring routing cache activity, and it's the height of
stupidity.  Every 5 minutes we pure, and then they all get regenerated
again.

Routing cache entries are expensive to recreate, it's much easier to
just keep them around then to potentially eat four full trie lookups
because that's what it will cost to reconstitute those guys.

But regardless, could you actually answer the question I asked of you?

^ permalink raw reply

* Re: [net] ixgbe: Do not pad FCoE frames as this can cause issues with FCoE DDP
From: David Miller @ 2012-06-26  7:43 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: alexander.h.duyck, netdev, gospo, sassmann, stable
In-Reply-To: <1340695993-1911-1-git-send-email-jeffrey.t.kirsher@intel.com>


You can't put things like "[3.4]" unquoted into the CC: list, that's
not kosher and vger rejected it.

^ permalink raw reply

* Re: [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: Eric Dumazet @ 2012-06-26  7:39 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120626.002124.2220875506847485306.davem@davemloft.net>

On Tue, 2012-06-26 at 00:21 -0700, David Miller wrote:
> IPv4 routing cache entries no longer use dst->expires, because the
> metrics, PMTU, and redirect information are stored in the inetpeer
> cache.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> 
> Eric, when you did commit 9f28a2fc0bd77511f649c0a788c7bf9a5fd04edb
> (ipv4: reintroduce route cache garbage collector) do you remember
> if the thing we needed was the real expiry or both the
> rt_is_expired() and the rt_may_expire() cases?
> 
> I really want to remove rt_may_expire() from this conditional because
> it results in absolutely stupid behavior.  If your system is idle
> for 5 minutes, all of your input routing cache entries are purged.

Hmm, after a DDOS, purging all those routing cache entries in 5 minutes
is good to recover some Mbytes of kernel memory.

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: Hans Schillstrom @ 2012-06-26  7:27 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet@gmail.com, subramanian.vijay@gmail.com,
	dave.taht@gmail.com, netdev@vger.kernel.org, ncardwell@google.com,
	therbert@google.com, brouer@redhat.com
In-Reply-To: <20120626.001124.36486380031998542.davem@davemloft.net>

On Tuesday 26 June 2012 09:11:24 David Miller wrote:
> From: Hans Schillstrom <hans.schillstrom@ericsson.com>
> Date: Tue, 26 Jun 2012 07:34:31 +0200
> 
> > The big cycle consumer during a syn attack is SHA sum right now, 
> > so from that perspective it's better to add aes crypto (by using AES-NI) 
> > to the syn cookies instead of SHA sum. Even if only newer x86_64 can use it.
> 
> I'm surprised that x86 lacks an SHA1 instruction, even shitty sparcs
> have one now.
> 
> SHA1 seems overkill for what the syncookie code is trying to do, could
> you give the following a try?
> 

Sure, I'll give it a try later today 

Thanks

^ permalink raw reply

* Re: [PATCH net] net: qmi_wwan: fix Oops while disconnecting
From: Ming Lei @ 2012-06-26  7:23 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Bjørn Mork, netdev, linux-usb, Marius Bjørnstad Kotsbak
In-Reply-To: <201206251410.13454.oliver@neukum.org>

On Mon, Jun 25, 2012 at 8:10 PM, Oliver Neukum <oliver@neukum.org> wrote:
> Am Montag, 25. Juni 2012, 09:15:21 schrieb Ming Lei:
>> Any locking isn't needed if the set to NULL is put after
>> driver_info->unbind,  since ->unbind will call subdriver->disconnect,
>> which will hold the open/disconnect lock of wdm_mutex.
>
> True for cdc_wdm. But usbnet needs to work well for everything.

Suppose there are other usbnet drivers which may have this kind of
subdriver, and they have to take one lock to avoid open/disconnect
race, there are only two ways to do it:

          - the lock is held before calling usbnet_disconnect
          - the lock is held inside driver_info->unbind

So putting the set to NULL after driver_info->unbind should work
for the both two ways above.

Also we can document the usage in comments.

>
>> > We can move to after unregister_netdev()
>> > I am unhappy with it going after unbind.
>> >
>>
>> Could you let us know the reason? I think it may let the
>> patch not necessary.
>
> Very well. This is the code:
>
>  void usbnet_disconnect (struct usb_interface *intf)
> {
>        struct usbnet           *dev;
>        struct usb_device       *xdev;
>        struct net_device       *net;
>
>        dev = usb_get_intfdata(intf);
>        usb_set_intfdata(intf, NULL);
>        if (!dev)
>                return;
>
> This code needs to check for NULL (cdc_ether and similar drivers)
> It is cleaner that if we need to check for NULL we also set to NULL.
> But that is no good reason to keep it if there's real trouble
>
>        xdev = interface_to_usbdev (intf);
>
>        netif_info(dev, probe, dev->net, "unregister '%s' usb-%s-%s, %s\n",
>                   intf->dev.driver->name,
>                   xdev->bus->bus_name, xdev->devpath,
>                   dev->driver_info->description);
>
>        net = dev->net;
>        unregister_netdev (net);
>
> Here intfdata is NULL.
>
>        cancel_work_sync(&dev->kevent);
>
>        if (dev->driver_info->unbind)
>                dev->driver_info->unbind (dev, intf);
>
> At this point a minidriver must not follow the intfdata pointer,
> because the interface may again be probed. So if here a minidriver

IMO, probe is serialized strictly with driver unbind since both the parent
lock and its own device lock have been held, so the probe may only be
started after driver unbinding is completed.

> still uses intfdata, locking will be needed. We want to catch those
> casees.

Suppose infdata is used here somewhere, it is surely a bug because
the usbnet instance pointed by intfdata will be freed soon.

So looks putting the set to NULL after driver_info->unbind is good,
doesn't it?

>
>        usb_kill_urb(dev->interrupt);
>        usb_free_urb(dev->interrupt);
>
>        free_netdev(net);
>        usb_put_dev (xdev);
> }
>
>> > Sure, it is a debugging aid. It has the drawback that minidrivers have
>> > to be able to deal with intfdata being NULL. That is not hard to do.
>>
>> The check isn't needed if the set to NULL is put after  driver_info->unbind
>> in usbnet_disconnect.
>
> True, but we don't catch bugs.

If the check is added, the bugs may be hided, and no stack will be
dumped, :-)


Thanks,
-- 
Ming Lei

^ permalink raw reply

* [PATCH] ipv4: Remove unnecessary code from rt_check_expire().
From: David Miller @ 2012-06-26  7:21 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet


IPv4 routing cache entries no longer use dst->expires, because the
metrics, PMTU, and redirect information are stored in the inetpeer
cache.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

Eric, when you did commit 9f28a2fc0bd77511f649c0a788c7bf9a5fd04edb
(ipv4: reintroduce route cache garbage collector) do you remember
if the thing we needed was the real expiry or both the
rt_is_expired() and the rt_may_expire() cases?

I really want to remove rt_may_expire() from this conditional because
it results in absolutely stupid behavior.  If your system is idle
for 5 minutes, all of your input routing cache entries are purged.

 net/ipv4/route.c |   34 +++++++++++-----------------------
 1 file changed, 11 insertions(+), 23 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8d62d85..846961c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -870,34 +870,22 @@ static void rt_check_expire(void)
 		while ((rth = rcu_dereference_protected(*rthp,
 					lockdep_is_held(rt_hash_lock_addr(i)))) != NULL) {
 			prefetch(rth->dst.rt_next);
-			if (rt_is_expired(rth)) {
+			if (rt_is_expired(rth) ||
+			    rt_may_expire(rth, tmo, ip_rt_gc_timeout)) {
 				*rthp = rth->dst.rt_next;
 				rt_free(rth);
 				continue;
 			}
-			if (rth->dst.expires) {
-				/* Entry is expired even if it is in use */
-				if (time_before_eq(jiffies, rth->dst.expires)) {
-nofree:
-					tmo >>= 1;
-					rthp = &rth->dst.rt_next;
-					/*
-					 * We only count entries on
-					 * a chain with equal hash inputs once
-					 * so that entries for different QOS
-					 * levels, and other non-hash input
-					 * attributes don't unfairly skew
-					 * the length computation
-					 */
-					length += has_noalias(rt_hash_table[i].chain, rth);
-					continue;
-				}
-			} else if (!rt_may_expire(rth, tmo, ip_rt_gc_timeout))
-				goto nofree;
 
-			/* Cleanup aged off entries. */
-			*rthp = rth->dst.rt_next;
-			rt_free(rth);
+			/* We only count entries on a chain with equal
+			 * hash inputs once so that entries for
+			 * different QOS levels, and other non-hash
+			 * input attributes don't unfairly skew the
+			 * length computation
+			 */
+			tmo >>= 1;
+			rthp = &rth->dst.rt_next;
+			length += has_noalias(rt_hash_table[i].chain, rth);
 		}
 		spin_unlock_bh(rt_hash_lock_addr(i));
 		sum += length;
-- 
1.7.10

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox