Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/9] stmmac: remove the mmc code
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

DWMAC Management Counters (MMC) are not fully support.
The minimal support added in the past allowed to
only disable counters (if present) and mask their
interrupts.
This patch prepares the driver to support the MMC
removing obsolete code.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/common.h         |   11 -----------
 drivers/net/stmmac/dwmac1000_core.c |    2 --
 drivers/net/stmmac/dwmac100_core.c  |   11 -----------
 drivers/net/stmmac/stmmac_main.c    |    4 ----
 4 files changed, 0 insertions(+), 28 deletions(-)

diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index 375ea19..290b97a 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -130,17 +130,6 @@ enum tx_dma_irq_status {
 #define MAC_ENABLE_TX		0x00000008	/* Transmitter Enable */
 #define MAC_RNABLE_RX		0x00000004	/* Receiver Enable */
 
-/* MAC Management Counters register */
-#define MMC_CONTROL		0x00000100	/* MMC Control */
-#define MMC_HIGH_INTR		0x00000104	/* MMC High Interrupt */
-#define MMC_LOW_INTR		0x00000108	/* MMC Low Interrupt */
-#define MMC_HIGH_INTR_MASK	0x0000010c	/* MMC High Interrupt Mask */
-#define MMC_LOW_INTR_MASK	0x00000110	/* MMC Low Interrupt Mask */
-
-#define MMC_CONTROL_MAX_FRM_MASK	0x0003ff8	/* Maximum Frame Size */
-#define MMC_CONTROL_MAX_FRM_SHIFT	3
-#define MMC_CONTROL_MAX_FRAME		0x7FF
-
 struct stmmac_desc_ops {
 	/* DMA RX descriptor ring initialization */
 	void (*init_rx_desc) (struct dma_desc *p, unsigned int ring_size,
diff --git a/drivers/net/stmmac/dwmac1000_core.c b/drivers/net/stmmac/dwmac1000_core.c
index eea184a..9ba9cae 100644
--- a/drivers/net/stmmac/dwmac1000_core.c
+++ b/drivers/net/stmmac/dwmac1000_core.c
@@ -37,8 +37,6 @@ static void dwmac1000_core_init(void __iomem *ioaddr)
 	value |= GMAC_CORE_INIT;
 	writel(value, ioaddr + GMAC_CONTROL);
 
-	/* Freeze MMC counters */
-	writel(0x8, ioaddr + GMAC_MMC_CTRL);
 	/* Mask GMAC interrupts */
 	writel(0x207, ioaddr + GMAC_INT_MASK);
 
diff --git a/drivers/net/stmmac/dwmac100_core.c b/drivers/net/stmmac/dwmac100_core.c
index 743a580..aacfc6e 100644
--- a/drivers/net/stmmac/dwmac100_core.c
+++ b/drivers/net/stmmac/dwmac100_core.c
@@ -70,17 +70,6 @@ static void dwmac100_dump_mac_regs(void __iomem *ioaddr)
 		readl(ioaddr + MAC_VLAN1));
 	pr_info("\tVLAN2 tag (offset 0x%x): 0x%08x\n", MAC_VLAN2,
 		readl(ioaddr + MAC_VLAN2));
-	pr_info("\n\tMAC management counter registers\n");
-	pr_info("\t MMC crtl (offset 0x%x): 0x%08x\n",
-		MMC_CONTROL, readl(ioaddr + MMC_CONTROL));
-	pr_info("\t MMC High Interrupt (offset 0x%x): 0x%08x\n",
-		MMC_HIGH_INTR, readl(ioaddr + MMC_HIGH_INTR));
-	pr_info("\t MMC Low Interrupt (offset 0x%x): 0x%08x\n",
-		MMC_LOW_INTR, readl(ioaddr + MMC_LOW_INTR));
-	pr_info("\t MMC High Interrupt Mask (offset 0x%x): 0x%08x\n",
-		MMC_HIGH_INTR_MASK, readl(ioaddr + MMC_HIGH_INTR_MASK));
-	pr_info("\t MMC Low Interrupt Mask (offset 0x%x): 0x%08x\n",
-		MMC_LOW_INTR_MASK, readl(ioaddr + MMC_LOW_INTR_MASK));
 }
 
 static void dwmac100_irq_status(void __iomem *ioaddr)
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index c6e567e..da11405 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -826,10 +826,6 @@ static int stmmac_open(struct net_device *dev)
 		pr_info("\tTX Checksum insertion supported\n");
 	netdev_update_features(dev);
 
-	/* Initialise the MMC (if present) to disable all interrupts. */
-	writel(0xffffffff, priv->ioaddr + MMC_HIGH_INTR_MASK);
-	writel(0xffffffff, priv->ioaddr + MMC_LOW_INTR_MASK);
-
 	/* Request the IRQ lines */
 	ret = request_irq(dev->irq, stmmac_interrupt,
 			 IRQF_SHARED, dev->name, dev);
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 3/9] stmmac: add MMC supports exported via debugfs.
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

This patch adds the MMC management counters support.
MMC module is an extension of the register address
space and all the hardware counters can be accessed
via debugfs.
Below an example of the output:

  bash-3.00# cat /sys/kernel/debug/stmmaceth/mmc
==============================
MMC Management HW MAC counters
==============================
 -- MMC TX counter counters ---
        Number of bytes in good and bad frames: 1439120
        Number of good and bad frames transmitted: 9844
        Number of good Broadcast frame transmitted: 9
        Good and bad frames of 64bytes: 15
 [snip]
        Good and bad frames of 256-511: 11
        Good and bad frames of 512-1024: 15
        Good and bad frames bigger than 1024: 1930709609
        Good and bad unicast frames: 9835
        Good and bad broadcast frames: 9
        Frame aborted after carrier sense error: 1
        Number of bytes (no preamble) in good frames: 1439120
        Number of good frames transmitted: 9844
 -- MMC RX counter counters ---
        Number of good and bad frames: 14981
        Number of bytes in  good and bad frames recv: 12242993
 [snip]
 -- MMC IPC counters  ---
 -- MMC RX IPv4 counters  ---
 -- MMC RX IPV6 counters  ---
 -- MMC Protocols RX counters  ---

Note that, the MMC interrupts remain masked and the logic
to handle this kind of interrupt will be added later (if
actually useful).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/Kconfig       |    9 ++
 drivers/net/stmmac/Makefile      |    3 +-
 drivers/net/stmmac/mmc.h         |  131 ++++++++++++++++++
 drivers/net/stmmac/mmc_core.c    |  265 ++++++++++++++++++++++++++++++++++++
 drivers/net/stmmac/stmmac.h      |    2 +
 drivers/net/stmmac/stmmac_main.c |  277 +++++++++++++++++++++++++++++++++++++-
 6 files changed, 685 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/stmmac/mmc.h
 create mode 100644 drivers/net/stmmac/mmc_core.c

diff --git a/drivers/net/stmmac/Kconfig b/drivers/net/stmmac/Kconfig
index 7df7df4..0a1bd82 100644
--- a/drivers/net/stmmac/Kconfig
+++ b/drivers/net/stmmac/Kconfig
@@ -11,6 +11,15 @@ config STMMAC_ETH
 
 if STMMAC_ETH
 
+config STMMAC_MONITOR
+	bool "Enable monitoring via sysFS "
+	default n
+	depends on STMMAC_ETH && DEBUG_FS
+	-- help
+	  To Enable monitoring via debufs.
+	  The stmmac entry in /sys reports the state of the HW monitoring
+	  counters (if supported).
+
 config STMMAC_DA
 	bool "STMMAC DMA arbitration scheme"
 	default n
diff --git a/drivers/net/stmmac/Makefile b/drivers/net/stmmac/Makefile
index 9691733..0f23d95 100644
--- a/drivers/net/stmmac/Makefile
+++ b/drivers/net/stmmac/Makefile
@@ -2,4 +2,5 @@ obj-$(CONFIG_STMMAC_ETH) += stmmac.o
 stmmac-$(CONFIG_STMMAC_TIMER) += stmmac_timer.o
 stmmac-objs:= stmmac_main.o stmmac_ethtool.o stmmac_mdio.o	\
 	      dwmac_lib.o dwmac1000_core.o  dwmac1000_dma.o	\
-	      dwmac100_core.o dwmac100_dma.o enh_desc.o  norm_desc.o $(stmmac-y)
+	      dwmac100_core.o dwmac100_dma.o enh_desc.o  norm_desc.o \
+	      mmc_core.o $(stmmac-y)
diff --git a/drivers/net/stmmac/mmc.h b/drivers/net/stmmac/mmc.h
new file mode 100644
index 0000000..a383520
--- /dev/null
+++ b/drivers/net/stmmac/mmc.h
@@ -0,0 +1,131 @@
+/*******************************************************************************
+  MMC Header file
+
+  Copyright (C) 2011  STMicroelectronics Ltd
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+/* MMC control register */
+/* When set, all counter are reset */
+#define MMC_CNTRL_COUNTER_RESET		0x1
+/* When set, do not roll over zero
+ * after reaching the max value*/
+#define MMC_CNTRL_COUNTER_STOP_ROLLOVER	0x2
+#define MMC_CNTRL_RESET_ON_READ		0x4	/* Reset after reading */
+#define MMC_CNTRL_COUNTER_FREEZER	0x8	/* Freeze counter values to the
+						 * current value.*/
+#define MMC_CNTRL_PRESET		0x10
+#define MMC_CNTRL_FULL_HALF_PRESET	0x20
+struct stmmac_counters {
+	unsigned int mmc_tx_octetcount_gb;
+	unsigned int mmc_tx_framecount_gb;
+	unsigned int mmc_tx_broadcastframe_g;
+	unsigned int mmc_tx_multicastframe_g;
+	unsigned int mmc_tx_64_octets_gb;
+	unsigned int mmc_tx_65_to_127_octets_gb;
+	unsigned int mmc_tx_128_to_255_octets_gb;
+	unsigned int mmc_tx_256_to_511_octets_gb;
+	unsigned int mmc_tx_512_to_1023_octets_gb;
+	unsigned int mmc_tx_1024_to_max_octets_gb;
+	unsigned int mmc_tx_unicast_gb;
+	unsigned int mmc_tx_multicast_gb;
+	unsigned int mmc_tx_broadcast_gb;
+	unsigned int mmc_tx_underflow_error;
+	unsigned int mmc_tx_singlecol_g;
+	unsigned int mmc_tx_multicol_g;
+	unsigned int mmc_tx_deferred;
+	unsigned int mmc_tx_latecol;
+	unsigned int mmc_tx_exesscol;
+	unsigned int mmc_tx_carrier_error;
+	unsigned int mmc_tx_octetcount_g;
+	unsigned int mmc_tx_framecount_g;
+	unsigned int mmc_tx_excessdef;
+	unsigned int mmc_tx_pause_frame;
+	unsigned int mmc_tx_vlan_frame_g;
+
+	/* MMC RX counter registers */
+	unsigned int mmc_rx_framecount_gb;
+	unsigned int mmc_rx_octetcount_gb;
+	unsigned int mmc_rx_octetcount_g;
+	unsigned int mmc_rx_broadcastframe_g;
+	unsigned int mmc_rx_multicastframe_g;
+	unsigned int mmc_rx_crc_errror;
+	unsigned int mmc_rx_align_error;
+	unsigned int mmc_rx_run_error;
+	unsigned int mmc_rx_jabber_error;
+	unsigned int mmc_rx_undersize_g;
+	unsigned int mmc_rx_oversize_g;
+	unsigned int mmc_rx_64_octets_gb;
+	unsigned int mmc_rx_65_to_127_octets_gb;
+	unsigned int mmc_rx_128_to_255_octets_gb;
+	unsigned int mmc_rx_256_to_511_octets_gb;
+	unsigned int mmc_rx_512_to_1023_octets_gb;
+	unsigned int mmc_rx_1024_to_max_octets_gb;
+	unsigned int mmc_rx_unicast_g;
+	unsigned int mmc_rx_length_error;
+	unsigned int mmc_rx_autofrangetype;
+	unsigned int mmc_rx_pause_frames;
+	unsigned int mmc_rx_fifo_overflow;
+	unsigned int mmc_rx_vlan_frames_gb;
+	unsigned int mmc_rx_watchdog_error;
+	/* IPC */
+	unsigned int mmc_rx_ipc_intr_mask;
+	unsigned int mmc_rx_ipc_intr;
+	/* IPv4 */
+	unsigned int mmc_rx_ipv4_gd;
+	unsigned int mmc_rx_ipv4_hderr;
+	unsigned int mmc_rx_ipv4_nopay;
+	unsigned int mmc_rx_ipv4_frag;
+	unsigned int mmc_rx_ipv4_udsbl;
+
+	unsigned int mmc_rx_ipv4_gd_octets;
+	unsigned int mmc_rx_ipv4_hderr_octets;
+	unsigned int mmc_rx_ipv4_nopay_octets;
+	unsigned int mmc_rx_ipv4_frag_octets;
+	unsigned int mmc_rx_ipv4_udsbl_octets;
+
+	/* IPV6 */
+	unsigned int mmc_rx_ipv6_gd_octets;
+	unsigned int mmc_rx_ipv6_hderr_octets;
+	unsigned int mmc_rx_ipv6_nopay_octets;
+
+	unsigned int mmc_rx_ipv6_gd;
+	unsigned int mmc_rx_ipv6_hderr;
+	unsigned int mmc_rx_ipv6_nopay;
+
+	/* Protocols */
+	unsigned int mmc_rx_udp_gd;
+	unsigned int mmc_rx_udp_err;
+	unsigned int mmc_rx_tcp_gd;
+	unsigned int mmc_rx_tcp_err;
+	unsigned int mmc_rx_icmp_gd;
+	unsigned int mmc_rx_icmp_err;
+
+	unsigned int mmc_rx_udp_gd_octets;
+	unsigned int mmc_rx_udp_err_octets;
+	unsigned int mmc_rx_tcp_gd_octets;
+	unsigned int mmc_rx_tcp_err_octets;
+	unsigned int mmc_rx_icmp_gd_octets;
+	unsigned int mmc_rx_icmp_err_octets;
+};
+
+extern void dwmac_mmc_ctrl(void __iomem *ioaddr, unsigned int mode);
+extern void dwmac_mmc_intr_all_mask(void __iomem *ioaddr);
+extern void dwmac_mmc_read(void __iomem *ioaddr, struct stmmac_counters *mmc);
diff --git a/drivers/net/stmmac/mmc_core.c b/drivers/net/stmmac/mmc_core.c
new file mode 100644
index 0000000..41e6b33
--- /dev/null
+++ b/drivers/net/stmmac/mmc_core.c
@@ -0,0 +1,265 @@
+/*******************************************************************************
+  DWMAC Management Counters
+
+  Copyright (C) 2011  STMicroelectronics Ltd
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+#include <linux/io.h>
+#include "mmc.h"
+
+/* MAC Management Counters register offset */
+
+#define MMC_CNTRL		0x00000100	/* MMC Control */
+#define MMC_RX_INTR		0x00000104	/* MMC RX Interrupt */
+#define MMC_TX_INTR		0x00000108	/* MMC TX Interrupt */
+#define MMC_RX_INTR_MASK	0x0000010c	/* MMC Interrupt Mask */
+#define MMC_TX_INTR_MASK	0x00000110	/* MMC Interrupt Mask */
+#define MMC_DEFAUL_MASK		0xffffffff
+
+/* MMC TX counter registers */
+
+/* Note:
+ * _GB register stands for good and bad frames
+ * _G is for good only.
+ */
+#define MMC_TX_OCTETCOUNT_GB		0x00000114
+#define MMC_TX_FRAMECOUNT_GB		0x00000118
+#define MMC_TX_BROADCASTFRAME_G		0x0000011c
+#define MMC_TX_MULTICASTFRAME_G		0x00000120
+#define MMC_TX_64_OCTETS_GB		0x00000124
+#define MMC_TX_65_TO_127_OCTETS_GB	0x00000128
+#define MMC_TX_128_TO_255_OCTETS_GB	0x0000012c
+#define MMC_TX_256_TO_511_OCTETS_GB	0x00000130
+#define MMC_TX_512_TO_1023_OCTETS_GB	0x00000134
+#define MMC_TX_1024_TO_MAX_OCTETS_GB	0x00000138
+#define MMC_TX_UNICAST_GB		0x0000013c
+#define MMC_TX_MULTICAST_GB		0x00000140
+#define MMC_TX_BROADCAST_GB		0x00000144
+#define MMC_TX_UNDERFLOW_ERROR		0x00000148
+#define MMC_TX_SINGLECOL_G		0x0000014c
+#define MMC_TX_MULTICOL_G		0x00000150
+#define MMC_TX_DEFERRED			0x00000154
+#define MMC_TX_LATECOL			0x00000158
+#define MMC_TX_EXESSCOL			0x0000015c
+#define MMC_TX_CARRIER_ERROR		0x00000160
+#define MMC_TX_OCTETCOUNT_G		0x00000164
+#define MMC_TX_FRAMECOUNT_G		0x00000168
+#define MMC_TX_EXCESSDEF		0x0000016c
+#define MMC_TX_PAUSE_FRAME		0x00000170
+#define MMC_TX_VLAN_FRAME_G		0x00000174
+
+/* MMC RX counter registers */
+#define MMC_RX_FRAMECOUNT_GB		0x00000180
+#define MMC_RX_OCTETCOUNT_GB		0x00000184
+#define MMC_RX_OCTETCOUNT_G		0x00000188
+#define MMC_RX_BROADCASTFRAME_G		0x0000018c
+#define MMC_RX_MULTICASTFRAME_G		0x00000190
+#define MMC_RX_CRC_ERRROR		0x00000194
+#define MMC_RX_ALIGN_ERROR		0x00000198
+#define MMC_RX_RUN_ERROR		0x0000019C
+#define MMC_RX_JABBER_ERROR		0x000001A0
+#define MMC_RX_UNDERSIZE_G		0x000001A4
+#define MMC_RX_OVERSIZE_G		0x000001A8
+#define MMC_RX_64_OCTETS_GB		0x000001AC
+#define MMC_RX_65_TO_127_OCTETS_GB	0x000001b0
+#define MMC_RX_128_TO_255_OCTETS_GB	0x000001b4
+#define MMC_RX_256_TO_511_OCTETS_GB	0x000001b8
+#define MMC_RX_512_TO_1023_OCTETS_GB	0x000001bc
+#define MMC_RX_1024_TO_MAX_OCTETS_GB	0x000001c0
+#define MMC_RX_UNICAST_G		0x000001c4
+#define MMC_RX_LENGTH_ERROR		0x000001c8
+#define MMC_RX_AUTOFRANGETYPE		0x000001cc
+#define MMC_RX_PAUSE_FRAMES		0x000001d0
+#define MMC_RX_FIFO_OVERFLOW		0x000001d4
+#define MMC_RX_VLAN_FRAMES_GB		0x000001d8
+#define MMC_RX_WATCHDOG_ERROR		0x000001dc
+/* IPC*/
+#define MMC_RX_IPC_INTR_MASK		0x00000200
+#define MMC_RX_IPC_INTR			0x00000208
+/* IPv4*/
+#define MMC_RX_IPV4_GD			0x00000210
+#define MMC_RX_IPV4_HDERR		0x00000214
+#define MMC_RX_IPV4_NOPAY		0x00000218
+#define MMC_RX_IPV4_FRAG		0x0000021C
+#define MMC_RX_IPV4_UDSBL		0x00000220
+
+#define MMC_RX_IPV4_GD_OCTETS		0x00000250
+#define MMC_RX_IPV4_HDERR_OCTETS	0x00000254
+#define MMC_RX_IPV4_NOPAY_OCTETS	0x00000258
+#define MMC_RX_IPV4_FRAG_OCTETS		0x0000025c
+#define MMC_RX_IPV4_UDSBL_OCTETS	0x00000260
+
+/* IPV6*/
+#define MMC_RX_IPV6_GD_OCTETS		0x00000264
+#define MMC_RX_IPV6_HDERR_OCTETS	0x00000268
+#define MMC_RX_IPV6_NOPAY_OCTETS	0x0000026c
+
+#define MMC_RX_IPV6_GD			0x00000224
+#define MMC_RX_IPV6_HDERR		0x00000228
+#define MMC_RX_IPV6_NOPAY		0x0000022c
+
+/* Protocols*/
+#define MMC_RX_UDP_GD			0x00000230
+#define MMC_RX_UDP_ERR			0x00000234
+#define MMC_RX_TCP_GD			0x00000238
+#define MMC_RX_TCP_ERR			0x0000023c
+#define MMC_RX_ICMP_GD			0x00000240
+#define MMC_RX_ICMP_ERR			0x00000244
+
+#define MMC_RX_UDP_GD_OCTETS		0x00000270
+#define MMC_RX_UDP_ERR_OCTETS		0x00000274
+#define MMC_RX_TCP_GD_OCTETS		0x00000278
+#define MMC_RX_TCP_ERR_OCTETS		0x0000027c
+#define MMC_RX_ICMP_GD_OCTETS		0x00000280
+#define MMC_RX_ICMP_ERR_OCTETS		0x00000284
+
+void dwmac_mmc_ctrl(void __iomem *ioaddr, unsigned int mode)
+{
+	u32 value = readl(ioaddr + MMC_CNTRL);
+
+	value |= (mode & 0x3F);
+
+	writel(value, ioaddr + MMC_CNTRL);
+
+	pr_debug("stmmac: MMC ctrl register (offset 0x%x): 0x%08x\n",
+		 MMC_CNTRL, value);
+}
+
+/* To mask all all interrupts.*/
+void dwmac_mmc_intr_all_mask(void __iomem *ioaddr)
+{
+	writel(MMC_DEFAUL_MASK, ioaddr + MMC_RX_INTR_MASK);
+	writel(MMC_DEFAUL_MASK, ioaddr + MMC_TX_INTR_MASK);
+}
+
+/* This reads the MAC core counters (if actaully supported).
+ * by default the MMC core is programmed to reset each
+ * counter after a read. So all the field of the mmc struct
+ * have to be incremented.
+ */
+void dwmac_mmc_read(void __iomem *ioaddr, struct stmmac_counters *mmc)
+{
+	mmc->mmc_tx_octetcount_gb += readl(ioaddr + MMC_TX_OCTETCOUNT_GB);
+	mmc->mmc_tx_framecount_gb += readl(ioaddr + MMC_TX_FRAMECOUNT_GB);
+	mmc->mmc_tx_broadcastframe_g += readl(ioaddr + MMC_TX_BROADCASTFRAME_G);
+	mmc->mmc_tx_multicastframe_g += readl(ioaddr + MMC_TX_MULTICASTFRAME_G);
+	mmc->mmc_tx_64_octets_gb += readl(ioaddr + MMC_TX_64_OCTETS_GB);
+	mmc->mmc_tx_65_to_127_octets_gb +=
+	    readl(ioaddr + MMC_TX_65_TO_127_OCTETS_GB);
+	mmc->mmc_tx_128_to_255_octets_gb +=
+	    readl(ioaddr + MMC_TX_128_TO_255_OCTETS_GB);
+	mmc->mmc_tx_256_to_511_octets_gb +=
+	    readl(ioaddr + MMC_TX_256_TO_511_OCTETS_GB);
+	mmc->mmc_tx_512_to_1023_octets_gb +=
+	    readl(ioaddr + MMC_TX_512_TO_1023_OCTETS_GB);
+	mmc->mmc_tx_1024_to_max_octets_gb +=
+	    readl(ioaddr + MMC_TX_1024_TO_MAX_OCTETS_GB);
+	mmc->mmc_tx_unicast_gb += readl(ioaddr + MMC_TX_UNICAST_GB);
+	mmc->mmc_tx_multicast_gb += readl(ioaddr + MMC_TX_MULTICAST_GB);
+	mmc->mmc_tx_broadcast_gb += readl(ioaddr + MMC_TX_BROADCAST_GB);
+	mmc->mmc_tx_underflow_error += readl(ioaddr + MMC_TX_UNDERFLOW_ERROR);
+	mmc->mmc_tx_singlecol_g += readl(ioaddr + MMC_TX_SINGLECOL_G);
+	mmc->mmc_tx_multicol_g += readl(ioaddr + MMC_TX_MULTICOL_G);
+	mmc->mmc_tx_deferred += readl(ioaddr + MMC_TX_DEFERRED);
+	mmc->mmc_tx_latecol += readl(ioaddr + MMC_TX_LATECOL);
+	mmc->mmc_tx_exesscol += readl(ioaddr + MMC_TX_EXESSCOL);
+	mmc->mmc_tx_carrier_error += readl(ioaddr + MMC_TX_CARRIER_ERROR);
+	mmc->mmc_tx_octetcount_g += readl(ioaddr + MMC_TX_OCTETCOUNT_G);
+	mmc->mmc_tx_framecount_g += readl(ioaddr + MMC_TX_FRAMECOUNT_G);
+	mmc->mmc_tx_excessdef += readl(ioaddr + MMC_TX_EXCESSDEF);
+	mmc->mmc_tx_pause_frame += readl(ioaddr + MMC_TX_PAUSE_FRAME);
+	mmc->mmc_tx_vlan_frame_g += readl(ioaddr + MMC_TX_VLAN_FRAME_G);
+
+	/* MMC RX counter registers */
+	mmc->mmc_rx_framecount_gb += readl(ioaddr + MMC_RX_FRAMECOUNT_GB);
+	mmc->mmc_rx_octetcount_gb += readl(ioaddr + MMC_RX_OCTETCOUNT_GB);
+	mmc->mmc_rx_octetcount_g += readl(ioaddr + MMC_RX_OCTETCOUNT_G);
+	mmc->mmc_rx_broadcastframe_g += readl(ioaddr + MMC_RX_BROADCASTFRAME_G);
+	mmc->mmc_rx_multicastframe_g += readl(ioaddr + MMC_RX_MULTICASTFRAME_G);
+	mmc->mmc_rx_crc_errror += readl(ioaddr + MMC_RX_CRC_ERRROR);
+	mmc->mmc_rx_align_error += readl(ioaddr + MMC_RX_ALIGN_ERROR);
+	mmc->mmc_rx_run_error += readl(ioaddr + MMC_RX_RUN_ERROR);
+	mmc->mmc_rx_jabber_error += readl(ioaddr + MMC_RX_JABBER_ERROR);
+	mmc->mmc_rx_undersize_g += readl(ioaddr + MMC_RX_UNDERSIZE_G);
+	mmc->mmc_rx_oversize_g += readl(ioaddr + MMC_RX_OVERSIZE_G);
+	mmc->mmc_rx_64_octets_gb += readl(ioaddr + MMC_RX_64_OCTETS_GB);
+	mmc->mmc_rx_65_to_127_octets_gb +=
+	    readl(ioaddr + MMC_RX_65_TO_127_OCTETS_GB);
+	mmc->mmc_rx_128_to_255_octets_gb +=
+	    readl(ioaddr + MMC_RX_128_TO_255_OCTETS_GB);
+	mmc->mmc_rx_256_to_511_octets_gb +=
+	    readl(ioaddr + MMC_RX_256_TO_511_OCTETS_GB);
+	mmc->mmc_rx_512_to_1023_octets_gb +=
+	    readl(ioaddr + MMC_RX_512_TO_1023_OCTETS_GB);
+	mmc->mmc_rx_1024_to_max_octets_gb +=
+	    readl(ioaddr + MMC_RX_1024_TO_MAX_OCTETS_GB);
+	mmc->mmc_rx_unicast_g += readl(ioaddr + MMC_RX_UNICAST_G);
+	mmc->mmc_rx_length_error += readl(ioaddr + MMC_RX_LENGTH_ERROR);
+	mmc->mmc_rx_autofrangetype += readl(ioaddr + MMC_RX_AUTOFRANGETYPE);
+	mmc->mmc_rx_pause_frames += readl(ioaddr + MMC_RX_PAUSE_FRAMES);
+	mmc->mmc_rx_fifo_overflow += readl(ioaddr + MMC_RX_FIFO_OVERFLOW);
+	mmc->mmc_rx_vlan_frames_gb += readl(ioaddr + MMC_RX_VLAN_FRAMES_GB);
+	mmc->mmc_rx_watchdog_error += readl(ioaddr + MMC_RX_WATCHDOG_ERROR);
+	/* IPC */
+	mmc->mmc_rx_ipc_intr_mask += readl(ioaddr + MMC_RX_IPC_INTR_MASK);
+	mmc->mmc_rx_ipc_intr += readl(ioaddr + MMC_RX_IPC_INTR);
+	/* IPv4 */
+	mmc->mmc_rx_ipv4_gd += readl(ioaddr + MMC_RX_IPV4_GD);
+	mmc->mmc_rx_ipv4_hderr += readl(ioaddr + MMC_RX_IPV4_HDERR);
+	mmc->mmc_rx_ipv4_nopay += readl(ioaddr + MMC_RX_IPV4_NOPAY);
+	mmc->mmc_rx_ipv4_frag += readl(ioaddr + MMC_RX_IPV4_FRAG);
+	mmc->mmc_rx_ipv4_udsbl += readl(ioaddr + MMC_RX_IPV4_UDSBL);
+
+	mmc->mmc_rx_ipv4_gd_octets += readl(ioaddr + MMC_RX_IPV4_GD_OCTETS);
+	mmc->mmc_rx_ipv4_hderr_octets +=
+	    readl(ioaddr + MMC_RX_IPV4_HDERR_OCTETS);
+	mmc->mmc_rx_ipv4_nopay_octets +=
+	    readl(ioaddr + MMC_RX_IPV4_NOPAY_OCTETS);
+	mmc->mmc_rx_ipv4_frag_octets += readl(ioaddr + MMC_RX_IPV4_FRAG_OCTETS);
+	mmc->mmc_rx_ipv4_udsbl_octets +=
+	    readl(ioaddr + MMC_RX_IPV4_UDSBL_OCTETS);
+
+	/* IPV6 */
+	mmc->mmc_rx_ipv6_gd_octets += readl(ioaddr + MMC_RX_IPV6_GD_OCTETS);
+	mmc->mmc_rx_ipv6_hderr_octets +=
+	    readl(ioaddr + MMC_RX_IPV6_HDERR_OCTETS);
+	mmc->mmc_rx_ipv6_nopay_octets +=
+	    readl(ioaddr + MMC_RX_IPV6_NOPAY_OCTETS);
+
+	mmc->mmc_rx_ipv6_gd += readl(ioaddr + MMC_RX_IPV6_GD);
+	mmc->mmc_rx_ipv6_hderr += readl(ioaddr + MMC_RX_IPV6_HDERR);
+	mmc->mmc_rx_ipv6_nopay += readl(ioaddr + MMC_RX_IPV6_NOPAY);
+
+	/* Protocols */
+	mmc->mmc_rx_udp_gd += readl(ioaddr + MMC_RX_UDP_GD);
+	mmc->mmc_rx_udp_err += readl(ioaddr + MMC_RX_UDP_ERR);
+	mmc->mmc_rx_tcp_gd += readl(ioaddr + MMC_RX_TCP_GD);
+	mmc->mmc_rx_tcp_err += readl(ioaddr + MMC_RX_TCP_ERR);
+	mmc->mmc_rx_icmp_gd += readl(ioaddr + MMC_RX_ICMP_GD);
+	mmc->mmc_rx_icmp_err += readl(ioaddr + MMC_RX_ICMP_ERR);
+
+	mmc->mmc_rx_udp_gd_octets += readl(ioaddr + MMC_RX_UDP_GD_OCTETS);
+	mmc->mmc_rx_udp_err_octets += readl(ioaddr + MMC_RX_UDP_ERR_OCTETS);
+	mmc->mmc_rx_tcp_gd_octets += readl(ioaddr + MMC_RX_TCP_GD_OCTETS);
+	mmc->mmc_rx_tcp_err_octets += readl(ioaddr + MMC_RX_TCP_ERR_OCTETS);
+	mmc->mmc_rx_icmp_gd_octets += readl(ioaddr + MMC_RX_ICMP_GD_OCTETS);
+	mmc->mmc_rx_icmp_err_octets += readl(ioaddr + MMC_RX_ICMP_ERR_OCTETS);
+}
diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index de1929b..bccf1f1 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -27,6 +27,7 @@
 #ifdef CONFIG_STMMAC_TIMER
 #include "stmmac_timer.h"
 #endif
+#include "mmc.h"
 
 struct stmmac_priv {
 	/* Frequently used values are kept adjacent for cache effect */
@@ -76,6 +77,7 @@ struct stmmac_priv {
 	struct stmmac_timer *tm;
 #endif
 	struct plat_stmmacenet_data *plat;
+	struct stmmac_counters mmc;
 };
 
 extern int stmmac_mdio_unregister(struct net_device *ndev);
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index da11405..4655fab 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -47,6 +47,10 @@
 #include <linux/slab.h>
 #include <linux/prefetch.h>
 #include "stmmac.h"
+#ifdef CONFIG_STMMAC_MONITOR
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#endif
 
 #define STMMAC_RESOURCE_NAME	"stmmaceth"
 
@@ -747,6 +751,17 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
 		stmmac_tx_err(priv);
 }
 
+static void stmmac_mmc_setup(struct stmmac_priv *priv)
+{
+	unsigned int mode = MMC_CNTRL_RESET_ON_READ | MMC_CNTRL_COUNTER_RESET |
+			    MMC_CNTRL_PRESET | MMC_CNTRL_FULL_HALF_PRESET;
+
+	/* Do not manage MMC IRQ (FIXME) */
+	dwmac_mmc_intr_all_mask(priv->ioaddr);
+	dwmac_mmc_ctrl(priv->ioaddr, mode);
+	memset(&priv->mmc, 0, sizeof(struct stmmac_counters));
+}
+
 /**
  *  stmmac_open - open entry point of the driver
  *  @dev : pointer to the device structure.
@@ -845,6 +860,8 @@ static int stmmac_open(struct net_device *dev)
 	memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
 	priv->xstats.threshold = tc;
 
+	stmmac_mmc_setup(priv);
+
 	/* Start the ball rolling... */
 	DBG(probe, DEBUG, "%s: DMA RX/TX processes started...\n", dev->name);
 	priv->hw->dma->start_tx(priv->ioaddr);
@@ -1411,6 +1428,254 @@ static int stmmac_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	return ret;
 }
 
+#ifdef CONFIG_STMMAC_MONITOR
+static struct dentry *stmmac_fs_dir;
+static struct dentry *stmmac_mmc;
+
+/* To only dump counters > 0 */
+static inline void seq_printf_mmc(struct seq_file *seq,
+				  const char *description,
+				  unsigned int counter)
+{
+	if (counter > 0)
+		seq_printf(seq, "\t%s%d\n", description, counter);
+}
+
+static int stmmac_sysfs_mmc_read(struct seq_file *seq, void *v)
+{
+	struct net_device *dev = seq->private;
+	struct stmmac_priv *priv = netdev_priv(dev);
+
+	seq_printf(seq, "==============================\n");
+	seq_printf(seq, "MMC Management HW MAC counters\n");
+	seq_printf(seq, "==============================\n");
+
+	dwmac_mmc_read(priv->ioaddr, &priv->mmc);
+
+	seq_printf(seq, " -- MMC TX counter counters ---\n");
+
+	seq_printf_mmc(seq, "Number of bytes in good and bad frames: ",
+		       priv->mmc.mmc_tx_octetcount_gb);
+	seq_printf_mmc(seq, "Number of good and bad frames transmitted: ",
+		       priv->mmc.mmc_tx_framecount_gb);
+	seq_printf_mmc(seq, "Number of good Broadcast frame transmitted: ",
+		       priv->mmc.mmc_tx_broadcastframe_g);
+	seq_printf_mmc(seq, "Number of good Multicast frame transmitted: ",
+		       priv->mmc.mmc_tx_multicastframe_g);
+	seq_printf_mmc(seq, "Good and bad frames of 64bytes: ",
+		       priv->mmc.mmc_tx_64_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 65-127: ",
+		       priv->mmc.mmc_tx_65_to_127_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 128-255: ",
+		       priv->mmc.mmc_tx_128_to_255_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 256-511: ",
+		       priv->mmc.mmc_tx_256_to_511_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 512-1024: ",
+		       priv->mmc.mmc_tx_512_to_1023_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames bigger than 1024: ",
+		       priv->mmc.mmc_tx_1024_to_max_octets_gb);
+	seq_printf_mmc(seq, "Good and bad unicast frames: ",
+		       priv->mmc.mmc_tx_unicast_gb);
+	seq_printf_mmc(seq, "Good and bad multicast frames: ",
+		       priv->mmc.mmc_tx_multicast_gb);
+	seq_printf_mmc(seq, "Good and bad broadcast frames: ",
+		       priv->mmc.mmc_tx_broadcast_gb);
+	seq_printf_mmc(seq, "Number of tx Underflow errors: ",
+		       priv->mmc.mmc_tx_underflow_error);
+	seq_printf_mmc(seq, "Good frms after single collision (Half Duplex): ",
+		       priv->mmc.mmc_tx_singlecol_g);
+	seq_printf_mmc(seq, "Good frames after multi collision (Half Duplex): ",
+		       priv->mmc.mmc_tx_multicol_g);
+	seq_printf_mmc(seq, "Good frames after deferral error (Half Duplex): ",
+		       priv->mmc.mmc_tx_deferred);
+	seq_printf_mmc(seq, "Frame aborted after late collisions: ",
+		       priv->mmc.mmc_tx_latecol);
+	seq_printf_mmc(seq, "Frame aborted after excessive collisions: ",
+		       priv->mmc.mmc_tx_exesscol);
+	seq_printf_mmc(seq, "Frame aborted after carrier sense error: ",
+		       priv->mmc.mmc_tx_carrier_error);
+	seq_printf_mmc(seq, "Number of bytes (no preamble) in good frames: ",
+		       priv->mmc.mmc_tx_octetcount_g);
+	seq_printf_mmc(seq, "Number of good frames transmitted: ",
+		       priv->mmc.mmc_tx_framecount_g);
+	seq_printf_mmc(seq, "Frame aborted due to excessive deferral errors: ",
+		       priv->mmc.mmc_tx_excessdef);
+	seq_printf_mmc(seq, "Number of good pause frames: ",
+		       priv->mmc.mmc_tx_pause_frame);
+	seq_printf_mmc(seq, "Number of good VLAN frames: ",
+		       priv->mmc.mmc_tx_vlan_frame_g);
+
+	seq_printf(seq, " -- MMC RX counter counters ---\n");
+
+	seq_printf_mmc(seq, "Number of good and bad frames: ",
+		       priv->mmc.mmc_rx_framecount_gb);
+	seq_printf_mmc(seq, "Number of bytes in  good and bad frames recv: ",
+		       priv->mmc.mmc_rx_octetcount_gb);
+	seq_printf_mmc(seq, "Number of bytes in good frms (no preample): ",
+		       priv->mmc.mmc_rx_octetcount_g);
+	seq_printf_mmc(seq, "Good broadcast frames: ",
+		       priv->mmc.mmc_rx_broadcastframe_g);
+	seq_printf_mmc(seq, "Good multicast frames: ",
+		       priv->mmc.mmc_rx_multicastframe_g);
+	seq_printf_mmc(seq, "Number of CRC err: ",
+		       priv->mmc.mmc_rx_crc_errror);
+	seq_printf_mmc(seq, "Frames with dribble error: ",
+		       priv->mmc.mmc_rx_align_error);
+	seq_printf_mmc(seq, "Number of frames with runt error: ",
+		       priv->mmc.mmc_rx_run_error);
+	seq_printf_mmc(seq, "Number of giant frame: ",
+		       priv->mmc.mmc_rx_jabber_error);
+	seq_printf_mmc(seq, "Number of frames with size < 64bytes w/o error: ",
+		       priv->mmc.mmc_rx_undersize_g);
+	seq_printf_mmc(seq, "Number of oversized frames w/o errors: ",
+		       priv->mmc.mmc_rx_oversize_g);
+	seq_printf_mmc(seq, "Good and bad frames of 64bytes: ",
+		       priv->mmc.mmc_rx_64_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 65-127: ",
+		       priv->mmc.mmc_rx_65_to_127_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 128-255: ",
+		       priv->mmc.mmc_rx_128_to_255_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 256-511: ",
+		       priv->mmc.mmc_rx_256_to_511_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames of 512-1024: ",
+		       priv->mmc.mmc_rx_512_to_1023_octets_gb);
+	seq_printf_mmc(seq, "Good and bad frames bigger than 1024: ",
+		       priv->mmc.mmc_rx_1024_to_max_octets_gb);
+	seq_printf_mmc(seq, "Good unicast frames: ",
+		       priv->mmc.mmc_rx_unicast_g);
+	seq_printf_mmc(seq, "Number of frames with length errors: ",
+		       priv->mmc.mmc_rx_length_error);
+	seq_printf_mmc(seq, "Number of frames with length not equal to the"
+		       " valid frame size: ",  priv->mmc.mmc_rx_autofrangetype);
+	seq_printf_mmc(seq, "Number of goog and valid PAUSE frames: ",
+		       priv->mmc.mmc_rx_pause_frames);
+	seq_printf_mmc(seq, "Number of FIFO overflow errors: ",
+		       priv->mmc.mmc_rx_fifo_overflow);
+	seq_printf_mmc(seq, "Number of good VLAN frames: ",
+		       priv->mmc.mmc_rx_vlan_frames_gb);
+	seq_printf_mmc(seq, "Number of frames with watchdog error: ",
+		       priv->mmc.mmc_rx_watchdog_error);
+
+	seq_printf(seq, " -- MMC IPC counters  ---\n");
+	seq_printf_mmc(seq, "MMC IPC interrupt mask: ",
+			priv->mmc.mmc_rx_ipc_intr_mask);
+	seq_printf_mmc(seq, "MMC RX IPC interrupt mask:",
+		       priv->mmc.mmc_rx_ipc_intr);
+
+	seq_printf(seq, " -- MMC RX IPv4 counters  ---\n");
+	seq_printf_mmc(seq, "Good TCP/UDP/ICMP datagram: s",
+		       priv->mmc.mmc_rx_ipv4_gd);
+	seq_printf_mmc(seq, "IPv4 with header errors (CRC/len/vers mismatch): ",
+		       priv->mmc.mmc_rx_ipv4_hderr);
+	seq_printf_mmc(seq, "Number of datagram received that did not have"
+		       " TCP/UDP/ICMP payload processed by the Csum engine: ",
+		       priv->mmc.mmc_rx_ipv4_nopay);
+	seq_printf_mmc(seq, "Good datagram with fragmentation: ",
+		       priv->mmc.mmc_rx_ipv4_frag);
+	seq_printf_mmc(seq, "Good datagram that had a UDP payload with"
+		       "checksum disabled: ", priv->mmc.mmc_rx_ipv4_udsbl);
+
+	seq_printf_mmc(seq, "N. of bytes in good datagrams: ",
+		       priv->mmc.mmc_rx_ipv4_gd_octets);
+	seq_printf_mmc(seq, "N. of bytes with header errors: ",
+		       priv->mmc.mmc_rx_ipv4_hderr_octets);
+	seq_printf_mmc(seq, "N. of bytes that not have TCP/UDP/ICMP payload: ",
+		       priv->mmc.mmc_rx_ipv4_nopay_octets);
+	seq_printf_mmc(seq, "N. of bytes in fragmented datagrams: ",
+		       priv->mmc.mmc_rx_ipv4_frag_octets);
+	seq_printf_mmc(seq, "N. of bytes in UDP segment w/o csum: ",
+		       priv->mmc.mmc_rx_ipv4_udsbl_octets);
+
+	seq_printf(seq, " -- MMC RX IPV6 counters  ---\n");
+	seq_printf_mmc(seq, "N. of good datagrams with TCP/UDP/ICMP payload: ",
+		       priv->mmc.mmc_rx_ipv6_gd);
+	seq_printf_mmc(seq, "N. of good datagrams with header errors: ",
+		       priv->mmc.mmc_rx_ipv6_hderr);
+	seq_printf_mmc(seq, "N. of datagram received that did not have"
+		       " TCP/UDP/ICMP payload: ", priv->mmc.mmc_rx_ipv6_nopay);
+
+	seq_printf_mmc(seq, "N. of bytes in good IPv6 datagrams: ",
+		       priv->mmc.mmc_rx_ipv6_gd_octets);
+	seq_printf_mmc(seq, "N. of bytes in IPv6 datagrams with header err: ",
+		       priv->mmc.mmc_rx_ipv6_hderr_octets);
+	seq_printf_mmc(seq, "N. of bytes in datagrams that did not have a"
+		       " TCP/UDP/ICMP payload: ",
+		       priv->mmc.mmc_rx_ipv6_nopay_octets);
+
+	seq_printf(seq, " -- MMC Protocols RX counters  ---\n");
+	seq_printf_mmc(seq, "N. of good IP datagrams with a good UDP payload: ",
+		       priv->mmc.mmc_rx_udp_gd);
+	seq_printf_mmc(seq, " and the number of bytes:",
+		       priv->mmc.mmc_rx_udp_gd_octets);
+	seq_printf_mmc(seq, "N. of good IP datagrams whose UDP payload has a "
+		       "checksum error: ",  priv->mmc.mmc_rx_udp_err);
+	seq_printf_mmc(seq, " and the number of bytes: ",
+		       priv->mmc.mmc_rx_udp_err_octets);
+	seq_printf_mmc(seq, "N. of good IP datagrams with a good TCP payload: ",
+		       priv->mmc.mmc_rx_tcp_gd);
+	seq_printf_mmc(seq, " and the number of bytes:",
+		       priv->mmc.mmc_rx_tcp_gd_octets);
+	seq_printf_mmc(seq, "N. of good IP datagrams whose TCP payload has a "
+		       "checksum error: ",  priv->mmc.mmc_rx_tcp_err);
+	seq_printf_mmc(seq, " and the number of bytes: ",
+		       priv->mmc.mmc_rx_tcp_err_octets);
+	seq_printf_mmc(seq, "N. of good IP datagrams with good ICMP payload: ",
+		       priv->mmc.mmc_rx_icmp_gd);
+	seq_printf_mmc(seq, " and the number of bytes: ",
+		       priv->mmc.mmc_rx_icmp_gd_octets);
+	seq_printf_mmc(seq, "N. of good IP datagrams whose ICMP payload has a "
+		       "checksum error: ",  priv->mmc.mmc_rx_icmp_err);
+	seq_printf_mmc(seq, " and the number of bytes: ",
+		       priv->mmc.mmc_rx_icmp_err_octets);
+
+	return 0;
+}
+
+static int stmmac_sysfs_mmc_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, stmmac_sysfs_mmc_read, inode->i_private);
+}
+
+static const struct file_operations stmmac_mmc_fops = {
+	.owner = THIS_MODULE,
+	.open = stmmac_sysfs_mmc_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = seq_release,
+};
+
+static int stmmac_init_fs(struct net_device *dev)
+{
+	/* Create debugfs entries */
+	stmmac_fs_dir = debugfs_create_dir(STMMAC_RESOURCE_NAME, NULL);
+
+	if (!stmmac_fs_dir || IS_ERR(stmmac_fs_dir)) {
+		pr_err("ERROR %s, debugfs create directory failed\n",
+		       STMMAC_RESOURCE_NAME);
+
+		return -ENOMEM;
+	}
+
+	stmmac_mmc = debugfs_create_file("mmc", S_IRUGO, stmmac_fs_dir, dev,
+					   &stmmac_mmc_fops);
+
+	if (!stmmac_mmc || IS_ERR(stmmac_mmc)) {
+		pr_info("ERROR creating stmmac MMC debugfs file\n");
+		debugfs_remove(stmmac_fs_dir);
+
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void stmmac_exit_fs(void)
+{
+	debugfs_remove(stmmac_mmc);
+	debugfs_remove(stmmac_fs_dir);
+}
+#endif /* CONFIG_STMMAC_MONITOR */
+
 static const struct net_device_ops stmmac_netdev_ops = {
 	.ndo_open = stmmac_open,
 	.ndo_start_xmit = stmmac_xmit,
@@ -1623,6 +1888,13 @@ static int stmmac_dvr_probe(struct platform_device *pdev)
 	if (ret < 0)
 		goto out_unregister;
 	pr_debug("registered!\n");
+
+#ifdef CONFIG_STMMAC_MONITOR
+	ret = stmmac_init_fs(ndev);
+	if (ret < 0)
+		pr_warning("\tFailed debugFS registration");
+#endif
+
 	return 0;
 
 out_unregister:
@@ -1675,6 +1947,10 @@ static int stmmac_dvr_remove(struct platform_device *pdev)
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	release_mem_region(res->start, resource_size(res));
 
+#ifdef CONFIG_STMMAC_MONITOR
+	stmmac_exit_fs();
+#endif
+
 	free_netdev(ndev);
 
 	return 0;
@@ -1810,7 +2086,6 @@ static struct platform_driver stmmac_driver = {
 static int __init stmmac_init_module(void)
 {
 	int ret;
-
 	ret = platform_driver_register(&stmmac_driver);
 	return ret;
 }
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 4/9] stmmac: export DMA TX/RX rings via debugfs.
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

This patch adds the following debugFs entry to dump the
RX/TX DMA rings:

 /sys/kernel/debug/stmmaceth/descriptors_status

This is an example:
=======================
 RX descriptor ring
=======================
[0] DES0=0x85ee0320 DES1=0x1fff1fff BUF1=0x5fae2022 BUF2=0x0
[1] DES0=0x85ee0320 DES1=0x1fff1fff BUF1=0x5fae0022 BUF2=0x0
[2] DES0=0x81460320 DES1=0x1fff1fff BUF1=0x5f9dd022 BUF2=0x0
...

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/Kconfig       |    3 +-
 drivers/net/stmmac/stmmac_main.c |   68 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 1 deletions(-)

diff --git a/drivers/net/stmmac/Kconfig b/drivers/net/stmmac/Kconfig
index 0a1bd82..57ae3716 100644
--- a/drivers/net/stmmac/Kconfig
+++ b/drivers/net/stmmac/Kconfig
@@ -18,7 +18,8 @@ config STMMAC_MONITOR
 	-- help
 	  To Enable monitoring via debufs.
 	  The stmmac entry in /sys reports the state of the HW monitoring
-	  counters (if supported).
+	  counters (if supported) and other useful debug information
+	  (e.g. DMA TX/RX rings).
 
 config STMMAC_DA
 	bool "STMMAC DMA arbitration scheme"
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index 4655fab..50fe89f 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1430,8 +1430,61 @@ static int stmmac_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 
 #ifdef CONFIG_STMMAC_MONITOR
 static struct dentry *stmmac_fs_dir;
+static struct dentry *stmmac_rings_status;
 static struct dentry *stmmac_mmc;
 
+static int stmmac_sysfs_ring_read(struct seq_file *seq, void *v)
+{
+	struct tmp_s {
+		u64 a;
+		unsigned int b;
+		unsigned int c;
+	};
+	int i;
+	struct net_device *dev = seq->private;
+	struct stmmac_priv *priv = netdev_priv(dev);
+
+	seq_printf(seq, "=======================\n");
+	seq_printf(seq, " RX descriptor ring\n");
+	seq_printf(seq, "=======================\n");
+
+	for (i = 0; i < priv->dma_rx_size; i++) {
+		struct tmp_s *x = (struct tmp_s *)(priv->dma_rx + i);
+		seq_printf(seq, "[%d] DES0=0x%x DES1=0x%x BUF1=0x%x BUF2=0x%x",
+			   i, (unsigned int)(x->a),
+			   (unsigned int)((x->a) >> 32), x->b, x->c);
+		seq_printf(seq, "\n");
+	}
+
+	seq_printf(seq, "\n");
+	seq_printf(seq, "=======================\n");
+	seq_printf(seq, "  TX descriptor ring\n");
+	seq_printf(seq, "=======================\n");
+
+	for (i = 0; i < priv->dma_tx_size; i++) {
+		struct tmp_s *x = (struct tmp_s *)(priv->dma_tx + i);
+		seq_printf(seq, "[%d] DES0=0x%x DES1=0x%x BUF1=0x%x BUF2=0x%x",
+			   i, (unsigned int)(x->a),
+			   (unsigned int)((x->a) >> 32), x->b, x->c);
+		seq_printf(seq, "\n");
+	}
+
+	return 0;
+}
+
+static int stmmac_sysfs_ring_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, stmmac_sysfs_ring_read, inode->i_private);
+}
+
+static const struct file_operations stmmac_rings_status_fops = {
+	.owner = THIS_MODULE,
+	.open = stmmac_sysfs_ring_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = seq_release,
+};
+
 /* To only dump counters > 0 */
 static inline void seq_printf_mmc(struct seq_file *seq,
 				  const char *description,
@@ -1656,11 +1709,25 @@ static int stmmac_init_fs(struct net_device *dev)
 		return -ENOMEM;
 	}
 
+	/* Entry to report DMA RX/TX rings */
+	stmmac_rings_status = debugfs_create_file("descriptors_status",
+					   S_IRUGO, stmmac_fs_dir, dev,
+					   &stmmac_rings_status_fops);
+
+	if (!stmmac_rings_status || IS_ERR(stmmac_rings_status)) {
+		pr_info("ERROR creating stmmac ring debugfs file\n");
+		debugfs_remove(stmmac_fs_dir);
+
+		return -ENOMEM;
+	}
+
+	/* Entry to report MAC Management counters */
 	stmmac_mmc = debugfs_create_file("mmc", S_IRUGO, stmmac_fs_dir, dev,
 					   &stmmac_mmc_fops);
 
 	if (!stmmac_mmc || IS_ERR(stmmac_mmc)) {
 		pr_info("ERROR creating stmmac MMC debugfs file\n");
+		debugfs_remove(stmmac_rings_status);
 		debugfs_remove(stmmac_fs_dir);
 
 		return -ENOMEM;
@@ -1671,6 +1738,7 @@ static int stmmac_init_fs(struct net_device *dev)
 
 static void stmmac_exit_fs(void)
 {
+	debugfs_remove(stmmac_rings_status);
 	debugfs_remove(stmmac_mmc);
 	debugfs_remove(stmmac_fs_dir);
 }
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 5/9] stmmac: support wake up irq from external sources
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Deepak Sikri, Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

From: Deepak Sikri <deepak.sikri@st.com>

On some platforms e.g. SPEAr the wake up irq differs from the
GMAC interrupt source.
With this patch an external wake up irq can be passed through the
platform code and named as "eth_wake_irq".

In case the wake up interrupt is not passed from the platform
so the driver will continue to use the mac irq (ndev->irq)

Signed-off-by: Deepak Sikri <deepak.sikri@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/stmmac.h         |    1 +
 drivers/net/stmmac/stmmac_ethtool.c |    4 ++--
 drivers/net/stmmac/stmmac_main.c    |   14 +++++++++++++-
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index bccf1f1..f86ea87 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -73,6 +73,7 @@ struct stmmac_priv {
 	spinlock_t lock;
 	int wolopts;
 	int wolenabled;
+	int wol_irq;
 #ifdef CONFIG_STMMAC_TIMER
 	struct stmmac_timer *tm;
 #endif
diff --git a/drivers/net/stmmac/stmmac_ethtool.c b/drivers/net/stmmac/stmmac_ethtool.c
index 7ed8fb6..79df79d 100644
--- a/drivers/net/stmmac/stmmac_ethtool.c
+++ b/drivers/net/stmmac/stmmac_ethtool.c
@@ -321,10 +321,10 @@ static int stmmac_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 	if (wol->wolopts) {
 		pr_info("stmmac: wakeup enable\n");
 		device_set_wakeup_enable(priv->device, 1);
-		enable_irq_wake(dev->irq);
+		enable_irq_wake(priv->wol_irq);
 	} else {
 		device_set_wakeup_enable(priv->device, 0);
-		disable_irq_wake(dev->irq);
+		disable_irq_wake(priv->wol_irq);
 	}
 
 	spin_lock_irq(&priv->lock);
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index 50fe89f..5b20e1b 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1845,7 +1845,7 @@ static int stmmac_mac_device_setup(struct net_device *dev)
 
 	if (device_can_wakeup(priv->device)) {
 		priv->wolopts = WAKE_MAGIC; /* Magic Frame as default */
-		enable_irq_wake(dev->irq);
+		enable_irq_wake(priv->wol_irq);
 	}
 
 	return 0;
@@ -1918,6 +1918,18 @@ static int stmmac_dvr_probe(struct platform_device *pdev)
 		pr_info("\tPMT module supported\n");
 		device_set_wakeup_capable(&pdev->dev, 1);
 	}
+	/*
+	 * On some platforms e.g. SPEAr the wake up irq differs from the mac irq
+	 * The external wake up irq can be passed through the platform code
+	 * named as "eth_wake_irq"
+	 *
+	 * In case the wake up interrupt is not passed from the platform
+	 * so the driver will continue to use the mac irq (ndev->irq)
+	 */
+	priv->wol_irq = platform_get_irq_byname(pdev, "eth_wake_irq");
+	if (priv->wol_irq == -ENXIO)
+		priv->wol_irq = ndev->irq;
+
 
 	platform_set_drvdata(pdev, ndev);
 
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 6/9] stmmac: rework the code to get the Synopsys ID
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

The Synopsys ID is now passed from the MAC core
to the main. This info will be used for managing
the HW cap register (supported in the new GMAC
generations).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/common.h         |    1 +
 drivers/net/stmmac/dwmac1000_core.c |    6 ++----
 drivers/net/stmmac/dwmac100_core.c  |    1 +
 drivers/net/stmmac/stmmac_main.c    |   20 +++++++++++++++++++-
 4 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index 290b97a..79c6171 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -229,6 +229,7 @@ struct mac_device_info {
 	const struct stmmac_dma_ops	*dma;
 	struct mii_regs mii;	/* MII register Addresses */
 	struct mac_link link;
+	unsigned int synopsys_uid;
 };
 
 struct mac_device_info *dwmac1000_setup(void __iomem *ioaddr);
diff --git a/drivers/net/stmmac/dwmac1000_core.c b/drivers/net/stmmac/dwmac1000_core.c
index 9ba9cae..b1c48b9 100644
--- a/drivers/net/stmmac/dwmac1000_core.c
+++ b/drivers/net/stmmac/dwmac1000_core.c
@@ -224,10 +224,7 @@ static const struct stmmac_ops dwmac1000_ops = {
 struct mac_device_info *dwmac1000_setup(void __iomem *ioaddr)
 {
 	struct mac_device_info *mac;
-	u32 uid = readl(ioaddr + GMAC_VERSION);
-
-	pr_info("\tDWMAC1000 - user ID: 0x%x, Synopsys ID: 0x%x\n",
-		((uid & 0x0000ff00) >> 8), (uid & 0x000000ff));
+	u32 hwid = readl(ioaddr + GMAC_VERSION);
 
 	mac = kzalloc(sizeof(const struct mac_device_info), GFP_KERNEL);
 	if (!mac)
@@ -241,6 +238,7 @@ struct mac_device_info *dwmac1000_setup(void __iomem *ioaddr)
 	mac->link.speed = GMAC_CONTROL_FES;
 	mac->mii.addr = GMAC_MII_ADDR;
 	mac->mii.data = GMAC_MII_DATA;
+	mac->synopsys_uid = hwid;
 
 	return mac;
 }
diff --git a/drivers/net/stmmac/dwmac100_core.c b/drivers/net/stmmac/dwmac100_core.c
index aacfc6e..138fb8d 100644
--- a/drivers/net/stmmac/dwmac100_core.c
+++ b/drivers/net/stmmac/dwmac100_core.c
@@ -188,6 +188,7 @@ struct mac_device_info *dwmac100_setup(void __iomem *ioaddr)
 	mac->link.speed = 0;
 	mac->mii.addr = MAC_MII_ADDR;
 	mac->mii.data = MAC_MII_DATA;
+	mac->synopsys_uid = 0;
 
 	return mac;
 }
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index 5b20e1b..f5aac12 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -762,6 +762,23 @@ static void stmmac_mmc_setup(struct stmmac_priv *priv)
 	memset(&priv->mmc, 0, sizeof(struct stmmac_counters));
 }
 
+static u32 stmmac_get_synopsys_id(struct stmmac_priv *priv)
+{
+	u32 hwid = priv->hw->synopsys_uid;
+
+	/* Only check valid Synopsys Id because old MAC chips
+	 * have no HW registers where get the ID */
+	if (likely(hwid)) {
+		u32 uid = ((hwid & 0x0000ff00) >> 8);
+		u32 synid = (hwid & 0x000000ff);
+
+		pr_info("STMMAC - user ID: 0x%x, Synopsys ID: 0x%x\n",
+			uid, synid);
+
+		return synid;
+	}
+	return 0;
+}
 /**
  *  stmmac_open - open entry point of the driver
  *  @dev : pointer to the device structure.
@@ -834,7 +851,8 @@ static int stmmac_open(struct net_device *dev)
 	/* Initialize the MAC Core */
 	priv->hw->mac->core_init(priv->ioaddr);
 
-	priv->rx_coe = priv->hw->mac->rx_coe(priv->ioaddr);
+	stmmac_get_synopsys_id(priv);
+
 	if (priv->rx_coe)
 		pr_info("stmmac: Rx Checksum Offload Engine supported\n");
 	if (priv->plat->tx_coe)
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 7/9] stmmac: add HW DMA feature register
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

New GMAC chips have an extra register to indicate
the presence of the optional features/functions of
the DMA core.

This patch adds this support and all the HW cap
are exported via debugfs.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/common.h        |   33 +++++++++
 drivers/net/stmmac/dwmac1000_dma.c |    6 ++
 drivers/net/stmmac/dwmac_dma.h     |    1 +
 drivers/net/stmmac/stmmac.h        |    1 +
 drivers/net/stmmac/stmmac_main.c   |  132 ++++++++++++++++++++++++++++++++++++
 5 files changed, 173 insertions(+), 0 deletions(-)

diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index 79c6171..1733019 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -115,6 +115,37 @@ enum tx_dma_irq_status {
 	handle_tx_rx = 3,
 };
 
+/* DMA HW capabilities */
+struct dma_features {
+	unsigned int mbps_10_100;
+	unsigned int mbps_1000;
+	unsigned int half_duplex;
+	unsigned int hash_filter;
+	unsigned int multi_addr;
+	unsigned int pcs;
+	unsigned int sma_mdio;
+	unsigned int pmt_remote_wake_up;
+	unsigned int pmt_magic_frame;
+	unsigned int rmon;
+	/* IEEE 1588-2002*/
+	unsigned int time_stamp;
+	/* IEEE 1588-2008*/
+	unsigned int atime_stamp;
+	/* 802.3az - Energy-Efficient Ethernet (EEE) */
+	unsigned int eee;
+	unsigned int av;
+	/* TX and RX csum */
+	unsigned int tx_coe;
+	unsigned int rx_coe_type1;
+	unsigned int rx_coe_type2;
+	unsigned int rxfifo_over_2048;
+	/* TX and RX number of channels */
+	unsigned int number_rx_channel;
+	unsigned int number_tx_channel;
+	/* Alternate (enhanced) DESC mode*/
+	unsigned int enh_desc;
+};
+
 /* GMAC TX FIFO is 8K, Rx FIFO is 16K */
 #define BUF_SIZE_16KiB 16384
 #define BUF_SIZE_8KiB 8192
@@ -187,6 +218,8 @@ struct stmmac_dma_ops {
 	void (*stop_rx) (void __iomem *ioaddr);
 	int (*dma_interrupt) (void __iomem *ioaddr,
 			      struct stmmac_extra_stats *x);
+	/* If supported then get the optional core features */
+	unsigned int (*get_hw_feature) (void __iomem *ioaddr);
 };
 
 struct stmmac_ops {
diff --git a/drivers/net/stmmac/dwmac1000_dma.c b/drivers/net/stmmac/dwmac1000_dma.c
index 3dbeea6..5c17d35 100644
--- a/drivers/net/stmmac/dwmac1000_dma.c
+++ b/drivers/net/stmmac/dwmac1000_dma.c
@@ -139,6 +139,11 @@ static void dwmac1000_dump_dma_regs(void __iomem *ioaddr)
 	}
 }
 
+static unsigned int dwmac1000_get_hw_feature(void __iomem *ioaddr)
+{
+	return readl(ioaddr + DMA_HW_FEATURE);
+}
+
 const struct stmmac_dma_ops dwmac1000_dma_ops = {
 	.init = dwmac1000_dma_init,
 	.dump_regs = dwmac1000_dump_dma_regs,
@@ -152,4 +157,5 @@ const struct stmmac_dma_ops dwmac1000_dma_ops = {
 	.start_rx = dwmac_dma_start_rx,
 	.stop_rx = dwmac_dma_stop_rx,
 	.dma_interrupt = dwmac_dma_interrupt,
+	.get_hw_feature = dwmac1000_get_hw_feature,
 };
diff --git a/drivers/net/stmmac/dwmac_dma.h b/drivers/net/stmmac/dwmac_dma.h
index da3f5cc..437edac 100644
--- a/drivers/net/stmmac/dwmac_dma.h
+++ b/drivers/net/stmmac/dwmac_dma.h
@@ -34,6 +34,7 @@
 #define DMA_MISSED_FRAME_CTR	0x00001020	/* Missed Frame Counter */
 #define DMA_CUR_TX_BUF_ADDR	0x00001050	/* Current Host Tx Buffer */
 #define DMA_CUR_RX_BUF_ADDR	0x00001054	/* Current Host Rx Buffer */
+#define DMA_HW_FEATURE		0x00001058	/* HW Feature Register */
 
 /* DMA Control register defines */
 #define DMA_CONTROL_ST		0x00002000	/* Start/Stop Transmission */
diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index f86ea87..3461676 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -79,6 +79,7 @@ struct stmmac_priv {
 #endif
 	struct plat_stmmacenet_data *plat;
 	struct stmmac_counters mmc;
+	struct dma_features dma_cap;
 };
 
 extern int stmmac_mdio_unregister(struct net_device *ndev);
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index f5aac12..6d806b5 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -779,6 +779,49 @@ static u32 stmmac_get_synopsys_id(struct stmmac_priv *priv)
 	}
 	return 0;
 }
+
+/* New GMAC chips support a new register to indicate the
+ * presence of the optional feature/functions.
+ */
+static int stmmac_get_hw_features(struct stmmac_priv *priv)
+{
+	u32 hw_cap = priv->hw->dma->get_hw_feature(priv->ioaddr);
+
+	if (likely(hw_cap)) {
+		priv->dma_cap.mbps_10_100 = (hw_cap & 0x1);
+		priv->dma_cap.mbps_1000 = (hw_cap & 0x2) >> 1;
+		priv->dma_cap.half_duplex = (hw_cap & 0x4) >> 2;
+		priv->dma_cap.hash_filter = (hw_cap & 0x10) >> 4;
+		priv->dma_cap.multi_addr = (hw_cap & 0x20) >> 5;
+		priv->dma_cap.pcs = (hw_cap & 0x40) >> 6;
+		priv->dma_cap.sma_mdio = (hw_cap & 0x100) >> 8;
+		priv->dma_cap.pmt_remote_wake_up = (hw_cap & 0x200) >> 9;
+		priv->dma_cap.pmt_magic_frame = (hw_cap & 0x400) >> 10;
+		priv->dma_cap.rmon = (hw_cap & 0x800) >> 11; /* MMC */
+		/* IEEE 1588-2002*/
+		priv->dma_cap.time_stamp = (hw_cap & 0x1000) >> 12;
+		/* IEEE 1588-2008*/
+		priv->dma_cap.atime_stamp = (hw_cap & 0x2000) >> 13;
+		/* 802.3az - Energy-Efficient Ethernet (EEE) */
+		priv->dma_cap.eee = (hw_cap & 0x4000) >> 14;
+		priv->dma_cap.av = (hw_cap & 0x8000) >> 15;
+		/* TX and RX csum */
+		priv->dma_cap.tx_coe = (hw_cap & 0x10000) >> 16;
+		priv->dma_cap.rx_coe_type1 = (hw_cap & 0x20000) >> 17;
+		priv->dma_cap.rx_coe_type2 = (hw_cap & 0x40000) >> 18;
+		priv->dma_cap.rxfifo_over_2048 = (hw_cap & 0x80000) >> 19;
+		/* TX and RX number of channels */
+		priv->dma_cap.number_rx_channel = (hw_cap & 0x300000) >> 20;
+		priv->dma_cap.number_tx_channel = (hw_cap & 0xc00000) >> 22;
+		/* Alternate (enhanced) DESC mode*/
+		priv->dma_cap.enh_desc = (hw_cap & 0x1000000) >> 24;
+
+	} else
+		pr_debug("\tNo HW DMA feature register supported");
+
+	return hw_cap;
+}
+
 /**
  *  stmmac_open - open entry point of the driver
  *  @dev : pointer to the device structure.
@@ -853,6 +896,8 @@ static int stmmac_open(struct net_device *dev)
 
 	stmmac_get_synopsys_id(priv);
 
+	stmmac_get_hw_features(priv);
+
 	if (priv->rx_coe)
 		pr_info("stmmac: Rx Checksum Offload Engine supported\n");
 	if (priv->plat->tx_coe)
@@ -1450,6 +1495,7 @@ static int stmmac_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 static struct dentry *stmmac_fs_dir;
 static struct dentry *stmmac_rings_status;
 static struct dentry *stmmac_mmc;
+static struct dentry *stmmac_dma_cap;
 
 static int stmmac_sysfs_ring_read(struct seq_file *seq, void *v)
 {
@@ -1715,6 +1761,78 @@ static const struct file_operations stmmac_mmc_fops = {
 	.release = seq_release,
 };
 
+static int stmmac_sysfs_dma_cap_read(struct seq_file *seq, void *v)
+{
+	struct net_device *dev = seq->private;
+	struct stmmac_priv *priv = netdev_priv(dev);
+
+	if (!stmmac_get_hw_features(priv)) {
+		seq_printf(seq, "DMA HW features not supported\n");
+		return 0;
+	}
+
+	seq_printf(seq, "==============================\n");
+	seq_printf(seq, "\tDMA HW features\n");
+	seq_printf(seq, "==============================\n");
+
+	seq_printf(seq, "\t10/100 Mbps %s\n",
+		   (priv->dma_cap.mbps_10_100) ? "Y" : "N");
+	seq_printf(seq, "\t1000 Mbps %s\n",
+		   (priv->dma_cap.mbps_1000) ? "Y" : "N");
+	seq_printf(seq, "\tHalf duple %s\n",
+		   (priv->dma_cap.half_duplex) ? "Y" : "N");
+	seq_printf(seq, "\tHash Filter: %s\n",
+		   (priv->dma_cap.hash_filter) ? "Y" : "N");
+	seq_printf(seq, "\tMultiple MAC address registers: %s\n",
+		   (priv->dma_cap.multi_addr) ? "Y" : "N");
+	seq_printf(seq, "\tPCS (TBI/SGMII/RTBI PHY interfatces): %s\n",
+		   (priv->dma_cap.pcs) ? "Y" : "N");
+	seq_printf(seq, "\tSMA (MDIO) Interface: %s\n",
+		   (priv->dma_cap.sma_mdio) ? "Y" : "N");
+	seq_printf(seq, "\tPMT Remote wake up: %s\n",
+		   (priv->dma_cap.pmt_remote_wake_up) ? "Y" : "N");
+	seq_printf(seq, "\tPMT Magic Frame: %s\n",
+		   (priv->dma_cap.pmt_magic_frame) ? "Y" : "N");
+	seq_printf(seq, "\tRMON module: %s\n",
+		   (priv->dma_cap.rmon) ? "Y" : "N");
+	seq_printf(seq, "\tIEEE 1588-2002 Time Stamp: %s\n",
+		   (priv->dma_cap.time_stamp) ? "Y" : "N");
+	seq_printf(seq, "\tIEEE 1588-2008 Advanced Time Stamp:%s\n",
+		   (priv->dma_cap.atime_stamp) ? "Y" : "N");
+	seq_printf(seq, "\t802.3az - Energy-Efficient Ethernet (EEE) %s\n",
+		   (priv->dma_cap.eee) ? "Y" : "N");
+	seq_printf(seq, "\tAV features: %s\n", (priv->dma_cap.av) ? "Y" : "N");
+	seq_printf(seq, "\tChecksum Offload in TX: %s\n",
+		   (priv->dma_cap.tx_coe) ? "Y" : "N");
+	seq_printf(seq, "\tIP Checksum Offload (type1) in RX: %s\n",
+		   (priv->dma_cap.rx_coe_type1) ? "Y" : "N");
+	seq_printf(seq, "\tIP Checksum Offload (type2) in RX: %s\n",
+		   (priv->dma_cap.rx_coe_type2) ? "Y" : "N");
+	seq_printf(seq, "\tRXFIFO > 2048bytes: %s\n",
+		   (priv->dma_cap.rxfifo_over_2048) ? "Y" : "N");
+	seq_printf(seq, "\tNumber of Additional RX channel: %d\n",
+		   priv->dma_cap.number_rx_channel);
+	seq_printf(seq, "\tNumber of Additional TX channel: %d\n",
+		   priv->dma_cap.number_tx_channel);
+	seq_printf(seq, "\tEnhanced descriptors: %s\n",
+		   (priv->dma_cap.enh_desc) ? "Y" : "N");
+
+	return 0;
+}
+
+static int stmmac_sysfs_dma_cap_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, stmmac_sysfs_dma_cap_read, inode->i_private);
+}
+
+static const struct file_operations stmmac_dma_cap_fops = {
+	.owner = THIS_MODULE,
+	.open = stmmac_sysfs_dma_cap_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = seq_release,
+};
+
 static int stmmac_init_fs(struct net_device *dev)
 {
 	/* Create debugfs entries */
@@ -1751,6 +1869,19 @@ static int stmmac_init_fs(struct net_device *dev)
 		return -ENOMEM;
 	}
 
+	/* Entry to report the DMA HW features */
+	stmmac_dma_cap = debugfs_create_file("dma_cap", S_IRUGO, stmmac_fs_dir,
+					     dev, &stmmac_dma_cap_fops);
+
+	if (!stmmac_dma_cap || IS_ERR(stmmac_dma_cap)) {
+		pr_info("ERROR creating stmmac MMC debugfs file\n");
+		debugfs_remove(stmmac_rings_status);
+		debugfs_remove(stmmac_mmc);
+		debugfs_remove(stmmac_fs_dir);
+
+		return -ENOMEM;
+	}
+
 	return 0;
 }
 
@@ -1758,6 +1889,7 @@ static void stmmac_exit_fs(void)
 {
 	debugfs_remove(stmmac_rings_status);
 	debugfs_remove(stmmac_mmc);
+	debugfs_remove(stmmac_dma_cap);
 	debugfs_remove(stmmac_fs_dir);
 }
 #endif /* CONFIG_STMMAC_MONITOR */
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 8/9] stmmac: update the doc with new info about the driver's debug.
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 Documentation/networking/stmmac.txt |   39 ++++++++++++++++++++++++++++++++++-
 1 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 57a2410..d82e180 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -235,7 +235,44 @@ reset procedure etc).
  o enh_desc.c: functions for handling enhanced descriptors
  o norm_desc.c: functions for handling normal descriptors
 
-5) TODO:
+5) Debug Information
+
+The driver exports many information i.e. internal statistics,
+debug information, MAC and DMA registers etc.
+
+These can be read in several ways depending on the
+type of the information actually needed.
+
+For example a user can be use the ethtool support
+to get statistics: e.g. using: ethtool -S ethX
+or sees the MAC/DMA registers: e.g. using: ethtool -d ethX
+
+Compiling the Kernel with CONFIG_DEBUG_FS and enabling the
+STMMAC_MONITORING option the driver will export the following
+debugfs entries:
+
+/sys/kernel/debug/stmmaceth/descriptors_status
+  To show the DMA TX/RX descriptor rings
+
+/sys/kernel/debug/stmmaceth/mmc
+  To show the internal Management counters (MMC)
+  if supported in the core.
+
+/sys/kernel/debug/stmmaceth/dma_cap
+  To show the DMA HW features register (if supported)
+
+Developer can also use the "debug" module parameter to get
+further debug information.
+
+In the end, there are other macros (that cannot be enabled
+via menuconfig) to turn-on the RX/TX DMA debugging,
+specific MAC core debug printk etc. Others to enable the
+debug in the TX and RX processes.
+All these are only useful during the developing stage
+and should never enabled inside the code for general usage.
+In fact, these can generate an huge amount of debug messages.
+
+6) TODO:
  o XGMAC is not supported.
  o Review the timer optimisation code to use an embedded device that will be
   available in new chip generations.
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH 9/9] stmmac: update the driver version (Aug_2011)
From: Giuseppe CAVALLARO @ 2011-08-25  8:00 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1314259229-13767-1-git-send-email-peppe.cavallaro@st.com>

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/stmmac.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index 3461676..6bc0d0c 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -20,7 +20,7 @@
   Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
 *******************************************************************************/
 
-#define DRV_MODULE_VERSION	"July_2011"
+#define DRV_MODULE_VERSION	"Aug_2011"
 #include <linux/stmmac.h>
 
 #include "common.h"
-- 
1.7.4.4

^ permalink raw reply related

* Re: [PATCH] stmmac: remove the STBus bridge setting from the GMAC code
From: Giuseppe CAVALLARO @ 2011-08-25  8:01 UTC (permalink / raw)
  To: ML netdev, David S. Miller
In-Reply-To: <1314021507-26955-1-git-send-email-peppe.cavallaro@st.com>

Hello David
you can discard this because I'm resending it in a bundle of patches to
update the whole driver to the Aug_2011 version.

Peppe

On 8/22/2011 3:58 PM, Giuseppe CAVALLARO wrote:
> This patch removes a piece of code (actually commented)
> only useful for some ST platforms in the past.
> 
> This kind of setting now can be done by using the platform
> callbacks provided in linux/stmmac.h (see the stmmac.txt for
> further details).
> 
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> ---
>  drivers/net/stmmac/dwmac1000_core.c |    3 ---
>  1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/stmmac/dwmac1000_core.c b/drivers/net/stmmac/dwmac1000_core.c
> index 0f63b3c..eea184a 100644
> --- a/drivers/net/stmmac/dwmac1000_core.c
> +++ b/drivers/net/stmmac/dwmac1000_core.c
> @@ -37,9 +37,6 @@ static void dwmac1000_core_init(void __iomem *ioaddr)
>  	value |= GMAC_CORE_INIT;
>  	writel(value, ioaddr + GMAC_CONTROL);
>  
> -	/* STBus Bridge Configuration */
> -	/*writel(0xc5608, ioaddr + 0x00007000);*/
> -
>  	/* Freeze MMC counters */
>  	writel(0x8, ioaddr + GMAC_MMC_CTRL);
>  	/* Mask GMAC interrupts */

^ permalink raw reply

* how to distribute irqs of ixgbevf
From: J.Hwan Kim @ 2011-08-25  8:07 UTC (permalink / raw)
  To: netdev

Hi, everyone

The interrupts of my ixgbevf driver occurs only Core 0
although the user space "irqbalance" serivce is working.

How can I distribute the interrupt of RX in ixgbevf to all cores?

cat /proc/interrupts | grep "isv"
   97:          8          0          0          0          0          
0          0          0   PCI-MSI-edge      isv0-rx-0
   99:          7          0          0          0          0          
0          0          0   PCI-MSI-edge      isv0:lsc
  103:       2059      0          0          0          0          
0          0          0   PCI-MSI-edge      isv2-rx-0
  104:         14        0          0          0          0          
0          0          0   PCI-MSI-edge      isv2-tx-0
  105:          1         0          0          0          0          
0          0          0   PCI-MSI-edge      isv2:mbx

"isv" is netdevice name of my ixgbevf.



Thanks in advance.

Best Regards,

J.Hwan Kim

^ permalink raw reply

* Re: how to distribute irqs of ixgbevf
From: Eric Dumazet @ 2011-08-25  8:21 UTC (permalink / raw)
  To: J.Hwan Kim; +Cc: netdev
In-Reply-To: <4E5602D6.90807@gmail.com>

Le jeudi 25 août 2011 à 17:07 +0900, J.Hwan Kim a écrit :
> Hi, everyone
> 
> The interrupts of my ixgbevf driver occurs only Core 0
> although the user space "irqbalance" serivce is working.
> 
> How can I distribute the interrupt of RX in ixgbevf to all cores?
> 
> cat /proc/interrupts | grep "isv"
>    97:          8          0          0          0          0          
> 0          0          0   PCI-MSI-edge      isv0-rx-0
>    99:          7          0          0          0          0          
> 0          0          0   PCI-MSI-edge      isv0:lsc
>   103:       2059      0          0          0          0          
> 0          0          0   PCI-MSI-edge      isv2-rx-0
>   104:         14        0          0          0          0          
> 0          0          0   PCI-MSI-edge      isv2-tx-0
>   105:          1         0          0          0          0          
> 0          0          0   PCI-MSI-edge      isv2:mbx
> 
> "isv" is netdevice name of my ixgbevf.
> 
> 

Given load is very small, irqbalance chose to send interrupts on a
single cpu.

^ permalink raw reply

* Re: [PATCH] tcp: bound RTO to minimum
From: Eric Dumazet @ 2011-08-25  8:26 UTC (permalink / raw)
  To: Alexander Zimmermann
  Cc: Yuchung Cheng, Hagen Paul Pfeifer, netdev, Hannemann Arnd,
	Lukowski Damian
In-Reply-To: <4033BFEE-C432-4D94-8372-BA166AF2AA26@comsys.rwth-aachen.de>

Le jeudi 25 août 2011 à 09:28 +0200, Alexander Zimmermann a écrit :
> Hi Eric,
> 
> Am 25.08.2011 um 07:28 schrieb Eric Dumazet:

> > Real question is : do we really want to process ~1000 timer interrupts
> > per tcp session, ~2000 skb alloc/free/build/handling, possibly ~1000 ARP
> > requests, only to make tcp revover in ~1sec when connectivity returns
> > back. This just doesnt scale.
> 
> maybe a stupid question, but 1000?. With an minRTO of 200ms and a maximum
> probing time of 120s, we 600 retransmits in a worst-case-senario
> (assumed that we get for every rot retransmission an icmp). No?

Where is asserted the "max probing time of 120s" ? 

It is not the case on my machine :
I have way more retransmits than that, even if spaced by 1600 ms

07:16:13.389331 write(3, "\350F\235JC\357\376\363&\3\374\270R\21L\26\324{\37p\342\244i\304\356\241I:\301\332\222\26"..., 48) = 48
07:16:13.389417 select(7, [3 4], [], NULL, NULL) = 1 (in [3])
07:31:39.901311 read(3, 0xff8c4c90, 8192) = -1 EHOSTUNREACH (No route to host)

Old kernels where performing up to 15 retries, doing exponential backoff.

Now its kind of unlimited, according to experimental results.

^ permalink raw reply

* [PATCH 0/9] skb fragment API: convert non-network drivers
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-atm-general-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA, devel-s9riP+hp16TNLxjTenLetw

The following series converts some non-network drivers to the SKB pages
fragment API introduced in 131ea6675c76. Included are ATM, Infiniband,
and FibreChannel. I also included the broadcom network drivers since I
was touching the related FC driver.

This is part of my series to enable visibility into SKB paged fragment's
lifecycles, [0] contains some more background and rationale but
basically the completed series will allow entities which inject pages
into the networking stack to receive a notification when the stack has
really finished with those pages (i.e. including retransmissions,
clones, pull-ups etc) and not just when the original skb is finished
with, which is beneficial to many subsystems which wish to inject pages
into the network stack without giving up full ownership of those page's
lifecycle. It implements something broadly along the lines of what was
described in [1].

Cheers,
Ian.

[0] http://marc.info/?l=linux-netdev&m=131072801125521&w=2
[1] http://marc.info/?l=linux-netdev&m=130925719513084&w=2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 2/9] IB: amso1100: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Ian Campbell, Tom Tucker, Steve Wise, Roland Dreier, Sean Hefty,
	Hal Rosenstock, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1314260881.10283.48.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Tom Tucker <tom-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Cc: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 drivers/infiniband/hw/amso1100/c2.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/amso1100/c2.c b/drivers/infiniband/hw/amso1100/c2.c
index 444470a..6a8f36e 100644
--- a/drivers/infiniband/hw/amso1100/c2.c
+++ b/drivers/infiniband/hw/amso1100/c2.c
@@ -802,11 +802,9 @@ static int c2_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 			maplen = frag->size;
-			mapaddr =
-			    pci_map_page(c2dev->pcidev, frag->page,
-					 frag->page_offset, maplen,
-					 PCI_DMA_TODEVICE);
-
+			mapaddr = skb_frag_dma_map(&c2dev->pcidev->dev, frag,
+						   0, maplen,
+						   PCI_DMA_TODEVICE);
 			elem = elem->next;
 			elem->skb = NULL;
 			elem->mapaddr = mapaddr;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 3/9] IB: nes: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Ian Campbell, Faisal Latif, Roland Dreier, Sean Hefty,
	Hal Rosenstock, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1314260881.10283.48.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Faisal Latif <faisal.latif-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 drivers/infiniband/hw/nes/nes_nic.c |   21 +++++++++++----------
 1 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index 66e1229..96cb35a 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -441,11 +441,11 @@ static int nes_nic_send(struct sk_buff *skb, struct net_device *netdev)
 		nesnic->tx_skb[nesnic->sq_head] = skb;
 		for (skb_fragment_index = 0; skb_fragment_index < skb_shinfo(skb)->nr_frags;
 				skb_fragment_index++) {
-			bus_address = pci_map_page( nesdev->pcidev,
-					skb_shinfo(skb)->frags[skb_fragment_index].page,
-					skb_shinfo(skb)->frags[skb_fragment_index].page_offset,
-					skb_shinfo(skb)->frags[skb_fragment_index].size,
-					PCI_DMA_TODEVICE);
+			skb_frag_t *frag =
+				&skb_shinfo(skb)->frags[skb_fragment_index];
+			bus_address = skb_frag_dma_map(&nesdev->pcidev->dev,
+						       frag, 0, frag->size,
+						       PCI_DMA_TODEVICE);
 			wqe_fragment_length[wqe_fragment_index] =
 					cpu_to_le16(skb_shinfo(skb)->frags[skb_fragment_index].size);
 			set_wqe_64bit_value(nic_sqe->wqe_words, NES_NIC_SQ_WQE_FRAG0_LOW_IDX+(2*wqe_fragment_index),
@@ -561,11 +561,12 @@ tso_sq_no_longer_full:
 			/* Map all the buffers */
 			for (tso_frag_count=0; tso_frag_count < skb_shinfo(skb)->nr_frags;
 					tso_frag_count++) {
-				tso_bus_address[tso_frag_count] = pci_map_page( nesdev->pcidev,
-						skb_shinfo(skb)->frags[tso_frag_count].page,
-						skb_shinfo(skb)->frags[tso_frag_count].page_offset,
-						skb_shinfo(skb)->frags[tso_frag_count].size,
-						PCI_DMA_TODEVICE);
+				skb_frag_t *frag =
+					&skb_shinfo(skb)->frags[tso_frag_count];
+				tso_bus_address[tso_frag_count] =
+					skb_frag_dma_map(&nesdev->pcidev->dev,
+							 frag, 0, frag->size,
+							 PCI_DMA_TODEVICE);
 			}
 
 			tso_frag_index = 0;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 4/9] IPoIB: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Ian Campbell, Roland Dreier, Sean Hefty, Hal Rosenstock,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1314260881.10283.48.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c |    5 +++--
 drivers/infiniband/ulp/ipoib/ipoib_ib.c |    5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 39913a0..67a477b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -169,7 +169,7 @@ static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev,
 			goto partial_error;
 		skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE);
 
-		mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page,
+		mapping[i + 1] = ib_dma_map_page(priv->ca, page,
 						 0, PAGE_SIZE, DMA_FROM_DEVICE);
 		if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1])))
 			goto partial_error;
@@ -537,7 +537,8 @@ static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space,
 
 		if (length == 0) {
 			/* don't need this page */
-			skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE);
+			skb_fill_page_desc(toskb, i, skb_frag_page(frag),
+					   0, PAGE_SIZE);
 			--skb_shinfo(skb)->nr_frags;
 		} else {
 			size = min(length, (unsigned) PAGE_SIZE);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 81ae61d..00435be 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -182,7 +182,7 @@ static struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, int id)
 			goto partial_error;
 		skb_fill_page_desc(skb, 0, page, 0, PAGE_SIZE);
 		mapping[1] =
-			ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[0].page,
+			ib_dma_map_page(priv->ca, page,
 					0, PAGE_SIZE, DMA_FROM_DEVICE);
 		if (unlikely(ib_dma_mapping_error(priv->ca, mapping[1])))
 			goto partial_error;
@@ -323,7 +323,8 @@ static int ipoib_dma_map_tx(struct ib_device *ca,
 
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; ++i) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-		mapping[i + off] = ib_dma_map_page(ca, frag->page,
+		mapping[i + off] = ib_dma_map_page(ca,
+						 skb_frag_page(frag),
 						 frag->page_offset, frag->size,
 						 DMA_TO_DEVICE);
 		if (unlikely(ib_dma_mapping_error(ca, mapping[i + off])))
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 5/9] tg3: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Matt Carlson,
	Ian Campbell, Michael Chan
In-Reply-To: <1314260881.10283.48.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Matt Carlson <mcarlson-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Cc: Michael Chan <mchan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
---
 drivers/net/ethernet/broadcom/tg3.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 0f81111..a7e28a2 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -6311,10 +6311,8 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
 			len = frag->size;
-			mapping = pci_map_page(tp->pdev,
-					       frag->page,
-					       frag->page_offset,
-					       len, PCI_DMA_TODEVICE);
+			mapping = skb_frag_dma_map(&tp->pdev->dev, frag, 0,
+						   len, PCI_DMA_TODEVICE);
 
 			tnapi->tx_buffers[entry].skb = NULL;
 			dma_unmap_addr_set(&tnapi->tx_buffers[entry], mapping,
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 1/9] atm: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell, Chas Williams, linux-atm-general
In-Reply-To: <1314260881.10283.48.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Chas Williams <chas@cmf.nrl.navy.mil>
Cc: linux-atm-general@lists.sourceforge.net
Cc: netdev@vger.kernel.org

--
The original logic here appears to be bogus (adding page-offset to the struct
page * itself doesn't seem likely to be correct) but I left that unchanged for
this mechanical change.
---
 drivers/atm/eni.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 9307141..f7ca4c1 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -1134,7 +1134,8 @@ DPRINTK("doing direct send\n"); /* @@@ well, this doesn't work anyway */
 				    skb_headlen(skb));
 			else
 				put_dma(tx->index,eni_dev->dma,&j,(unsigned long)
-				    skb_shinfo(skb)->frags[i].page + skb_shinfo(skb)->frags[i].page_offset,
+				    skb_frag_page(&skb_shinfo(skb)->frags[i]) +
+					skb_shinfo(skb)->frags[i].page_offset,
 				    skb_shinfo(skb)->frags[i].size);
 	}
 	if (skb->len & 3)
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 6/9] bnx2: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell, Michael Chan
In-Reply-To: <1314260881.10283.48.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Michael Chan <mchan@broadcom.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/broadcom/bnx2.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 4a9a8c81..9afb653 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2930,8 +2930,8 @@ bnx2_reuse_rx_skb_pages(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
 
 		shinfo = skb_shinfo(skb);
 		shinfo->nr_frags--;
-		page = shinfo->frags[shinfo->nr_frags].page;
-		shinfo->frags[shinfo->nr_frags].page = NULL;
+		page = skb_frag_page(&shinfo->frags[shinfo->nr_frags]);
+		__skb_frag_set_page(&shinfo->frags[shinfo->nr_frags], NULL);
 
 		cons_rx_pg->page = page;
 		dev_kfree_skb(skb);
@@ -6511,8 +6511,8 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		txbd = &txr->tx_desc_ring[ring_prod];
 
 		len = frag->size;
-		mapping = dma_map_page(&bp->pdev->dev, frag->page, frag->page_offset,
-				       len, PCI_DMA_TODEVICE);
+		mapping = skb_frag_dma_map(&bp->pdev->dev, frag, 0, len,
+					   PCI_DMA_TODEVICE);
 		if (dma_mapping_error(&bp->pdev->dev, mapping))
 			goto dma_error;
 		dma_unmap_addr_set(&txr->tx_buf_ring[ring_prod], mapping,
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 7/9] bnx2x: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell, Eilon Greenstein
In-Reply-To: <1314260881.10283.48.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 93bff08..5c3eb17 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -2800,9 +2800,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		mapping = dma_map_page(&bp->pdev->dev, frag->page,
-				       frag->page_offset, frag->size,
-				       DMA_TO_DEVICE);
+		mapping = skb_frag_dma_map(&bp->pdev->dev, frag, 0, frag->size,
+					   DMA_TO_DEVICE);
 		if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
 
 			DP(NETIF_MSG_TX_QUEUED, "Unable to map page - "
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 8/9] bnx2fc: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev
  Cc: Ian Campbell, Bhanu Prakash Gollapudi, James E.J. Bottomley,
	linux-scsi
In-Reply-To: <1314260881.10283.48.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index 7cb2cd4..2c780a7 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -302,7 +302,7 @@ static int bnx2fc_xmit(struct fc_lport *lport, struct fc_frame *fp)
 			return -ENOMEM;
 		}
 		frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1];
-		cp = kmap_atomic(frag->page, KM_SKB_DATA_SOFTIRQ)
+		cp = kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
 				+ frag->page_offset;
 	} else {
 		cp = (struct fcoe_crc_eof *)skb_put(skb, tlen);
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 9/9] fcoe: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-25  8:28 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell, Robert Love, James E.J. Bottomley, devel,
	linux-scsi
In-Reply-To: <1314260881.10283.48.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: devel@open-fcoe.org
Cc: linux-scsi@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 drivers/scsi/fcoe/fcoe.c           |    2 +-
 drivers/scsi/fcoe/fcoe_transport.c |    5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index ba710e3..3416ab6 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1514,7 +1514,7 @@ int fcoe_xmit(struct fc_lport *lport, struct fc_frame *fp)
 			return -ENOMEM;
 		}
 		frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1];
-		cp = kmap_atomic(frag->page, KM_SKB_DATA_SOFTIRQ)
+		cp = kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
 			+ frag->page_offset;
 	} else {
 		cp = (struct fcoe_crc_eof *)skb_put(skb, tlen);
diff --git a/drivers/scsi/fcoe/fcoe_transport.c b/drivers/scsi/fcoe/fcoe_transport.c
index 41068e8..f6613f9 100644
--- a/drivers/scsi/fcoe/fcoe_transport.c
+++ b/drivers/scsi/fcoe/fcoe_transport.c
@@ -108,8 +108,9 @@ u32 fcoe_fc_crc(struct fc_frame *fp)
 		len = frag->size;
 		while (len > 0) {
 			clen = min(len, PAGE_SIZE - (off & ~PAGE_MASK));
-			data = kmap_atomic(frag->page + (off >> PAGE_SHIFT),
-					   KM_SKB_DATA_SOFTIRQ);
+			data = kmap_atomic(
+				skb_frag_page(frag) + (off >> PAGE_SHIFT),
+				KM_SKB_DATA_SOFTIRQ);
 			crc = crc32(crc, data + (off & ~PAGE_MASK), clen);
 			kunmap_atomic(data, KM_SKB_DATA_SOFTIRQ);
 			off += clen;
-- 
1.7.2.5

^ permalink raw reply related

* Re: [PATCH] tcp: bound RTO to minimum
From: Alexander Zimmermann @ 2011-08-25  8:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Yuchung Cheng, Hagen Paul Pfeifer, netdev, Hannemann Arnd,
	Lukowski Damian
In-Reply-To: <1314260805.2387.11.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>


Am 25.08.2011 um 10:26 schrieb Eric Dumazet:

> Le jeudi 25 août 2011 à 09:28 +0200, Alexander Zimmermann a écrit :
>> Hi Eric,
>> 
>> Am 25.08.2011 um 07:28 schrieb Eric Dumazet:
> 
>>> Real question is : do we really want to process ~1000 timer interrupts
>>> per tcp session, ~2000 skb alloc/free/build/handling, possibly ~1000 ARP
>>> requests, only to make tcp revover in ~1sec when connectivity returns
>>> back. This just doesnt scale.
>> 
>> maybe a stupid question, but 1000?. With an minRTO of 200ms and a maximum
>> probing time of 120s, we 600 retransmits in a worst-case-senario
>> (assumed that we get for every rot retransmission an icmp). No?
> 
> Where is asserted the "max probing time of 120s" ? 
> 
> It is not the case on my machine :
> I have way more retransmits than that, even if spaced by 1600 ms
> 
> 07:16:13.389331 write(3, "\350F\235JC\357\376\363&\3\374\270R\21L\26\324{\37p\342\244i\304\356\241I:\301\332\222\26"..., 48) = 48
> 07:16:13.389417 select(7, [3 4], [], NULL, NULL) = 1 (in [3])
> 07:31:39.901311 read(3, 0xff8c4c90, 8192) = -1 EHOSTUNREACH (No route to host)
> 
> Old kernels where performing up to 15 retries, doing exponential backoff.

Yes I know. And in combination with RFC6069 we have to convert this
See Section 7.1

and

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6fa12c85031485dff38ce550c24f10da23b0adaa

Is the transformation broken? Damian?


> 
> Now its kind of unlimited, according to experimental results.

Ok, unlimited is not what I expect...


> 
> 
> 

//
// Dipl.-Inform. Alexander Zimmermann
// Department of Computer Science, Informatik 4
// RWTH Aachen University
// Ahornstr. 55, 52056 Aachen, Germany
// phone: (49-241) 80-21422, fax: (49-241) 80-22222
// email: zimmermann@cs.rwth-aachen.de
// web: http://www.umic-mesh.net
//

^ permalink raw reply

* Re: [PATCH] tcp: bound RTO to minimum
From: Arnd Hannemann @ 2011-08-25  8:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Zimmermann, Yuchung Cheng, Hagen Paul Pfeifer, netdev,
	Lukowski Damian
In-Reply-To: <1314260805.2387.11.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

Hi,

Am 25.08.2011 10:26, schrieb Eric Dumazet:
> Le jeudi 25 août 2011 à 09:28 +0200, Alexander Zimmermann a écrit :
>> Hi Eric,
>>
>> Am 25.08.2011 um 07:28 schrieb Eric Dumazet:
> 
>>> Real question is : do we really want to process ~1000 timer interrupts
>>> per tcp session, ~2000 skb alloc/free/build/handling, possibly ~1000 ARP
>>> requests, only to make tcp revover in ~1sec when connectivity returns
>>> back. This just doesnt scale.
>>
>> maybe a stupid question, but 1000?. With an minRTO of 200ms and a maximum
>> probing time of 120s, we 600 retransmits in a worst-case-senario
>> (assumed that we get for every rot retransmission an icmp). No?
> 
> Where is asserted the "max probing time of 120s" ? 
> 
> It is not the case on my machine :
> I have way more retransmits than that, even if spaced by 1600 ms
> 
> 07:16:13.389331 write(3, "\350F\235JC\357\376\363&\3\374\270R\21L\26\324{\37p\342\244i\304\356\241I:\301\332\222\26"..., 48) = 48
> 07:16:13.389417 select(7, [3 4], [], NULL, NULL) = 1 (in [3])
> 07:31:39.901311 read(3, 0xff8c4c90, 8192) = -1 EHOSTUNREACH (No route to host)
> 
> Old kernels where performing up to 15 retries, doing exponential backoff.
> 
> Now its kind of unlimited, according to experimental results.

That shouldn't be. It should stop after the same time a TCP connection with an
RTO of Minimum RTO which is doing 15 retries (tcp_retries2=15) and doing exponential backoff.
So it should be around 900s*. But it could be that because of the icsk_retransmit wrapover
this doesn't work as expected.

* 200ms + 400ms + 800ms ...

Best regards,
Arnd

^ permalink raw reply

* Re: [BUG] tcp : how many times a frame can possibly be retransmitted ?
From: Ilpo Järvinen @ 2011-08-25  8:56 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Jerry Chu, Damian Lukowski
In-Reply-To: <1314226834.6797.5.camel@edumazet-laptop>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1973 bytes --]

On Thu, 25 Aug 2011, Eric Dumazet wrote:

> Le jeudi 25 août 2011 à 01:44 +0300, Ilpo Järvinen a écrit :
> > On Wed, 24 Aug 2011, Eric Dumazet wrote:
> > 
> > > On one dev machine running net-next, I just found strange tcp sessions
> > > that retransmit a frame forever (The other peer disappeared)
> > > 
> > > # ss -emoi dst 10.2.1.1
> > > State      Recv-Q Send-Q      Local Address:Port          Peer Address:Port   
> > > ESTAB      0      816              10.2.1.2:37930             10.2.1.1:ssh      timer:(on,630ms,246) ino:60786 sk:ffff8801189aa400
> > > 	 mem:(r0,w3776,f320,t0) ts sack ecn cubic wscale:8,6 rto:1680 rtt:16.25/7.5 ato:40 ssthresh:7 send 1.4Mbps rcv_rtt:10 rcv_space:16632
> > > 
> > > 
> > > You can see the retransmit count : 246 
> > > 
> > > What possibly can be going on ?
> > > 
> > > What happened to backoff ?
> > 
> > But RTO (even without any backoffs) should be lower bounded to some not so 
> > zeroish value?
> 
> Apparently not.
> 
> The only thing that protect us from a flood is that ip_error() uses
> inetpeer cache to ratelimit the icmp_send(ICMP_DEST_UNREACH)
> 
> This is why we get retransmit period >= 1 sec
>
> vi +432 net/ipv4/tcp_ipv4.c
> 
>                 icsk->icsk_backoff--;
>                 inet_csk(sk)->icsk_rto = (tp->srtt ? __tcp_set_rto(tp) :
>                         TCP_TIMEOUT_INIT) << icsk->icsk_backoff;
>                 tcp_bound_rto(sk);
> 
> and __tcp_set_rto() uses : return (tp->srtt >> 3) + tp->rttvar;

So you think that this is not true: ?

        /* NOTE: clamping at TCP_RTO_MIN is not required, current algo
         * guarantees that rto is higher.
         */

...it would still be smaller than 1sec though, but certainly not going to 
cause flooding either. Default tcp_rto_min should be 200ms so it's 
5pkts+5ICMP sent, received and processed per second. Which doesn't sound 
that bad CPU load?!?

It is unclear to me how tp->rttvar could become smaller than 
tcp_rto_min().

-- 
 i.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox