* [net-next.git 6/7] stmmac: add mitigation and sysfs info in the doc
From: Giuseppe CAVALLARO @ 2012-09-03 7:47 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>
This patch updates the stmmac.txt addinf some information
about the new rx/tx mitigation schema and the sysFs support.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
Documentation/networking/stmmac.txt | 34 +++++++++++++++++++++-------------
1 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index ef9ee71..67eaa35 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -29,11 +29,9 @@ The kernel configuration option is STMMAC_ETH:
dma_txsize: DMA tx ring size;
buf_sz: DMA buffer size;
tc: control the HW FIFO threshold;
- tx_coe: Enable/Disable Tx Checksum Offload engine;
watchdog: transmit timeout (in milliseconds);
flow_ctrl: Flow control ability [on/off];
pause: Flow Control Pause Time;
- tmrate: timer period (only if timer optimisation is configured).
3) Command line options
Driver parameters can be also passed in command line by using:
@@ -60,17 +58,21 @@ Then the poll method will be scheduled at some future point.
The incoming packets are stored, by the DMA, in a list of pre-allocated socket
buffers in order to avoid the memcpy (Zero-copy).
-4.3) Timer-Driver Interrupt
-Instead of having the device that asynchronously notifies the frame receptions,
-the driver configures a timer to generate an interrupt at regular intervals.
-Based on the granularity of the timer, the frames that are received by the
-device will experience different levels of latency. Some NICs have dedicated
-timer device to perform this task. STMMAC can use either the RTC device or the
-TMU channel 2 on STLinux platforms.
-The timers frequency can be passed to the driver as parameter; when change it,
-take care of both hardware capability and network stability/performance impact.
-Several performance tests on STM platforms showed this optimisation allows to
-spare the CPU while having the maximum throughput.
+4.3) Interrupt Mitigation
+The driver is able to mitigate the number of its DMA interrupts
+using NAPI for the reception on chips older than the 3.50.
+New chips have an HW RX-Watchdog used for this mitigation.
+
+User can tune (also via sysfs) a parameter that is the RI Watchdog
+Timer count. It indicates the number of system clock cycles.
+
+On Tx-side, the mitigation schema is based on a SW timer that calls the
+tx function (stmmac_tx) to reclaim the resource after transmitting the
+frames.
+Also there is another parameter (like a threshold) used to program
+the descriptors avoiding to set the interrupt on completion bit in
+when the frame is sent (xmit).
+These parameters can be tuned by sysfs entries.
4.4) WOL
Wake up on Lan feature through Magic and Unicast frames are supported for the
@@ -324,6 +326,12 @@ To enter in Tx LPI mode the driver needs to have a software timer
that enable and disable the LPI mode when there is nothing to be
transmitted.
+7) sys FS interface
+Some internal driver parameters can be tuned by using some
+entries exposed via sysFS. There parameter currently are,
+for example, for internal timers used to mitigate the rx/tx
+interrupts or for EEE.
+
7) TODO:
o XGMAC is not supported.
o Add the PTP - precision time protocol
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 5/7] stmmac: add sysFs support
From: Giuseppe CAVALLARO @ 2012-09-03 7:47 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>
This patch adds the sysFs support.
Some internal driver parameters can be tuned by using some
entries exposed via sysFS. There parameter currently are,
for example, for internal timers used to mitigate the rx/tx
interrupts or for EEE.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/Makefile | 2 +-
drivers/net/ethernet/stmicro/stmmac/common.h | 8 +-
drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c | 157 ++++++++++++++++++++
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 3 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 36 ++---
5 files changed, 185 insertions(+), 21 deletions(-)
create mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c
diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile
index c8e8ea6..4450fc6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Makefile
+++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
@@ -6,4 +6,4 @@ stmmac-$(CONFIG_STMMAC_PCI) += stmmac_pci.o
stmmac-objs:= stmmac_main.o stmmac_ethtool.o stmmac_mdio.o \
dwmac_lib.o dwmac1000_core.o dwmac1000_dma.o \
dwmac100_core.o dwmac100_dma.o enh_desc.o norm_desc.o \
- mmc_core.o $(stmmac-y)
+ mmc_core.o stmmac-sysfs.o $(stmmac-y)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 63d4bad..b0b08bc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -169,7 +169,13 @@ struct stmmac_extra_stats {
#define DMA_HW_FEAT_SAVLANINS 0x08000000 /* Source Addr or VLAN Insertion */
#define DMA_HW_FEAT_ACTPHYIF 0x70000000 /* Active/selected PHY interface */
#define DEFAULT_DMA_PBL 8
-#define DEFAULT_DMA_RIWT 0xff /* Max RI Watchdog Timer count */
+#define MAX_DMA_RIWT 0xff /* Max RI Watchdog Timer count */
+
+#define STMMAC_COAL_TX_TIMER 40000
+#define STMMAC_MAX_COAL_TX_TIMER 100000
+#define STMMAC_TX_MAX_FRAMES 64
+#define STMMAC_DEFAULT_LPI_TIMER 1000
+#define STMMAC_MAX_LPI_TIMER 5000
enum rx_frame_status { /* IPC status */
good_frame = 0,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c b/drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c
new file mode 100644
index 0000000..92537a0
--- /dev/null
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c
@@ -0,0 +1,157 @@
+/*******************************************************************************
+ STMMAC sysfs module
+
+ Copyright(C) 2012 STMicroelectronics Ltd
+
+ This program is free software; you can redistribute it and/or modify it
+ under the terms and conditions of the GNU General Public License,
+ version 2, as published by the Free Software Foundation.
+
+ This program is distributed in the hope it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ more details.
+
+ You should have received a copy of the GNU General Public License along with
+ this program; if not, write to the Free Software Foundation, Inc.,
+ 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+ The full GNU General Public License is included in this distribution in
+ the file called "COPYING".
+
+ Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+#include <linux/sysfs.h>
+
+#include "stmmac.h"
+
+/* EEE Timer attribute */
+static ssize_t eee_timer_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ return snprintf(buf, PAGE_SIZE, "%u", (u32) priv->eee_timer);
+}
+
+static ssize_t eee_timer_store(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t count)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ int eee_timer;
+
+ sscanf(buf, "%u", &eee_timer);
+
+ if ((eee_timer <= 0) || (eee_timer > STMMAC_MAX_LPI_TIMER))
+ pr_err("stmmac: invalid EEE timer value\n");
+ else
+ priv->eee_timer = eee_timer;
+
+ return count;
+}
+
+/* TX coalesce parameters */
+static ssize_t tx_coal_timer_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ return snprintf(buf, PAGE_SIZE, "%u", (u32) priv->tx_coal_timer);
+}
+
+static ssize_t tx_coal_timer_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ int tx_coal_timer;
+
+ sscanf(buf, "%u", &tx_coal_timer);
+
+ if ((tx_coal_timer <= 0) || (tx_coal_timer > STMMAC_MAX_COAL_TX_TIMER))
+ pr_err("stmmac: Tx coalesce timer value\n");
+ else
+ priv->tx_coal_timer = tx_coal_timer;
+
+ return count;
+}
+
+static ssize_t tx_coal_frames_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ return snprintf(buf, PAGE_SIZE, "%u", (u32) priv->tx_coal_frames);
+}
+
+static ssize_t tx_coal_frames_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ int tx_coal_frames;
+
+ sscanf(buf, "%u", &tx_coal_frames);
+
+ if ((tx_coal_frames <= 0) || (tx_coal_frames >= STMMAC_TX_MAX_FRAMES))
+ pr_err("stmmac: invalid Tx coalesce value\n");
+ else
+ priv->tx_coal_frames = tx_coal_frames;
+
+ return count;
+}
+
+/* RX coalesce parameters */
+static ssize_t rx_riwt_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ return snprintf(buf, PAGE_SIZE, "%u", (u16) priv->rx_riwt);
+}
+
+static ssize_t rx_riwt_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(dev));
+
+ int rx_riwt;
+
+ sscanf(buf, "%u", &rx_riwt);
+
+ if ((rx_riwt <= 0) || (rx_riwt >= MAX_DMA_RIWT)) {
+ pr_err("stmmac: invalid RX WDT timer value\n");
+ } else {
+ priv->rx_riwt = rx_riwt;
+ priv->hw->dma->rx_watchdog(priv->ioaddr, priv->rx_riwt);
+ }
+
+ return count;
+}
+
+DEVICE_ATTR(eee_timer, 0644, eee_timer_show, eee_timer_store);
+DEVICE_ATTR(tx_coal_timer, 0644, tx_coal_timer_show,
+ tx_coal_timer_store);
+DEVICE_ATTR(tx_coal_frames, 0644, tx_coal_frames_show,
+ tx_coal_frames_store);
+DEVICE_ATTR(rx_riwt, 0644, rx_riwt_show, rx_riwt_store);
+
+void stmmac_create_sysfs(struct net_device *dev)
+{
+ int rc;
+ struct stmmac_priv *priv = netdev_priv(dev);
+
+ rc = device_create_file(&dev->dev, &dev_attr_tx_coal_timer);
+ rc |= device_create_file(&dev->dev, &dev_attr_tx_coal_frames);
+ if (priv->synopsys_id >= DWMAC_CORE_3_50)
+ rc |= device_create_file(&dev->dev, &dev_attr_rx_riwt);
+ if (priv->eee_enabled)
+ rc |= device_create_file(&dev->dev, &dev_attr_eee_timer);
+ if (rc)
+ pr_err("%s: failed to create the sysfs entries\n", __func__);
+}
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 0f5ab28..05f17184 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -87,11 +87,13 @@ struct stmmac_priv {
int lpi_irq;
int eee_enabled;
int eee_active;
+ int eee_timer;
int tx_lpi_timer;
struct timer_list txtimer;
u32 tx_count_frames;
u32 tx_coal_frames;
u32 tx_coal_timer;
+ u16 rx_riwt;
};
extern int phyaddr;
@@ -111,6 +113,7 @@ struct stmmac_priv *stmmac_dvr_probe(struct device *device,
void __iomem *addr);
void stmmac_disable_eee_mode(struct stmmac_priv *priv);
bool stmmac_eee_init(struct stmmac_priv *priv);
+void stmmac_create_sysfs(struct net_device *dev);
#ifdef CONFIG_STMMAC_PLATFORM
extern struct platform_driver stmmac_pltfr_driver;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index bafe694..1895130 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -77,8 +77,6 @@
#define STMMAC_ALIGN(x) L1_CACHE_ALIGN(x)
#define JUMBO_LEN 9000
-#define STMMAC_TX_TM 40000
-#define STMMAC_TX_MAX_FRAMES 64 /* Max coalesced frame */
/* Module parameters */
#define TX_TIMEO 5000 /* default 5 seconds */
@@ -126,11 +124,8 @@ static const u32 default_msg_level = (NETIF_MSG_DRV | NETIF_MSG_PROBE |
NETIF_MSG_LINK | NETIF_MSG_IFUP |
NETIF_MSG_IFDOWN | NETIF_MSG_TIMER);
-#define STMMAC_DEFAULT_LPI_TIMER 1000
-static int eee_timer = STMMAC_DEFAULT_LPI_TIMER;
-module_param(eee_timer, int, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(eee_timer, "LPI tx expiration time in msec");
#define STMMAC_LPI_TIMER(x) (jiffies + msecs_to_jiffies(x))
+#define STMMAC_COAL_TIMER(x) (jiffies + usecs_to_jiffies(x))
static irqreturn_t stmmac_interrupt(int irq, void *dev_id);
static int stmmac_rx(struct stmmac_priv *priv, int limit);
@@ -161,8 +156,6 @@ static void stmmac_verify_args(void)
flow_ctrl = FLOW_OFF;
if (unlikely((pause < 0) || (pause > 0xffff)))
pause = PAUSE_TIME;
- if (eee_timer < 0)
- eee_timer = STMMAC_DEFAULT_LPI_TIMER;
}
static void stmmac_clk_csr_set(struct stmmac_priv *priv)
@@ -254,7 +247,7 @@ static void stmmac_eee_ctrl_timer(unsigned long arg)
struct stmmac_priv *priv = (struct stmmac_priv *)arg;
stmmac_enable_eee_mode(priv);
- mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_TIMER(eee_timer));
+ mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_TIMER(priv->eee_timer));
}
/**
@@ -280,7 +273,8 @@ bool stmmac_eee_init(struct stmmac_priv *priv)
init_timer(&priv->eee_ctrl_timer);
priv->eee_ctrl_timer.function = stmmac_eee_ctrl_timer;
priv->eee_ctrl_timer.data = (unsigned long)priv;
- priv->eee_ctrl_timer.expires = STMMAC_LPI_TIMER(eee_timer);
+ priv->eee_ctrl_timer.expires =
+ STMMAC_LPI_TIMER(priv->eee_timer);
add_timer(&priv->eee_ctrl_timer);
priv->hw->mac->set_eee_timer(priv->ioaddr,
@@ -770,7 +764,8 @@ static void stmmac_tx(struct stmmac_priv *priv)
if ((priv->eee_enabled) && (!priv->tx_path_in_lpi_mode)) {
stmmac_enable_eee_mode(priv);
- mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_TIMER(eee_timer));
+ mod_timer(&priv->eee_ctrl_timer,
+ STMMAC_LPI_TIMER(priv->eee_timer));
}
spin_unlock_irqrestore(&priv->tx_lock, flags);
}
@@ -1016,9 +1011,9 @@ static int stmmac_init_tx_coalesce(struct stmmac_priv *priv)
STMMAC_TX_MAX_FRAMES);
if (priv->tx_coal_frames) {
/* Set Tx coalesce parameters and timers */
- priv->tx_coal_timer = jiffies + usecs_to_jiffies(STMMAC_TX_TM);
+ priv->tx_coal_timer = STMMAC_COAL_TX_TIMER;
init_timer(&priv->txtimer);
- priv->txtimer.expires = priv->tx_coal_timer;
+ priv->txtimer.expires = STMMAC_COAL_TIMER(priv->tx_coal_timer);
priv->txtimer.data = (unsigned long)priv;
priv->txtimer.function = stmmac_txtimer;
@@ -1138,6 +1133,7 @@ static int stmmac_open(struct net_device *dev)
phy_start(priv->phydev);
priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS_TIMER;
+ priv->eee_timer = STMMAC_DEFAULT_LPI_TIMER;
priv->eee_enabled = stmmac_eee_init(priv);
ret = stmmac_init_tx_coalesce(priv);
@@ -1149,11 +1145,13 @@ static int stmmac_open(struct net_device *dev)
*/
if (priv->synopsys_id < DWMAC_CORE_3_50)
napi_enable(&priv->napi);
- else if (priv->hw->dma->rx_watchdog)
+ else if (priv->hw->dma->rx_watchdog) {
+ priv->rx_riwt = MAX_DMA_RIWT;
/* Program RX Watchdog register to the default values
* FIXME: provide user value for RIWT
*/
- priv->hw->dma->rx_watchdog(priv->ioaddr, DEFAULT_DMA_RIWT);
+ priv->hw->dma->rx_watchdog(priv->ioaddr, priv->rx_riwt);
+ }
skb_queue_head_init(&priv->rx_recycle);
netif_start_queue(dev);
@@ -1317,7 +1315,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
if (priv->tx_coal_frames > priv->tx_count_frames) {
priv->hw->desc->clear_tx_ic(desc);
priv->xstats.tx_reset_ic_bit++;
- mod_timer(&priv->txtimer, priv->tx_coal_timer);
+ mod_timer(&priv->txtimer,
+ STMMAC_COAL_TIMER(priv->tx_coal_timer));
} else
priv->tx_count_frames = 0;
@@ -2081,6 +2080,8 @@ struct stmmac_priv *stmmac_dvr_probe(struct device *device,
goto error_mdio_register;
}
+ stmmac_create_sysfs(ndev);
+
return priv;
error_mdio_register:
@@ -2283,9 +2284,6 @@ static int __init stmmac_cmdline_opt(char *str)
} else if (!strncmp(opt, "pause:", 6)) {
if (kstrtoint(opt + 6, 0, &pause))
goto err;
- } else if (!strncmp(opt, "eee_timer:", 6)) {
- if (kstrtoint(opt + 10, 0, &eee_timer))
- goto err;
}
}
return 0;
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 4/7] stmmac: add Rx watchdog optimization to mitigate the DMA irqs
From: Giuseppe CAVALLARO @ 2012-09-03 7:46 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>
New GMAC devices (3.50 and newer) have an embedded timer
that can be used for mitigating the number of interrupts.
So this patch adds this optimizations.
Old MAC will continue to use NAPI.
In this implementation the rx timer stored in the Reg9 is fixed
to the max value.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/common.h | 7 ++
drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 3 -
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 6 ++
drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h | 3 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 67 +++++++++++++------
5 files changed, 61 insertions(+), 25 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 1d6bd3e..63d4bad 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -48,6 +48,10 @@
#define CHIP_DBG(fmt, args...) do { } while (0)
#endif
+/* Synopsys Core versions */
+#define DWMAC_CORE_3_40 0x34
+#define DWMAC_CORE_3_50 0x35
+
#undef FRAME_FILTER_DEBUG
/* #define FRAME_FILTER_DEBUG */
@@ -165,6 +169,7 @@ struct stmmac_extra_stats {
#define DMA_HW_FEAT_SAVLANINS 0x08000000 /* Source Addr or VLAN Insertion */
#define DMA_HW_FEAT_ACTPHYIF 0x70000000 /* Active/selected PHY interface */
#define DEFAULT_DMA_PBL 8
+#define DEFAULT_DMA_RIWT 0xff /* Max RI Watchdog Timer count */
enum rx_frame_status { /* IPC status */
good_frame = 0,
@@ -301,6 +306,8 @@ struct stmmac_dma_ops {
struct stmmac_extra_stats *x);
/* If supported then get the optional core features */
unsigned int (*get_hw_feature) (void __iomem *ioaddr);
+ /* Manage HW RX Watchdog*/
+ void (*rx_watchdog) (void __iomem *ioaddr, u8 timer);
};
struct stmmac_ops {
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 0e4cace..7ad56af 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -230,8 +230,5 @@ enum rtc_control {
#define GMAC_MMC_TX_INTR 0x108
#define GMAC_MMC_RX_CSUM_OFFLOAD 0x208
-/* Synopsys Core versions */
-#define DWMAC_CORE_3_40 0x34
-
extern const struct stmmac_dma_ops dwmac1000_dma_ops;
#endif /* __DWMAC1000_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index 0335000..e2c9431 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -174,6 +174,11 @@ static unsigned int dwmac1000_get_hw_feature(void __iomem *ioaddr)
return readl(ioaddr + DMA_HW_FEATURE);
}
+static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u8 timer)
+{
+ writel(timer, ioaddr + DMA_RX_WATCHDOG);
+}
+
const struct stmmac_dma_ops dwmac1000_dma_ops = {
.init = dwmac1000_dma_init,
.dump_regs = dwmac1000_dump_dma_regs,
@@ -187,4 +192,5 @@ const struct stmmac_dma_ops dwmac1000_dma_ops = {
.stop_rx = dwmac_dma_stop_rx,
.dma_interrupt = dwmac_dma_interrupt,
.get_hw_feature = dwmac1000_get_hw_feature,
+ .rx_watchdog = dwmac1000_rx_watchdog,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index e49c9a0..4eeff5d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -35,7 +35,8 @@
#define DMA_CONTROL 0x00001018 /* Ctrl (Operational Mode) */
#define DMA_INTR_ENA 0x0000101c /* Interrupt Enable */
#define DMA_MISSED_FRAME_CTR 0x00001020 /* Missed Frame Counter */
-#define DMA_AXI_BUS_MODE 0x00001028 /* AXI Bus Mode */
+#define DMA_RX_WATCHDOG 0x00001024 /* Receive Int Watchdog Timer */
+#define DMA_AXI_BUS_MODE 0x00001028 /* AXI Bus Mode */
#define DMA_CUR_TX_BUF_ADDR 0x00001050 /* Current Host Tx Buffer */
#define DMA_CUR_RX_BUF_ADDR 0x00001054 /* Current Host Rx Buffer */
#define DMA_HW_FEATURE 0x00001058 /* HW Feature Register */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d7f5482..bafe694 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -133,6 +133,7 @@ MODULE_PARM_DESC(eee_timer, "LPI tx expiration time in msec");
#define STMMAC_LPI_TIMER(x) (jiffies + msecs_to_jiffies(x))
static irqreturn_t stmmac_interrupt(int irq, void *dev_id);
+static int stmmac_rx(struct stmmac_priv *priv, int limit);
#ifdef CONFIG_STMMAC_DEBUG_FS
static int stmmac_init_fs(struct net_device *dev);
@@ -481,7 +482,6 @@ static void display_ring(struct dma_desc *p, int size)
i, (unsigned int)virt_to_phys(&p[i]),
(unsigned int)(x->a), (unsigned int)((x->a) >> 32),
x->b, x->c);
- pr_info("\n");
}
}
@@ -516,7 +516,7 @@ static void init_dma_desc_rings(struct net_device *dev)
unsigned int txsize = priv->dma_tx_size;
unsigned int rxsize = priv->dma_rx_size;
unsigned int bfsize;
- int dis_ic = 0;
+ int dis_ic = 1;
int des3_as_data_buf = 0;
/* Set the max buffer size according to the DESC mode
@@ -603,6 +603,8 @@ static void init_dma_desc_rings(struct net_device *dev)
priv->dirty_tx = 0;
priv->cur_tx = 0;
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ dis_ic = 0;
/* Clear the Rx/Tx descriptors */
priv->hw->desc->init_rx_desc(priv->dma_rx, rxsize, dis_ic);
priv->hw->desc->init_tx_desc(priv->dma_tx, txsize);
@@ -746,7 +748,7 @@ static void stmmac_tx(struct stmmac_priv *priv)
skb_recycle_check(skb, priv->dma_buf_sz))
__skb_queue_head(&priv->rx_recycle, skb);
else
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
priv->tx_skbuff[entry] = NULL;
}
@@ -816,12 +818,15 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
netif_wake_queue(priv->dev);
}
-static void stmmac_rx_schedule(struct stmmac_priv *priv)
+static void stmmac_rx_work(struct stmmac_priv *priv)
{
- if (likely(napi_schedule_prep(&priv->napi))) {
- stmmac_disable_irq(priv);
- __napi_schedule(&priv->napi);
- }
+ if (priv->synopsys_id < DWMAC_CORE_3_50) {
+ if (likely(napi_schedule_prep(&priv->napi))) {
+ stmmac_disable_irq(priv);
+ __napi_schedule(&priv->napi);
+ }
+ } else
+ stmmac_rx(priv, priv->dma_rx_size);
}
static void stmmac_dma_interrupt(struct stmmac_priv *priv)
@@ -831,7 +836,7 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
status = priv->hw->dma->dma_interrupt(priv->ioaddr, &priv->xstats);
if (likely(status == handle_rx)) {
priv->xstats.rx_normal_irq_n++;
- stmmac_rx_schedule(priv);
+ stmmac_rx_work(priv);
}
if (likely(status == handle_tx)) {
priv->xstats.tx_normal_irq_n++;
@@ -1139,7 +1144,17 @@ static int stmmac_open(struct net_device *dev)
if (!ret)
add_timer(&priv->txtimer);
- napi_enable(&priv->napi);
+ /* Enable NAPI on chip older than the 3.50 where the Rx watchdog
+ * is not supported.
+ */
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ napi_enable(&priv->napi);
+ else if (priv->hw->dma->rx_watchdog)
+ /* Program RX Watchdog register to the default values
+ * FIXME: provide user value for RIWT
+ */
+ priv->hw->dma->rx_watchdog(priv->ioaddr, DEFAULT_DMA_RIWT);
+
skb_queue_head_init(&priv->rx_recycle);
netif_start_queue(dev);
@@ -1183,7 +1198,8 @@ static int stmmac_release(struct net_device *dev)
netif_stop_queue(dev);
- napi_disable(&priv->napi);
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ napi_disable(&priv->napi);
skb_queue_purge(&priv->rx_recycle);
/* Free the IRQ lines */
@@ -1448,14 +1464,15 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
#endif
skb->protocol = eth_type_trans(skb, priv->dev);
- if (unlikely(!priv->plat->rx_coe)) {
- /* No RX COE for old mac10/100 devices */
+ if (unlikely(!priv->plat->rx_coe))
skb_checksum_none_assert(skb);
- netif_receive_skb(skb);
- } else {
+ else
skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
napi_gro_receive(&priv->napi, skb);
- }
+ else
+ netif_rx(skb);
priv->dev->stats.rx_packets++;
priv->dev->stats.rx_bytes += frame_len;
@@ -2025,7 +2042,10 @@ struct stmmac_priv *stmmac_dvr_probe(struct device *device,
if (flow_ctrl)
priv->flow_ctrl = FLOW_AUTO; /* RX/TX pause on */
- netif_napi_add(ndev, &priv->napi, stmmac_poll, 64);
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ netif_napi_add(ndev, &priv->napi, stmmac_poll, 64);
+ else
+ pr_info(" Enable Mitigation via HW RX_Watchdog Timer\n");
spin_lock_init(&priv->lock);
spin_lock_init(&priv->tx_lock);
@@ -2068,7 +2088,8 @@ error_mdio_register:
error_clk_get:
unregister_netdev(ndev);
error_netdev_register:
- netif_napi_del(&priv->napi);
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ netif_napi_del(&priv->napi);
free_netdev(ndev);
return NULL;
@@ -2102,7 +2123,7 @@ int stmmac_dvr_remove(struct net_device *ndev)
int stmmac_suspend(struct net_device *ndev)
{
struct stmmac_priv *priv = netdev_priv(ndev);
- int dis_ic = 0;
+ int dis_ic = 1;
unsigned long flags;
if (!ndev || !netif_running(ndev))
@@ -2116,11 +2137,14 @@ int stmmac_suspend(struct net_device *ndev)
netif_device_detach(ndev);
netif_stop_queue(ndev);
- napi_disable(&priv->napi);
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ napi_disable(&priv->napi);
/* Stop TX/RX DMA */
priv->hw->dma->stop_tx(priv->ioaddr);
priv->hw->dma->stop_rx(priv->ioaddr);
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ dis_ic = 0;
/* Clear the Rx/Tx descriptors */
priv->hw->desc->init_rx_desc(priv->dma_rx, priv->dma_rx_size,
dis_ic);
@@ -2166,7 +2190,8 @@ int stmmac_resume(struct net_device *ndev)
priv->hw->dma->start_tx(priv->ioaddr);
priv->hw->dma->start_rx(priv->ioaddr);
- napi_enable(&priv->napi);
+ if (priv->synopsys_id < DWMAC_CORE_3_50)
+ napi_enable(&priv->napi);
netif_start_queue(ndev);
--
1.7.4.4
^ permalink raw reply related
* Re: [PATCH v2] netfilter: take care of timewait sockets
From: Florian Westphal @ 2012-09-03 7:47 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, Sami Farin, netdev, e1000-devel
In-Reply-To: <1346629684.2563.78.camel@edumazet-glaptop>
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
> full blown socket.
>
> But with TCP early demux, we can have skb->sk pointing to a timewait
> socket.
>
> Same fix is needed in netfnetlink_log
Looks good, but IMHO it is very un-intuitive that
skb->sk might be a pointer to an object that is not struct sock (or
a compatible object).
^ permalink raw reply
* [net-next.git 2/7] stmmac: manage tx clean out of rx_poll
From: Giuseppe CAVALLARO @ 2012-09-03 7:46 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>
This patch is to invoke the stmmac_tx (tx handler)
out of the NAPI poll method.
This will make easier the next step to add the new
mitigation schema.
Also the patch enhances the ethtool to report some
stats for normal TX and RX IRQs.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/common.h | 13 +++++++----
drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c | 7 +++--
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 4 ++-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 22 ++++++++++++++-----
4 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 719be39..bd32fe6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -95,7 +95,9 @@ struct stmmac_extra_stats {
unsigned long threshold;
unsigned long tx_pkt_n;
unsigned long rx_pkt_n;
- unsigned long poll_n;
+ unsigned long rx_napi_poll;
+ unsigned long rx_normal_irq_n;
+ unsigned long tx_normal_irq_n;
unsigned long sched_timer_n;
unsigned long normal_irq_n;
unsigned long mmc_tx_irq_n;
@@ -169,10 +171,11 @@ enum rx_frame_status { /* IPC status */
llc_snap = 4,
};
-enum tx_dma_irq_status {
- tx_hard_error = 1,
- tx_hard_error_bump_tc = 2,
- handle_tx_rx = 3,
+enum dma_irq_status {
+ tx_hard_error = 0x1,
+ tx_hard_error_bump_tc = 0x2,
+ handle_rx = 0x4,
+ handle_tx = 0x8,
};
enum core_specific_irq_mask {
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index 4e0e18a..73766e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -206,9 +206,10 @@ int dwmac_dma_interrupt(void __iomem *ioaddr,
/* TX/RX NORMAL interrupts */
if (intr_status & DMA_STATUS_NIS) {
x->normal_irq_n++;
- if (likely((intr_status & DMA_STATUS_RI) ||
- (intr_status & (DMA_STATUS_TI))))
- ret = handle_tx_rx;
+ if (likely(intr_status & DMA_STATUS_RI))
+ ret |= handle_rx;
+ if (intr_status & (DMA_STATUS_TI))
+ ret |= handle_tx;
}
/* Optional hardware blocks, interrupts should be disabled */
if (unlikely(intr_status &
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 76fd61a..505fe71 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -90,7 +90,9 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
STMMAC_STAT(threshold),
STMMAC_STAT(tx_pkt_n),
STMMAC_STAT(rx_pkt_n),
- STMMAC_STAT(poll_n),
+ STMMAC_STAT(rx_napi_poll),
+ STMMAC_STAT(rx_normal_irq_n),
+ STMMAC_STAT(tx_normal_irq_n),
STMMAC_STAT(sched_timer_n),
STMMAC_STAT(normal_irq_n),
STMMAC_STAT(normal_irq_n),
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c8985f3..b247c39 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -824,16 +824,27 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
netif_wake_queue(priv->dev);
}
+static inline void stmmac_rx_schedule(struct stmmac_priv *priv)
+{
+ if (likely(napi_schedule_prep(&priv->napi))) {
+ stmmac_disable_irq(priv);
+ __napi_schedule(&priv->napi);
+ }
+}
static void stmmac_dma_interrupt(struct stmmac_priv *priv)
{
int status;
status = priv->hw->dma->dma_interrupt(priv->ioaddr, &priv->xstats);
- if (likely(status == handle_tx_rx))
- _stmmac_schedule(priv);
-
- else if (unlikely(status == tx_hard_error_bump_tc)) {
+ if (likely(status == handle_rx)) {
+ priv->xstats.rx_normal_irq_n++;
+ stmmac_rx_schedule(priv);
+ }
+ if (likely(status == handle_tx)) {
+ priv->xstats.tx_normal_irq_n++;
+ stmmac_tx(priv);
+ } else if (unlikely(status == tx_hard_error_bump_tc)) {
/* Try to bump up the dma threshold on this failure */
if (unlikely(tc != SF_DMA_MODE) && (tc <= 256)) {
tc += 64;
@@ -1443,8 +1454,7 @@ static int stmmac_poll(struct napi_struct *napi, int budget)
struct stmmac_priv *priv = container_of(napi, struct stmmac_priv, napi);
int work_done = 0;
- priv->xstats.poll_n++;
- stmmac_tx(priv);
+ priv->xstats.rx_napi_poll++;
work_done = stmmac_rx(priv, budget);
if (work_done < budget) {
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 1/7] stmmac: remove dead code for TIMER
From: Giuseppe CAVALLARO @ 2012-09-03 7:46 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>
TIMER option is not longer supported and this
code can be considered dead for this driver in
the new kernel series.
In fact, It was not updated at all and never used.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/Kconfig | 25 ----
drivers/net/ethernet/stmicro/stmmac/Makefile | 1 -
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 -
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 101 +--------------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c | 134 --------------------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h | 46 -------
6 files changed, 3 insertions(+), 310 deletions(-)
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index 9f44827..1164930 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -54,31 +54,6 @@ config STMMAC_DA
By default, the DMA arbitration scheme is based on Round-robin
(rx:tx priority is 1:1).
-config STMMAC_TIMER
- bool "STMMAC Timer optimisation"
- default n
- depends on RTC_HCTOSYS_DEVICE
- ---help---
- Use an external timer for mitigating the number of network
- interrupts. Currently, for SH architectures, it is possible
- to use the TMU channel 2 and the SH-RTC device.
-
-choice
- prompt "Select Timer device"
- depends on STMMAC_TIMER
-
-config STMMAC_TMU_TIMER
- bool "TMU channel 2"
- depends on CPU_SH4
- ---help---
-
-config STMMAC_RTC_TIMER
- bool "Real time clock"
- depends on RTC_CLASS
- ---help---
-
-endchoice
-
choice
prompt "Select the DMA TX/RX descriptor operating modes"
depends on STMMAC_ETH
diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile
index bc965ac..c8e8ea6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Makefile
+++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
@@ -1,5 +1,4 @@
obj-$(CONFIG_STMMAC_ETH) += stmmac.o
-stmmac-$(CONFIG_STMMAC_TIMER) += stmmac_timer.o
stmmac-$(CONFIG_STMMAC_RING) += ring_mode.o
stmmac-$(CONFIG_STMMAC_CHAINED) += chain_mode.o
stmmac-$(CONFIG_STMMAC_PLATFORM) += stmmac_platform.o
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index e872e1d..9f35769 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -31,9 +31,6 @@
#include <linux/phy.h>
#include <linux/pci.h>
#include "common.h"
-#ifdef CONFIG_STMMAC_TIMER
-#include "stmmac_timer.h"
-#endif
struct stmmac_priv {
/* Frequently used values are kept adjacent for cache effect */
@@ -78,9 +75,6 @@ struct stmmac_priv {
spinlock_t tx_lock;
int wolopts;
int wol_irq;
-#ifdef CONFIG_STMMAC_TIMER
- struct stmmac_timer *tm;
-#endif
struct plat_stmmacenet_data *plat;
struct stmmac_counters mmc;
struct dma_features dma_cap;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c136162..c8985f3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -115,16 +115,6 @@ static int tc = TC_DEFAULT;
module_param(tc, int, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(tc, "DMA threshold control value");
-/* Pay attention to tune this parameter; take care of both
- * hardware capability and network stabitily/performance impact.
- * Many tests showed that ~4ms latency seems to be good enough. */
-#ifdef CONFIG_STMMAC_TIMER
-#define DEFAULT_PERIODIC_RATE 256
-static int tmrate = DEFAULT_PERIODIC_RATE;
-module_param(tmrate, int, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(tmrate, "External timer freq. (default: 256Hz)");
-#endif
-
#define DMA_BUFFER_SIZE BUF_SIZE_2KiB
static int buf_sz = DMA_BUFFER_SIZE;
module_param(buf_sz, int, S_IRUGO | S_IWUSR);
@@ -536,12 +526,6 @@ static void init_dma_desc_rings(struct net_device *dev)
else
bfsize = stmmac_set_bfsize(dev->mtu, priv->dma_buf_sz);
-#ifdef CONFIG_STMMAC_TIMER
- /* Disable interrupts on completion for the reception if timer is on */
- if (likely(priv->tm->enable))
- dis_ic = 1;
-#endif
-
DBG(probe, INFO, "stmmac: txsize %d, rxsize %d, bfsize %d\n",
txsize, rxsize, bfsize);
@@ -786,22 +770,12 @@ static void stmmac_tx(struct stmmac_priv *priv)
static inline void stmmac_enable_irq(struct stmmac_priv *priv)
{
-#ifdef CONFIG_STMMAC_TIMER
- if (likely(priv->tm->enable))
- priv->tm->timer_start(tmrate);
- else
-#endif
- priv->hw->dma->enable_dma_irq(priv->ioaddr);
+ priv->hw->dma->enable_dma_irq(priv->ioaddr);
}
static inline void stmmac_disable_irq(struct stmmac_priv *priv)
{
-#ifdef CONFIG_STMMAC_TIMER
- if (likely(priv->tm->enable))
- priv->tm->timer_stop();
- else
-#endif
- priv->hw->dma->disable_dma_irq(priv->ioaddr);
+ priv->hw->dma->disable_dma_irq(priv->ioaddr);
}
static int stmmac_has_work(struct stmmac_priv *priv)
@@ -829,25 +803,6 @@ static inline void _stmmac_schedule(struct stmmac_priv *priv)
}
}
-#ifdef CONFIG_STMMAC_TIMER
-void stmmac_schedule(struct net_device *dev)
-{
- struct stmmac_priv *priv = netdev_priv(dev);
-
- priv->xstats.sched_timer_n++;
-
- _stmmac_schedule(priv);
-}
-
-static void stmmac_no_timer_started(unsigned int x)
-{;
-};
-
-static void stmmac_no_timer_stopped(void)
-{;
-};
-#endif
-
/**
* stmmac_tx_err:
* @priv: pointer to the private device structure
@@ -1049,23 +1004,6 @@ static int stmmac_open(struct net_device *dev)
struct stmmac_priv *priv = netdev_priv(dev);
int ret;
-#ifdef CONFIG_STMMAC_TIMER
- priv->tm = kzalloc(sizeof(struct stmmac_timer *), GFP_KERNEL);
- if (unlikely(priv->tm == NULL))
- return -ENOMEM;
-
- priv->tm->freq = tmrate;
-
- /* Test if the external timer can be actually used.
- * In case of failure continue without timer. */
- if (unlikely((stmmac_open_ext_timer(dev, priv->tm)) < 0)) {
- pr_warning("stmmaceth: cannot attach the external timer.\n");
- priv->tm->freq = 0;
- priv->tm->timer_start = stmmac_no_timer_started;
- priv->tm->timer_stop = stmmac_no_timer_stopped;
- } else
- priv->tm->enable = 1;
-#endif
clk_enable(priv->stmmac_clk);
stmmac_check_ether_addr(priv);
@@ -1152,10 +1090,6 @@ static int stmmac_open(struct net_device *dev)
priv->hw->dma->start_tx(priv->ioaddr);
priv->hw->dma->start_rx(priv->ioaddr);
-#ifdef CONFIG_STMMAC_TIMER
- priv->tm->timer_start(tmrate);
-#endif
-
/* Dump DMA/MAC registers */
if (netif_msg_hw(priv)) {
priv->hw->mac->dump_regs(priv->ioaddr);
@@ -1182,9 +1116,6 @@ open_error_wolirq:
free_irq(dev->irq, dev);
open_error:
-#ifdef CONFIG_STMMAC_TIMER
- kfree(priv->tm);
-#endif
if (priv->phydev)
phy_disconnect(priv->phydev);
@@ -1215,12 +1146,6 @@ static int stmmac_release(struct net_device *dev)
netif_stop_queue(dev);
-#ifdef CONFIG_STMMAC_TIMER
- /* Stop and release the timer */
- stmmac_close_ext_timer();
- if (priv->tm != NULL)
- kfree(priv->tm);
-#endif
napi_disable(&priv->napi);
skb_queue_purge(&priv->rx_recycle);
@@ -1336,12 +1261,6 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
/* Interrupt on completition only for the latest segment */
priv->hw->desc->close_tx_desc(desc);
-#ifdef CONFIG_STMMAC_TIMER
- /* Clean IC while using timer */
- if (likely(priv->tm->enable))
- priv->hw->desc->clear_tx_ic(desc);
-#endif
-
wmb();
/* To avoid raise condition */
@@ -1539,7 +1458,7 @@ static int stmmac_poll(struct napi_struct *napi, int budget)
* stmmac_tx_timeout
* @dev : Pointer to net device structure
* Description: this function is called when a packet transmission fails to
- * complete within a reasonable tmrate. The driver will mark the error in the
+ * complete within a reasonable time. The driver will mark the error in the
* netdev structure and arrange for the device to be reset to a sane state
* in order to transmit a new packet.
*/
@@ -2157,11 +2076,6 @@ int stmmac_suspend(struct net_device *ndev)
netif_device_detach(ndev);
netif_stop_queue(ndev);
-#ifdef CONFIG_STMMAC_TIMER
- priv->tm->timer_stop();
- if (likely(priv->tm->enable))
- dis_ic = 1;
-#endif
napi_disable(&priv->napi);
/* Stop TX/RX DMA */
@@ -2212,10 +2126,6 @@ int stmmac_resume(struct net_device *ndev)
priv->hw->dma->start_tx(priv->ioaddr);
priv->hw->dma->start_rx(priv->ioaddr);
-#ifdef CONFIG_STMMAC_TIMER
- if (likely(priv->tm->enable))
- priv->tm->timer_start(tmrate);
-#endif
napi_enable(&priv->napi);
netif_start_queue(ndev);
@@ -2311,11 +2221,6 @@ static int __init stmmac_cmdline_opt(char *str)
} else if (!strncmp(opt, "eee_timer:", 6)) {
if (kstrtoint(opt + 10, 0, &eee_timer))
goto err;
-#ifdef CONFIG_STMMAC_TIMER
- } else if (!strncmp(opt, "tmrate:", 7)) {
- if (kstrtoint(opt + 7, 0, &tmrate))
- goto err;
-#endif
}
}
return 0;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
deleted file mode 100644
index 2a0e1ab..0000000
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
+++ /dev/null
@@ -1,134 +0,0 @@
-/*******************************************************************************
- STMMAC external timer support.
-
- Copyright (C) 2007-2009 STMicroelectronics Ltd
-
- This program is free software; you can redistribute it and/or modify it
- under the terms and conditions of the GNU General Public License,
- version 2, as published by the Free Software Foundation.
-
- This program is distributed in the hope it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
-
- You should have received a copy of the GNU General Public License along with
- this program; if not, write to the Free Software Foundation, Inc.,
- 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
-
- The full GNU General Public License is included in this distribution in
- the file called "COPYING".
-
- Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
-*******************************************************************************/
-
-#include <linux/kernel.h>
-#include <linux/etherdevice.h>
-#include "stmmac_timer.h"
-
-static void stmmac_timer_handler(void *data)
-{
- struct net_device *dev = (struct net_device *)data;
-
- stmmac_schedule(dev);
-}
-
-#define STMMAC_TIMER_MSG(timer, freq) \
-printk(KERN_INFO "stmmac_timer: %s Timer ON (freq %dHz)\n", timer, freq);
-
-#if defined(CONFIG_STMMAC_RTC_TIMER)
-#include <linux/rtc.h>
-static struct rtc_device *stmmac_rtc;
-static rtc_task_t stmmac_task;
-
-static void stmmac_rtc_start(unsigned int new_freq)
-{
- rtc_irq_set_freq(stmmac_rtc, &stmmac_task, new_freq);
- rtc_irq_set_state(stmmac_rtc, &stmmac_task, 1);
-}
-
-static void stmmac_rtc_stop(void)
-{
- rtc_irq_set_state(stmmac_rtc, &stmmac_task, 0);
-}
-
-int stmmac_open_ext_timer(struct net_device *dev, struct stmmac_timer *tm)
-{
- stmmac_task.private_data = dev;
- stmmac_task.func = stmmac_timer_handler;
-
- stmmac_rtc = rtc_class_open(CONFIG_RTC_HCTOSYS_DEVICE);
- if (stmmac_rtc == NULL) {
- pr_err("open rtc device failed\n");
- return -ENODEV;
- }
-
- rtc_irq_register(stmmac_rtc, &stmmac_task);
-
- /* Periodic mode is not supported */
- if ((rtc_irq_set_freq(stmmac_rtc, &stmmac_task, tm->freq) < 0)) {
- pr_err("set periodic failed\n");
- rtc_irq_unregister(stmmac_rtc, &stmmac_task);
- rtc_class_close(stmmac_rtc);
- return -1;
- }
-
- STMMAC_TIMER_MSG(CONFIG_RTC_HCTOSYS_DEVICE, tm->freq);
-
- tm->timer_start = stmmac_rtc_start;
- tm->timer_stop = stmmac_rtc_stop;
-
- return 0;
-}
-
-int stmmac_close_ext_timer(void)
-{
- rtc_irq_set_state(stmmac_rtc, &stmmac_task, 0);
- rtc_irq_unregister(stmmac_rtc, &stmmac_task);
- rtc_class_close(stmmac_rtc);
- return 0;
-}
-
-#elif defined(CONFIG_STMMAC_TMU_TIMER)
-#include <linux/clk.h>
-#define TMU_CHANNEL "tmu2_clk"
-static struct clk *timer_clock;
-
-static void stmmac_tmu_start(unsigned int new_freq)
-{
- clk_set_rate(timer_clock, new_freq);
- clk_enable(timer_clock);
-}
-
-static void stmmac_tmu_stop(void)
-{
- clk_disable(timer_clock);
-}
-
-int stmmac_open_ext_timer(struct net_device *dev, struct stmmac_timer *tm)
-{
- timer_clock = clk_get(NULL, TMU_CHANNEL);
-
- if (timer_clock == NULL)
- return -1;
-
- if (tmu2_register_user(stmmac_timer_handler, (void *)dev) < 0) {
- timer_clock = NULL;
- return -1;
- }
-
- STMMAC_TIMER_MSG("TMU2", tm->freq);
- tm->timer_start = stmmac_tmu_start;
- tm->timer_stop = stmmac_tmu_stop;
-
- return 0;
-}
-
-int stmmac_close_ext_timer(void)
-{
- clk_disable(timer_clock);
- tmu2_unregister_user();
- clk_put(timer_clock);
- return 0;
-}
-#endif
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
deleted file mode 100644
index aea9b14..0000000
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
+++ /dev/null
@@ -1,46 +0,0 @@
-/*******************************************************************************
- STMMAC external timer Header File.
-
- Copyright (C) 2007-2009 STMicroelectronics Ltd
-
- This program is free software; you can redistribute it and/or modify it
- under the terms and conditions of the GNU General Public License,
- version 2, as published by the Free Software Foundation.
-
- This program is distributed in the hope it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
-
- You should have received a copy of the GNU General Public License along with
- this program; if not, write to the Free Software Foundation, Inc.,
- 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
-
- The full GNU General Public License is included in this distribution in
- the file called "COPYING".
-
- Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
-*******************************************************************************/
-#ifndef __STMMAC_TIMER_H__
-#define __STMMAC_TIMER_H__
-
-struct stmmac_timer {
- void (*timer_start) (unsigned int new_freq);
- void (*timer_stop) (void);
- unsigned int freq;
- unsigned int enable;
-};
-
-/* Open the HW timer device and return 0 in case of success */
-int stmmac_open_ext_timer(struct net_device *dev, struct stmmac_timer *tm);
-/* Stop the timer and release it */
-int stmmac_close_ext_timer(void);
-/* Function used for scheduling task within the stmmac */
-void stmmac_schedule(struct net_device *dev);
-
-#if defined(CONFIG_STMMAC_TMU_TIMER)
-extern int tmu2_register_user(void *fnt, void *data);
-extern void tmu2_unregister_user(void);
-#endif
-
-#endif /* __STMMAC_TIMER_H__ */
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 0/7] stmmac: remove dead code for STMMAC_TIMER and add new mitigation schema.
From: Giuseppe CAVALLARO @ 2012-09-03 7:46 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
These patch series remove the STMMAC_TIMER option no longer updated
and never used and add a new mitigation schema.
Having removed the Timer opt, this has made the driver slim.
On top of this work, it has been easier to introduce the new
mitigation schema based on HW RX-watchdog (available in new cores).
In fact, 3.50 and newer cores have an HW RX-Watchdog that can be used for
mitigating the Rx-interrupts and first results look promising.
Running n-u-t-t-c-p with the following parameters:
Throughput: 500Mbps
UDP Buffer size: 1328bytes
TCP Buffer size: 65536bytes
for example, I got on ST box (arm-based) these improvements:
--------------------------------------------------------------------
Original | With New Mitigation patch
--------------------------------------------------------------------
Test CPU usage pkt/loss | CPU usage pkt/loss
Type Mbps % % |Mbps % %
--------------------------------------------------------------------
UDP-RX 395.5065 95 20.89 |499.9966 25 0.00%
UDP-TX 499.5578 100 0.08915 |499.7156 100 0.06029%
TCP-RX 499.9221 77 |499.8648 41
TCP-TX 389.5719 99 |499.2802 79
--------------------------------------------------------------------
... no regression on ST boxes (SH based) I always test.
This is a brief explanation of the new mitigation schema although there
is a patch that updates the driver's documentation.
o On Rx-side I have:
New GMACs will use the RX-watchdog timer; old ones will continue to
use NAPI to mitigate the RX DMA interrupts.
For the RX-watchdog, there is a parameter that is the RI Watchdog
Timer count. It indicates the number of system clock cycles and can be
set via sysFS. Next step will be to tune it via ethtool.
o On Tx-side, the mitigation schema is based on a SW timer
that calls the tx function (stmmac_tx) to reclaim the resource after
transmitting the frames.
Also there is another parameter (a threshold) used to program
the descriptors avoiding to set the interrupt on completion bit in
when the frame is sent (xmit). This means that the stmmac_tx can be
called by the ISR too. Also this parameter can be tuned via sysFs and
not yet via ethtool.
Note1: there is a patch that updates the driver to August 2012.
I hope to also release the PTP support and update the driver in the next weeks.
Note2: next step will be to tune coalesce params via ethtool. I'll do that.
peppe
Giuseppe Cavallaro (7):
stmmac: remove dead code for TIMER
stmmac: manage tx clean out of rx_poll
stmmac: add the initial tx coalesce schema
stmmac: add Rx watchdog optimization to mitigate the DMA irqs
stmmac: add sysFs support
stmmac: add mitigation and sysfs info in the doc
stmmac: update the driver version to August_2012
Documentation/networking/stmmac.txt | 34 ++-
drivers/net/ethernet/stmicro/stmmac/Kconfig | 25 --
drivers/net/ethernet/stmicro/stmmac/Makefile | 3 +-
drivers/net/ethernet/stmicro/stmmac/common.h | 30 ++-
drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 3 -
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 6 +
drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h | 3 +-
drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c | 7 +-
drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c | 157 +++++++++++
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 15 +-
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 9 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 282 +++++++++-----------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c | 134 ---------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h | 46 ----
14 files changed, 354 insertions(+), 400 deletions(-)
create mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac-sysfs.c
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
--
1.7.4.4
^ permalink raw reply
* [PATCH] mISDN: fix possible memory leak in hfcmulti_init()
From: Wei Yongjun @ 2012-09-03 7:31 UTC (permalink / raw)
To: isdn; +Cc: yongjun_wei, netdev
From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
hc has been allocated in this function and missing free it before
leaving from some error handling cases.
spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
drivers/isdn/hardware/mISDN/hfcmulti.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c b/drivers/isdn/hardware/mISDN/hfcmulti.c
index 5e402cf..f027942 100644
--- a/drivers/isdn/hardware/mISDN/hfcmulti.c
+++ b/drivers/isdn/hardware/mISDN/hfcmulti.c
@@ -5059,6 +5059,7 @@ hfcmulti_init(struct hm_map *m, struct pci_dev *pdev,
printk(KERN_INFO
"HFC-E1 #%d has overlapping B-channels on fragment #%d\n",
E1_cnt + 1, pt);
+ kfree(hc);
return -EINVAL;
}
maskcheck |= hc->bmask[pt];
@@ -5086,6 +5087,7 @@ hfcmulti_init(struct hm_map *m, struct pci_dev *pdev,
if ((poll >> 1) > sizeof(hc->silence_data)) {
printk(KERN_ERR "HFCMULTI error: silence_data too small, "
"please fix\n");
+ kfree(hc);
return -EINVAL;
}
for (i = 0; i < (poll >> 1); i++)
^ permalink raw reply related
* Support for draft-ietf-6man-impatient-nud-02
From: Kiran (Kiran Kumar) Kella @ 2012-09-03 6:19 UTC (permalink / raw)
To: netdev@vger.kernel.org
Hi All,
Would like to know if the patch for supporting new NDP state UNREACHABLE as per "draft-ietf-6man-impatient-nud-02" (Neighbor Unreachability Detection is too impatient) exists?
It is useful to have these changes supported as part of the initiatives to mitigate the effects of DoS attacks as mentioned in RFC 6583 (Operational Neighbor Discovery Problems).
I couldn't find any related emails in the netdev archives.
Appreciate your response.
Regards,
Kiran
^ permalink raw reply
* Question: routing packets via specific router in LAN?
From: Yi Li @ 2012-09-03 6:04 UTC (permalink / raw)
To: netdev
Hi All,
I have server --- router ---client three machines,
and they all have only one ip in the same LAN.
I want to instruct the packets flowing through the router when the
server and client communicates.
I have do the following things to setup:
on the server:
# ip route add to unicast CLIENT_IP/32 via ROUTER_IP dev eth0
# echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects
# echo 0 > /proc/sys/net/ipv4/conf/eth0/accept_redirects
on the client:
/*modify route table*/
# ip route add to unicast SERVER_IP/32 via ROUTER_IP dev eth0
/*disable icmp-redirects accept*/
# echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects
# echo 0 > /proc/sys/net/ipv4/conf/eth0/accept_redirects
on the router:
/*enable forwarding*/
# echo 1 > /proc/sys/net/ipv4/ip_forwarding
/*disable icmp-redirects*/
# echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects
# echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects
BTW, I have disabled iptables on all of these three machines.
But I still can't tcpdump any packets on the router which means no
packets flowing
through the router. The server and client communicates by pass the router!
So, What I have missed, or we can't setup server-router-client topo in LAN?
Thanks in advaced.
^ permalink raw reply
* Re: [PATCH 3/3] tcp: use PRR to reduce cwin in CWR state
From: Neal Cardwell @ 2012-09-03 4:00 UTC (permalink / raw)
To: Yuchung Cheng
Cc: David Miller, Nandita Dukkipati, Matt Mathis, Eric Dumazet,
Netdev
In-Reply-To: <1346643484-12947-3-git-send-email-ycheng@google.com>
On Sun, Sep 2, 2012 at 11:38 PM, Yuchung Cheng <ycheng@google.com> wrote:
> Use proportional rate reduction (PRR) algorithm to reduce cwnd in CWR state,
> in addition to Recovery state. Retire the current rate-halving in CWR.
> When losses are detected via ACKs in CWR state, the sender enters Recovery
> state but the cwnd reduction continues and does not restart.
>
> Rename and refactor cwnd reduction functions since both CWR and Recovery
> use the same algorithm:
> tcp_init_cwnd_reduction() is new and initiates reduction state variables.
> tcp_cwnd_reduction() is previously tcp_update_cwnd_in_recovery().
> tcp_ends_cwnd_reduction() is previously tcp_complete_cwr().
>
> The rate halving functions and logic such as tcp_cwnd_down(), tcp_min_cwnd(),
> and the cwnd moderation inside tcp_enter_cwr() are removed. The unused
> parameter, flag, in tcp_cwnd_reduction() is also removed.
>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
neal
^ permalink raw reply
* Re: [PATCH 2/3] tcp: move tcp_update_cwnd_in_recovery
From: Neal Cardwell @ 2012-09-03 3:57 UTC (permalink / raw)
To: Yuchung Cheng
Cc: David Miller, Nandita Dukkipati, Matt Mathis, Eric Dumazet,
Netdev
In-Reply-To: <1346643484-12947-2-git-send-email-ycheng@google.com>
On Sun, Sep 2, 2012 at 11:38 PM, Yuchung Cheng <ycheng@google.com> wrote:
> To prepare replacing rate halving with PRR algorithm in CWR state.
>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
neal
^ permalink raw reply
* Re: [PATCH 1/3] tcp: move tcp_enter_cwr()
From: Neal Cardwell @ 2012-09-03 3:56 UTC (permalink / raw)
To: Yuchung Cheng
Cc: David Miller, Nandita Dukkipati, Matt Mathis, Eric Dumazet,
Netdev
In-Reply-To: <1346643484-12947-1-git-send-email-ycheng@google.com>
On Sun, Sep 2, 2012 at 11:38 PM, Yuchung Cheng <ycheng@google.com> wrote:
> To prepare replacing rate halving with PRR algorithm in CWR state.
>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
neal
^ permalink raw reply
* [PATCH 2/3] tcp: move tcp_update_cwnd_in_recovery
From: Yuchung Cheng @ 2012-09-03 3:38 UTC (permalink / raw)
To: davem, ncardwell, nanditad; +Cc: mattmathis, edumazet, netdev, Yuchung Cheng
In-Reply-To: <1346643484-12947-1-git-send-email-ycheng@google.com>
To prepare replacing rate halving with PRR algorithm in CWR state.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
---
net/ipv4/tcp_input.c | 64 +++++++++++++++++++++++++-------------------------
1 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3ab0c75..38589e4 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2700,6 +2700,38 @@ static bool tcp_try_undo_loss(struct sock *sk)
return false;
}
+/* This function implements the PRR algorithm, specifcally the PRR-SSRB
+ * (proportional rate reduction with slow start reduction bound) as described in
+ * http://www.ietf.org/id/draft-mathis-tcpm-proportional-rate-reduction-01.txt.
+ * It computes the number of packets to send (sndcnt) based on packets newly
+ * delivered:
+ * 1) If the packets in flight is larger than ssthresh, PRR spreads the
+ * cwnd reductions across a full RTT.
+ * 2) If packets in flight is lower than ssthresh (such as due to excess
+ * losses and/or application stalls), do not perform any further cwnd
+ * reductions, but instead slow start up to ssthresh.
+ */
+static void tcp_update_cwnd_in_recovery(struct sock *sk, int newly_acked_sacked,
+ int fast_rexmit, int flag)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ int sndcnt = 0;
+ int delta = tp->snd_ssthresh - tcp_packets_in_flight(tp);
+
+ if (tcp_packets_in_flight(tp) > tp->snd_ssthresh) {
+ u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
+ tp->prior_cwnd - 1;
+ sndcnt = div_u64(dividend, tp->prior_cwnd) - tp->prr_out;
+ } else {
+ sndcnt = min_t(int, delta,
+ max_t(int, tp->prr_delivered - tp->prr_out,
+ newly_acked_sacked) + 1);
+ }
+
+ sndcnt = max(sndcnt, (fast_rexmit ? 1 : 0));
+ tp->snd_cwnd = tcp_packets_in_flight(tp) + sndcnt;
+}
+
static inline void tcp_complete_cwr(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -2854,38 +2886,6 @@ void tcp_simple_retransmit(struct sock *sk)
}
EXPORT_SYMBOL(tcp_simple_retransmit);
-/* This function implements the PRR algorithm, specifcally the PRR-SSRB
- * (proportional rate reduction with slow start reduction bound) as described in
- * http://www.ietf.org/id/draft-mathis-tcpm-proportional-rate-reduction-01.txt.
- * It computes the number of packets to send (sndcnt) based on packets newly
- * delivered:
- * 1) If the packets in flight is larger than ssthresh, PRR spreads the
- * cwnd reductions across a full RTT.
- * 2) If packets in flight is lower than ssthresh (such as due to excess
- * losses and/or application stalls), do not perform any further cwnd
- * reductions, but instead slow start up to ssthresh.
- */
-static void tcp_update_cwnd_in_recovery(struct sock *sk, int newly_acked_sacked,
- int fast_rexmit, int flag)
-{
- struct tcp_sock *tp = tcp_sk(sk);
- int sndcnt = 0;
- int delta = tp->snd_ssthresh - tcp_packets_in_flight(tp);
-
- if (tcp_packets_in_flight(tp) > tp->snd_ssthresh) {
- u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
- tp->prior_cwnd - 1;
- sndcnt = div_u64(dividend, tp->prior_cwnd) - tp->prr_out;
- } else {
- sndcnt = min_t(int, delta,
- max_t(int, tp->prr_delivered - tp->prr_out,
- newly_acked_sacked) + 1);
- }
-
- sndcnt = max(sndcnt, (fast_rexmit ? 1 : 0));
- tp->snd_cwnd = tcp_packets_in_flight(tp) + sndcnt;
-}
-
static void tcp_enter_recovery(struct sock *sk, bool ece_ack)
{
struct tcp_sock *tp = tcp_sk(sk);
--
1.7.7.3
^ permalink raw reply related
* [PATCH 3/3] tcp: use PRR to reduce cwin in CWR state
From: Yuchung Cheng @ 2012-09-03 3:38 UTC (permalink / raw)
To: davem, ncardwell, nanditad; +Cc: mattmathis, edumazet, netdev, Yuchung Cheng
In-Reply-To: <1346643484-12947-1-git-send-email-ycheng@google.com>
Use proportional rate reduction (PRR) algorithm to reduce cwnd in CWR state,
in addition to Recovery state. Retire the current rate-halving in CWR.
When losses are detected via ACKs in CWR state, the sender enters Recovery
state but the cwnd reduction continues and does not restart.
Rename and refactor cwnd reduction functions since both CWR and Recovery
use the same algorithm:
tcp_init_cwnd_reduction() is new and initiates reduction state variables.
tcp_cwnd_reduction() is previously tcp_update_cwnd_in_recovery().
tcp_ends_cwnd_reduction() is previously tcp_complete_cwr().
The rate halving functions and logic such as tcp_cwnd_down(), tcp_min_cwnd(),
and the cwnd moderation inside tcp_enter_cwr() are removed. The unused
parameter, flag, in tcp_cwnd_reduction() is also removed.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
---
include/net/tcp.h | 10 +++-
net/ipv4/tcp_input.c | 119 +++++++++++++++++--------------------------------
net/ipv4/tcp_output.c | 6 +-
3 files changed, 52 insertions(+), 83 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1421b02..a8cb00c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -913,15 +913,21 @@ static inline bool tcp_in_initial_slowstart(const struct tcp_sock *tp)
return tp->snd_ssthresh >= TCP_INFINITE_SSTHRESH;
}
+static inline bool tcp_in_cwnd_reduction(const struct sock *sk)
+{
+ return (TCPF_CA_CWR | TCPF_CA_Recovery) &
+ (1 << inet_csk(sk)->icsk_ca_state);
+}
+
/* If cwnd > ssthresh, we may raise ssthresh to be half-way to cwnd.
- * The exception is rate halving phase, when cwnd is decreasing towards
+ * The exception is cwnd reduction phase, when cwnd is decreasing towards
* ssthresh.
*/
static inline __u32 tcp_current_ssthresh(const struct sock *sk)
{
const struct tcp_sock *tp = tcp_sk(sk);
- if ((1 << inet_csk(sk)->icsk_ca_state) & (TCPF_CA_CWR | TCPF_CA_Recovery))
+ if (tcp_in_cwnd_reduction(sk))
return tp->snd_ssthresh;
else
return max(tp->snd_ssthresh,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 38589e4..e2bec81 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2470,35 +2470,6 @@ static inline void tcp_moderate_cwnd(struct tcp_sock *tp)
tp->snd_cwnd_stamp = tcp_time_stamp;
}
-/* Lower bound on congestion window is slow start threshold
- * unless congestion avoidance choice decides to overide it.
- */
-static inline u32 tcp_cwnd_min(const struct sock *sk)
-{
- const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
-
- return ca_ops->min_cwnd ? ca_ops->min_cwnd(sk) : tcp_sk(sk)->snd_ssthresh;
-}
-
-/* Decrease cwnd each second ack. */
-static void tcp_cwnd_down(struct sock *sk, int flag)
-{
- struct tcp_sock *tp = tcp_sk(sk);
- int decr = tp->snd_cwnd_cnt + 1;
-
- if ((flag & (FLAG_ANY_PROGRESS | FLAG_DSACKING_ACK)) ||
- (tcp_is_reno(tp) && !(flag & FLAG_NOT_DUP))) {
- tp->snd_cwnd_cnt = decr & 1;
- decr >>= 1;
-
- if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
- tp->snd_cwnd -= decr;
-
- tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp) + 1);
- tp->snd_cwnd_stamp = tcp_time_stamp;
- }
-}
-
/* Nothing was retransmitted or returned timestamp is less
* than timestamp of the first retransmission.
*/
@@ -2700,9 +2671,8 @@ static bool tcp_try_undo_loss(struct sock *sk)
return false;
}
-/* This function implements the PRR algorithm, specifcally the PRR-SSRB
- * (proportional rate reduction with slow start reduction bound) as described in
- * http://www.ietf.org/id/draft-mathis-tcpm-proportional-rate-reduction-01.txt.
+/* The cwnd reduction in CWR and Recovery use the PRR algorithm
+ * https://datatracker.ietf.org/doc/draft-ietf-tcpm-proportional-rate-reduction/
* It computes the number of packets to send (sndcnt) based on packets newly
* delivered:
* 1) If the packets in flight is larger than ssthresh, PRR spreads the
@@ -2711,13 +2681,29 @@ static bool tcp_try_undo_loss(struct sock *sk)
* losses and/or application stalls), do not perform any further cwnd
* reductions, but instead slow start up to ssthresh.
*/
-static void tcp_update_cwnd_in_recovery(struct sock *sk, int newly_acked_sacked,
- int fast_rexmit, int flag)
+static void tcp_init_cwnd_reduction(struct sock *sk, const bool set_ssthresh)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ tp->high_seq = tp->snd_nxt;
+ tp->bytes_acked = 0;
+ tp->snd_cwnd_cnt = 0;
+ tp->prior_cwnd = tp->snd_cwnd;
+ tp->prr_delivered = 0;
+ tp->prr_out = 0;
+ if (set_ssthresh)
+ tp->snd_ssthresh = inet_csk(sk)->icsk_ca_ops->ssthresh(sk);
+ TCP_ECN_queue_cwr(tp);
+}
+
+static void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked,
+ int fast_rexmit)
{
struct tcp_sock *tp = tcp_sk(sk);
int sndcnt = 0;
int delta = tp->snd_ssthresh - tcp_packets_in_flight(tp);
+ tp->prr_delivered += newly_acked_sacked;
if (tcp_packets_in_flight(tp) > tp->snd_ssthresh) {
u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
tp->prior_cwnd - 1;
@@ -2732,43 +2718,29 @@ static void tcp_update_cwnd_in_recovery(struct sock *sk, int newly_acked_sacked,
tp->snd_cwnd = tcp_packets_in_flight(tp) + sndcnt;
}
-static inline void tcp_complete_cwr(struct sock *sk)
+static inline void tcp_end_cwnd_reduction(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
- /* Do not moderate cwnd if it's already undone in cwr or recovery. */
- if (tp->undo_marker) {
- if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR) {
- tp->snd_cwnd = min(tp->snd_cwnd, tp->snd_ssthresh);
- tp->snd_cwnd_stamp = tcp_time_stamp;
- } else if (tp->snd_ssthresh < TCP_INFINITE_SSTHRESH) {
- /* PRR algorithm. */
- tp->snd_cwnd = tp->snd_ssthresh;
- tp->snd_cwnd_stamp = tcp_time_stamp;
- }
+ /* Reset cwnd to ssthresh in CWR or Recovery (unless it's undone) */
+ if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
+ (tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
+ tp->snd_cwnd = tp->snd_ssthresh;
+ tp->snd_cwnd_stamp = tcp_time_stamp;
}
tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
}
-/* Set slow start threshold and cwnd not falling to slow start */
+/* Enter CWR state. Disable cwnd undo since congestion is proven with ECN */
void tcp_enter_cwr(struct sock *sk, const int set_ssthresh)
{
struct tcp_sock *tp = tcp_sk(sk);
- const struct inet_connection_sock *icsk = inet_csk(sk);
tp->prior_ssthresh = 0;
tp->bytes_acked = 0;
- if (icsk->icsk_ca_state < TCP_CA_CWR) {
+ if (inet_csk(sk)->icsk_ca_state < TCP_CA_CWR) {
tp->undo_marker = 0;
- if (set_ssthresh)
- tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
- tp->snd_cwnd = min(tp->snd_cwnd,
- tcp_packets_in_flight(tp) + 1U);
- tp->snd_cwnd_cnt = 0;
- tp->high_seq = tp->snd_nxt;
- tp->snd_cwnd_stamp = tcp_time_stamp;
- TCP_ECN_queue_cwr(tp);
-
+ tcp_init_cwnd_reduction(sk, set_ssthresh);
tcp_set_ca_state(sk, TCP_CA_CWR);
}
}
@@ -2787,7 +2759,7 @@ static void tcp_try_keep_open(struct sock *sk)
}
}
-static void tcp_try_to_open(struct sock *sk, int flag)
+static void tcp_try_to_open(struct sock *sk, int flag, int newly_acked_sacked)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -2804,7 +2776,7 @@ static void tcp_try_to_open(struct sock *sk, int flag)
if (inet_csk(sk)->icsk_ca_state != TCP_CA_Open)
tcp_moderate_cwnd(tp);
} else {
- tcp_cwnd_down(sk, flag);
+ tcp_cwnd_reduction(sk, newly_acked_sacked, 0);
}
}
@@ -2898,7 +2870,6 @@ static void tcp_enter_recovery(struct sock *sk, bool ece_ack)
NET_INC_STATS_BH(sock_net(sk), mib_idx);
- tp->high_seq = tp->snd_nxt;
tp->prior_ssthresh = 0;
tp->undo_marker = tp->snd_una;
tp->undo_retrans = tp->retrans_out;
@@ -2906,15 +2877,8 @@ static void tcp_enter_recovery(struct sock *sk, bool ece_ack)
if (inet_csk(sk)->icsk_ca_state < TCP_CA_CWR) {
if (!ece_ack)
tp->prior_ssthresh = tcp_current_ssthresh(sk);
- tp->snd_ssthresh = inet_csk(sk)->icsk_ca_ops->ssthresh(sk);
- TCP_ECN_queue_cwr(tp);
+ tcp_init_cwnd_reduction(sk, true);
}
-
- tp->bytes_acked = 0;
- tp->snd_cwnd_cnt = 0;
- tp->prior_cwnd = tp->snd_cwnd;
- tp->prr_delivered = 0;
- tp->prr_out = 0;
tcp_set_ca_state(sk, TCP_CA_Recovery);
}
@@ -2974,7 +2938,7 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked,
/* CWR is to be held something *above* high_seq
* is ACKed for CWR bit to reach receiver. */
if (tp->snd_una != tp->high_seq) {
- tcp_complete_cwr(sk);
+ tcp_end_cwnd_reduction(sk);
tcp_set_ca_state(sk, TCP_CA_Open);
}
break;
@@ -2984,7 +2948,7 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked,
tcp_reset_reno_sack(tp);
if (tcp_try_undo_recovery(sk))
return;
- tcp_complete_cwr(sk);
+ tcp_end_cwnd_reduction(sk);
break;
}
}
@@ -3025,7 +2989,7 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked,
tcp_try_undo_dsack(sk);
if (!tcp_time_to_recover(sk, flag)) {
- tcp_try_to_open(sk, flag);
+ tcp_try_to_open(sk, flag, newly_acked_sacked);
return;
}
@@ -3047,8 +3011,7 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked,
if (do_lost || (tcp_is_fack(tp) && tcp_head_timedout(sk)))
tcp_update_scoreboard(sk, fast_rexmit);
- tp->prr_delivered += newly_acked_sacked;
- tcp_update_cwnd_in_recovery(sk, newly_acked_sacked, fast_rexmit, flag);
+ tcp_cwnd_reduction(sk, newly_acked_sacked, fast_rexmit);
tcp_xmit_retransmit_queue(sk);
}
@@ -3394,7 +3357,7 @@ static inline bool tcp_may_raise_cwnd(const struct sock *sk, const int flag)
{
const struct tcp_sock *tp = tcp_sk(sk);
return (!(flag & FLAG_ECE) || tp->snd_cwnd < tp->snd_ssthresh) &&
- !((1 << inet_csk(sk)->icsk_ca_state) & (TCPF_CA_Recovery | TCPF_CA_CWR));
+ !tcp_in_cwnd_reduction(sk);
}
/* Check that window update is acceptable.
@@ -3462,9 +3425,9 @@ static void tcp_conservative_spur_to_response(struct tcp_sock *tp)
}
/* A conservative spurious RTO response algorithm: reduce cwnd using
- * rate halving and continue in congestion avoidance.
+ * PRR and continue in congestion avoidance.
*/
-static void tcp_ratehalving_spur_to_response(struct sock *sk)
+static void tcp_cwr_spur_to_response(struct sock *sk)
{
tcp_enter_cwr(sk, 0);
}
@@ -3472,7 +3435,7 @@ static void tcp_ratehalving_spur_to_response(struct sock *sk)
static void tcp_undo_spur_to_response(struct sock *sk, int flag)
{
if (flag & FLAG_ECE)
- tcp_ratehalving_spur_to_response(sk);
+ tcp_cwr_spur_to_response(sk);
else
tcp_undo_cwr(sk, true);
}
@@ -3579,7 +3542,7 @@ static bool tcp_process_frto(struct sock *sk, int flag)
tcp_conservative_spur_to_response(tp);
break;
default:
- tcp_ratehalving_spur_to_response(sk);
+ tcp_cwr_spur_to_response(sk);
break;
}
tp->frto_counter = 0;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 9383b51..cfe6ffe 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2037,10 +2037,10 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
if (push_one)
break;
}
- if (inet_csk(sk)->icsk_ca_state == TCP_CA_Recovery)
- tp->prr_out += sent_pkts;
if (likely(sent_pkts)) {
+ if (tcp_in_cwnd_reduction(sk))
+ tp->prr_out += sent_pkts;
tcp_cwnd_validate(sk);
return false;
}
@@ -2542,7 +2542,7 @@ begin_fwd:
}
NET_INC_STATS_BH(sock_net(sk), mib_idx);
- if (inet_csk(sk)->icsk_ca_state == TCP_CA_Recovery)
+ if (tcp_in_cwnd_reduction(sk))
tp->prr_out += tcp_skb_pcount(skb);
if (skb == tcp_write_queue_head(sk))
--
1.7.7.3
^ permalink raw reply related
* [PATCH 1/3] tcp: move tcp_enter_cwr()
From: Yuchung Cheng @ 2012-09-03 3:38 UTC (permalink / raw)
To: davem, ncardwell, nanditad; +Cc: mattmathis, edumazet, netdev, Yuchung Cheng
To prepare replacing rate halving with PRR algorithm in CWR state.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
---
net/ipv4/tcp_input.c | 46 +++++++++++++++++++++++-----------------------
1 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 8c304a4..3ab0c75 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -743,29 +743,6 @@ __u32 tcp_init_cwnd(const struct tcp_sock *tp, const struct dst_entry *dst)
return min_t(__u32, cwnd, tp->snd_cwnd_clamp);
}
-/* Set slow start threshold and cwnd not falling to slow start */
-void tcp_enter_cwr(struct sock *sk, const int set_ssthresh)
-{
- struct tcp_sock *tp = tcp_sk(sk);
- const struct inet_connection_sock *icsk = inet_csk(sk);
-
- tp->prior_ssthresh = 0;
- tp->bytes_acked = 0;
- if (icsk->icsk_ca_state < TCP_CA_CWR) {
- tp->undo_marker = 0;
- if (set_ssthresh)
- tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
- tp->snd_cwnd = min(tp->snd_cwnd,
- tcp_packets_in_flight(tp) + 1U);
- tp->snd_cwnd_cnt = 0;
- tp->high_seq = tp->snd_nxt;
- tp->snd_cwnd_stamp = tcp_time_stamp;
- TCP_ECN_queue_cwr(tp);
-
- tcp_set_ca_state(sk, TCP_CA_CWR);
- }
-}
-
/*
* Packet counting of FACK is based on in-order assumptions, therefore TCP
* disables it when reordering is detected
@@ -2741,6 +2718,29 @@ static inline void tcp_complete_cwr(struct sock *sk)
tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
}
+/* Set slow start threshold and cwnd not falling to slow start */
+void tcp_enter_cwr(struct sock *sk, const int set_ssthresh)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+
+ tp->prior_ssthresh = 0;
+ tp->bytes_acked = 0;
+ if (icsk->icsk_ca_state < TCP_CA_CWR) {
+ tp->undo_marker = 0;
+ if (set_ssthresh)
+ tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
+ tp->snd_cwnd = min(tp->snd_cwnd,
+ tcp_packets_in_flight(tp) + 1U);
+ tp->snd_cwnd_cnt = 0;
+ tp->high_seq = tp->snd_nxt;
+ tp->snd_cwnd_stamp = tcp_time_stamp;
+ TCP_ECN_queue_cwr(tp);
+
+ tcp_set_ca_state(sk, TCP_CA_CWR);
+ }
+}
+
static void tcp_try_keep_open(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
--
1.7.7.3
^ permalink raw reply related
* Re: [BUG] TIPC handling of -ERESTARTSYS in connect()
From: Ying Xue @ 2012-09-03 3:16 UTC (permalink / raw)
To: Chris Friesen; +Cc: netdev, Allan Stephens, Jon Maloy
In-Reply-To: <5040D5D9.4040802@genband.com>
[-- Attachment #1: Type: text/plain, Size: 448 bytes --]
>>
>> It looks like the case where we come in with "sock->state ==
>> SS_LISTENING" has changed. Previously we would return -EOPNOTSUPP but
>> now it'll be -EINVAL. Is that intentional?
>
> Just noticed something else. In the SS_CONNECTING case I don't think
> there's any point in setting "res = -EALREADY" since a bit further
> down it gets set unconditionally anyway.
>
Yes, you are right. Please check v2 version.
Thanks,
Ying
> Chris
>
[-- Attachment #2: 0001-tipc-correct-return-value-of-connect-when-it-is-inte.patch --]
[-- Type: text/x-diff, Size: 4437 bytes --]
>From 8c4606c4dbb1cf47f357cc3620ba5b70b433b670 Mon Sep 17 00:00:00 2001
From: Ying Xue <ying.xue@windriver.com>
Date: Mon, 3 Sep 2012 10:29:28 +0800
Subject: [PATCH v2] tipc: correct return value of connect() when it is interrupted by a signal
When connect() is interrupted by a signal, it returns -ERESTARTSYS
error code, meanwhile socket state is also changed to SS_DISCONNECTING
state. If connect() syscall exits from kernel space to userspace,
kernel will deal with the pending signal. Once seeing the -ERESTARTSYS,
it will retry this syscall again. As a result, finally userspace gets
-EISCONN error code from connect() at the moment.
Correct behavior of connect() is that we should not set socket state
SS_DISCONNECTING when signal happens, but it may return a successful
code to userspace after signal is handled.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
net/tipc/tipc_socket.c | 109 +++++++++++++++++++++++++-----------------------
1 files changed, 57 insertions(+), 52 deletions(-)
diff --git a/net/tipc/tipc_socket.c b/net/tipc/tipc_socket.c
index c135edf..d800087 100644
--- a/net/tipc/tipc_socket.c
+++ b/net/tipc/tipc_socket.c
@@ -1456,21 +1456,6 @@ static int connect(struct socket *sock, struct sockaddr *dest, int destlen,
goto exit;
}
- /* Issue Posix-compliant error code if socket is in the wrong state */
-
- if (sock->state == SS_LISTENING) {
- res = -EOPNOTSUPP;
- goto exit;
- }
- if (sock->state == SS_CONNECTING) {
- res = -EALREADY;
- goto exit;
- }
- if (sock->state != SS_UNCONNECTED) {
- res = -EISCONN;
- goto exit;
- }
-
/*
* Reject connection attempt using multicast address
*
@@ -1483,54 +1468,74 @@ static int connect(struct socket *sock, struct sockaddr *dest, int destlen,
goto exit;
}
- /* Reject any messages already in receive queue (very unlikely) */
-
- reject_rx_queue(sk);
+ switch (sock->state) {
+ case SS_UNCONNECTED:
+ /* Reject any messages already in receive queue
+ * (very unlikely) */
+ reject_rx_queue(sk);
- /* Send a 'SYN-' to destination */
+ /* Send a 'SYN-' to destination */
+ m.msg_name = dest;
+ m.msg_namelen = destlen;
+ res = send_msg(NULL, sock, &m, 0);
+ if (res < 0) {
+ goto exit;
+ }
- m.msg_name = dest;
- m.msg_namelen = destlen;
- res = send_msg(NULL, sock, &m, 0);
- if (res < 0) {
- goto exit;
+ /*
+ * Just entered SS_CONNECTING state; the only
+ * difference is that return value in non-blocking
+ * case is EINPROGRESS, rather than EALREADY.
+ */
+ res = -EINPROGRESS;
+ break;
+ case SS_LISTENING:
+ res = -EOPNOTSUPP;
+ break;
+ case SS_CONNECTED:
+ res = -EISCONN;
+ break;
+ default:
+ res = -EINVAL;
+ break;
}
- /* Wait until an 'ACK' or 'RST' arrives, or a timeout occurs */
-
- timeout = tipc_sk(sk)->conn_timeout;
- release_sock(sk);
- res = wait_event_interruptible_timeout(*sk->sk_sleep,
- (!skb_queue_empty(&sk->sk_receive_queue) ||
- (sock->state != SS_CONNECTING)),
- timeout ? (long)msecs_to_jiffies(timeout)
- : MAX_SCHEDULE_TIMEOUT);
- lock_sock(sk);
+ if (sock->state == SS_CONNECTING) {
+ /* Wait until an 'ACK' or 'RST' arrives, or a timeout occurs */
+ timeout = tipc_sk(sk)->conn_timeout;
+ release_sock(sk);
+ res = wait_event_interruptible_timeout(*sk->sk_sleep,
+ (!skb_queue_empty(&sk->sk_receive_queue) ||
+ (sock->state != SS_CONNECTING)),
+ timeout ? (long)msecs_to_jiffies(timeout)
+ : MAX_SCHEDULE_TIMEOUT);
+ lock_sock(sk);
- if (res > 0) {
- buf = skb_peek(&sk->sk_receive_queue);
- if (buf != NULL) {
- msg = buf_msg(buf);
- res = auto_connect(sock, msg);
- if (!res) {
- if (!msg_data_sz(msg))
- advance_rx_queue(sk);
+ if (res > 0) {
+ buf = skb_peek(&sk->sk_receive_queue);
+ if (buf != NULL) {
+ msg = buf_msg(buf);
+ res = auto_connect(sock, msg);
+ if (!res) {
+ if (!msg_data_sz(msg))
+ advance_rx_queue(sk);
+ }
+ } else {
+ if (sock->state == SS_CONNECTED) {
+ res = -EISCONN;
+ } else {
+ res = -ECONNREFUSED;
+ }
}
} else {
- if (sock->state == SS_CONNECTED) {
- res = -EISCONN;
+ if (res == 0) {
+ sock->state = SS_DISCONNECTING;
+ res = -ETIMEDOUT;
} else {
- res = -ECONNREFUSED;
+ ; /* leave "res" unchanged */
}
}
- } else {
- if (res == 0)
- res = -ETIMEDOUT;
- else
- ; /* leave "res" unchanged */
- sock->state = SS_DISCONNECTING;
}
-
exit:
release_sock(sk);
return res;
--
1.7.1
^ permalink raw reply related
* [PATCH v7 1/1] ieee802154: MRF24J40 driver
From: Alan Ott @ 2012-09-03 1:44 UTC (permalink / raw)
To: Alexander Smirnov, Dmitry Eremin-Solenikov, David S. Miller,
Tony Cheneau
Cc: Alan Ott, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-zigbee-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
In-Reply-To: <1346636653-28903-1-git-send-email-alan-yzvJWuRpmD1zbRFIqnYvSA@public.gmane.org>
Driver for the Microchip MRF24J40 802.15.4 WPAN module.
Signed-off-by: Alan Ott <alan-yzvJWuRpmD1zbRFIqnYvSA@public.gmane.org>
---
drivers/net/ieee802154/Kconfig | 11 +
drivers/net/ieee802154/Makefile | 1 +
drivers/net/ieee802154/mrf24j40.c | 767 ++++++++++++++++++++++++++++++++++++++
3 files changed, 779 insertions(+)
create mode 100644 drivers/net/ieee802154/mrf24j40.c
diff --git a/drivers/net/ieee802154/Kconfig b/drivers/net/ieee802154/Kconfig
index 1fc4eef..08ae465 100644
--- a/drivers/net/ieee802154/Kconfig
+++ b/drivers/net/ieee802154/Kconfig
@@ -34,3 +34,14 @@ config IEEE802154_AT86RF230
depends on IEEE802154_DRIVERS && MAC802154
tristate "AT86RF230/231 transceiver driver"
depends on SPI
+
+config IEEE802154_MRF24J40
+ tristate "Microchip MRF24J40 transceiver driver"
+ depends on IEEE802154_DRIVERS && MAC802154
+ depends on SPI
+ ---help---
+ Say Y here to enable the MRF24J20 SPI 802.15.4 wireless
+ controller.
+
+ This driver can also be built as a module. To do so, say M here.
+ the module will be called 'mrf24j40'.
diff --git a/drivers/net/ieee802154/Makefile b/drivers/net/ieee802154/Makefile
index 4f4371d..abb0c08 100644
--- a/drivers/net/ieee802154/Makefile
+++ b/drivers/net/ieee802154/Makefile
@@ -1,3 +1,4 @@
obj-$(CONFIG_IEEE802154_FAKEHARD) += fakehard.o
obj-$(CONFIG_IEEE802154_FAKELB) += fakelb.o
obj-$(CONFIG_IEEE802154_AT86RF230) += at86rf230.o
+obj-$(CONFIG_IEEE802154_MRF24J40) += mrf24j40.o
diff --git a/drivers/net/ieee802154/mrf24j40.c b/drivers/net/ieee802154/mrf24j40.c
new file mode 100644
index 0000000..0e53d4f
--- /dev/null
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -0,0 +1,767 @@
+/*
+ * Driver for Microchip MRF24J40 802.15.4 Wireless-PAN Networking controller
+ *
+ * Copyright (C) 2012 Alan Ott <alan-yzvJWuRpmD1zbRFIqnYvSA@public.gmane.org>
+ * Signal 11 Software
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include <linux/spi/spi.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <net/wpan-phy.h>
+#include <net/mac802154.h>
+
+/* MRF24J40 Short Address Registers */
+#define REG_RXMCR 0x00 /* Receive MAC control */
+#define REG_PANIDL 0x01 /* PAN ID (low) */
+#define REG_PANIDH 0x02 /* PAN ID (high) */
+#define REG_SADRL 0x03 /* Short address (low) */
+#define REG_SADRH 0x04 /* Short address (high) */
+#define REG_EADR0 0x05 /* Long address (low) (high is EADR7) */
+#define REG_TXMCR 0x11 /* Transmit MAC control */
+#define REG_PACON0 0x16 /* Power Amplifier Control */
+#define REG_PACON1 0x17 /* Power Amplifier Control */
+#define REG_PACON2 0x18 /* Power Amplifier Control */
+#define REG_TXNCON 0x1B /* Transmit Normal FIFO Control */
+#define REG_TXSTAT 0x24 /* TX MAC Status Register */
+#define REG_SOFTRST 0x2A /* Soft Reset */
+#define REG_TXSTBL 0x2E /* TX Stabilization */
+#define REG_INTSTAT 0x31 /* Interrupt Status */
+#define REG_INTCON 0x32 /* Interrupt Control */
+#define REG_RFCTL 0x36 /* RF Control Mode Register */
+#define REG_BBREG1 0x39 /* Baseband Registers */
+#define REG_BBREG2 0x3A /* */
+#define REG_BBREG6 0x3E /* */
+#define REG_CCAEDTH 0x3F /* Energy Detection Threshold */
+
+/* MRF24J40 Long Address Registers */
+#define REG_RFCON0 0x200 /* RF Control Registers */
+#define REG_RFCON1 0x201
+#define REG_RFCON2 0x202
+#define REG_RFCON3 0x203
+#define REG_RFCON5 0x205
+#define REG_RFCON6 0x206
+#define REG_RFCON7 0x207
+#define REG_RFCON8 0x208
+#define REG_RSSI 0x210
+#define REG_SLPCON0 0x211 /* Sleep Clock Control Registers */
+#define REG_SLPCON1 0x220
+#define REG_WAKETIMEL 0x222 /* Wake-up Time Match Value Low */
+#define REG_WAKETIMEH 0x223 /* Wake-up Time Match Value High */
+#define REG_RX_FIFO 0x300 /* Receive FIFO */
+
+/* Device configuration: Only channels 11-26 on page 0 are supported. */
+#define MRF24J40_CHAN_MIN 11
+#define MRF24J40_CHAN_MAX 26
+#define CHANNEL_MASK (((u32)1 << (MRF24J40_CHAN_MAX + 1)) \
+ - ((u32)1 << MRF24J40_CHAN_MIN))
+
+#define TX_FIFO_SIZE 128 /* From datasheet */
+#define RX_FIFO_SIZE 144 /* From datasheet */
+#define SET_CHANNEL_DELAY_US 192 /* From datasheet */
+
+/* Device Private Data */
+struct mrf24j40 {
+ struct spi_device *spi;
+ struct ieee802154_dev *dev;
+
+ struct mutex buffer_mutex; /* only used to protect buf */
+ struct completion tx_complete;
+ struct work_struct irqwork;
+ u8 *buf; /* 3 bytes. Used for SPI single-register transfers. */
+};
+
+/* Read/Write SPI Commands for Short and Long Address registers. */
+#define MRF24J40_READSHORT(reg) ((reg) << 1)
+#define MRF24J40_WRITESHORT(reg) ((reg) << 1 | 1)
+#define MRF24J40_READLONG(reg) (1 << 15 | (reg) << 5)
+#define MRF24J40_WRITELONG(reg) (1 << 15 | (reg) << 5 | 1 << 4)
+
+/* Maximum speed to run the device at. TODO: Get the real max value from
+ * someone at Microchip since it isn't in the datasheet. */
+#define MAX_SPI_SPEED_HZ 1000000
+
+#define printdev(X) (&X->spi->dev)
+
+static int write_short_reg(struct mrf24j40 *devrec, u8 reg, u8 value)
+{
+ int ret;
+ struct spi_message msg;
+ struct spi_transfer xfer = {
+ .len = 2,
+ .tx_buf = devrec->buf,
+ .rx_buf = devrec->buf,
+ };
+
+ spi_message_init(&msg);
+ spi_message_add_tail(&xfer, &msg);
+
+ mutex_lock(&devrec->buffer_mutex);
+ devrec->buf[0] = MRF24J40_WRITESHORT(reg);
+ devrec->buf[1] = value;
+
+ ret = spi_sync(devrec->spi, &msg);
+ if (ret)
+ dev_err(printdev(devrec),
+ "SPI write Failed for short register 0x%hhx\n", reg);
+
+ mutex_unlock(&devrec->buffer_mutex);
+ return ret;
+}
+
+static int read_short_reg(struct mrf24j40 *devrec, u8 reg, u8 *val)
+{
+ int ret = -1;
+ struct spi_message msg;
+ struct spi_transfer xfer = {
+ .len = 2,
+ .tx_buf = devrec->buf,
+ .rx_buf = devrec->buf,
+ };
+
+ spi_message_init(&msg);
+ spi_message_add_tail(&xfer, &msg);
+
+ mutex_lock(&devrec->buffer_mutex);
+ devrec->buf[0] = MRF24J40_READSHORT(reg);
+ devrec->buf[1] = 0;
+
+ ret = spi_sync(devrec->spi, &msg);
+ if (ret)
+ dev_err(printdev(devrec),
+ "SPI read Failed for short register 0x%hhx\n", reg);
+ else
+ *val = devrec->buf[1];
+
+ mutex_unlock(&devrec->buffer_mutex);
+ return ret;
+}
+
+static int read_long_reg(struct mrf24j40 *devrec, u16 reg, u8 *value)
+{
+ int ret;
+ u16 cmd;
+ struct spi_message msg;
+ struct spi_transfer xfer = {
+ .len = 3,
+ .tx_buf = devrec->buf,
+ .rx_buf = devrec->buf,
+ };
+
+ spi_message_init(&msg);
+ spi_message_add_tail(&xfer, &msg);
+
+ cmd = MRF24J40_READLONG(reg);
+ mutex_lock(&devrec->buffer_mutex);
+ devrec->buf[0] = cmd >> 8 & 0xff;
+ devrec->buf[1] = cmd & 0xff;
+ devrec->buf[2] = 0;
+
+ ret = spi_sync(devrec->spi, &msg);
+ if (ret)
+ dev_err(printdev(devrec),
+ "SPI read Failed for long register 0x%hx\n", reg);
+ else
+ *value = devrec->buf[2];
+
+ mutex_unlock(&devrec->buffer_mutex);
+ return ret;
+}
+
+static int write_long_reg(struct mrf24j40 *devrec, u16 reg, u8 val)
+{
+ int ret;
+ u16 cmd;
+ struct spi_message msg;
+ struct spi_transfer xfer = {
+ .len = 3,
+ .tx_buf = devrec->buf,
+ .rx_buf = devrec->buf,
+ };
+
+ spi_message_init(&msg);
+ spi_message_add_tail(&xfer, &msg);
+
+ cmd = MRF24J40_WRITELONG(reg);
+ mutex_lock(&devrec->buffer_mutex);
+ devrec->buf[0] = cmd >> 8 & 0xff;
+ devrec->buf[1] = cmd & 0xff;
+ devrec->buf[2] = val;
+
+ ret = spi_sync(devrec->spi, &msg);
+ if (ret)
+ dev_err(printdev(devrec),
+ "SPI write Failed for long register 0x%hx\n", reg);
+
+ mutex_unlock(&devrec->buffer_mutex);
+ return ret;
+}
+
+/* This function relies on an undocumented write method. Once a write command
+ and address is set, as many bytes of data as desired can be clocked into
+ the device. The datasheet only shows setting one byte at a time. */
+static int write_tx_buf(struct mrf24j40 *devrec, u16 reg,
+ const u8 *data, size_t length)
+{
+ int ret;
+ u16 cmd;
+ u8 lengths[2];
+ struct spi_message msg;
+ struct spi_transfer addr_xfer = {
+ .len = 2,
+ .tx_buf = devrec->buf,
+ };
+ struct spi_transfer lengths_xfer = {
+ .len = 2,
+ .tx_buf = &lengths, /* TODO: Is DMA really required for SPI? */
+ };
+ struct spi_transfer data_xfer = {
+ .len = length,
+ .tx_buf = data,
+ };
+
+ /* Range check the length. 2 bytes are used for the length fields.*/
+ if (length > TX_FIFO_SIZE-2) {
+ dev_err(printdev(devrec), "write_tx_buf() was passed too large a buffer. Performing short write.\n");
+ length = TX_FIFO_SIZE-2;
+ }
+
+ spi_message_init(&msg);
+ spi_message_add_tail(&addr_xfer, &msg);
+ spi_message_add_tail(&lengths_xfer, &msg);
+ spi_message_add_tail(&data_xfer, &msg);
+
+ cmd = MRF24J40_WRITELONG(reg);
+ mutex_lock(&devrec->buffer_mutex);
+ devrec->buf[0] = cmd >> 8 & 0xff;
+ devrec->buf[1] = cmd & 0xff;
+ lengths[0] = 0x0; /* Header Length. Set to 0 for now. TODO */
+ lengths[1] = length; /* Total length */
+
+ ret = spi_sync(devrec->spi, &msg);
+ if (ret)
+ dev_err(printdev(devrec), "SPI write Failed for TX buf\n");
+
+ mutex_unlock(&devrec->buffer_mutex);
+ return ret;
+}
+
+static int mrf24j40_read_rx_buf(struct mrf24j40 *devrec,
+ u8 *data, u8 *len, u8 *lqi)
+{
+ u8 rx_len;
+ u8 addr[2];
+ u8 lqi_rssi[2];
+ u16 cmd;
+ int ret;
+ struct spi_message msg;
+ struct spi_transfer addr_xfer = {
+ .len = 2,
+ .tx_buf = &addr,
+ };
+ struct spi_transfer data_xfer = {
+ .len = 0x0, /* set below */
+ .rx_buf = data,
+ };
+ struct spi_transfer status_xfer = {
+ .len = 2,
+ .rx_buf = &lqi_rssi,
+ };
+
+ /* Get the length of the data in the RX FIFO. The length in this
+ * register exclues the 1-byte length field at the beginning. */
+ ret = read_long_reg(devrec, REG_RX_FIFO, &rx_len);
+ if (ret)
+ goto out;
+
+ /* Range check the RX FIFO length, accounting for the one-byte
+ * length field at the begining. */
+ if (rx_len > RX_FIFO_SIZE-1) {
+ dev_err(printdev(devrec), "Invalid length read from device. Performing short read.\n");
+ rx_len = RX_FIFO_SIZE-1;
+ }
+
+ if (rx_len > *len) {
+ /* Passed in buffer wasn't big enough. Should never happen. */
+ dev_err(printdev(devrec), "Buffer not big enough. Performing short read\n");
+ rx_len = *len;
+ }
+
+ /* Set up the commands to read the data. */
+ cmd = MRF24J40_READLONG(REG_RX_FIFO+1);
+ addr[0] = cmd >> 8 & 0xff;
+ addr[1] = cmd & 0xff;
+ data_xfer.len = rx_len;
+
+ spi_message_init(&msg);
+ spi_message_add_tail(&addr_xfer, &msg);
+ spi_message_add_tail(&data_xfer, &msg);
+ spi_message_add_tail(&status_xfer, &msg);
+
+ ret = spi_sync(devrec->spi, &msg);
+ if (ret) {
+ dev_err(printdev(devrec), "SPI RX Buffer Read Failed.\n");
+ goto out;
+ }
+
+ *lqi = lqi_rssi[0];
+ *len = rx_len;
+
+#ifdef DEBUG
+ print_hex_dump(KERN_DEBUG, "mrf24j40 rx: ",
+ DUMP_PREFIX_OFFSET, 16, 1, data, *len, 0);
+ printk(KERN_DEBUG "mrf24j40 rx: lqi: %02hhx rssi: %02hhx\n",
+ lqi_rssi[0], lqi_rssi[1]);
+#endif
+
+out:
+ return ret;
+}
+
+static int mrf24j40_tx(struct ieee802154_dev *dev, struct sk_buff *skb)
+{
+ struct mrf24j40 *devrec = dev->priv;
+ u8 val;
+ int ret = 0;
+
+ dev_dbg(printdev(devrec), "tx packet of %d bytes\n", skb->len);
+
+ ret = write_tx_buf(devrec, 0x000, skb->data, skb->len);
+ if (ret)
+ goto err;
+
+ /* Set TXNTRIG bit of TXNCON to send packet */
+ ret = read_short_reg(devrec, REG_TXNCON, &val);
+ if (ret)
+ goto err;
+ val |= 0x1;
+ val &= ~0x4;
+ write_short_reg(devrec, REG_TXNCON, val);
+
+ INIT_COMPLETION(devrec->tx_complete);
+
+ /* Wait for the device to send the TX complete interrupt. */
+ ret = wait_for_completion_interruptible_timeout(
+ &devrec->tx_complete,
+ 5 * HZ);
+ if (ret == -ERESTARTSYS)
+ goto err;
+ if (ret == 0) {
+ ret = -ETIMEDOUT;
+ goto err;
+ }
+
+ /* Check for send error from the device. */
+ ret = read_short_reg(devrec, REG_TXSTAT, &val);
+ if (ret)
+ goto err;
+ if (val & 0x1) {
+ dev_err(printdev(devrec), "Error Sending. Retry count exceeded\n");
+ ret = -ECOMM; /* TODO: Better error code ? */
+ } else
+ dev_dbg(printdev(devrec), "Packet Sent\n");
+
+err:
+
+ return ret;
+}
+
+static int mrf24j40_ed(struct ieee802154_dev *dev, u8 *level)
+{
+ /* TODO: */
+ printk(KERN_WARNING "mrf24j40: ed not implemented\n");
+ *level = 0;
+ return 0;
+}
+
+static int mrf24j40_start(struct ieee802154_dev *dev)
+{
+ struct mrf24j40 *devrec = dev->priv;
+ u8 val;
+ int ret;
+
+ dev_dbg(printdev(devrec), "start\n");
+
+ ret = read_short_reg(devrec, REG_INTCON, &val);
+ if (ret)
+ return ret;
+ val &= ~(0x1|0x8); /* Clear TXNIE and RXIE. Enable interrupts */
+ write_short_reg(devrec, REG_INTCON, val);
+
+ return 0;
+}
+
+static void mrf24j40_stop(struct ieee802154_dev *dev)
+{
+ struct mrf24j40 *devrec = dev->priv;
+ u8 val;
+ int ret;
+ dev_dbg(printdev(devrec), "stop\n");
+
+ ret = read_short_reg(devrec, REG_INTCON, &val);
+ if (ret)
+ return;
+ val |= 0x1|0x8; /* Set TXNIE and RXIE. Disable Interrupts */
+ write_short_reg(devrec, REG_INTCON, val);
+
+ return;
+}
+
+static int mrf24j40_set_channel(struct ieee802154_dev *dev,
+ int page, int channel)
+{
+ struct mrf24j40 *devrec = dev->priv;
+ u8 val;
+ int ret;
+
+ dev_dbg(printdev(devrec), "Set Channel %d\n", channel);
+
+ WARN_ON(page != 0);
+ WARN_ON(channel < MRF24J40_CHAN_MIN);
+ WARN_ON(channel > MRF24J40_CHAN_MAX);
+
+ /* Set Channel TODO */
+ val = (channel-11) << 4 | 0x03;
+ write_long_reg(devrec, REG_RFCON0, val);
+
+ /* RF Reset */
+ ret = read_short_reg(devrec, REG_RFCTL, &val);
+ if (ret)
+ return ret;
+ val |= 0x04;
+ write_short_reg(devrec, REG_RFCTL, val);
+ val &= ~0x04;
+ write_short_reg(devrec, REG_RFCTL, val);
+
+ udelay(SET_CHANNEL_DELAY_US); /* per datasheet */
+
+ return 0;
+}
+
+static int mrf24j40_filter(struct ieee802154_dev *dev,
+ struct ieee802154_hw_addr_filt *filt,
+ unsigned long changed)
+{
+ struct mrf24j40 *devrec = dev->priv;
+
+ dev_dbg(printdev(devrec), "filter\n");
+
+ if (changed & IEEE802515_AFILT_SADDR_CHANGED) {
+ /* Short Addr */
+ u8 addrh, addrl;
+ addrh = filt->short_addr >> 8 & 0xff;
+ addrl = filt->short_addr & 0xff;
+
+ write_short_reg(devrec, REG_SADRH, addrh);
+ write_short_reg(devrec, REG_SADRL, addrl);
+ dev_dbg(printdev(devrec),
+ "Set short addr to %04hx\n", filt->short_addr);
+ }
+
+ if (changed & IEEE802515_AFILT_IEEEADDR_CHANGED) {
+ /* Device Address */
+ int i;
+ for (i = 0; i < 8; i++)
+ write_short_reg(devrec, REG_EADR0+i,
+ filt->ieee_addr[i]);
+
+#ifdef DEBUG
+ printk(KERN_DEBUG "Set long addr to: ");
+ for (i = 0; i < 8; i++)
+ printk("%02hhx ", filt->ieee_addr[i]);
+ printk(KERN_DEBUG "\n");
+#endif
+ }
+
+ if (changed & IEEE802515_AFILT_PANID_CHANGED) {
+ /* PAN ID */
+ u8 panidl, panidh;
+ panidh = filt->pan_id >> 8 & 0xff;
+ panidl = filt->pan_id & 0xff;
+ write_short_reg(devrec, REG_PANIDH, panidh);
+ write_short_reg(devrec, REG_PANIDL, panidl);
+
+ dev_dbg(printdev(devrec), "Set PANID to %04hx\n", filt->pan_id);
+ }
+
+ if (changed & IEEE802515_AFILT_PANC_CHANGED) {
+ /* Pan Coordinator */
+ u8 val;
+ int ret;
+
+ ret = read_short_reg(devrec, REG_RXMCR, &val);
+ if (ret)
+ return ret;
+ if (filt->pan_coord)
+ val |= 0x8;
+ else
+ val &= ~0x8;
+ write_short_reg(devrec, REG_RXMCR, val);
+
+ /* REG_SLOTTED is maintained as default (unslotted/CSMA-CA).
+ * REG_ORDER is maintained as default (no beacon/superframe).
+ */
+
+ dev_dbg(printdev(devrec), "Set Pan Coord to %s\n",
+ filt->pan_coord ? "on" : "off");
+ }
+
+ return 0;
+}
+
+static int mrf24j40_handle_rx(struct mrf24j40 *devrec)
+{
+ u8 len = RX_FIFO_SIZE;
+ u8 lqi = 0;
+ u8 val;
+ int ret = 0;
+ struct sk_buff *skb;
+
+ /* Turn off reception of packets off the air. This prevents the
+ * device from overwriting the buffer while we're reading it. */
+ ret = read_short_reg(devrec, REG_BBREG1, &val);
+ if (ret)
+ goto out;
+ val |= 4; /* SET RXDECINV */
+ write_short_reg(devrec, REG_BBREG1, val);
+
+ skb = alloc_skb(len, GFP_KERNEL);
+ if (!skb) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = mrf24j40_read_rx_buf(devrec, skb_put(skb, len), &len, &lqi);
+ if (ret < 0) {
+ dev_err(printdev(devrec), "Failure reading RX FIFO\n");
+ kfree_skb(skb);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Cut off the checksum */
+ skb_trim(skb, len-2);
+
+ /* TODO: Other drivers call ieee20154_rx_irqsafe() here (eg: cc2040,
+ * also from a workqueue). I think irqsafe is not necessary here.
+ * Can someone confirm? */
+ ieee802154_rx_irqsafe(devrec->dev, skb, lqi);
+
+ dev_dbg(printdev(devrec), "RX Handled\n");
+
+out:
+ /* Turn back on reception of packets off the air. */
+ ret = read_short_reg(devrec, REG_BBREG1, &val);
+ if (ret)
+ return ret;
+ val &= ~0x4; /* Clear RXDECINV */
+ write_short_reg(devrec, REG_BBREG1, val);
+
+ return ret;
+}
+
+static struct ieee802154_ops mrf24j40_ops = {
+ .owner = THIS_MODULE,
+ .xmit = mrf24j40_tx,
+ .ed = mrf24j40_ed,
+ .start = mrf24j40_start,
+ .stop = mrf24j40_stop,
+ .set_channel = mrf24j40_set_channel,
+ .set_hw_addr_filt = mrf24j40_filter,
+};
+
+static irqreturn_t mrf24j40_isr(int irq, void *data)
+{
+ struct mrf24j40 *devrec = data;
+
+ disable_irq_nosync(irq);
+
+ schedule_work(&devrec->irqwork);
+
+ return IRQ_HANDLED;
+}
+
+static void mrf24j40_isrwork(struct work_struct *work)
+{
+ struct mrf24j40 *devrec = container_of(work, struct mrf24j40, irqwork);
+ u8 intstat;
+ int ret;
+
+ /* Read the interrupt status */
+ ret = read_short_reg(devrec, REG_INTSTAT, &intstat);
+ if (ret)
+ goto out;
+
+ /* Check for TX complete */
+ if (intstat & 0x1)
+ complete(&devrec->tx_complete);
+
+ /* Check for Rx */
+ if (intstat & 0x8)
+ mrf24j40_handle_rx(devrec);
+
+out:
+ enable_irq(devrec->spi->irq);
+}
+
+static int __devinit mrf24j40_probe(struct spi_device *spi)
+{
+ int ret = -ENOMEM;
+ u8 val;
+ struct mrf24j40 *devrec;
+
+ printk(KERN_INFO "mrf24j40: probe(). IRQ: %d\n", spi->irq);
+
+ devrec = kzalloc(sizeof(struct mrf24j40), GFP_KERNEL);
+ if (!devrec)
+ goto err_devrec;
+ devrec->buf = kzalloc(3, GFP_KERNEL);
+ if (!devrec->buf)
+ goto err_buf;
+
+ spi->mode = SPI_MODE_0; /* TODO: Is this appropriate for right here? */
+ if (spi->max_speed_hz > MAX_SPI_SPEED_HZ)
+ spi->max_speed_hz = MAX_SPI_SPEED_HZ;
+
+ mutex_init(&devrec->buffer_mutex);
+ init_completion(&devrec->tx_complete);
+ INIT_WORK(&devrec->irqwork, mrf24j40_isrwork);
+ devrec->spi = spi;
+ dev_set_drvdata(&spi->dev, devrec);
+
+ /* Register with the 802154 subsystem */
+
+ devrec->dev = ieee802154_alloc_device(0, &mrf24j40_ops);
+ if (!devrec->dev)
+ goto err_alloc_dev;
+
+ devrec->dev->priv = devrec;
+ devrec->dev->parent = &devrec->spi->dev;
+ devrec->dev->phy->channels_supported[0] = CHANNEL_MASK;
+ devrec->dev->flags = IEEE802154_HW_OMIT_CKSUM|IEEE802154_HW_AACK;
+
+ dev_dbg(printdev(devrec), "registered mrf24j40\n");
+ ret = ieee802154_register_device(devrec->dev);
+ if (ret)
+ goto err_register_device;
+
+ /* Initialize the device.
+ From datasheet section 3.2: Initialization. */
+ write_short_reg(devrec, REG_SOFTRST, 0x07);
+ write_short_reg(devrec, REG_PACON2, 0x98);
+ write_short_reg(devrec, REG_TXSTBL, 0x95);
+ write_long_reg(devrec, REG_RFCON0, 0x03);
+ write_long_reg(devrec, REG_RFCON1, 0x01);
+ write_long_reg(devrec, REG_RFCON2, 0x80);
+ write_long_reg(devrec, REG_RFCON6, 0x90);
+ write_long_reg(devrec, REG_RFCON7, 0x80);
+ write_long_reg(devrec, REG_RFCON8, 0x10);
+ write_long_reg(devrec, REG_SLPCON1, 0x21);
+ write_short_reg(devrec, REG_BBREG2, 0x80);
+ write_short_reg(devrec, REG_CCAEDTH, 0x60);
+ write_short_reg(devrec, REG_BBREG6, 0x40);
+ write_short_reg(devrec, REG_RFCTL, 0x04);
+ write_short_reg(devrec, REG_RFCTL, 0x0);
+ udelay(192);
+
+ /* Set RX Mode. RXMCR<1:0>: 0x0 normal, 0x1 promisc, 0x2 error */
+ ret = read_short_reg(devrec, REG_RXMCR, &val);
+ if (ret)
+ goto err_read_reg;
+ val &= ~0x3; /* Clear RX mode (normal) */
+ write_short_reg(devrec, REG_RXMCR, val);
+
+ ret = request_irq(spi->irq,
+ mrf24j40_isr,
+ IRQF_TRIGGER_FALLING,
+ dev_name(&spi->dev),
+ devrec);
+
+ if (ret) {
+ dev_err(printdev(devrec), "Unable to get IRQ");
+ goto err_irq;
+ }
+
+ return 0;
+
+err_irq:
+err_read_reg:
+ ieee802154_unregister_device(devrec->dev);
+err_register_device:
+ ieee802154_free_device(devrec->dev);
+err_alloc_dev:
+ kfree(devrec->buf);
+err_buf:
+ kfree(devrec);
+err_devrec:
+ return ret;
+}
+
+static int __devexit mrf24j40_remove(struct spi_device *spi)
+{
+ struct mrf24j40 *devrec = dev_get_drvdata(&spi->dev);
+
+ dev_dbg(printdev(devrec), "remove\n");
+
+ free_irq(spi->irq, devrec);
+ flush_work_sync(&devrec->irqwork); /* TODO: Is this the right call? */
+ ieee802154_unregister_device(devrec->dev);
+ ieee802154_free_device(devrec->dev);
+ /* TODO: Will ieee802154_free_device() wait until ->xmit() is
+ * complete? */
+
+ /* Clean up the SPI stuff. */
+ dev_set_drvdata(&spi->dev, NULL);
+ kfree(devrec->buf);
+ kfree(devrec);
+ return 0;
+}
+
+static const struct spi_device_id mrf24j40_ids[] = {
+ { "mrf24j40", 0 },
+ { "mrf24j40ma", 0 },
+ { },
+};
+MODULE_DEVICE_TABLE(spi, mrf24j40_ids);
+
+static struct spi_driver mrf24j40_driver = {
+ .driver = {
+ .name = "mrf24j40",
+ .bus = &spi_bus_type,
+ .owner = THIS_MODULE,
+ },
+ .id_table = mrf24j40_ids,
+ .probe = mrf24j40_probe,
+ .remove = __devexit_p(mrf24j40_remove),
+};
+
+static int __init mrf24j40_init(void)
+{
+ return spi_register_driver(&mrf24j40_driver);
+}
+
+static void __exit mrf24j40_exit(void)
+{
+ spi_unregister_driver(&mrf24j40_driver);
+}
+
+module_init(mrf24j40_init);
+module_exit(mrf24j40_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Alan Ott");
+MODULE_DESCRIPTION("MRF24J40 SPI 802.15.4 Controller Driver");
--
1.7.11.2
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
^ permalink raw reply related
* [PATCH v7 0/1] Driver for Microchip MRF24J40
From: Alan Ott @ 2012-09-03 1:44 UTC (permalink / raw)
To: Alexander Smirnov, Dmitry Eremin-Solenikov, David S. Miller,
Tony Cheneau
Cc: linux-zigbee-devel, netdev, linux-kernel, Alan Ott
In-Reply-To: <1346300194-3286-1-git-send-email-alan@signal11.us>
This is a driver for the Microchip MRF24J40 802.15.4 transceiver.
This is the same as v6 except that it now puts the driver in
drivers/net/ieee802154 instead of drivers/ieee802154, since
drivers/ieee802154 was recently moved to drivers/net/ieee802154 in
0739d643b8dda866d1011bcf
Full history and discussion can be found on the archives of the linux-zigbee
mailing list:
http://www.mail-archive.com/linux-zigbee-devel@lists.sourceforge.net/msg01014.html
Alan Ott (1):
ieee802154: MRF24J40 driver
drivers/net/ieee802154/Kconfig | 11 +
drivers/net/ieee802154/Makefile | 1 +
drivers/net/ieee802154/mrf24j40.c | 767 ++++++++++++++++++++++++++++++++++++++
3 files changed, 779 insertions(+)
create mode 100644 drivers/net/ieee802154/mrf24j40.c
--
1.7.11.2
^ permalink raw reply
* Re: [PATCH v2 1/2] tcp: add generic netlink support for tcp_metrics
From: Eric Dumazet @ 2012-09-03 0:45 UTC (permalink / raw)
To: Julian Anastasov
Cc: David Miller, netdev, Stephen Hemminger, Paul E. McKenney
In-Reply-To: <1346564178-1794-2-git-send-email-ja@ssi.bg>
On Sun, 2012-09-02 at 08:36 +0300, Julian Anastasov wrote:
> +
> +static int tcp_metrics_flush_all(struct net *net)
> +{
> + unsigned int max_rows = 1U << net->ipv4.tcp_metrics_hash_log;
> + struct tcpm_hash_bucket *hb = net->ipv4.tcp_metrics_hash;
> + struct tcp_metrics_block *tm;
> + unsigned int sync_count = 0;
> + unsigned int row;
> +
> + for (row = 0; row < max_rows; row++, hb++) {
> + spin_lock_bh(&tcp_metrics_lock);
> + tm = deref_locked_genl(hb->chain);
> + if (tm)
> + hb->chain = NULL;
> + spin_unlock_bh(&tcp_metrics_lock);
> + while (tm) {
> + struct tcp_metrics_block *next;
> +
> + next = deref_genl(tm->tcpm_next);
> + kfree_rcu(tm, rcu_head);
> + if (!((++sync_count) & 2047))
> + synchronize_rcu();
> + tm = next;
> + }
> + }
> + return 0;
> +}
It looks like the synchronize_rcu() call is not exactly what you wanted,
but then net/ipv4/fib_trie.c has the same mistake.
What we want here is to force pending call_rcu() calls to complete, so
that we dont consume too much memory. So it would probably better to
call rcu_barrier() instead.
If other cpus are idle or outside of rcu read lock sections,
synchronize_rcu() should basically do nothing at all.
But I am not sure its worth the trouble ?
Commit c3059477fce2d956a0bb3e04357324780c5d8eeb (ipv4: Use
synchronize_rcu() during trie_rebalance()) was needed because FIB TRIE
can really use huge amounts of memory, thats hardly the case with
tcp_metrics.
^ permalink raw reply
* Re: [PATCH] fq_codel: dont reinit flow state
From: Dave Taht @ 2012-09-03 0:08 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1346505597.7996.76.camel@edumazet-glaptop>
On Sat, Sep 1, 2012 at 6:19 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> When fq_codel builds a new flow, it should not reset codel state.
>
> Codel algo needs to get previous values (lastcount, drop_next) to get
> proper behavior.
>
> Signed-off-by: Dave Taht <dave.taht@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> net/sched/sch_fq_codel.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
> index 9fc1c62..4e606fc 100644
> --- a/net/sched/sch_fq_codel.c
> +++ b/net/sched/sch_fq_codel.c
> @@ -191,7 +191,6 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>
> if (list_empty(&flow->flowchain)) {
> list_add_tail(&flow->flowchain, &q->new_flows);
> - codel_vars_init(&flow->cvars);
> q->new_flow_count++;
> flow->deficit = q->quantum;
> flow->dropped = 0;
> @@ -418,6 +417,7 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
> struct fq_codel_flow *flow = q->flows + i;
>
> INIT_LIST_HEAD(&flow->flowchain);
> + codel_vars_init(&flow->cvars);
> }
> }
> if (sch->limit >= 1)
>
>
I can live with this as is. flow->dropped will probably go away in the
long term anyway.
Acked-by: Dave Taht <dave.taht@bufferbloat.net>
^ permalink raw reply
* [PATCH v2] netfilter: take care of timewait sockets
From: Eric Dumazet @ 2012-09-02 23:48 UTC (permalink / raw)
To: Florian Westphal; +Cc: Sami Farin, netdev, e1000-devel
In-Reply-To: <1346626385.2563.44.camel@edumazet-glaptop>
Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
full blown socket.
But with TCP early demux, we can have skb->sk pointing to a timewait
socket.
Same fix is needed in netfnetlink_log
Diagnosed-by: Florian Westphal <fw@strlen.de>
Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/netfilter/nfnetlink_log.c | 14 +++++++------
net/netfilter/xt_LOG.c | 33 ++++++++++++++++----------------
2 files changed, 25 insertions(+), 22 deletions(-)
diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 14e2f39..5cfb5be 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -381,6 +381,7 @@ __build_packet_message(struct nfulnl_instance *inst,
struct nlmsghdr *nlh;
struct nfgenmsg *nfmsg;
sk_buff_data_t old_tail = inst->skb->tail;
+ struct sock *sk;
nlh = nlmsg_put(inst->skb, 0, 0,
NFNL_SUBSYS_ULOG << 8 | NFULNL_MSG_PACKET,
@@ -499,18 +500,19 @@ __build_packet_message(struct nfulnl_instance *inst,
}
/* UID */
- if (skb->sk) {
- read_lock_bh(&skb->sk->sk_callback_lock);
- if (skb->sk->sk_socket && skb->sk->sk_socket->file) {
- struct file *file = skb->sk->sk_socket->file;
+ sk = skb->sk;
+ if (sk && sk->sk_state != TCP_TIME_WAIT) {
+ read_lock_bh(&sk->sk_callback_lock);
+ if (sk->sk_socket && sk->sk_socket->file) {
+ struct file *file = sk->sk_socket->file;
__be32 uid = htonl(file->f_cred->fsuid);
__be32 gid = htonl(file->f_cred->fsgid);
- read_unlock_bh(&skb->sk->sk_callback_lock);
+ read_unlock_bh(&sk->sk_callback_lock);
if (nla_put_be32(inst->skb, NFULA_UID, uid) ||
nla_put_be32(inst->skb, NFULA_GID, gid))
goto nla_put_failure;
} else
- read_unlock_bh(&skb->sk->sk_callback_lock);
+ read_unlock_bh(&sk->sk_callback_lock);
}
/* local sequence number */
diff --git a/net/netfilter/xt_LOG.c b/net/netfilter/xt_LOG.c
index ff5f75f..2a4f969 100644
--- a/net/netfilter/xt_LOG.c
+++ b/net/netfilter/xt_LOG.c
@@ -145,6 +145,19 @@ static int dump_tcp_header(struct sbuff *m, const struct sk_buff *skb,
return 0;
}
+static void dump_sk_uid_gid(struct sbuff *m, struct sock *sk)
+{
+ if (!sk || sk->sk_state == TCP_TIME_WAIT)
+ return;
+
+ read_lock_bh(&sk->sk_callback_lock);
+ if (sk->sk_socket && sk->sk_socket->file)
+ sb_add(m, "UID=%u GID=%u ",
+ sk->sk_socket->file->f_cred->fsuid,
+ sk->sk_socket->file->f_cred->fsgid);
+ read_unlock_bh(&sk->sk_callback_lock);
+}
+
/* One level of recursion won't kill us */
static void dump_ipv4_packet(struct sbuff *m,
const struct nf_loginfo *info,
@@ -361,14 +374,8 @@ static void dump_ipv4_packet(struct sbuff *m,
}
/* Max length: 15 "UID=4294967295 " */
- if ((logflags & XT_LOG_UID) && !iphoff && skb->sk) {
- read_lock_bh(&skb->sk->sk_callback_lock);
- if (skb->sk->sk_socket && skb->sk->sk_socket->file)
- sb_add(m, "UID=%u GID=%u ",
- skb->sk->sk_socket->file->f_cred->fsuid,
- skb->sk->sk_socket->file->f_cred->fsgid);
- read_unlock_bh(&skb->sk->sk_callback_lock);
- }
+ if ((logflags & XT_LOG_UID) && !iphoff)
+ dump_sk_uid_gid(m, skb->sk);
/* Max length: 16 "MARK=0xFFFFFFFF " */
if (!iphoff && skb->mark)
@@ -717,14 +724,8 @@ static void dump_ipv6_packet(struct sbuff *m,
}
/* Max length: 15 "UID=4294967295 " */
- if ((logflags & XT_LOG_UID) && recurse && skb->sk) {
- read_lock_bh(&skb->sk->sk_callback_lock);
- if (skb->sk->sk_socket && skb->sk->sk_socket->file)
- sb_add(m, "UID=%u GID=%u ",
- skb->sk->sk_socket->file->f_cred->fsuid,
- skb->sk->sk_socket->file->f_cred->fsgid);
- read_unlock_bh(&skb->sk->sk_callback_lock);
- }
+ if ((logflags & XT_LOG_UID) && recurse)
+ dump_sk_uid_gid(m, skb->sk);
/* Max length: 16 "MARK=0xFFFFFFFF " */
if (!recurse && skb->mark)
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply related
* Re: (ipt_log_packet, sb_add) 3.6.0-rc2 kernel panic - not syncing; Fatal exception in interrupt
From: Eric Dumazet @ 2012-09-02 23:37 UTC (permalink / raw)
To: Florian Westphal; +Cc: Sami Farin, netdev, e1000-devel
In-Reply-To: <1346626385.2563.44.camel@edumazet-glaptop>
On Mon, 2012-09-03 at 00:53 +0200, Eric Dumazet wrote:
> First I thought changing demux to not do the lookup of TIMEWAIT slot in
> __inet_lookup_established(), then it sounds not optimal to redo the full
> lookup for ESTABLISHED sockets later in TCP stack.
>
> So it seems we should fix xt_LOG instead
>
> Thanks Florian !
>
> [PATCH] xt_LOG: take care of timewait sockets
>
> Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
> full blown socket.
>
> But with TCP early demux, we can have skb->sk pointing to a timewait
> socket.
Seems other fixes are needed, for example in nfnetlink_log.c
I'll send a v2
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH] fq_codel: dont reinit flow state
From: Eric Dumazet @ 2012-09-02 23:17 UTC (permalink / raw)
To: Dave Taht; +Cc: David Miller, netdev
In-Reply-To: <CAA93jw6mdAaomNfNWt0D4MZYFqXZ3=P+HYe9AEkvwNkAH84GSQ@mail.gmail.com>
On Sat, 2012-09-01 at 11:12 -0700, Dave Taht wrote:
> partial NAK.
Kind of weird to NACK a patch you are the author.
It makes both of us plain fools on netdev, dont do that.
> There is no need to reset dropped either, and resetting quantum or not
> is an interesting question.
>
Yes but totally unrelated to this patch.
It would be good you pollute threads.
Open a new thread, using RFC attribute or something.
> (in both cases they can be initted at init time)
>
> Lastly, stats->maxpacket is presently a shared resource in fq_codel,
> and the initial setting of 256 was an artifact some test ns2 code
> (according to kathie). In the case of ack traffic owning a queue, this
> can be 64 or 66 bytes (or larger on ipv6 acks).
>
I really hope you understand how this maxpacket thing is useless ?
In the real life, we dont send only 66 bytes packets.
maxpacket purpose is to relax codel so that it doesnt drop packet if
the qdisc backlog is small (so about to be empty).
Rename this maxpacket to 'relax_codel_under_that_size' if you want to.
> It's totally unclear if this distinction is important in fq_codel at
> present. :/. Obviously in codel which is a pure FIFO, maxpacket will
> hit the max size really fast.
>
> A related problem on maxpacket is if someone turns off tso/gso/ufo, it
> remains set to the largest packet seen forever until the qdisc is
> re-initialized.
> This leads to really puzzling behavior... (I'll submit a patch for
> this shortly but not change fq_codel's
> centralized stats)
>
> I'd be happy (for now) if dropped was preserved,
> and to fiddle with the other ideas a while longer.
>
There is no dropped field, what are you talking about ?
If you are now unsure of the patch, I can repost the same patch but
as the author.
Its a real obvious one. After an idle period, CODEL needs to compute the
delay between now and previous 'drop_next'
Codel doesnt call codel_vars_init() but in the qdisc init.
fq_codel being FQ + CODEL, it is an error to call codel_vars_init()
outside of init path.
^ permalink raw reply
* Re: [PATCH] codel: reduce default maxpacket and reinitialize upon re-entering drop state
From: Eric Dumazet @ 2012-09-02 23:05 UTC (permalink / raw)
To: Dave Täht; +Cc: netdev
In-Reply-To: <1346525457-31481-1-git-send-email-dave.taht@bufferbloat.net>
On Sat, 2012-09-01 at 11:50 -0700, Dave Täht wrote:
> From: Dave Taht <dave.taht@bufferbloat.net>
>
> The minimum size packet in a queue is 64, not 256.
>
> Also, as ethtool can be used to remove tso/gso/ufo, the last seen
> maxpacket can vary considerably in size (e.g. 64k). So reset it
> on re-entering the drop scheduler to the most recent packet size,
> and let it grow again naturally.
>
> Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>
> ---
> include/net/codel.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/codel.h b/include/net/codel.h
> index 389cf62..0b0ac5e 100644
> --- a/include/net/codel.h
> +++ b/include/net/codel.h
> @@ -170,7 +170,7 @@ static void codel_vars_init(struct codel_vars *vars)
>
> static void codel_stats_init(struct codel_stats *stats)
> {
> - stats->maxpacket = 256;
> + stats->maxpacket = 64;
> }
>
> /*
> @@ -333,6 +333,7 @@ static struct sk_buff *codel_dequeue(struct Qdisc *sch,
> */
> codel_Newton_step(vars);
> } else {
> + stats->maxpacket = skb ? qdisc_pkt_len(skb) : 64;
> vars->count = 1;
> vars->rec_inv_sqrt = ~0U >> REC_INV_SQRT_SHIFT;
> }
This has very litle value.
maxpacket is mainly an information given to "tc -s qdisc".
It serves as a debugging stuff, to know the max packet size seen on
codel qdisc.
So you can set it to 0 as far as I am concerned, but not reset it in
codel_enqueue(), as it adds runtime overhead and a branch misprediction.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox