* [PATCH net-next v1 1/6] tg3: fix possible infinite loop
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
@ 2011-12-16 18:19 ` David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 2/6] tg3: Remove IRQF_SAMPLE_RANDOM flag from internal tests David Decotigny
` (5 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: David Decotigny @ 2011-12-16 18:19 UTC (permalink / raw)
To: Matt Carlson, Michael Chan, netdev, linux-kernel
Cc: Javier Martinez Canillas, Robin Getz, Matt Mackall,
David Decotigny
Found by browsing the code.
Tested:
Not tested along the affected path. No regression observed.
Signed-off-by: David Decotigny <decot@googlers.com>
---
drivers/net/ethernet/broadcom/tg3.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 8bf11ca..e04c4f9 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -13877,8 +13877,10 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
continue;
}
if (pci_id->rev != PCI_ANY_ID) {
- if (bridge->revision > pci_id->rev)
+ if (bridge->revision > pci_id->rev) {
+ pci_id++;
continue;
+ }
}
if (bridge->subordinate &&
(bridge->subordinate->number ==
--
1.7.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH net-next v1 2/6] tg3: Remove IRQF_SAMPLE_RANDOM flag from internal tests
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 1/6] tg3: fix possible infinite loop David Decotigny
@ 2011-12-16 18:19 ` David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 3/6] tg3: Implement adaptive interrupt coalescing David Decotigny
` (4 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: David Decotigny @ 2011-12-16 18:19 UTC (permalink / raw)
To: Matt Carlson, Michael Chan, netdev, linux-kernel
Cc: Javier Martinez Canillas, Robin Getz, Matt Mackall,
Maciej Żenczykowski, David Decotigny
From: Maciej Żenczykowski <maze@google.com>
This complements commit ab392d2d6d4 ("Remove IRQF_SAMPLE_RANDOM flag
from network drivers"), removing IRQF_SAMPLE_RANDOM flag from internal
and self tests.
Tested: no visible regression on 1 actual host + ethtool -t passes.
Signed-off-by: David Decotigny <decot@googlers.com>
---
drivers/net/ethernet/broadcom/tg3.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index e04c4f9..a65b419 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -9387,7 +9387,7 @@ static int tg3_test_interrupt(struct tg3 *tp)
}
err = request_irq(tnapi->irq_vec, tg3_test_isr,
- IRQF_SHARED | IRQF_SAMPLE_RANDOM, dev->name, tnapi);
+ IRQF_SHARED, dev->name, tnapi);
if (err)
return err;
--
1.7.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH net-next v1 3/6] tg3: Implement adaptive interrupt coalescing
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 1/6] tg3: fix possible infinite loop David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 2/6] tg3: Remove IRQF_SAMPLE_RANDOM flag from internal tests David Decotigny
@ 2011-12-16 18:19 ` David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 4/6] tg3: move functions related to reset_task together David Decotigny
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: David Decotigny @ 2011-12-16 18:19 UTC (permalink / raw)
To: Matt Carlson, Michael Chan, netdev, linux-kernel
Cc: Javier Martinez Canillas, Robin Getz, Matt Mackall, Ying Cai,
David Decotigny
From: Ying Cai <ycai@google.com>
Implement adaptive coalescing in the tg3 driver. On an opteron-based
test system, interrupt rate can be reduced from 40K intrs/sec to less
than 10K intrs/sec on netperf tests, with netperf performances gaining
2 to 14% (depending on the tests).
Example with 200 netperf streams in parallel for 100s, irq/s measured
at the netperf client host, netperf figure is cumulative on all 200
streams:
Without this patch:
TCP_RR netperf=284208 eth0 irq/s=55141.9
TCP_CRR netperf=32204.5 eth0 irq/s=15727.7
TCP_MAERTS netperf=944.68 eth0 irq/s=16255.5
pktgen loopback (pkt_size 60) 484718 pps
With patch:
TCP_RR netperf=317511 (111.72%) eth0 irq/s=8307.77 (15.07%)
TCP_CRR netperf=35890.1 (111.44%) eth0 irq/s=4390.89 (27.92%)
TCP_MAERTS netperf=971.64 (102.85%) eth0 irq/s=8135.58 (50.05%)
pktgen loopback (pkt_size 60) 552185 pps (113.92%)
Signed-off-by: David Decotigny <decot@googlers.com>
---
drivers/net/ethernet/broadcom/tg3.c | 174 +++++++++++++++++++++++++++++++++++
drivers/net/ethernet/broadcom/tg3.h | 37 ++++++++
2 files changed, 211 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index a65b419..9deb6a6 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -52,6 +52,7 @@
#include <linux/io.h>
#include <asm/byteorder.h>
#include <linux/uaccess.h>
+#include <linux/jiffies.h>
#ifdef CONFIG_SPARC
#include <asm/idprom.h>
@@ -413,6 +414,9 @@ static const struct {
#define TG3_NUM_TEST ARRAY_SIZE(ethtool_test_keys)
+static inline void tg3_full_lock(struct tg3 *tp, int irq_sync);
+static inline void tg3_full_unlock(struct tg3 *tp);
+
static void tg3_write32(struct tg3 *tp, u32 off, u32 val)
{
writel(val, tp->regs + off);
@@ -5503,6 +5507,168 @@ static void tg3_recycle_rx(struct tg3_napi *tnapi,
src_map->data = NULL;
}
+static inline int tg3_coal_adaptive_init(struct tg3 *tp)
+{
+ tp->ad.rx_jiffies = jiffies;
+ tp->ad.rx_interval = 0;
+ tp->ad.rx_frames = 0;
+ tp->ad.rx_average_frames = 0;
+ tp->ad.rx_reg_frames = TG3_COAL_RX_FRAMES;
+ tp->ad.rx_adaptive_mode = TG3_COAL_ADAPTIVE_MODE;
+ tp->ad.rx_frames_high = TG3_COAL_ADAPTIVE_MAX_FRAMES;
+ tp->ad.rx_usecs_high = TG3_COAL_ADAPTIVE_MAX_USECS;
+ tp->ad.rx_sample_interval = msecs_to_jiffies(TG3_COAL_ADAPTIVE_SAMPLE);
+
+ /* over-write tg3 stored coalescing values. Using 2.6.11 tg3
+ * adaptive coalescing values
+ */
+ tp->coal.rx_coalesce_usecs = TG3_COAL_RX_TICKS;
+ tp->coal.tx_coalesce_usecs = TG3_COAL_TX_TICKS;
+ tp->coal.rx_max_coalesced_frames = TG3_COAL_RX_FRAMES;
+ tp->coal.tx_max_coalesced_frames = TG3_COAL_TX_FRAMES;
+ tp->coal.rx_max_coalesced_frames_irq = TG3_COAL_RX_FRAMES;
+
+ return 0;
+}
+
+static int tg3_coal_adaptive_set(struct net_device *dev,
+ struct ethtool_coalesce *cmd)
+{
+ struct tg3 *tp = netdev_priv(dev);
+ int i = 0;
+
+ tg3_full_lock(tp, 0);
+ /* changing adaptive coalescing resets the rx frames coalescing to
+ * the default value. turning off adaptive coalescing, means use the
+ * default behavior, while turning it on means starts computing now....
+ * note that this is done before setting any hardcoded values, thus
+ * allowing a single call to turn off adaptive coalescing and setting
+ * a new value for hard coded (static) coalescing
+ */
+
+ /* reset host coalescing engine. */
+ tw32(HOSTCC_MODE, 0);
+ for (i = 0; i < 2000; i++) {
+ if (!(tr32(HOSTCC_MODE) & HOSTCC_MODE_ENABLE))
+ break;
+ udelay(10);
+ }
+
+ if (tp->ad.rx_adaptive_mode != cmd->use_adaptive_rx_coalesce) {
+ tp->ad.rx_jiffies = jiffies;
+ tp->ad.rx_frames = 0;
+ tp->ad.rx_interval = 0;
+ tp->ad.rx_average_frames = 0;
+ }
+
+ tp->ad.rx_adaptive_mode = cmd->use_adaptive_rx_coalesce;
+ tp->ad.rx_usecs_high = cmd->rx_coalesce_usecs_high;
+ tp->ad.rx_frames_high = cmd->rx_max_coalesced_frames_high;
+ tp->ad.rx_sample_interval = msecs_to_jiffies(cmd->rate_sample_interval);
+
+ tw32(HOSTCC_MODE, HOSTCC_MODE_ENABLE | tp->coalesce_mode);
+
+ tg3_full_unlock(tp);
+
+ return 0;
+}
+
+static int tg3_coal_adaptive_get(struct net_device *dev,
+ struct ethtool_coalesce *cmd)
+{
+ struct tg3 *tp = netdev_priv(dev);
+
+ if (tp->ad.rx_adaptive_mode) {
+ tg3_full_lock(tp, 0);
+
+ cmd->rx_coalesce_usecs = tr32(HOSTCC_RXCOL_TICKS);
+ cmd->rx_max_coalesced_frames = tr32(HOSTCC_RXMAX_FRAMES);
+
+ cmd->tx_coalesce_usecs = tr32(HOSTCC_TXCOL_TICKS);
+ cmd->tx_max_coalesced_frames = tr32(HOSTCC_TXMAX_FRAMES);
+
+ cmd->rx_max_coalesced_frames_irq = tr32(HOSTCC_RXCOAL_MAXF_INT);
+ cmd->tx_max_coalesced_frames_irq = tr32(HOSTCC_TXCOAL_MAXF_INT);
+
+ cmd->use_adaptive_rx_coalesce = tp->ad.rx_adaptive_mode;
+ cmd->rx_coalesce_usecs_high = tp->ad.rx_usecs_high;
+ cmd->rx_max_coalesced_frames_high = tp->ad.rx_frames_high;
+ cmd->rate_sample_interval = jiffies_to_msecs(
+ tp->ad.rx_sample_interval);
+
+ tg3_full_unlock(tp);
+ }
+ return 0;
+}
+
+static inline int tg3_coal_adaptive_rx(struct tg3 *tp, int received)
+{
+ unsigned long cur_jiffies = jiffies;
+
+ unsigned long diff = cur_jiffies - tp->ad.rx_jiffies;
+
+ tp->ad.rx_interval += diff;
+
+ tp->ad.rx_jiffies = cur_jiffies;
+ tp->ad.rx_frames += received;
+
+ if ((tp->ad.rx_interval >= tp->ad.rx_sample_interval) &&
+ (0 != tp->ad.rx_interval)) {
+ unsigned long rx_rate;
+
+ /* average packet per ms */
+ tp->ad.rx_frames /= jiffies_to_msecs(tp->ad.rx_interval);
+ /* apply coalescing factor */
+ tp->ad.rx_frames >>= TG3_COAL_FACTOR_EXP;
+
+ /* compute a running average of the packet rate
+ * the goal of the running average is to provide a faster
+ * response to lowering rate compare to slowly increasing rate.
+ *
+ * if the new sample interval rate as decreased from the running
+ * average, set the register coalescing to new sample interval
+ * rate. else, average the new rate with the running average
+ */
+ tp->ad.rx_average_frames += tp->ad.rx_frames;
+ tp->ad.rx_average_frames >>= 1;
+
+ if (tp->ad.rx_frames <= tp->ad.rx_average_frames)
+ rx_rate = tp->ad.rx_frames;
+ else
+ rx_rate = tp->ad.rx_average_frames;
+
+ /* adjust based on max values. Also do not set a '0' value in
+ * the average frames, always set the average to at least one
+ * frame. From the BCM documentation it is recommended to set
+ * the register value to get an interrupt for every rx packet.
+ * we could use 0, which would disable coalescing and should
+ * have the same result.
+ */
+ if (rx_rate > tp->ad.rx_frames_high)
+ rx_rate = tp->ad.rx_frames_high;
+ else if (0 == rx_rate)
+ rx_rate = 1;
+
+ if (rx_rate != tp->ad.rx_reg_frames) {
+ unsigned long rx_usecs;
+ rx_usecs = rx_rate * TG3_COAL_TICK_PER_FRAME;
+ if (rx_usecs > tp->ad.rx_usecs_high)
+ rx_usecs = tp->ad.rx_usecs_high;
+
+ tw32(HOSTCC_RXCOL_TICKS, rx_usecs);
+ tw32(HOSTCC_RXMAX_FRAMES, rx_rate);
+ tw32(HOSTCC_RXCOAL_MAXF_INT, rx_rate);
+ tp->ad.rx_reg_frames = rx_rate;
+ }
+
+ tp->ad.rx_frames = 0;
+ tp->ad.rx_interval = 0;
+
+ }
+
+ return 0;
+}
+
/* The RX ring scheme is composed of multiple rings which post fresh
* buffers to the chip, and one special ring the chip uses to report
* status back to the host.
@@ -5679,6 +5845,9 @@ next_pkt_nopost:
}
}
+ if (TG3_COAL_ADAPTIVE_ON == tp->ad.rx_adaptive_mode)
+ tg3_coal_adaptive_rx(tp, received);
+
/* ACK the status ring. */
tnapi->rx_rcb_ptr = sw_idx;
tw32_rx_mbox(tnapi->consmbox, sw_idx);
@@ -11866,6 +12035,7 @@ static int tg3_get_coalesce(struct net_device *dev, struct ethtool_coalesce *ec)
struct tg3 *tp = netdev_priv(dev);
memcpy(ec, &tp->coal, sizeof(*ec));
+ tg3_coal_adaptive_get(dev, ec);
return 0;
}
@@ -11915,6 +12085,8 @@ static int tg3_set_coalesce(struct net_device *dev, struct ethtool_coalesce *ec)
tp->coal.tx_max_coalesced_frames_irq = ec->tx_max_coalesced_frames_irq;
tp->coal.stats_block_coalesce_usecs = ec->stats_block_coalesce_usecs;
+ tg3_coal_adaptive_set(dev, ec);
+
if (netif_running(dev)) {
tg3_full_lock(tp, 0);
__tg3_set_coalesce(tp, &tp->coal);
@@ -15319,6 +15491,8 @@ static void __devinit tg3_init_coal(struct tg3 *tp)
ec->tx_coalesce_usecs_irq = 0;
ec->stats_block_coalesce_usecs = 0;
}
+
+ tg3_coal_adaptive_init(tp);
}
static const struct net_device_ops tg3_netdev_ops = {
diff --git a/drivers/net/ethernet/broadcom/tg3.h b/drivers/net/ethernet/broadcom/tg3.h
index aea8f72..695cf14 100644
--- a/drivers/net/ethernet/broadcom/tg3.h
+++ b/drivers/net/ethernet/broadcom/tg3.h
@@ -3211,6 +3211,43 @@ struct tg3 {
struct ethtool_coalesce coal;
+ struct {
+ unsigned long rx_jiffies; /* last read jiffies */
+ unsigned long rx_interval; /* rcv interval in jiffies */
+ unsigned long rx_reg_frames; /* current register value */
+ unsigned long rx_frames; /* received frame in interval*/
+ unsigned long rx_average_frames; /* computed received average */
+ unsigned int rx_adaptive_mode; /* adaptive mode on/off */
+ unsigned long rx_frames_high; /* max coalescing frame in */
+ /* adaptive mode */
+ unsigned long rx_usecs_high; /* max usecs in adaptive */
+ /* mode */
+ unsigned long rx_sample_interval;/* adaptive sample rate */
+ /* in jiffies (msecs) */
+ } ad; /* adaptive coalescing */
+
+#define TG3_COAL_ADAPTIVE_ON 1
+#define TG3_COAL_ADAPTIVE_OFF 0
+#define TG3_COAL_ADAPTIVE_MODE TG3_COAL_ADAPTIVE_ON
+
+/* Ticks are in TG3 register units and not in system units */
+#define TG3_COAL_FACTOR_EXP 3 /* coalescence factor, num irq per ms max. */
+ /* this is the exponent of a power of two. */
+
+#define TG3_COAL_TICK_PER_FRAME 10 /* tick per frame, num of us per tick */
+ /* per frame. Used tg3 ms tick unit */
+
+#define TG3_COAL_ADAPTIVE_MAX_FRAMES 32
+#define TG3_COAL_ADAPTIVE_MAX_USECS (TG3_COAL_ADAPTIVE_MAX_FRAMES \
+ << TG3_COAL_TICK_PER_FRAME)
+
+#define TG3_COAL_ADAPTIVE_SAMPLE 10 /* ms samples */
+
+#define TG3_COAL_TX_FRAMES 32 /* must be <= to 1/2 TG3_TX_RING_SIZE */
+#define TG3_COAL_TX_TICKS (TG3_COAL_TX_FRAMES * TG3_COAL_TICK_PER_FRAME)
+#define TG3_COAL_RX_FRAMES 1
+#define TG3_COAL_RX_TICKS (TG3_COAL_RX_FRAMES * TG3_COAL_TICK_PER_FRAME)
+
/* firmware info */
const char *fw_needed;
const struct firmware *fw;
--
1.7.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH net-next v1 4/6] tg3: move functions related to reset_task together
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
` (2 preceding siblings ...)
2011-12-16 18:19 ` [PATCH net-next v1 3/6] tg3: Implement adaptive interrupt coalescing David Decotigny
@ 2011-12-16 18:19 ` David Decotigny
2011-12-16 18:19 ` [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode David Decotigny
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: David Decotigny @ 2011-12-16 18:19 UTC (permalink / raw)
To: Matt Carlson, Michael Chan, netdev, linux-kernel
Cc: Javier Martinez Canillas, Robin Getz, Matt Mackall,
David Decotigny
This prepares next patches: simply move functions related to
reset_task close to its definition.
Signed-off-by: David Decotigny <decot@googlers.com>
---
drivers/net/ethernet/broadcom/tg3.c | 25 +++++++++++++------------
1 files changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 9deb6a6..ecd6ea5 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -416,6 +416,7 @@ static const struct {
static inline void tg3_full_lock(struct tg3 *tp, int irq_sync);
static inline void tg3_full_unlock(struct tg3 *tp);
+static inline void tg3_reset_task_schedule(struct tg3 *tp);
static void tg3_write32(struct tg3 *tp, u32 off, u32 val)
{
@@ -6080,18 +6081,6 @@ static int tg3_poll_work(struct tg3_napi *tnapi, int work_done, int budget)
return work_done;
}
-static inline void tg3_reset_task_schedule(struct tg3 *tp)
-{
- if (!test_and_set_bit(TG3_FLAG_RESET_TASK_PENDING, tp->tg3_flags))
- schedule_work(&tp->reset_task);
-}
-
-static inline void tg3_reset_task_cancel(struct tg3 *tp)
-{
- cancel_work_sync(&tp->reset_task);
- tg3_flag_clear(tp, RESET_TASK_PENDING);
-}
-
static int tg3_poll_msix(struct napi_struct *napi, int budget)
{
struct tg3_napi *tnapi = container_of(napi, struct tg3_napi, napi);
@@ -6543,6 +6532,18 @@ out:
tg3_flag_clear(tp, RESET_TASK_PENDING);
}
+static inline void tg3_reset_task_schedule(struct tg3 *tp)
+{
+ if (!test_and_set_bit(TG3_FLAG_RESET_TASK_PENDING, tp->tg3_flags))
+ schedule_work(&tp->reset_task);
+}
+
+static inline void tg3_reset_task_cancel(struct tg3 *tp)
+{
+ cancel_work_sync(&tp->reset_task);
+ tg3_flag_clear(tp, RESET_TASK_PENDING);
+}
+
static void tg3_tx_timeout(struct net_device *dev)
{
struct tg3 *tp = netdev_priv(dev);
--
1.7.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
` (3 preceding siblings ...)
2011-12-16 18:19 ` [PATCH net-next v1 4/6] tg3: move functions related to reset_task together David Decotigny
@ 2011-12-16 18:19 ` David Decotigny
2011-12-16 19:30 ` Ben Hutchings
` (2 more replies)
2011-12-16 18:19 ` [PATCH net-next v1 6/6] tg3: use netif_tx_start_queue instead of wake_queue when no reschedule needed David Decotigny
2011-12-16 18:42 ` [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Miller
6 siblings, 3 replies; 11+ messages in thread
From: David Decotigny @ 2011-12-16 18:19 UTC (permalink / raw)
To: Matt Carlson, Michael Chan, netdev, linux-kernel
Cc: Javier Martinez Canillas, Robin Getz, Matt Mackall, Tom Herbert,
David Decotigny
From: Tom Herbert <therbert@google.com>
The tg3 NIC has a hard limit of 511 descriptors for the receive ring.
Under heavy load of small packets, this device receive queue may not
be serviced fast enough to prevent packets drops. This could be due
to a variety of reasons such as lengthy processing delays of packets
in the stack, softirqs being disabled too long, etc. If the driver is
run in non-NAPI mode, the RX queue is serviced in the device
interrupt, which is much less likely to be deferred for a substantial
period of time.
There are some effects in not using NAPI that need to be considered.
It does increase the chance of live-lock in interrupt handler,
although since the tg3 does interrupt coalescing this is very unlikely
to occur. Also, more code is being run with interrupts disabled
potentially deferring other hardware interrupts. The amount of time
spent in the interrupt handler should be minimized by dequeuing
packets of the device queue and queuing them to a host queue as
quickly as possible.
The default mode of operation remains NAPI and its performances are
kept unchanged (code unchanged). Non-NAPI mode is enabled by
commenting-out CONFIG_TIGON3_NAPI Kconfig parameter.
Signed-off-by: David Decotigny <decot@googlers.com>
---
drivers/net/ethernet/broadcom/Kconfig | 8 ++
drivers/net/ethernet/broadcom/tg3.c | 151 +++++++++++++++++++++++++++++++--
drivers/net/ethernet/broadcom/tg3.h | 5 +
3 files changed, 157 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig
index f15e72e..e808b3d 100644
--- a/drivers/net/ethernet/broadcom/Kconfig
+++ b/drivers/net/ethernet/broadcom/Kconfig
@@ -107,6 +107,14 @@ config TIGON3
To compile this driver as a module, choose M here: the module
will be called tg3. This is recommended.
+if TIGON3
+config TIGON3_NAPI
+ bool "Use Rx Polling (NAPI)"
+ default y
+ ---help---
+ Use NAPI for Tigon3 driver. If unsure, say Y.
+endif # TIGON3
+
config BNX2X
tristate "Broadcom NetXtremeII 10Gb support"
depends on PCI
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index ecd6ea5..e3f221d 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -5349,7 +5349,11 @@ static void tg3_tx(struct tg3_napi *tnapi)
pkts_compl++;
bytes_compl += skb->len;
+#ifdef CONFIG_TIGON3_NAPI
dev_kfree_skb(skb);
+#else
+ dev_kfree_skb_any(skb);
+#endif
if (unlikely(tx_bug)) {
tg3_tx_recover(tp);
@@ -5370,11 +5374,15 @@ static void tg3_tx(struct tg3_napi *tnapi)
if (unlikely(netif_tx_queue_stopped(txq) &&
(tg3_tx_avail(tnapi) > TG3_TX_WAKEUP_THRESH(tnapi)))) {
+#ifdef CONFIG_TIGON3_NAPI
__netif_tx_lock(txq, smp_processor_id());
if (netif_tx_queue_stopped(txq) &&
(tg3_tx_avail(tnapi) > TG3_TX_WAKEUP_THRESH(tnapi)))
netif_tx_wake_queue(txq);
__netif_tx_unlock(txq);
+#else
+ netif_tx_wake_queue(txq);
+#endif
}
}
@@ -5694,7 +5702,11 @@ static inline int tg3_coal_adaptive_rx(struct tg3 *tp, int received)
* If both the host and chip were to write into the same ring, cache line
* eviction could occur since both entities want it in an exclusive state.
*/
+#ifdef CONFIG_TIGON3_NAPI
static int tg3_rx(struct tg3_napi *tnapi, int budget)
+#else
+static int tg3_rx(struct tg3_napi *tnapi)
+#endif
{
struct tg3 *tp = tnapi->tp;
u32 work_mask, rx_std_posted = 0;
@@ -5714,7 +5726,11 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
received = 0;
std_prod_idx = tpr->rx_std_prod_idx;
jmb_prod_idx = tpr->rx_jmb_prod_idx;
- while (sw_idx != hw_idx && budget > 0) {
+ while (sw_idx != hw_idx
+#ifdef CONFIG_TIGON3_NAPI
+ && budget > 0
+#endif
+ ) {
struct ring_info *ri;
struct tg3_rx_buffer_desc *desc = &tnapi->rx_rcb[sw_idx];
unsigned int len;
@@ -5819,10 +5835,16 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
__vlan_hwaccel_put_tag(skb,
desc->err_vlan & RXD_VLAN_MASK);
+#ifdef CONFIG_TIGON3_NAPI
napi_gro_receive(&tnapi->napi, skb);
+#else
+ netif_rx(skb);
+#endif
received++;
+#ifdef CONFIG_TIGON3_NAPI
budget--;
+#endif
next_pkt:
(*post_ptr)++;
@@ -5868,7 +5890,10 @@ next_pkt_nopost:
tpr->rx_jmb_prod_idx);
}
mmiowb();
- } else if (work_mask) {
+ }
+#ifdef CONFIG_TIGON3_NAPI
+ /* TG3_FLG3_ENABLE_RSS is only set in NAPI mode. */
+ else if (work_mask) {
/* rx_std_buffers[] and rx_jmb_buffers[] entries must be
* updated before the producer indices can be updated.
*/
@@ -5880,6 +5905,7 @@ next_pkt_nopost:
if (tnapi != &tp->napi[1])
napi_schedule(&tp->napi[1].napi);
}
+#endif
return received;
}
@@ -5893,7 +5919,12 @@ static void tg3_poll_link(struct tg3 *tp)
if (sblk->status & SD_STATUS_LINK_CHG) {
sblk->status = SD_STATUS_UPDATED |
(sblk->status & ~SD_STATUS_LINK_CHG);
+
+#ifdef CONFIG_TIGON3_NAPI
spin_lock(&tp->lock);
+#else
+ spin_lock_bh(&tp->lock);
+#endif
if (tg3_flag(tp, USE_PHYLIB)) {
tw32_f(MAC_STATUS,
(MAC_STATUS_SYNC_CHANGED |
@@ -5903,11 +5934,16 @@ static void tg3_poll_link(struct tg3 *tp)
udelay(40);
} else
tg3_setup_phy(tp, 0);
+#ifdef CONFIG_TIGON3_NAPI
spin_unlock(&tp->lock);
+#else
+ spin_unlock_bh(&tp->lock);
+#endif
}
}
}
+#ifdef CONFIG_TIGON3_NAPI
static int tg3_rx_prodring_xfer(struct tg3 *tp,
struct tg3_rx_prodring_set *dpr,
struct tg3_rx_prodring_set *spr)
@@ -6207,6 +6243,58 @@ tx_recovery:
return work_done;
}
+#else /* !CONFIG_TIGON3_NAPI */
+
+static void tg3_poll_link_task(struct work_struct *work)
+{
+ struct tg3 *tp = container_of(work, struct tg3, poll_link_task);
+ tg3_poll_link(tp);
+}
+
+static void tg3_int_work(struct tg3_napi *tnapi)
+{
+ struct tg3 *tp = tnapi->tp;
+ struct tg3_hw_status *sblk = tnapi->hw_status;
+
+ if (tg3_flag(tp, TAGGED_STATUS)) {
+ tnapi->last_irq_tag = sblk->status_tag;
+ tnapi->last_tag = tnapi->last_irq_tag;
+ }
+
+ if (!(tg3_flag(tp, USE_LINKCHG_REG) || tg3_flag(tp, POLL_SERDES))) {
+ if (sblk->status & SD_STATUS_LINK_CHG)
+ schedule_work(&tp->poll_link_task);
+ }
+
+ /* run TX completion thread */
+ if (tnapi->hw_status->idx[0].tx_consumer != tnapi->tx_cons) {
+ tg3_tx(tnapi);
+ if (unlikely(tg3_flag(tp, TX_RECOVERY_PENDING)))
+ goto tx_recovery;
+ }
+
+ /* run RX thread */
+ if (*(tnapi->rx_rcb_prod_idx) != tnapi->rx_rcb_ptr)
+ tg3_rx(tnapi);
+
+ if (unlikely(tg3_flag(tp, TX_RECOVERY_PENDING)))
+ goto tx_recovery;
+
+ if (!(tg3_flag(tp, TAGGED_STATUS)))
+ sblk->status &= ~SD_STATUS_UPDATED;
+
+ /* Reenable interrupts */
+ tg3_int_reenable(tnapi);
+
+ return;
+
+tx_recovery:
+ schedule_work(&tp->reset_task);
+}
+
+#endif /* CONFIG_TIGON3_NAPI */
+
+#ifdef CONFIG_TIGON3_NAPI
static void tg3_napi_disable(struct tg3 *tp)
{
int i;
@@ -6239,11 +6327,14 @@ static void tg3_napi_fini(struct tg3 *tp)
for (i = 0; i < tp->irq_cnt; i++)
netif_napi_del(&tp->napi[i].napi);
}
+#endif
static inline void tg3_netif_stop(struct tg3 *tp)
{
tp->dev->trans_start = jiffies; /* prevent tx timeout */
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_disable(tp);
+#endif
netif_tx_disable(tp->dev);
}
@@ -6255,7 +6346,9 @@ static inline void tg3_netif_start(struct tg3 *tp)
*/
netif_tx_wake_all_queues(tp->dev);
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_enable(tp);
+#endif
tp->napi[0].hw_status->status |= SD_STATUS_UPDATED;
tg3_enable_ints(tp);
}
@@ -6303,7 +6396,11 @@ static irqreturn_t tg3_msi_1shot(int irq, void *dev_id)
prefetch(&tnapi->rx_rcb[tnapi->rx_rcb_ptr]);
if (likely(!tg3_irq_sync(tp)))
+#ifdef CONFIG_TIGON3_NAPI
napi_schedule(&tnapi->napi);
+#else
+ tg3_int_work(tnapi);
+#endif
return IRQ_HANDLED;
}
@@ -6329,7 +6426,11 @@ static irqreturn_t tg3_msi(int irq, void *dev_id)
*/
tw32_mailbox(tnapi->int_mbox, 0x00000001);
if (likely(!tg3_irq_sync(tp)))
+#ifdef CONFIG_TIGON3_NAPI
napi_schedule(&tnapi->napi);
+#else
+ tg3_int_work(tnapi);
+#endif
return IRQ_RETVAL(1);
}
@@ -6371,7 +6472,11 @@ static irqreturn_t tg3_interrupt(int irq, void *dev_id)
sblk->status &= ~SD_STATUS_UPDATED;
if (likely(tg3_has_work(tnapi))) {
prefetch(&tnapi->rx_rcb[tnapi->rx_rcb_ptr]);
+#ifdef CONFIG_TIGON3_NAPI
napi_schedule(&tnapi->napi);
+#else
+ tg3_int_work(tnapi);
+#endif
} else {
/* No work, shared interrupt perhaps? re-enable
* interrupts, and flush that PCI write
@@ -6429,7 +6534,11 @@ static irqreturn_t tg3_interrupt_tagged(int irq, void *dev_id)
prefetch(&tnapi->rx_rcb[tnapi->rx_rcb_ptr]);
+#ifdef CONFIG_TIGON3_NAPI
napi_schedule(&tnapi->napi);
+#else
+ tg3_int_work(tnapi);
+#endif
out:
return IRQ_RETVAL(handled);
@@ -6470,7 +6579,9 @@ static int tg3_restart_hw(struct tg3 *tp, int reset_phy)
tg3_full_unlock(tp);
del_timer_sync(&tp->timer);
tp->irq_sync = 0;
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_enable(tp);
+#endif
dev_close(tp->dev);
tg3_full_lock(tp, 0);
}
@@ -6804,10 +6915,15 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
budget = tg3_tx_avail(tnapi);
- /* We are running in BH disabled context with netif_tx_lock
- * and TX reclaim runs via tp->napi.poll inside of a software
- * interrupt. Furthermore, IRQ processing runs lockless so we have
- * no IRQ context deadlocks to worry about either. Rejoice!
+ /* When in NAPI mode, we are running in BH disabled context
+ * with netif_tx_lock and TX reclaim runs via tp->napi.poll
+ * inside of a software interrupt. Furthermore, IRQ
+ * processing runs lockless so we have no IRQ context
+ * deadlocks to worry about either. Rejoice!
+ *
+ * When in non-NAPI mode, we are running in BH disabled
+ * context with netif_tx_lock and TX reclaim runs lockless in
+ * interrupt context.
*/
if (unlikely(budget <= (skb_shinfo(skb)->nr_frags + 1))) {
if (!netif_tx_queue_stopped(txq)) {
@@ -9847,9 +9963,10 @@ static int tg3_open(struct net_device *dev)
if (err)
goto err_out1;
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_init(tp);
-
tg3_napi_enable(tp);
+#endif
for (i = 0; i < tp->irq_cnt; i++) {
struct tg3_napi *tnapi = &tp->napi[i];
@@ -9942,8 +10059,10 @@ err_out3:
}
err_out2:
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_disable(tp);
tg3_napi_fini(tp);
+#endif
tg3_free_consistent(tp);
err_out1:
@@ -9958,7 +10077,9 @@ static int tg3_close(struct net_device *dev)
int i;
struct tg3 *tp = netdev_priv(dev);
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_disable(tp);
+#endif
tg3_reset_task_cancel(tp);
netif_tx_stop_all_queues(dev);
@@ -9988,7 +10109,9 @@ static int tg3_close(struct net_device *dev)
memset(&tp->net_stats_prev, 0, sizeof(tp->net_stats_prev));
memset(&tp->estats_prev, 0, sizeof(tp->estats_prev));
+#ifdef CONFIG_TIGON3_NAPI
tg3_napi_fini(tp);
+#endif
tg3_free_consistent(tp);
@@ -14223,10 +14346,13 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
tg3_flag_set(tp, 1SHOT_MSI);
}
+#ifdef CONFIG_TIGON3_NAPI
+ /* Do not enable MSIX in non-NAPI mode. */
if (tg3_flag(tp, 57765_PLUS)) {
tg3_flag_set(tp, SUPPORT_MSIX);
tp->irq_max = TG3_IRQ_MAX_VECS;
}
+#endif
}
if (tg3_flag(tp, 5755_PLUS))
@@ -15601,6 +15727,9 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
spin_lock_init(&tp->lock);
spin_lock_init(&tp->indirect_lock);
INIT_WORK(&tp->reset_task, tg3_reset_task);
+#ifndef CONFIG_TIGON3_NAPI
+ INIT_WORK(&tp->poll_link_task, tg3_poll_link_task);
+#endif
tp->regs = pci_ioremap_bar(pdev, BAR_0);
if (!tp->regs) {
@@ -15821,6 +15950,14 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
goto err_out_apeunmap;
}
+#ifdef CONFIG_TIGON3_NAPI
+ netdev_info(dev, "Tigon3 driver %s loaded in NAPI mode.\n",
+ DRV_MODULE_VERSION);
+#else
+ netdev_info(dev, "Tigon3 driver %s loaded in non-NAPI mode.\n",
+ DRV_MODULE_VERSION);
+#endif
+
netdev_info(dev, "Tigon3 [partno(%s) rev %04x] (%s) MAC address %pM\n",
tp->board_part_number,
tp->pci_chip_rev_id,
diff --git a/drivers/net/ethernet/broadcom/tg3.h b/drivers/net/ethernet/broadcom/tg3.h
index 695cf14..b023e96 100644
--- a/drivers/net/ethernet/broadcom/tg3.h
+++ b/drivers/net/ethernet/broadcom/tg3.h
@@ -2832,7 +2832,9 @@ struct tg3_rx_prodring_set {
#define TG3_IRQ_MAX_VECS TG3_IRQ_MAX_VECS_RSS
struct tg3_napi {
+#ifdef CONFIG_TIGON3_NAPI
struct napi_struct napi ____cacheline_aligned;
+#endif
struct tg3 *tp;
struct tg3_hw_status *hw_status;
@@ -3167,6 +3169,9 @@ struct tg3 {
struct tg3_hw_stats *hw_stats;
dma_addr_t stats_mapping;
struct work_struct reset_task;
+#ifndef CONFIG_TIGON3_NAPI
+ struct work_struct poll_link_task;
+#endif
int nvram_lock_cnt;
u32 nvram_size;
--
1.7.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode
2011-12-16 18:19 ` [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode David Decotigny
@ 2011-12-16 19:30 ` Ben Hutchings
2011-12-16 19:42 ` Eric Dumazet
2011-12-16 19:50 ` Eric Dumazet
2 siblings, 0 replies; 11+ messages in thread
From: Ben Hutchings @ 2011-12-16 19:30 UTC (permalink / raw)
To: David Decotigny
Cc: Matt Carlson, Michael Chan, netdev, linux-kernel,
Javier Martinez Canillas, Robin Getz, Matt Mackall, Tom Herbert
On Fri, 2011-12-16 at 10:19 -0800, David Decotigny wrote:
> From: Tom Herbert <therbert@google.com>
>
> The tg3 NIC has a hard limit of 511 descriptors for the receive ring.
> Under heavy load of small packets, this device receive queue may not
> be serviced fast enough to prevent packets drops. This could be due
> to a variety of reasons such as lengthy processing delays of packets
> in the stack, softirqs being disabled too long, etc.
[...]
I think those are bugs to be fixed, not worked around.
Various drivers had NAPI as a compile-time option for a while, and
pretty much all of those have now been made to use NAPI unconditionally
(I think tulip is the only one left). Adding such an option back is
unlikely to be accepted now.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode
2011-12-16 18:19 ` [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode David Decotigny
2011-12-16 19:30 ` Ben Hutchings
@ 2011-12-16 19:42 ` Eric Dumazet
2011-12-16 19:50 ` Eric Dumazet
2 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2011-12-16 19:42 UTC (permalink / raw)
To: David Decotigny
Cc: Matt Carlson, Michael Chan, netdev, linux-kernel,
Javier Martinez Canillas, Robin Getz, Matt Mackall, Tom Herbert
Le vendredi 16 décembre 2011 à 10:19 -0800, David Decotigny a écrit :
> From: Tom Herbert <therbert@google.com>
>
...
>
>
> Signed-off-by: David Decotigny <decot@googlers.com>
> ---
If patch author is Tom Herbert, you should use :
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David Decotigny <decot@googlers.com>
(Same problem on other patches)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode
2011-12-16 18:19 ` [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode David Decotigny
2011-12-16 19:30 ` Ben Hutchings
2011-12-16 19:42 ` Eric Dumazet
@ 2011-12-16 19:50 ` Eric Dumazet
2 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2011-12-16 19:50 UTC (permalink / raw)
To: David Decotigny
Cc: Matt Carlson, Michael Chan, netdev, linux-kernel,
Javier Martinez Canillas, Robin Getz, Matt Mackall, Tom Herbert
Le vendredi 16 décembre 2011 à 10:19 -0800, David Decotigny a écrit :
> From: Tom Herbert <therbert@google.com>
>
> The tg3 NIC has a hard limit of 511 descriptors for the receive ring.
> Under heavy load of small packets, this device receive queue may not
> be serviced fast enough to prevent packets drops. This could be due
> to a variety of reasons such as lengthy processing delays of packets
> in the stack, softirqs being disabled too long, etc. If the driver is
> run in non-NAPI mode, the RX queue is serviced in the device
> interrupt, which is much less likely to be deferred for a substantial
> period of time.
>
> There are some effects in not using NAPI that need to be considered.
> It does increase the chance of live-lock in interrupt handler,
> although since the tg3 does interrupt coalescing this is very unlikely
> to occur. Also, more code is being run with interrupts disabled
> potentially deferring other hardware interrupts. The amount of time
> spent in the interrupt handler should be minimized by dequeuing
> packets of the device queue and queuing them to a host queue as
> quickly as possible.
>
> The default mode of operation remains NAPI and its performances are
> kept unchanged (code unchanged). Non-NAPI mode is enabled by
> commenting-out CONFIG_TIGON3_NAPI Kconfig parameter.
>
>
Oh well, thats ugly :(
I suspect this was only used with RPS/RFS ?
Or interrupts stick on a given cpu ?
Because with a default setup, and IRQ serviced by multiple cpus, you
endup with possible packet reorderings.
Packet1,2,3,4 handled by CPU0 : queued on netif_rx() queue.
EndOfInterrupt
Packet4,5,6,7 handled by CPU1 : queued on netif_rx() queue.
EndOfInterrupt
CPU0/CPU1 happily merge packets...
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net-next v1 6/6] tg3: use netif_tx_start_queue instead of wake_queue when no reschedule needed
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
` (4 preceding siblings ...)
2011-12-16 18:19 ` [PATCH net-next v1 5/6] tg3: implementation of a non-NAPI mode David Decotigny
@ 2011-12-16 18:19 ` David Decotigny
2011-12-16 18:42 ` [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Miller
6 siblings, 0 replies; 11+ messages in thread
From: David Decotigny @ 2011-12-16 18:19 UTC (permalink / raw)
To: Matt Carlson, Michael Chan, netdev, linux-kernel
Cc: Javier Martinez Canillas, Robin Getz, Matt Mackall, Ying Cai,
David Decotigny
From: Ying Cai <ycai@google.com>
This commit replaces netif_tx_wake_queue() with netif_tx_start_queue()
when __netif_schedule() is not needed. It also adds code to deal with
race condition between netif_tx_start_queue() and netif_tx_stop_queue().
Signed-off-by: David Decotigny <decot@googlers.com>
---
drivers/net/ethernet/broadcom/tg3.c | 23 ++++++++++++-----------
1 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index e3f221d..311e073 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -6860,9 +6860,10 @@ static int tg3_tso_bug(struct tg3 *tp, struct sk_buff *skb)
{
struct sk_buff *segs, *nskb;
u32 frag_cnt_est = skb_shinfo(skb)->gso_segs * 3;
+ struct tg3_napi *tnapi = &tp->napi[0];
/* Estimate the number of fragments in the worst case */
- if (unlikely(tg3_tx_avail(&tp->napi[0]) <= frag_cnt_est)) {
+ if (unlikely(tg3_tx_avail(tnapi) <= frag_cnt_est)) {
netif_stop_queue(tp->dev);
/* netif_tx_stop_queue() must be done before checking
@@ -6871,10 +6872,10 @@ static int tg3_tso_bug(struct tg3 *tp, struct sk_buff *skb)
* netif_tx_queue_stopped().
*/
smp_mb();
- if (tg3_tx_avail(&tp->napi[0]) <= frag_cnt_est)
+ if (tg3_tx_avail(tnapi) <= TG3_TX_WAKEUP_THRESH(tnapi))
return NETDEV_TX_BUSY;
- netif_wake_queue(tp->dev);
+ netif_start_queue(tp->dev);
}
segs = skb_gso_segment(skb, tp->dev->features & ~NETIF_F_TSO);
@@ -6926,14 +6927,14 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
* interrupt context.
*/
if (unlikely(budget <= (skb_shinfo(skb)->nr_frags + 1))) {
- if (!netif_tx_queue_stopped(txq)) {
- netif_tx_stop_queue(txq);
+ /* This is a hard error, log it. */
+ netdev_err(dev, "BUG! Tx Ring full when queue awake!\n");
- /* This is a hard error, log it. */
- netdev_err(dev,
- "BUG! Tx Ring full when queue awake!\n");
- }
- return NETDEV_TX_BUSY;
+ netif_tx_stop_queue(txq);
+ smp_mb();
+ if (tg3_tx_avail(tnapi) <= TG3_TX_WAKEUP_THRESH(tnapi))
+ return NETDEV_TX_BUSY;
+ netif_tx_start_queue(txq);
}
entry = tnapi->tx_prod;
@@ -7100,7 +7101,7 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
*/
smp_mb();
if (tg3_tx_avail(tnapi) > TG3_TX_WAKEUP_THRESH(tnapi))
- netif_tx_wake_queue(txq);
+ netif_tx_start_queue(txq);
}
mmiowb();
--
1.7.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode
2011-12-16 18:19 [PATCH net-next v1 0/6] tg3: adaptive interrupt coalescing, non-napi mode David Decotigny
` (5 preceding siblings ...)
2011-12-16 18:19 ` [PATCH net-next v1 6/6] tg3: use netif_tx_start_queue instead of wake_queue when no reschedule needed David Decotigny
@ 2011-12-16 18:42 ` David Miller
6 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2011-12-16 18:42 UTC (permalink / raw)
To: decot; +Cc: mcarlson, mchan, netdev, linux-kernel, martinez.javier, rgetz,
mpm
From: David Decotigny <decot@googlers.com>
Date: Fri, 16 Dec 2011 10:19:43 -0800
> This series implements adaptive interrupt coalescing for tg3 NIC,
> improving performance substancially. It also implements non-NAPI mode
> for specific system loads.
I specifically removed the dynamic IRQ coalescing from this driver years
ago.
It's too susceptible to state changes.
I am sure that you have some nice benchmark for which one scheme helps,
but more generally it is not possible to make it such that you will avoid
the case where the network flow pattern changes by the time you change
the chip configuration and thus the result is suboptimal.
I highly recommend these changes are not applied, because they will hurt
someone.
^ permalink raw reply [flat|nested] 11+ messages in thread