* [PATCH 2.6.28] myri10ge updates + multiqueue TX
@ 2008-09-12 17:46 Brice Goglin
2008-09-12 17:47 ` [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent Brice Goglin
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Brice Goglin @ 2008-09-12 17:46 UTC (permalink / raw)
To: Jeff Garzik; +Cc: netdev
Hello Jeff,
Here's a first batch of patches for myri10ge in 2.6.28:
1) Stop scaring people when DCA is built but absent
2) Rename dca-related firmware counters
3) Add Toeplitz-hashing related routines
4) Add multiqueue TX support
thanks,
Brice
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent
2008-09-12 17:46 [PATCH 2.6.28] myri10ge updates + multiqueue TX Brice Goglin
@ 2008-09-12 17:47 ` Brice Goglin
2008-09-13 19:30 ` Jeff Garzik
2008-09-12 17:48 ` [PATCH 2.6.28 2/4] myri10ge: Rename DCA-related firmware counters Brice Goglin
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Brice Goglin @ 2008-09-12 17:47 UTC (permalink / raw)
To: Jeff Garzik; +Cc: netdev
Stop scaring people with what looks like a fatal message when DCA support
is compiled into their kernel, but the DCA device is not present.
Signed-off-by: Brice Goglin <brice@myri.com>
---
drivers/net/myri10ge/myri10ge.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux-2.6.git/drivers/net/myri10ge/myri10ge.c
===================================================================
--- linux-2.6.git.orig/drivers/net/myri10ge/myri10ge.c 2008-08-29 07:29:45.000000000 +0200
+++ linux-2.6.git/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:22:37.000000000 +0200
@@ -1060,8 +1060,9 @@
}
err = dca_add_requester(&pdev->dev);
if (err) {
- dev_err(&pdev->dev,
- "dca_add_requester() failed, err=%d\n", err);
+ if (err != -ENODEV)
+ dev_err(&pdev->dev,
+ "dca_add_requester() failed, err=%d\n", err);
return;
}
mgp->dca_enabled = 1;
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2.6.28 2/4] myri10ge: Rename DCA-related firmware counters
2008-09-12 17:46 [PATCH 2.6.28] myri10ge updates + multiqueue TX Brice Goglin
2008-09-12 17:47 ` [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent Brice Goglin
@ 2008-09-12 17:48 ` Brice Goglin
2008-09-12 17:49 ` [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines Brice Goglin
2008-09-12 17:50 ` [PATCH 2.6.28 4/4] myri10ge: Add multiqueue TX support Brice Goglin
3 siblings, 0 replies; 8+ messages in thread
From: Brice Goglin @ 2008-09-12 17:48 UTC (permalink / raw)
To: Jeff Garzik; +Cc: netdev
Rename the cryptic "dca_capable" to "dca_capable_firmware"
and "dca_enabled" to "dca_device_present" in the firmware
counters.
Signed-off-by: Brice Goglin <brice@myri.com>
---
drivers/net/myri10ge/myri10ge.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6.git/drivers/net/myri10ge/myri10ge.c
===================================================================
--- linux-2.6.git.orig/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:27:21.000000000 +0200
+++ linux-2.6.git/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:27:23.000000000 +0200
@@ -1688,7 +1688,7 @@
"read_dma_bw_MBs", "write_dma_bw_MBs", "read_write_dma_bw_MBs",
"serial_number", "watchdog_resets",
#ifdef CONFIG_DCA
- "dca_capable", "dca_enabled",
+ "dca_capable_firmware", "dca_device_present",
#endif
"link_changes", "link_up", "dropped_link_overflow",
"dropped_link_error_or_filtered",
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines
2008-09-12 17:46 [PATCH 2.6.28] myri10ge updates + multiqueue TX Brice Goglin
2008-09-12 17:47 ` [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent Brice Goglin
2008-09-12 17:48 ` [PATCH 2.6.28 2/4] myri10ge: Rename DCA-related firmware counters Brice Goglin
@ 2008-09-12 17:49 ` Brice Goglin
2008-09-12 19:54 ` Ben Hutchings
2008-09-12 22:32 ` Duyck, Alexander H
2008-09-12 17:50 ` [PATCH 2.6.28 4/4] myri10ge: Add multiqueue TX support Brice Goglin
3 siblings, 2 replies; 8+ messages in thread
From: Brice Goglin @ 2008-09-12 17:49 UTC (permalink / raw)
To: Jeff Garzik; +Cc: netdev
myri10ge uses a Toeplitz hashing. Add the corresponding select_queue()
method without using it yet.
Signed-off-by: Brice Goglin <brice@myri.com>
---
drivers/net/myri10ge/myri10ge.c | 165 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 165 insertions(+)
Index: linux-2.6.git/drivers/net/myri10ge/myri10ge.c
===================================================================
--- linux-2.6.git.orig/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:24:15.000000000 +0200
+++ linux-2.6.git/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:24:42.000000000 +0200
@@ -250,6 +250,8 @@
u32 read_write_dma;
u32 link_changes;
u32 msg_enable;
+ u32 *toeplitz_hash_table;
+ u8 rss_key[32];
};
static char *myri10ge_fw_unaligned = "myri10ge_ethp_z8e.dat";
@@ -2194,6 +2196,169 @@
return 0;
}
+static int myri10ge_init_toeplitz(struct myri10ge_priv *mgp)
+{
+ struct myri10ge_cmd cmd;
+ int i, b, s, t, j;
+ int status;
+ u32 k[8];
+ u32 tmp;
+ u8 *key;
+
+ status = myri10ge_send_cmd(mgp, MXGEFW_CMD_GET_RSS_KEY_OFFSET, &cmd, 0);
+ if (status != 0) {
+ printk(KERN_ERR
+ "myri10ge: %s: failed to get rss key\n", mgp->dev->name);
+ return -EIO;
+ }
+ memcpy_fromio(mgp->rss_key, mgp->sram + cmd.data0,
+ sizeof(mgp->rss_key));
+
+ mgp->toeplitz_hash_table = kmalloc(sizeof(u32) * 12 * 256, GFP_KERNEL);
+ if (mgp->toeplitz_hash_table == NULL)
+ return -ENOMEM;
+ key = (u8 *) mgp->rss_key;
+ t = 0;
+ for (b = 0; b < 12; b++) {
+ for (s = 0; s < 8; s++) {
+ /* Bits: b*8+s, ..., b*8+s+31 */
+ k[s] = 0;
+ for (j = 0; j < 32; j++) {
+ int bit = b * 8 + s + j;
+ bit = 0x1 & (key[bit / 8] >> (7 - (bit & 0x7)));
+ k[s] |= bit << (31 - j);
+ }
+ }
+
+ for (i = 0; i <= 0xff; i++) {
+ tmp = 0;
+ if (i & (1 << 7)) {
+ tmp ^= k[0];
+ }
+ if (i & (1 << 6)) {
+ tmp ^= k[1];
+ }
+ if (i & (1 << 5)) {
+ tmp ^= k[2];
+ }
+ if (i & (1 << 4)) {
+ tmp ^= k[3];
+ }
+ if (i & (1 << 3)) {
+ tmp ^= k[4];
+ }
+ if (i & (1 << 2)) {
+ tmp ^= k[5];
+ }
+ if (i & (1 << 1)) {
+ tmp ^= k[6];
+ }
+ if (i & (1 << 0)) {
+ tmp ^= k[7];
+ }
+ mgp->toeplitz_hash_table[t++] = tmp;
+ }
+ }
+ return 0;
+}
+
+static inline u16
+myri10ge_toeplitz_select_queue(struct net_device *dev, struct iphdr *ip)
+{
+ struct myri10ge_priv *mgp = netdev_priv(dev);
+ struct tcphdr *hdr;
+ u32 saddr, daddr;
+ u32 hash;
+ u32 *table = mgp->toeplitz_hash_table;
+ u16 src, dst;
+
+ /*
+ * Note hashing order is reversed from how it is done
+ * in the NIC, so as to generate the same hash value
+ * for the connection to try to keep connections CPU local
+ */
+
+ /* hash on IPv4 src/dst address */
+ saddr = ntohl(ip->saddr);
+ daddr = ntohl(ip->daddr);
+ hash = table[(256 * 0) + ((daddr >> 24) & 0xff)];
+ hash ^= table[(256 * 1) + ((daddr >> 16) & 0xff)];
+ hash ^= table[(256 * 2) + ((daddr >> 8) & 0xff)];
+ hash ^= table[(256 * 3) + ((daddr) & 0xff)];
+ hash ^= table[(256 * 4) + ((saddr >> 24) & 0xff)];
+ hash ^= table[(256 * 5) + ((saddr >> 16) & 0xff)];
+ hash ^= table[(256 * 6) + ((saddr >> 8) & 0xff)];
+ hash ^= table[(256 * 7) + ((saddr) & 0xff)];
+ /* hash on TCP port, if required */
+ if ((myri10ge_rss_hash & MXGEFW_RSS_HASH_TYPE_TCP_IPV4) &&
+ ip->protocol == IPPROTO_TCP) {
+ hdr = (struct tcphdr *)(((u8 *) ip) + (ip->ihl << 2));
+ src = ntohs(hdr->source);
+ dst = ntohs(hdr->dest);
+
+ hash ^= table[(256 * 8) + ((dst >> 8) & 0xff)];
+ hash ^= table[(256 * 9) + ((dst) & 0xff)];
+ hash ^= table[(256 * 10) + ((src >> 8) & 0xff)];
+ hash ^= table[(256 * 11) + ((src) & 0xff)];
+ }
+ return (u16) (hash & (dev->real_num_tx_queues - 1));
+}
+
+static u16
+myri10ge_simple_select_queue(struct net_device *dev, struct iphdr *ip)
+{
+ struct udphdr *hdr;
+ u32 hash_val = 0;
+
+ if (ip->protocol != IPPROTO_TCP && ip->protocol != IPPROTO_UDP)
+ return (0);
+ hdr = (struct udphdr *)(((u8 *) ip) + (ip->ihl << 2));
+
+ /*
+ * Use the second byte of the *destination* address for
+ * MXGEFW_RSS_HASH_TYPE_SRC_PORT, so as to match NIC's hashing
+ */
+ hash_val = ntohs(hdr->dest) & 0xff;
+ if (myri10ge_rss_hash == MXGEFW_RSS_HASH_TYPE_SRC_DST_PORT)
+ hash_val += ntohs(hdr->source) & 0xff;
+
+ return (u16) (hash_val & (dev->real_num_tx_queues - 1));
+}
+
+static u16 myri10ge_select_queue(struct net_device *dev, struct sk_buff *skb)
+{
+ struct iphdr *ip;
+ struct vlan_hdr *vh;
+
+ if (skb->protocol == __constant_htons(ETH_P_IP)) {
+ ip = ip_hdr(skb);
+ } else if (skb->protocol == __constant_htons(ETH_P_8021Q)) {
+ vh = (struct vlan_hdr *)skb->data;
+ if ((vh->h_vlan_encapsulated_proto !=
+ __constant_htons(ETH_P_IP)))
+ return 0;
+ ip = (struct iphdr *)skb->data + sizeof(*vh);
+ } else {
+ return 0;
+ }
+
+ switch (myri10ge_rss_hash) {
+ case MXGEFW_RSS_HASH_TYPE_IPV4:
+ /* fallthru */
+ case MXGEFW_RSS_HASH_TYPE_TCP_IPV4:
+ /* fallthru */
+ case (MXGEFW_RSS_HASH_TYPE_IPV4 | MXGEFW_RSS_HASH_TYPE_TCP_IPV4):
+ return (myri10ge_toeplitz_select_queue(dev, ip));
+ break;
+ case MXGEFW_RSS_HASH_TYPE_SRC_PORT:
+ /* fallthru */
+ case MXGEFW_RSS_HASH_TYPE_SRC_DST_PORT:
+ return (myri10ge_simple_select_queue(dev, ip));
+ default:
+ return (0);
+ }
+}
+
static int myri10ge_get_txrx(struct myri10ge_priv *mgp, int slice)
{
struct myri10ge_cmd cmd;
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2.6.28 4/4] myri10ge: Add multiqueue TX support
2008-09-12 17:46 [PATCH 2.6.28] myri10ge updates + multiqueue TX Brice Goglin
` (2 preceding siblings ...)
2008-09-12 17:49 ` [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines Brice Goglin
@ 2008-09-12 17:50 ` Brice Goglin
3 siblings, 0 replies; 8+ messages in thread
From: Brice Goglin @ 2008-09-12 17:50 UTC (permalink / raw)
To: Jeff Garzik; +Cc: netdev
Add multiqueue TX support to myri10ge, using Toeplitz hashing.
Signed-off-by: Brice Goglin <brice@myri.com>
---
drivers/net/myri10ge/myri10ge.c | 185 ++++++++++++++++++++++++++++++++--------
1 file changed, 149 insertions(+), 36 deletions(-)
Index: linux-2.6.git/drivers/net/myri10ge/myri10ge.c
===================================================================
--- linux-2.6.git.orig/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:24:42.000000000 +0200
+++ linux-2.6.git/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:24:59.000000000 +0200
@@ -102,6 +102,9 @@
#define MYRI10GE_ALLOC_SIZE ((1 << MYRI10GE_ALLOC_ORDER) * PAGE_SIZE)
#define MYRI10GE_MAX_FRAGS_PER_FRAME (MYRI10GE_MAX_ETHER_MTU/MYRI10GE_ALLOC_SIZE + 1)
+#define MYRI10GE_MAX_SLICES 32
+#define MYRI10GE_TOEPLITZ_HASH (MXGEFW_RSS_HASH_TYPE_TCP_IPV4|MXGEFW_RSS_HASH_TYPE_IPV4)
+
struct myri10ge_rx_buffer_state {
struct page *page;
int page_offset;
@@ -138,6 +141,8 @@
struct myri10ge_tx_buf {
struct mcp_kreq_ether_send __iomem *lanai; /* lanai ptr for sendq */
+ __be32 __iomem *send_go; /* "go" doorbell ptr */
+ __be32 __iomem *send_stop; /* "stop" doorbell ptr */
struct mcp_kreq_ether_send *req_list; /* host shadow of sendq */
char *req_bytes;
struct myri10ge_tx_buffer_state *info;
@@ -149,6 +154,7 @@
int done ____cacheline_aligned; /* transmit slots completed */
int pkt_done; /* packets completed */
int wake_queue;
+ int queue_active;
};
struct myri10ge_rx_done {
@@ -420,6 +426,12 @@
return -ENOSYS;
} else if (result == MXGEFW_CMD_ERROR_UNALIGNED) {
return -E2BIG;
+ } else if (result == MXGEFW_CMD_ERROR_RANGE &&
+ cmd == MXGEFW_CMD_ENABLE_RSS_QUEUES &&
+ (data->
+ data1 & MXGEFW_SLICE_ENABLE_MULTIPLE_TX_QUEUES) !=
+ 0) {
+ return -ERANGE;
} else {
dev_err(&mgp->pdev->dev,
"command %d failed, result = %d\n",
@@ -949,9 +961,24 @@
*/
cmd.data0 = mgp->num_slices;
- cmd.data1 = 1; /* use MSI-X */
+ cmd.data1 = MXGEFW_SLICE_INTR_MODE_ONE_PER_SLICE;
+ if (mgp->dev->real_num_tx_queues > 1)
+ cmd.data1 |= MXGEFW_SLICE_ENABLE_MULTIPLE_TX_QUEUES;
status = myri10ge_send_cmd(mgp, MXGEFW_CMD_ENABLE_RSS_QUEUES,
&cmd, 0);
+
+ /* Firmware older than 1.4.32 only supports multiple
+ * RX queues, so if we get an error, first retry using a
+ * single TX queue before giving up */
+ if (status != 0 && mgp->dev->real_num_tx_queues > 1) {
+ mgp->dev->real_num_tx_queues = 1;
+ cmd.data0 = mgp->num_slices;
+ cmd.data1 = MXGEFW_SLICE_INTR_MODE_ONE_PER_SLICE;
+ status = myri10ge_send_cmd(mgp,
+ MXGEFW_CMD_ENABLE_RSS_QUEUES,
+ &cmd, 0);
+ }
+
if (status != 0) {
dev_err(&mgp->pdev->dev,
"failed to set number of slices\n");
@@ -1319,6 +1346,7 @@
{
struct pci_dev *pdev = ss->mgp->pdev;
struct myri10ge_tx_buf *tx = &ss->tx;
+ struct netdev_queue *dev_queue;
struct sk_buff *skb;
int idx, len;
@@ -1352,11 +1380,31 @@
PCI_DMA_TODEVICE);
}
}
+
+ dev_queue = netdev_get_tx_queue(ss->dev, ss - ss->mgp->ss);
+ /*
+ * Make a minimal effort to prevent the NIC from polling an
+ * idle tx queue. If we can't get the lock we leave the queue
+ * active. In this case, either a thread was about to start
+ * using the queue anyway, or we lost a race and the NIC will
+ * waste some of its resources polling an inactive queue for a
+ * while.
+ */
+
+ if ((ss->mgp->dev->real_num_tx_queues > 1) &&
+ __netif_tx_trylock(dev_queue)) {
+ if (tx->req == tx->done) {
+ tx->queue_active = 0;
+ put_be32(htonl(1), tx->send_stop);
+ }
+ __netif_tx_unlock(dev_queue);
+ }
+
/* start the queue if we've stopped it */
- if (netif_queue_stopped(ss->dev)
+ if (netif_tx_queue_stopped(dev_queue)
&& tx->req - tx->done < (tx->mask >> 1)) {
tx->wake_queue++;
- netif_wake_queue(ss->dev);
+ netif_tx_wake_queue(dev_queue);
}
}
@@ -1484,9 +1532,9 @@
u32 send_done_count;
int i;
- /* an interrupt on a non-zero slice is implicitly valid
- * since MSI-X irqs are not shared */
- if (ss != mgp->ss) {
+ /* an interrupt on a non-zero receive-only slice is implicitly
+ * valid since MSI-X irqs are not shared */
+ if ((mgp->dev->real_num_tx_queues == 1) && (ss != mgp->ss)) {
netif_rx_schedule(ss->dev, &ss->napi);
return (IRQ_HANDLED);
}
@@ -1528,7 +1576,9 @@
barrier();
}
- myri10ge_check_statblock(mgp);
+ /* Only slice 0 updates stats */
+ if (ss == mgp->ss)
+ myri10ge_check_statblock(mgp);
put_be32(htonl(3), ss->irq_claim + 1);
return (IRQ_HANDLED);
@@ -1886,6 +1936,7 @@
/* ensure req_list entries are aligned to 8 bytes */
ss->tx.req_list = (struct mcp_kreq_ether_send *)
ALIGN((unsigned long)ss->tx.req_bytes, 8);
+ ss->tx.queue_active = 0;
bytes = rx_ring_entries * sizeof(*ss->rx_small.shadow);
ss->rx_small.shadow = kzalloc(bytes, GFP_KERNEL);
@@ -2366,11 +2417,14 @@
int status;
ss = &mgp->ss[slice];
- cmd.data0 = 0; /* single slice for now */
- status = myri10ge_send_cmd(mgp, MXGEFW_CMD_GET_SEND_OFFSET, &cmd, 0);
- ss->tx.lanai = (struct mcp_kreq_ether_send __iomem *)
- (mgp->sram + cmd.data0);
-
+ status = 0;
+ if (slice == 0 || (mgp->dev->real_num_tx_queues > 1)) {
+ cmd.data0 = slice;
+ status = myri10ge_send_cmd(mgp, MXGEFW_CMD_GET_SEND_OFFSET,
+ &cmd, 0);
+ ss->tx.lanai = (struct mcp_kreq_ether_send __iomem *)
+ (mgp->sram + cmd.data0);
+ }
cmd.data0 = slice;
status |= myri10ge_send_cmd(mgp, MXGEFW_CMD_GET_SMALL_RX_OFFSET,
&cmd, 0);
@@ -2382,6 +2436,10 @@
ss->rx_big.lanai = (struct mcp_kreq_ether_recv __iomem *)
(mgp->sram + cmd.data0);
+ ss->tx.send_go = (__iomem __be32 *)
+ (mgp->sram + MXGEFW_ETH_SEND_GO + 64 * slice);
+ ss->tx.send_stop = (__iomem __be32 *)
+ (mgp->sram + MXGEFW_ETH_SEND_STOP + 64 * slice);
return status;
}
@@ -2395,7 +2453,7 @@
ss = &mgp->ss[slice];
cmd.data0 = MYRI10GE_LOWPART_TO_U32(ss->fw_stats_bus);
cmd.data1 = MYRI10GE_HIGHPART_TO_U32(ss->fw_stats_bus);
- cmd.data2 = sizeof(struct mcp_irq_data);
+ cmd.data2 = sizeof(struct mcp_irq_data) | (slice << 16);
status = myri10ge_send_cmd(mgp, MXGEFW_CMD_SET_STATS_DMA_V2, &cmd, 0);
if (status == -ENOSYS) {
dma_addr_t bus = ss->fw_stats_bus;
@@ -2436,7 +2494,9 @@
if (mgp->num_slices > 1) {
cmd.data0 = mgp->num_slices;
- cmd.data1 = 1; /* use MSI-X */
+ cmd.data1 = MXGEFW_SLICE_INTR_MODE_ONE_PER_SLICE;
+ if (mgp->dev->real_num_tx_queues > 1)
+ cmd.data1 |= MXGEFW_SLICE_ENABLE_MULTIPLE_TX_QUEUES;
status = myri10ge_send_cmd(mgp, MXGEFW_CMD_ENABLE_RSS_QUEUES,
&cmd, 0);
if (status != 0) {
@@ -2457,6 +2517,7 @@
printk(KERN_ERR
"myri10ge: %s: failed to setup rss tables\n",
dev->name);
+ goto abort_with_nothing;
}
/* just enable an identity mapping */
@@ -2464,6 +2525,20 @@
for (i = 0; i < mgp->num_slices; i++)
__raw_writeb(i, &itable[i]);
+ if (mgp->dev->real_num_tx_queues > 1) {
+ if (myri10ge_rss_hash & MYRI10GE_TOEPLITZ_HASH) {
+ /* grab the rss key for use in hashing transmits */
+ status = myri10ge_init_toeplitz(mgp);
+ if (status != 0) {
+ printk(KERN_ERR
+ "myri10ge: %s: failed to init toeplitz table\n",
+ dev->name);
+ goto abort_with_nothing;
+ }
+ }
+ mgp->dev->select_queue = myri10ge_select_queue;
+ }
+
cmd.data0 = 1;
cmd.data1 = myri10ge_rss_hash;
status = myri10ge_send_cmd(mgp, MXGEFW_CMD_SET_RSS_ENABLE,
@@ -2472,7 +2547,7 @@
printk(KERN_ERR
"myri10ge: %s: failed to enable slices\n",
dev->name);
- goto abort_with_nothing;
+ goto abort_with_toeplitz;
}
}
@@ -2527,7 +2602,11 @@
status = myri10ge_allocate_rings(ss);
if (status != 0)
goto abort_with_rings;
- if (slice == 0)
+
+ /* only firmware which supports multiple TX queues
+ * supports setting up the tx stats on non-zero
+ * slices */
+ if (slice == 0 || mgp->dev->real_num_tx_queues > 1)
status = myri10ge_set_stats(mgp, slice);
if (status) {
printk(KERN_ERR
@@ -2593,7 +2672,8 @@
mgp->running = MYRI10GE_ETH_RUNNING;
mgp->watchdog_timer.expires = jiffies + myri10ge_watchdog_timeout * HZ;
add_timer(&mgp->watchdog_timer);
- netif_wake_queue(dev);
+ netif_tx_wake_all_queues(dev);
+
return 0;
abort_with_rings:
@@ -2602,6 +2682,11 @@
myri10ge_free_irq(mgp);
+abort_with_toeplitz:
+ if (mgp->toeplitz_hash_table != NULL) {
+ kfree(mgp->toeplitz_hash_table);
+ mgp->toeplitz_hash_table = NULL;
+ }
abort_with_nothing:
mgp->running = MYRI10GE_ETH_STOPPED;
return -ENOMEM;
@@ -2620,13 +2705,15 @@
if (mgp->ss[0].tx.req_bytes == NULL)
return 0;
+ dev->select_queue = NULL;
del_timer_sync(&mgp->watchdog_timer);
mgp->running = MYRI10GE_ETH_STOPPING;
for (i = 0; i < mgp->num_slices; i++) {
napi_disable(&mgp->ss[i].napi);
}
netif_carrier_off(dev);
- netif_stop_queue(dev);
+
+ netif_tx_stop_all_queues(dev);
old_down_cnt = mgp->down_cnt;
mb();
status = myri10ge_send_cmd(mgp, MXGEFW_CMD_ETHERNET_DOWN, &cmd, 0);
@@ -2643,6 +2730,11 @@
for (i = 0; i < mgp->num_slices; i++)
myri10ge_free_rings(&mgp->ss[i]);
+ if (mgp->toeplitz_hash_table != NULL) {
+ kfree(mgp->toeplitz_hash_table);
+ mgp->toeplitz_hash_table = NULL;
+ }
+
mgp->running = MYRI10GE_ETH_STOPPED;
return 0;
}
@@ -2731,18 +2823,23 @@
struct mcp_kreq_ether_send *req;
struct myri10ge_tx_buf *tx;
struct skb_frag_struct *frag;
+ struct netdev_queue *netdev_queue;
dma_addr_t bus;
u32 low;
__be32 high_swapped;
unsigned int len;
int idx, last_idx, avail, frag_cnt, frag_idx, count, mss, max_segments;
- u16 pseudo_hdr_offset, cksum_offset;
+ u16 pseudo_hdr_offset, cksum_offset, queue;
int cum_len, seglen, boundary, rdma_count;
u8 flags, odd_flag;
- /* always transmit through slot 0 */
- ss = mgp->ss;
+ queue = skb_get_queue_mapping(skb);
+ queue &= (mgp->num_slices - 1);
+
+ ss = &mgp->ss[queue];
+ netdev_queue = netdev_get_tx_queue(mgp->dev, queue);
tx = &ss->tx;
+
again:
req = tx->req_list;
avail = tx->mask - 1 - (tx->req - tx->done);
@@ -2758,7 +2855,7 @@
if ((unlikely(avail < max_segments))) {
/* we are out of transmit resources */
tx->stop_queue++;
- netif_stop_queue(dev);
+ netif_tx_stop_queue(netdev_queue);
return 1;
}
@@ -2951,10 +3048,16 @@
idx = ((count - 1) + tx->req) & tx->mask;
tx->info[idx].last = 1;
myri10ge_submit_req(tx, tx->req_list, count);
+ /* if using multiple tx queues, make sure NIC polls the
+ * current slice */
+ if ((mgp->dev->real_num_tx_queues > 1) && tx->queue_active == 0) {
+ tx->queue_active = 1;
+ put_be32(htonl(1), tx->send_go);
+ }
tx->pkt_start++;
if ((avail - count) < MXGEFW_MAX_SEND_DESC) {
tx->stop_queue++;
- netif_stop_queue(dev);
+ netif_tx_stop_queue(netdev_queue);
}
dev->trans_start = jiffies;
return 0;
@@ -3532,20 +3635,21 @@
for (i = 0; i < mgp->num_slices; i++) {
tx = &mgp->ss[i].tx;
printk(KERN_INFO
- "myri10ge: %s: (%d): %d %d %d %d %d\n",
- mgp->dev->name, i, tx->req, tx->done,
- tx->pkt_start, tx->pkt_done,
+ "myri10ge: %s: (%d): %d %d %d %d %d %d\n",
+ mgp->dev->name, i, tx->queue_active, tx->req,
+ tx->done, tx->pkt_start, tx->pkt_done,
(int)ntohl(mgp->ss[i].fw_stats->
send_done_count));
msleep(2000);
printk(KERN_INFO
- "myri10ge: %s: (%d): %d %d %d %d %d\n",
- mgp->dev->name, i, tx->req, tx->done,
- tx->pkt_start, tx->pkt_done,
+ "myri10ge: %s: (%d): %d %d %d %d %d %d\n",
+ mgp->dev->name, i, tx->queue_active, tx->req,
+ tx->done, tx->pkt_start, tx->pkt_done,
(int)ntohl(mgp->ss[i].fw_stats->
send_done_count));
}
}
+
rtnl_lock();
myri10ge_close(mgp->dev);
status = myri10ge_load_firmware(mgp, 1);
@@ -3600,10 +3704,14 @@
/* nic seems like it might be stuck.. */
if (rx_pause_cnt != mgp->watchdog_pause) {
if (net_ratelimit())
- printk(KERN_WARNING "myri10ge %s:"
+ printk(KERN_WARNING
+ "myri10ge %s slice %d:"
"TX paused, check link partner\n",
- mgp->dev->name);
+ mgp->dev->name, i);
} else {
+ printk(KERN_WARNING
+ "myri10ge %s slice %d stuck:",
+ mgp->dev->name, i);
reset_needed = 1;
}
}
@@ -3789,6 +3897,9 @@
mgp->num_slices);
if (status == 0) {
pci_disable_msix(pdev);
+#ifdef CONFIG_NETDEVICES_MULTIQUEUE
+ mgp->features |= NETIF_F_MULTI_QUEUE;
+#endif
return;
}
if (status > 0)
@@ -3818,7 +3929,7 @@
int status = -ENXIO;
int dac_enabled;
- netdev = alloc_etherdev(sizeof(*mgp));
+ netdev = alloc_etherdev_mq(sizeof(*mgp), MYRI10GE_MAX_SLICES);
if (netdev == NULL) {
dev_err(dev, "Could not allocate ethernet device\n");
return -ENOMEM;
@@ -3923,7 +4034,7 @@
dev_err(&pdev->dev, "failed to alloc slice state\n");
goto abort_with_firmware;
}
-
+ netdev->real_num_tx_queues = mgp->num_slices;
status = myri10ge_reset(mgp);
if (status != 0) {
dev_err(&pdev->dev, "failed reset\n");
@@ -3947,6 +4058,7 @@
netdev->set_multicast_list = myri10ge_set_multicast_list;
netdev->set_mac_address = myri10ge_set_mac_address;
netdev->features = mgp->features;
+
if (dac_enabled)
netdev->features |= NETIF_F_HIGHDMA;
@@ -4102,8 +4214,7 @@
printk(KERN_INFO "%s: Version %s\n", myri10ge_driver.name,
MYRI10GE_VERSION_STR);
- if (myri10ge_rss_hash > MXGEFW_RSS_HASH_TYPE_SRC_PORT ||
- myri10ge_rss_hash < MXGEFW_RSS_HASH_TYPE_IPV4) {
+ if (myri10ge_rss_hash > MXGEFW_RSS_HASH_TYPE_MAX) {
printk(KERN_ERR
"%s: Illegal rssh hash type %d, defaulting to source port\n",
myri10ge_driver.name, myri10ge_rss_hash);
@@ -4112,6 +4223,8 @@
#ifdef CONFIG_DCA
dca_register_notify(&myri10ge_dca_notifier);
#endif
+ if (myri10ge_max_slices > MYRI10GE_MAX_SLICES)
+ myri10ge_max_slices = MYRI10GE_MAX_SLICES;
return pci_register_driver(&myri10ge_driver);
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines
2008-09-12 17:49 ` [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines Brice Goglin
@ 2008-09-12 19:54 ` Ben Hutchings
2008-09-12 22:32 ` Duyck, Alexander H
1 sibling, 0 replies; 8+ messages in thread
From: Ben Hutchings @ 2008-09-12 19:54 UTC (permalink / raw)
To: Brice Goglin; +Cc: Jeff Garzik, netdev
On Fri, 2008-09-12 at 19:49 +0200, Brice Goglin wrote:
> myri10ge uses a Toeplitz hashing. Add the corresponding select_queue()
> method without using it yet.
[...]
Since Microsoft has pushed everyone to implement the Toeplitz hash, this
probably ought to go in the networking core and be exported to drivers.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines
2008-09-12 17:49 ` [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines Brice Goglin
2008-09-12 19:54 ` Ben Hutchings
@ 2008-09-12 22:32 ` Duyck, Alexander H
1 sibling, 0 replies; 8+ messages in thread
From: Duyck, Alexander H @ 2008-09-12 22:32 UTC (permalink / raw)
To: Brice Goglin, Jeff Garzik; +Cc: netdev@vger.kernel.org
Brice Goglin wrote:
> myri10ge uses a Toeplitz hashing. Add the corresponding select_queue()
> method without using it yet.
>
> Signed-off-by: Brice Goglin <brice@myri.com>
> ---
...
> + /*
> + * Note hashing order is reversed from how it is done
> + * in the NIC, so as to generate the same hash value
> + * for the connection to try to keep connections CPU local
> + */
Have you given this any consideration in terms of routing performance? It
seems to me that if you didn't reorder things you should be able to generate
the same exact hash value that you would have received on the receive side.
The end result being that packets received on queue 1 would transmit on queue
1, thus saving you a lot of cross queue thrash when routing. Of course this
all relies on the rss_key being global and shared between ports.
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent
2008-09-12 17:47 ` [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent Brice Goglin
@ 2008-09-13 19:30 ` Jeff Garzik
0 siblings, 0 replies; 8+ messages in thread
From: Jeff Garzik @ 2008-09-13 19:30 UTC (permalink / raw)
To: Brice Goglin; +Cc: netdev
Brice Goglin wrote:
> Stop scaring people with what looks like a fatal message when DCA support
> is compiled into their kernel, but the DCA device is not present.
>
> Signed-off-by: Brice Goglin <brice@myri.com>
> ---
> drivers/net/myri10ge/myri10ge.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> Index: linux-2.6.git/drivers/net/myri10ge/myri10ge.c
> ===================================================================
> --- linux-2.6.git.orig/drivers/net/myri10ge/myri10ge.c 2008-08-29 07:29:45.000000000 +0200
> +++ linux-2.6.git/drivers/net/myri10ge/myri10ge.c 2008-09-12 19:22:37.000000000 +0200
> @@ -1060,8 +1060,9 @@
> }
> err = dca_add_requester(&pdev->dev);
> if (err) {
> - dev_err(&pdev->dev,
> - "dca_add_requester() failed, err=%d\n", err);
> + if (err != -ENODEV)
> + dev_err(&pdev->dev,
> + "dca_add_requester() failed, err=%d\n", err);
> return;
applied 1-4
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-09-13 19:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-12 17:46 [PATCH 2.6.28] myri10ge updates + multiqueue TX Brice Goglin
2008-09-12 17:47 ` [PATCH 2.6.28 1/4] myri10ge: Stop scaring people when DCA is built but absent Brice Goglin
2008-09-13 19:30 ` Jeff Garzik
2008-09-12 17:48 ` [PATCH 2.6.28 2/4] myri10ge: Rename DCA-related firmware counters Brice Goglin
2008-09-12 17:49 ` [PATCH 2.6.28 3/4] myri10ge: Add Toeplitz-hashing related routines Brice Goglin
2008-09-12 19:54 ` Ben Hutchings
2008-09-12 22:32 ` Duyck, Alexander H
2008-09-12 17:50 ` [PATCH 2.6.28 4/4] myri10ge: Add multiqueue TX support Brice Goglin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).