* NAPI for eepro100
@ 2002-06-12 12:39 Zhang Fuxin
2002-06-12 23:00 ` Jeff Garzik
0 siblings, 1 reply; 10+ messages in thread
From: Zhang Fuxin @ 2002-06-12 12:39 UTC (permalink / raw)
To: linux-mips, saw, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1090 bytes --]
hi,all
Recently i've converted eepro100 driver to use napi,in order to improve
network performance of my poor 150M mips machine. It does eliminate
the interrupt live lock seen before,maintaining a peak throughput under
heavy load.
In case anybody are interested,i post the patches to the list. They are
3 incremental patchs:
eepro100-napi.patch is against 2.5.20 eepro100.c and provide basic
napi support
eepro100-proc.patch is proc file system support adapted from intel's
e100 driver. I am using it for debugging.
eepro100-mips.patch is mips specific patch to make it work(well) for
my mips
platform.
(i suppose people use: patch eepro100.c < x.patch to apply patch)
Tests are mainly done on my mips machine for 2.4 kernel,though i think
it should work for at least x86 on which minimal test is performed. Be
careful.
A little pitty is that to achieve good performance under heavy load, the
/proc/sys/net/core/netdev_max_backlog value may need to be adjusted.
Feedbacks are always welcome:). But i am not on linux-kernel list,so
people
there please CC me.
[-- Attachment #2: eepro100-mips.patch --]
[-- Type: text/plain, Size: 3463 bytes --]
--- eepro100-napi-proc.c Wed Jun 12 17:33:51 2002
+++ eepro100-mips.c Wed Jun 12 19:22:29 2002
@@ -45,7 +45,7 @@
/* Set the copy breakpoint for the copy-only-tiny-buffer Rx method.
Lower values use more memory, but are faster. */
-#if defined(__alpha__) || defined(__sparc__) || defined(__mips__) || \
+#if defined(__alpha__) || defined(__sparc__) || /*defined(__mips__) ||*/ \
defined(__arm__)
static int rx_copybreak = 1518;
#else
@@ -66,8 +66,8 @@
/* A few values that may be tweaked. */
/* The ring sizes should be a power of two for efficiency. */
-#define TX_RING_SIZE 64
-#define RX_RING_SIZE 64
+#define TX_RING_SIZE 32
+#define RX_RING_SIZE 32
/* How much slots multicast filter setup may take.
Do not descrease without changing set_rx_mode() implementaion. */
#define TX_MULTICAST_SIZE 2
@@ -1298,7 +1298,14 @@
if (skb == NULL)
break; /* OK. Just initially short of Rx bufs. */
skb->dev = dev; /* Mark as being used by this device. */
+#ifndef __mips__
rxf = (struct RxFD *)skb->tail;
+#else
+ /* use uncached address,use pci_dma_sync_xx seems
+ * problematic in my mips platform
+ */
+ rxf = (struct RxFD *)(KSEG1ADDR(skb->tail));
+#endif
sp->rx_ringp[i] = rxf;
sp->rx_ring_dma[i] =
pci_map_single(sp->pdev, rxf,
@@ -1306,8 +1313,10 @@
skb_reserve(skb, sizeof(struct RxFD));
if (last_rxf) {
last_rxf->link = cpu_to_le32(sp->rx_ring_dma[i]);
+#ifndef __mips__
pci_dma_sync_single(sp->pdev, last_rxf_dma,
sizeof(struct RxFD), PCI_DMA_TODEVICE);
+#endif
}
last_rxf = rxf;
last_rxf_dma = sp->rx_ring_dma[i];
@@ -1316,14 +1325,18 @@
/* This field unused by i82557. */
rxf->rx_buf_addr = 0xffffffff;
rxf->count = cpu_to_le32(PKT_BUF_SZ << 16);
+#ifndef __mips__
pci_dma_sync_single(sp->pdev, sp->rx_ring_dma[i],
sizeof(struct RxFD), PCI_DMA_TODEVICE);
+#endif
}
sp->dirty_rx = (unsigned int)(i - RX_RING_SIZE);
/* Mark the last entry as end-of-list. */
last_rxf->status = cpu_to_le32(0xC0000002); /* '2' is flag value only. */
+#ifndef __mips__
pci_dma_sync_single(sp->pdev, sp->rx_ring_dma[RX_RING_SIZE-1],
sizeof(struct RxFD), PCI_DMA_TODEVICE);
+#endif
sp->last_rxf = last_rxf;
sp->last_rxf_dma = last_rxf_dma;
}
@@ -1733,15 +1746,21 @@
#endif
return NULL;
}
+#ifndef __mips__
rxf = sp->rx_ringp[entry] = (struct RxFD *)skb->tail;
+#else
+ rxf = sp->rx_ringp[entry] = (struct RxFD *)(KSEG1ADDR(skb->tail));
+#endif
sp->rx_ring_dma[entry] =
pci_map_single(sp->pdev, rxf,
PKT_BUF_SZ + sizeof(struct RxFD), PCI_DMA_FROMDEVICE);
skb->dev = dev;
skb_reserve(skb, sizeof(struct RxFD));
rxf->rx_buf_addr = 0xffffffff;
+#ifndef __mips__
pci_dma_sync_single(sp->pdev, sp->rx_ring_dma[entry],
sizeof(struct RxFD), PCI_DMA_TODEVICE);
+#endif
return rxf;
}
@@ -1754,8 +1773,10 @@
rxf->count = cpu_to_le32(PKT_BUF_SZ << 16);
sp->last_rxf->link = cpu_to_le32(rxf_dma);
sp->last_rxf->status &= cpu_to_le32(~0xC0000000);
+#ifndef __mips__
pci_dma_sync_single(sp->pdev, sp->last_rxf_dma,
sizeof(struct RxFD), PCI_DMA_TODEVICE);
+#endif
sp->last_rxf = rxf;
sp->last_rxf_dma = rxf_dma;
}
@@ -2274,8 +2295,10 @@
int status;
int pkt_len;
+#ifndef __mips__
pci_dma_sync_single(sp->pdev, sp->rx_ring_dma[entry],
sizeof(struct RxFD), PCI_DMA_FROMDEVICE);
+#endif
status = le32_to_cpu(sp->rx_ringp[entry]->status);
pkt_len = le32_to_cpu(sp->rx_ringp[entry]->count) & 0x3fff;
[-- Attachment #3: eepro100-napi.patch --]
[-- Type: text/plain, Size: 9645 bytes --]
--- eepro100.c.ori Wed Jun 12 17:25:28 2002
+++ eepro100-napi.c Wed Jun 12 17:11:38 2002
@@ -25,6 +25,8 @@
Disabled FC and ER, to avoid lockups when when we get FCP interrupts.
2000 Jul 17 Goutham Rao <goutham.rao@intel.com>
PCI DMA API fixes, adding pci_dma_sync_single calls where neccesary
+ 2002 Jun 12 Zhang Fuxin <fxzhang@ict.ac.cn>
+ add NAPI support
*/
static const char *version =
@@ -115,6 +117,8 @@
#include <linux/skbuff.h>
#include <linux/ethtool.h>
+#define CONFIG_EEPRO100_NAPI
+
MODULE_AUTHOR("Maintainer: Andrey V. Savochkin <saw@saw.sw.com.sg>");
MODULE_DESCRIPTION("Intel i82557/i82558/i82559 PCI EtherExpressPro driver");
MODULE_LICENSE("GPL");
@@ -494,8 +498,34 @@
#ifdef CONFIG_PM
u32 pm_state[16];
#endif
+
+ /* added by zfx for NAPI*/
+#ifdef CONFIG_EEPRO100_NAPI
+
+ /* used to pass rx_work_limit into speedo_rx,i don't want to
+ * change its prototype
+ */
+ int curr_work_limit;
+ unsigned long poll_switch;
+ unsigned long failed_poll_switch;
+ unsigned long done_poll;
+ unsigned long notdone_poll;
+ unsigned long empty_poll;
+ unsigned long soft_reset_count;
+ unsigned long rx_resume_count;
+ unsigned long alloc_fail;
+ unsigned long long poll_cycles;
+
+#ifdef CONFIG_NET_FASTROUTE
+ unsigned long fastroute_hit;
+ unsigned long fastroute_success;
+ unsigned long fastroute_defer;
+#endif
+
+#endif
};
+
/* The parameters for a CmdConfigure operation.
There are so many options that it would be difficult to document each bit.
We mostly use the default or recommended settings. */
@@ -546,6 +576,14 @@
static void set_rx_mode(struct net_device *dev);
static void speedo_show_state(struct net_device *dev);
+#ifdef CONFIG_EEPRO100_NAPI
+
+static int speedo_poll (struct net_device *dev, int *budget);
+static void enable_rx_and_rxnobuf_ints(struct net_device *dev);
+static void disable_rx_and_rxnobuf_ints(struct net_device *dev);
+
+#endif
+
\f
#ifdef honor_default_port
@@ -842,6 +880,10 @@
dev->set_multicast_list = &set_rx_mode;
dev->do_ioctl = &speedo_ioctl;
+#ifdef CONFIG_EEPRO100_NAPI
+ dev->poll = speedo_poll;
+ dev->quota = dev->weight = RX_RING_SIZE;
+#endif
return 0;
}
\f
@@ -1517,6 +1559,9 @@
struct speedo_private *sp;
long ioaddr, boguscnt = max_interrupt_work;
unsigned short status;
+#ifdef CONFIG_EEPRO100_NAPI
+ int first = 1;
+#endif
#ifndef final_version
if (dev == NULL) {
@@ -1543,16 +1588,21 @@
/* Acknowledge all of the current interrupt sources ASAP. */
/* Will change from 0xfc00 to 0xff00 when we start handling
FCP and ER interrupts --Dragan */
+#ifndef CONFIG_EEPRO100_NAPI
outw(status & 0xfc00, ioaddr + SCBStatus);
+#else
+ /* Rx & RxNoBuf is acked in speedo_poll */
+ outw(status & 0xac00, ioaddr + SCBStatus);
+#endif
if (speedo_debug > 4)
printk(KERN_DEBUG "%s: interrupt status=%#4.4x.\n",
dev->name, status);
+#ifndef CONFIG_EEPRO100_NAPI
if ((status & 0xfc00) == 0)
break;
-
if ((status & 0x5000) || /* Packet received, or Rx error. */
(sp->rx_ring_state&(RrNoMem|RrPostponed)) == RrPostponed)
/* Need to gather the postponed packet. */
@@ -1560,8 +1610,33 @@
/* Always check if all rx buffers are allocated. --SAW */
speedo_refill_rx_buffers(dev, 0);
+#else
+ /* Packet received, or Rx error. */
+ if (first && ((status & 0x5000) || (sp->rx_ring_state&(RrNoMem|RrPostponed)) == RrPostponed || (status & 0x3c) != 0x10 ))
+ /* Need to gather the postponed packet. */
+ {
+ if (speedo_debug > 4)
+ printk("switching to poll,status=%x\n",status);
+ first = 0;
+ if (netif_rx_schedule_prep(dev)) {
+ sp->poll_switch++;
+ /* disable interrupts caused by arriving packets */
+ disable_rx_and_rxnobuf_ints(dev);
+ /* tell system we have work to be done. */
+ __netif_rx_schedule(dev);
+ }else {
+ sp->failed_poll_switch++;
+ }
+
+ }
+
+ if ((status & 0xac00) == 0)
+ break;
+#endif
spin_lock(&sp->lock);
+
+#ifndef CONFIG_EEPRO100_NAPI
/*
* The chip may have suspended reception for various reasons.
* Check for that, and re-prime it should this be the case.
@@ -1581,7 +1656,7 @@
/* these are all reserved values */
break;
}
-
+#endif
/* User interrupt, Command/Tx unit interrupt or CU not active. */
if (status & 0xA400) {
@@ -1602,7 +1677,12 @@
/* Clear all interrupt sources. */
/* Will change from 0xfc00 to 0xff00 when we start handling
FCP and ER interrupts --Dragan */
+#ifndef CONFIG_EEPRO100_NAPI
outw(0xfc00, ioaddr + SCBStatus);
+#else
+ outw(0xac00, ioaddr + SCBStatus);
+#endif
+
break;
}
} while (1);
@@ -1611,7 +1691,9 @@
printk(KERN_DEBUG "%s: exiting interrupt, status=%#4.4x.\n",
dev->name, inw(ioaddr + SCBStatus));
+#ifndef final_version
clear_bit(0, (void*)&sp->in_interrupt);
+#endif
return;
}
@@ -1625,6 +1707,9 @@
sp->rx_skbuff[entry] = skb;
if (skb == NULL) {
sp->rx_ringp[entry] = NULL;
+#ifdef CONFIG_EEPRO100_NAPI
+ sp->alloc_fail++;
+#endif
return NULL;
}
rxf = sp->rx_ringp[entry] = (struct RxFD *)skb->tail;
@@ -1705,12 +1790,112 @@
speedo_refill_rx_buf(dev, force) != -1);
}
+#ifdef CONFIG_EEPRO100_NAPI
+static void enable_rx_and_rxnobuf_ints(struct net_device *dev)
+{
+ long ioaddr = dev->base_addr;
+ outw(SCBMaskEarlyRx | SCBMaskFlowCtl, ioaddr + SCBCmd);
+ inw(ioaddr + SCBStatus); /* flushes last write, read-safe */
+};
+
+static void disable_rx_and_rxnobuf_ints(struct net_device *dev)
+{
+ long ioaddr = dev->base_addr;
+ outw(SCBMaskRxDone | SCBMaskRxSuspend | SCBMaskEarlyRx | SCBMaskFlowCtl, ioaddr + SCBCmd);
+ inw(ioaddr + SCBStatus); /* flushes last write, read-safe */
+};
+
+static int speedo_poll (struct net_device *dev, int *budget)
+{
+ struct speedo_private *sp = (struct speedo_private *)dev->priv;
+ long ioaddr, received = 0;
+ unsigned short intr_status;
+
+ ioaddr = dev->base_addr;
+ intr_status = inw(ioaddr + SCBStatus);
+
+ if (speedo_debug > 4)
+ printk(KERN_DEBUG " In speedo_poll().\n");
+
+ sp->curr_work_limit = *budget;
+ if (sp->curr_work_limit > dev->quota)
+ sp->curr_work_limit = dev->quota;
+
+ do {
+ /* ack Rx & RxNobuf intrs*/
+ outw(intr_status & 0x5000, ioaddr + SCBStatus);
+
+ received += speedo_rx(dev);
+
+ if (sp->curr_work_limit < 0) /* out of quota */
+ goto not_done;
+
+ /* no packets on ring; but new ones can arrive since we last checked */
+ intr_status = inw(ioaddr + SCBStatus);
+
+ if ((intr_status & 0x5000) == 0) {
+ /* If something arrives in this narrow window,an interrupt
+ * will be generated
+ */
+ goto done;
+ }
+ /* done! at least thats what it looks like ;->
+ if new packets came in after our last check on status bits
+ they'll be caught by the while check and we go back and clear them
+ since we havent exceeded our quota
+ */
+ } while (intr_status & 0x5000);
+
+done:
+ if (!received) {
+ if (speedo_debug > 4) printk("received==0\n");
+ received = 1;
+ sp->empty_poll++;
+ }
+ dev->quota -= received;
+ *budget -= received;
+
+ /* we are happy/done, no more packets on ring; put us back
+ * to where we can start processing interrupts again
+ */
+ netif_rx_complete(dev);
+ enable_rx_and_rxnobuf_ints(dev);
+
+ sp->done_poll++;
+
+ if (speedo_debug > 3)
+ printk("done,received=%lu\n",received);
+
+ return 0; /* done */
+
+not_done:
+ if (!received) {
+ if (speedo_debug > 4) printk("received==0\n");
+ received = 1;
+ sp->empty_poll++;
+ }
+ dev->quota -= received;
+ *budget -= received;
+
+ sp->notdone_poll++;
+
+ if (speedo_debug > 3)
+ printk("not done,received=%lu\n",received);
+
+ return 1; /* not_done */
+}
+
+#endif /* NAPI */
+
static int
speedo_rx(struct net_device *dev)
{
struct speedo_private *sp = (struct speedo_private *)dev->priv;
int entry = sp->cur_rx % RX_RING_SIZE;
+#ifndef CONFIG_EEPRO100_NAPI
int rx_work_limit = sp->dirty_rx + RX_RING_SIZE - sp->cur_rx;
+#endif
+ int received = 0;
int alloc_ok = 1;
if (speedo_debug > 4)
@@ -1725,11 +1910,42 @@
status = le32_to_cpu(sp->rx_ringp[entry]->status);
pkt_len = le32_to_cpu(sp->rx_ringp[entry]->count) & 0x3fff;
+#ifndef CONFIG_EEPRO100_NAPI
if (!(status & RxComplete))
break;
if (--rx_work_limit < 0)
break;
+#else
+ if (!(status & RxComplete)) {
+ int intr_status;
+ unsigned long ioaddr = dev->base_addr;
+
+ intr_status = inw(ioaddr + SCBStatus);
+ /* We check receiver state here because if
+ * we have to do soft reset,sp->cur_rx should
+ * point to an empty entry or something
+ * unexpected will happen
+ */
+ if (intr_status | 0x1000) { /* suspended */
+ outw(0x5000,ioaddr + SCBStatus);
+ /* No resources */
+ if ((intr_status & 0x3c) == 0x28) {
+ outw(RxResumeNoResources,ioaddr+SCBCmd);
+ sp->rx_resume_count++;
+ }else if ((intr_status & 0x3c) == 0x8) {
+ if (speedo_debug > 4)
+ printk("No resource,reset\n");
+ speedo_rx_soft_reset(dev);
+ sp->soft_reset_count++;
+ }
+ }
+ break;
+ }
+
+ if (--sp->curr_work_limit < 0)
+ break;
+#endif
/* Check for a rare out-of-memory case: the current buffer is
the last buffer allocated in the RX ring. --SAW */
@@ -1793,7 +2009,12 @@
PKT_BUF_SZ + sizeof(struct RxFD), PCI_DMA_FROMDEVICE);
}
skb->protocol = eth_type_trans(skb, dev);
+#ifndef CONFIG_EEPRO100_NAPI
netif_rx(skb);
+#else
+ netif_receive_skb(skb);
+ received ++;
+#endif
dev->last_rx = jiffies;
sp->stats.rx_packets++;
sp->stats.rx_bytes += pkt_len;
@@ -1811,7 +2032,7 @@
sp->last_rx_time = jiffies;
- return 0;
+ return received;
}
static int
[-- Attachment #4: eepro100-proc.patch --]
[-- Type: text/plain, Size: 10000 bytes --]
--- eepro100-napi.c Wed Jun 12 17:11:38 2002
+++ eepro100-napi-proc.c Wed Jun 12 17:33:51 2002
@@ -119,6 +119,10 @@
#define CONFIG_EEPRO100_NAPI
+#ifdef CONFIG_PROC_FS
+#include <linux/proc_fs.h>
+#endif
+
MODULE_AUTHOR("Maintainer: Andrey V. Savochkin <saw@saw.sw.com.sg>");
MODULE_DESCRIPTION("Intel i82557/i82558/i82559 PCI EtherExpressPro driver");
MODULE_LICENSE("GPL");
@@ -516,6 +520,10 @@
unsigned long alloc_fail;
unsigned long long poll_cycles;
+#ifdef CONFIG_PROC_FS
+ struct proc_dir_entry *proc_parent;
+#endif
+
#ifdef CONFIG_NET_FASTROUTE
unsigned long fastroute_hit;
unsigned long fastroute_success;
@@ -582,6 +590,11 @@
static void enable_rx_and_rxnobuf_ints(struct net_device *dev);
static void disable_rx_and_rxnobuf_ints(struct net_device *dev);
+#ifdef CONFIG_PROC_FS
+int __devinit speedo_create_proc_subdir(struct net_device *sp);
+void speedo_remove_proc_subdir(struct net_device *sp);
+#endif
+
#endif
\f
@@ -883,6 +896,14 @@
#ifdef CONFIG_EEPRO100_NAPI
dev->poll = speedo_poll;
dev->quota = dev->weight = RX_RING_SIZE;
+
+#ifdef CONFIG_PROC_FS
+ if (speedo_create_proc_subdir(dev) < 0) {
+ printk(KERN_ERR "Failed to create proc directory for %s\n",
+ dev->name);
+ }
+#endif
+
#endif
return 0;
}
@@ -1885,6 +1906,354 @@
return 1; /* not_done */
}
+#ifdef CONFIG_PROC_FS
+/* adapted from intel's e100 code */
+static struct proc_dir_entry *adapters_proc_dir = 0;
+
+static void speedo_proc_cleanup(void);
+static unsigned char speedo_init_proc_dir(void);
+
+#define ADAPTERS_PROC_DIR "eepro100"
+#define WRITE_BUF_MAX_LEN 20
+#define READ_BUF_MAX_LEN 256
+#define SPEEDO_PE_LEN 25
+
+#define sp_off(off) (unsigned long)(offsetof(struct speedo_private, off))
+
+typedef struct _speedo_proc_entry {
+ char *name;
+ read_proc_t *read_proc;
+ write_proc_t *write_proc;
+ unsigned long offset; /* offset into sp. ~0 means no value, pass NULL. */
+} speedo_proc_entry;
+
+static int
+generic_read(char *page, char **start, off_t off, int count, int *eof, int len)
+{
+ if (len <= off + count)
+ *eof = 1;
+
+ *start = page + off;
+ len -= off;
+ if (len > count)
+ len = count;
+
+ if (len < 0)
+ len = 0;
+
+ return len;
+}
+
+static int
+read_ulong(char *page, char **start, off_t off,
+ int count, int *eof, unsigned long l)
+{
+ int len;
+
+ len = sprintf(page, "%lu\n", l);
+
+ return generic_read(page, start, off, count, eof, len);
+}
+
+static int
+read_gen_ulong(char *page, char **start, off_t off,
+ int count, int *eof, void *data)
+{
+ unsigned long val = 0;
+
+ if (data)
+ val = *((unsigned long *) data);
+
+ return read_ulong(page, start, off, count, eof, val);
+}
+
+static int
+read_ulonglong(char *page, char **start, off_t off,
+ int count, int *eof, unsigned long long ll)
+{
+ int len;
+
+ len = sprintf(page, "%llu\n", ll);
+
+ return generic_read(page, start, off, count, eof, len);
+}
+
+static int
+read_gen_ulonglong(char *page, char **start, off_t off,
+ int count, int *eof, void *data)
+{
+ unsigned long val = 0;
+
+ if (data)
+ val = *((unsigned long long *) data);
+
+ return read_ulonglong(page, start, off, count, eof, val);
+}
+
+static int
+set_debug(struct file *file, const char *buffer,
+ unsigned long count, void *data)
+
+{
+ if (speedo_debug == 1)
+ speedo_debug = 6;
+ else
+ speedo_debug = 1;
+ return count;
+}
+
+static int
+_speedo_show_state(struct file *file, const char *buffer,
+ unsigned long count, void *data)
+{
+
+ struct net_device *dev = (struct net_device *)data;
+
+ speedo_show_state(dev);
+
+ return count;
+}
+
+static speedo_proc_entry speedo_proc_list[] = {
+ {"set_debug", 0, set_debug, ~0},
+ {"show_state", 0, _speedo_show_state, ~0},
+ {"poll_switch",read_gen_ulong,0,sp_off(poll_switch)},
+ {"failed_poll_switch",read_gen_ulong,0,sp_off(failed_poll_switch)},
+ {"done_poll",read_gen_ulong,0,sp_off(done_poll)},
+ {"notdone_poll",read_gen_ulong,0,sp_off(notdone_poll)},
+ {"empty_poll",read_gen_ulong,0,sp_off(empty_poll)},
+ {"soft_reset_count",read_gen_ulong,0,sp_off(soft_reset_count)},
+ {"rx_resume_count",read_gen_ulong,0,sp_off(rx_resume_count)},
+ {"alloc_fail",read_gen_ulong,0,sp_off(alloc_fail)},
+ {"poll_cycles",read_gen_ulonglong,0,sp_off(poll_cycles)},
+ {"fastroute_hit",read_gen_ulonglong,0,sp_off(fastroute_hit)},
+ {"fastroute_success",read_gen_ulonglong,0,sp_off(fastroute_success)},
+ {"fastroute_defer",read_gen_ulonglong,0,sp_off(fastroute_defer)},
+ {"", 0, 0, 0}
+};
+
+static int
+read_info(char *page, char **start, off_t off, int count, int *eof, void *data)
+{
+ struct speedo_private *sp = data;
+ speedo_proc_entry *pe;
+ int tmp;
+ void *val;
+ int len = 0;
+
+ for (pe = speedo_proc_list; pe->name[0]; pe++) {
+ if (pe->name[0] == '\n') {
+ len += sprintf(page + len, "\n");
+ continue;
+ }
+
+ if (pe->read_proc) {
+ if ((len + READ_BUF_MAX_LEN + SPEEDO_PE_LEN + 1) >=
+ PAGE_SIZE)
+ break;
+
+ if (pe->offset != ~0)
+ val = ((char *) sp) + pe->offset;
+ else
+ val = NULL;
+
+ len += sprintf(page + len, "%-"
+ __MODULE_STRING(SPEEDO_PE_LEN)
+ "s ", pe->name);
+ len += pe->read_proc(page + len, start, 0,
+ READ_BUF_MAX_LEN + 1, &tmp, val);
+ }
+ }
+
+ return generic_read(page, start, off, count, eof, len);
+}
+
+static struct proc_dir_entry * __devinit
+create_proc_rw(char *name, void *data, struct proc_dir_entry *parent,
+ read_proc_t * read_proc, write_proc_t * write_proc)
+{
+ struct proc_dir_entry *pdep;
+ mode_t mode = S_IFREG;
+
+ if (write_proc) {
+ mode |= S_IWUSR;
+ if (read_proc) {
+ mode |= S_IRUSR;
+ }
+
+ } else if (read_proc) {
+ mode |= S_IRUGO;
+ }
+
+ if (!(pdep = create_proc_entry(name, mode, parent)))
+ return NULL;
+
+ pdep->read_proc = read_proc;
+ pdep->write_proc = write_proc;
+ pdep->data = data;
+ return pdep;
+}
+
+void
+speedo_remove_proc_subdir(struct net_device *dev)
+{
+ struct speedo_private *sp = (struct speedo_private *)dev->priv;
+ speedo_proc_entry *pe;
+ char info[256];
+ int len;
+
+ /* If our root /proc dir was not created, there is nothing to remove */
+ if (adapters_proc_dir == NULL) {
+ return;
+ }
+
+ len = strlen(dev->name);
+ strncpy(info, dev->name, sizeof (info));
+ strncat(info + len, ".info", sizeof (info) - len);
+
+ if (sp->proc_parent) {
+ for (pe = speedo_proc_list; pe->name[0]; pe++) {
+ if (pe->name[0] == '\n')
+ continue;
+
+ remove_proc_entry(pe->name, sp->proc_parent);
+ }
+
+ remove_proc_entry(dev->name, adapters_proc_dir);
+ sp->proc_parent = NULL;
+ }
+
+ remove_proc_entry(info, adapters_proc_dir);
+
+ /* try to remove the main /proc dir, if it's empty */
+ speedo_proc_cleanup();
+}
+
+int __devinit
+speedo_create_proc_subdir(struct net_device *dev)
+{
+ struct speedo_private *sp = (struct speedo_private *)dev->priv;
+ struct proc_dir_entry *dev_dir;
+ speedo_proc_entry *pe;
+ char info[256];
+ int len;
+ void *data;
+
+ /* create the main /proc dir if needed */
+ if (!adapters_proc_dir) {
+ if (!speedo_init_proc_dir())
+ return -ENOMEM;
+ }
+
+ strncpy(info, dev->name, sizeof (info));
+ len = strlen(info);
+ strncat(info + len, ".info", sizeof (info) - len);
+
+ /* info */
+ if (!(create_proc_rw(info, sp, adapters_proc_dir, read_info, 0))) {
+ speedo_proc_cleanup();
+ return -ENOMEM;
+ }
+
+ dev_dir = create_proc_entry(dev->name, S_IFDIR,
+ adapters_proc_dir);
+ sp->proc_parent = dev_dir;
+
+ if (!dev_dir) {
+ speedo_remove_proc_subdir(dev);
+ return -ENOMEM;
+ }
+
+ for (pe = speedo_proc_list; pe->name[0]; pe++) {
+ if (pe->name[0] == '\n')
+ continue;
+
+ if (pe->offset != ~0)
+ data = ((char *) sp) + pe->offset;
+ else
+ data = dev;
+
+ if (!(create_proc_rw(pe->name, data, dev_dir,
+ pe->read_proc, pe->write_proc))) {
+ speedo_remove_proc_subdir(dev);
+ return -ENOMEM;
+ }
+ }
+
+ return 0;
+}
+
+/****************************************************************************
+ * Name: speedo_init_proc_dir
+ *
+ * Description: This routine creates the top-level /proc directory for the
+ * driver in /proc/net
+ *
+ * Arguments: none
+ *
+ * Returns: true on success, false on fail
+ *
+ ***************************************************************************/
+static unsigned char
+speedo_init_proc_dir(void)
+{
+ int len;
+
+ /* first check if adapters_proc_dir already exists */
+ len = strlen(ADAPTERS_PROC_DIR);
+ for (adapters_proc_dir = proc_net->subdir;
+ adapters_proc_dir; adapters_proc_dir = adapters_proc_dir->next) {
+
+ if ((adapters_proc_dir->namelen == len) &&
+ (!memcmp(adapters_proc_dir->name, ADAPTERS_PROC_DIR, len)))
+ break;
+ }
+
+ if (!adapters_proc_dir)
+ adapters_proc_dir =
+ create_proc_entry(ADAPTERS_PROC_DIR, S_IFDIR, proc_net);
+
+ if (!adapters_proc_dir)
+ return 0;
+
+ return 1;
+}
+
+/****************************************************************************
+ * Name: speedo_proc_cleanup
+ *
+ * Description: This routine clears the top-level /proc directory, if empty.
+ *
+ * Arguments: none
+ *
+ * Returns: none
+ *
+ ***************************************************************************/
+static void
+speedo_proc_cleanup(void)
+{
+ struct proc_dir_entry *de;
+
+ if (adapters_proc_dir == NULL) {
+ return;
+ }
+
+ /* check if subdir list is empty before removing adapters_proc_dir */
+ for (de = adapters_proc_dir->subdir; de; de = de->next) {
+ /* ignore . and .. */
+ if (*(de->name) != '.')
+ break;
+ }
+
+ if (de)
+ return;
+
+ remove_proc_entry(ADAPTERS_PROC_DIR, proc_net);
+ adapters_proc_dir = NULL;
+}
+
+#endif /* CONFIG_PROC_FS */
+
#endif /* NAPI */
static int
@@ -2474,6 +2843,9 @@
unregister_netdev(dev);
+#if defined(CONFIG_EEPRO100_NAPI) && defined(CONFIG_PROC_FS)
+ speedo_remove_proc_subdir(dev);
+#endif
release_region(pci_resource_start(pdev, 1), pci_resource_len(pdev, 1));
release_mem_region(pci_resource_start(pdev, 0), pci_resource_len(pdev, 0));
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-12 12:39 NAPI for eepro100 Zhang Fuxin
@ 2002-06-12 23:00 ` Jeff Garzik
2002-06-12 23:05 ` David S. Miller
0 siblings, 1 reply; 10+ messages in thread
From: Jeff Garzik @ 2002-06-12 23:00 UTC (permalink / raw)
To: Zhang Fuxin; +Cc: linux-mips, saw, linux-kernel, netdev
Zhang Fuxin wrote:
> hi,all
> Recently i've converted eepro100 driver to use napi,in order to improve
> network performance of my poor 150M mips machine. It does eliminate
> the interrupt live lock seen before,maintaining a peak throughput under
> heavy load.
> In case anybody are interested,i post the patches to the list. They are
> 3 incremental patchs:
> eepro100-napi.patch is against 2.5.20 eepro100.c and provide basic
> napi support
Nifty, I'll take a look at this.
> eepro100-proc.patch is proc file system support adapted from intel's
> e100 driver. I am using it for debugging.
> eepro100-mips.patch is mips specific patch to make it work(well) for
> my mips
> platform.
Just FWIW I'm not gonna apply these... for the 'proc' patch, that either
needs to be moved to ethtool, or we should make a filesystem for net
drivers that exports procfs-like inodes. for the 'mips' patch, it looks
like the arch maintainer(s) need to fix the PCI DMA support...
Jeff
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-12 23:00 ` Jeff Garzik
@ 2002-06-12 23:05 ` David S. Miller
2002-06-12 23:17 ` Jeff Garzik
2002-06-13 8:57 ` Andrey Savochkin
0 siblings, 2 replies; 10+ messages in thread
From: David S. Miller @ 2002-06-12 23:05 UTC (permalink / raw)
To: jgarzik; +Cc: fxzhang, linux-mips, saw, linux-kernel, netdev
From: Jeff Garzik <jgarzik@mandrakesoft.com>
Date: Wed, 12 Jun 2002 19:00:00 -0400
for the 'mips' patch, it looks
like the arch maintainer(s) need to fix the PCI DMA support...
No, it's worse than that.
See how non-consistent memory is used by the eepro100 driver
for descriptor bits? The skb->tail bits?
That is very problematic.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-12 23:05 ` David S. Miller
@ 2002-06-12 23:17 ` Jeff Garzik
2002-06-12 23:33 ` David S. Miller
2002-06-13 8:57 ` Andrey Savochkin
1 sibling, 1 reply; 10+ messages in thread
From: Jeff Garzik @ 2002-06-12 23:17 UTC (permalink / raw)
To: David S. Miller; +Cc: fxzhang, linux-mips, saw, linux-kernel, netdev
David S. Miller wrote:
> From: Jeff Garzik <jgarzik@mandrakesoft.com>
> Date: Wed, 12 Jun 2002 19:00:00 -0400
>
> for the 'mips' patch, it looks
> like the arch maintainer(s) need to fix the PCI DMA support...
>
> No, it's worse than that.
>
> See how non-consistent memory is used by the eepro100 driver
> for descriptor bits? The skb->tail bits?
>
> That is very problematic.
Oh crap, you're right... eepro100 in general does funky stuff with the
way packets are handled, mainly due to the need to issue commands to the
NIC engine instead of the normal per-descriptor owner bit way of doing
things.
Well, I accept patches to that clean eepro100 up... I'm not terribly
motivated to clean it up myself, as we have e100 and an e100 maintainer
we can beat on if such uglies arise :)
Jeff
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-12 23:17 ` Jeff Garzik
@ 2002-06-12 23:33 ` David S. Miller
2002-06-13 2:25 ` Donald Becker
0 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2002-06-12 23:33 UTC (permalink / raw)
To: jgarzik; +Cc: fxzhang, linux-mips, saw, linux-kernel, netdev
From: Jeff Garzik <jgarzik@mandrakesoft.com>
Date: Wed, 12 Jun 2002 19:17:58 -0400
Oh crap, you're right... eepro100 in general does funky stuff with the
way packets are handled, mainly due to the need to issue commands to the
NIC engine instead of the normal per-descriptor owner bit way of doing
things.
The question is, do the descriptor bits have to live right before
the RX packet data buffer or can other schemes be used?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-12 23:33 ` David S. Miller
@ 2002-06-13 2:25 ` Donald Becker
2002-06-13 7:15 ` Samuel Maftoul
0 siblings, 1 reply; 10+ messages in thread
From: Donald Becker @ 2002-06-13 2:25 UTC (permalink / raw)
To: David S. Miller
Cc: Jeff Garzik, linux-mips, Linux Kernel Mailing List, netdev
On Wed, 12 Jun 2002, David S. Miller wrote:
> From: Jeff Garzik <jgarzik@mandrakesoft.com>
> Oh crap, you're right... eepro100 in general does funky stuff with the
> way packets are handled, mainly due to the need to issue commands to the
> NIC engine instead of the normal per-descriptor owner bit way of doing
> things.
The eepro100 has a unique design in many different aspects.
> The question is, do the descriptor bits have to live right before
> the RX packet data buffer or can other schemes be used?
With the current driver structure, yes, the descriptor words must be
immediately before the packet data. You can use other Rx and Tx
structures/modes to avoid this, but they use less efficient memory access.
For instance, the current Tx structure allows transmitting a packet with
a single PCI burst, rather than multiple transfers.
--
Donald Becker becker@scyld.com
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210 Second Generation Beowulf Clusters
Annapolis MD 21403 410-990-9993
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-13 2:25 ` Donald Becker
@ 2002-06-13 7:15 ` Samuel Maftoul
2002-06-19 11:46 ` Bill Davidsen
0 siblings, 1 reply; 10+ messages in thread
From: Samuel Maftoul @ 2002-06-13 7:15 UTC (permalink / raw)
To: lkml
On Wed, Jun 12, 2002 at 10:25:22PM -0400, Donald Becker wrote:
> On Wed, 12 Jun 2002, David S. Miller wrote:
> > From: Jeff Garzik <jgarzik@mandrakesoft.com>
> > Oh crap, you're right... eepro100 in general does funky stuff with the
> > way packets are handled, mainly due to the need to issue commands to the
> > NIC engine instead of the normal per-descriptor owner bit way of doing
> > things.
>
> The eepro100 has a unique design in many different aspects.
>
> > The question is, do the descriptor bits have to live right before
> > the RX packet data buffer or can other schemes be used?
>
> With the current driver structure, yes, the descriptor words must be
> immediately before the packet data. You can use other Rx and Tx
> structures/modes to avoid this, but they use less efficient memory access.
> For instance, the current Tx structure allows transmitting a packet with
> a single PCI burst, rather than multiple transfers.
Maybe a bit off topic, but we (at my work) are using plenty of eepro100
cards with both drivers ( e100 and eepro100 )(shipped with dell
machines, and others).
We have lot of problem with these card: from link autonegociation to the
really frequent cmd_timeout.
We expreienced some freezes, slowdowns, problems with copying from NFS
to a Firwire disk ( systematic cmd_timeout at about 250 MB).
Do you have any advice ? should I test eepro100 NAPI driver ?
I've try to play with ethtool(chang some eepro100 bits , like the
"sleeping" one ...
I have quitely the same card at home wich doesn't make any problem ( I
noticed some cmd_timeout when I changed my hub).
Is this hub related ? Is there a standart way autonegociation is working
( we use mostly cisco switches, are they compliant?).
We are actually trying to force 10FD or 100FD any new installed card
because we think this is the best way to avoid performances problem ...
Thanks for any advice.
Sam
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-13 8:57 ` Andrey Savochkin
@ 2002-06-13 8:47 ` David S. Miller
0 siblings, 0 replies; 10+ messages in thread
From: David S. Miller @ 2002-06-13 8:47 UTC (permalink / raw)
To: saw; +Cc: fxzhang, linux-mips, linux-kernel, netdev, jgarzik
From: Andrey Savochkin <saw@saw.sw.com.sg>
Date: Thu, 13 Jun 2002 12:57:53 +0400
On Wed, Jun 12, 2002 at 04:05:32PM -0700, David S. Miller wrote:
> No, it's worse than that.
>
> See how non-consistent memory is used by the eepro100 driver
> for descriptor bits? The skb->tail bits?
>
> That is very problematic.
What's the problem?
If it isn't allowed to do, then what is the meaning of PCI_DMA_BIDIRECTIONAL
mappings?
It's slow. Not wrong, just inefficient.
Descriptors were meant to be done using consistent mappings, not
"pci_map_*()"'d memory. The latter is meant to be used for long
linear DMA transfers to/from the device. It is not meant for things
the cpu pokes small bits of data in and out of, that is what
consistent DMA memory is for.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-12 23:05 ` David S. Miller
2002-06-12 23:17 ` Jeff Garzik
@ 2002-06-13 8:57 ` Andrey Savochkin
2002-06-13 8:47 ` David S. Miller
1 sibling, 1 reply; 10+ messages in thread
From: Andrey Savochkin @ 2002-06-13 8:57 UTC (permalink / raw)
To: David S. Miller; +Cc: fxzhang, linux-mips, linux-kernel, netdev, jgarzik
On Wed, Jun 12, 2002 at 04:05:32PM -0700, David S. Miller wrote:
> From: Jeff Garzik <jgarzik@mandrakesoft.com>
> Date: Wed, 12 Jun 2002 19:00:00 -0400
>
> for the 'mips' patch, it looks
> like the arch maintainer(s) need to fix the PCI DMA support...
>
> No, it's worse than that.
>
> See how non-consistent memory is used by the eepro100 driver
> for descriptor bits? The skb->tail bits?
>
> That is very problematic.
What's the problem?
If it isn't allowed to do, then what is the meaning of PCI_DMA_BIDIRECTIONAL
mappings?
Andrey
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NAPI for eepro100
2002-06-13 7:15 ` Samuel Maftoul
@ 2002-06-19 11:46 ` Bill Davidsen
0 siblings, 0 replies; 10+ messages in thread
From: Bill Davidsen @ 2002-06-19 11:46 UTC (permalink / raw)
To: Samuel Maftoul; +Cc: lkml
On Thu, 13 Jun 2002, Samuel Maftoul wrote:
> Maybe a bit off topic, but we (at my work) are using plenty of eepro100
> cards with both drivers ( e100 and eepro100 )(shipped with dell
> machines, and others).
> We have lot of problem with these card: from link autonegociation to the
> really frequent cmd_timeout.
> We expreienced some freezes, slowdowns, problems with copying from NFS
> to a Firwire disk ( systematic cmd_timeout at about 250 MB).
Yes, this would be better in the cosl.networking, but a quick answer since
it seems kernel related. I had problems with these until recent kernels.
The e100 driver helped in some cases, but other issues were reported. I
don't really have any problems now, on news servers which get ~250GB/day
in and push ~700 out (yes, they run 70-80Mbit all day).
This may be related to the blessing of the new scheduler.
> Do you have any advice ? should I test eepro100 NAPI driver ?
> I've try to play with ethtool(chang some eepro100 bits , like the
> "sleeping" one ...
We noticed long ago that different blades on the same switch would work
right with either auto at both ends or 100TX forced at both ends. Don;t
know why, just look at ifconfig reports for collisions and if you see them
change the settings.
> I have quitely the same card at home wich doesn't make any problem ( I
> noticed some cmd_timeout when I changed my hub).
> Is this hub related ? Is there a standart way autonegociation is working
> ( we use mostly cisco switches, are they compliant?).
You have said both "hub" and "switch" in this paragraph, trying to run
a hub full deplex will cause problems. I have no idea what a "switching
hub" means, I see them around cheap, but not from Cisco.
> We are actually trying to force 10FD or 100FD any new installed card
> because we think this is the best way to avoid performances problem ...
We have never found the slightest way to predect if auto or forced 100TX
works with any given blade in any given switch. But it doesn't change, so
get it right and it's fixed. See above, look for collisions, try auto
first. No collisions, it's as good as it gets.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-06-19 11:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-12 12:39 NAPI for eepro100 Zhang Fuxin
2002-06-12 23:00 ` Jeff Garzik
2002-06-12 23:05 ` David S. Miller
2002-06-12 23:17 ` Jeff Garzik
2002-06-12 23:33 ` David S. Miller
2002-06-13 2:25 ` Donald Becker
2002-06-13 7:15 ` Samuel Maftoul
2002-06-19 11:46 ` Bill Davidsen
2002-06-13 8:57 ` Andrey Savochkin
2002-06-13 8:47 ` David S. Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox