[Linux-ia64] eepro100 hang?

Linux IA64 platform development
 help / color / mirror / Atom feed

* [Linux-ia64] eepro100 hang?
@ 2001-12-07  8:10 Ville Herva
  2001-12-07 16:16 ` David Mosberger
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Ville Herva @ 2001-12-07  8:10 UTC (permalink / raw)
  To: linux-ia64

at http://linuxia64.org/todo.html, it says 

  EEPro100 ? Hangs on >180MB xfers

Is this supposed to be fixed?

Mandrake ftp installation stalled in the middle, and when I straced the
process from shell, it seemed to be loading rpm from ftp server, but the
read always returned 1 byte, and and it took about half a second to do so.

On second run it stalled as well, but in another place. I kill -HUP'ed the
download, and the rest went fine.

After installation, I can scp a 650MB image just fine, but that's from local
network.

I dig any deeper yet.

HP i2000, eepro100.

-- v --

v@iki.fi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Linux-ia64] eepro100 hang?
  2001-12-07  8:10 [Linux-ia64] eepro100 hang? Ville Herva
@ 2001-12-07 16:16 ` David Mosberger
  2001-12-07 17:03 ` Ville Herva
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: David Mosberger @ 2001-12-07 16:16 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 7 Dec 2001 10:10:07 +0200, Ville Herva <vherva@niksula.hut.fi> said:

  Ville> at http://linuxia64.org/todo.html, it says EEPro100 ? Hangs
  Ville> on >180MB xfers

  Ville> Is this supposed to be fixed?

I don't think the workstations (i2000) ever suffered from this problem.
It appeared only on the 4-way server.

  Ville> Mandrake ftp installation stalled in the middle, and when I
  Ville> straced the process from shell, it seemed to be loading rpm
  Ville> from ftp server, but the read always returned 1 byte, and and
  Ville> it took about half a second to do so.

  Ville> On second run it stalled as well, but in another place. I
  Ville> kill -HUP'ed the download, and the rest went fine.

  Ville> After installation, I can scp a 650MB image just fine, but
  Ville> that's from local network.

  Ville> I dig any deeper yet.

  Ville> HP i2000, eepro100.

For what it's worth, I use an i2000 with an eepro100 all the time and
haven't seen any hangs.  Of course, that doesn't prove anything; just
a data point...

	--david

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Linux-ia64] eepro100 hang?
  2001-12-07  8:10 [Linux-ia64] eepro100 hang? Ville Herva
  2001-12-07 16:16 ` David Mosberger
@ 2001-12-07 17:03 ` Ville Herva
  2001-12-27 14:34 ` Gwenole Beauchesne
  2002-01-04  4:12 ` Grant Grundler
  3 siblings, 0 replies; 5+ messages in thread
From: Ville Herva @ 2001-12-07 17:03 UTC (permalink / raw)
  To: linux-ia64

On Fri, Dec 07, 2001 at 08:16:43AM -0800, you [David Mosberger] claimed:
> >>>>> On Fri, 7 Dec 2001 10:10:07 +0200, Ville Herva <vherva@niksula.hut.fi> said:
> 
>   Ville> at http://linuxia64.org/todo.html, it says EEPro100 ? Hangs
>   Ville> on >180MB xfers
> 
>   Ville> Is this supposed to be fixed?
> 
> I don't think the workstations (i2000) ever suffered from this problem.
> It appeared only on the 4-way server.

Ok.

> For what it's worth, I use an i2000 with an eepro100 all the time and
> haven't seen any hangs.  Of course, that doesn't prove anything; just
> a data point...

Yep. I haven't seen anything strange since I got Mandrake installed. It was
just during installation. Could've been Mandrake trakX problem, but it
seemed strange - it tried to read(2) 16kB from the socket and only got one
byte at a time. And that took half a second each time...



-- v --

v@iki.fi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Linux-ia64] eepro100 hang?
  2001-12-07  8:10 [Linux-ia64] eepro100 hang? Ville Herva
  2001-12-07 16:16 ` David Mosberger
  2001-12-07 17:03 ` Ville Herva
@ 2001-12-27 14:34 ` Gwenole Beauchesne
  2002-01-04  4:12 ` Grant Grundler
  3 siblings, 0 replies; 5+ messages in thread
From: Gwenole Beauchesne @ 2001-12-27 14:34 UTC (permalink / raw)
  To: linux-ia64

On Fri, 7 Dec 2001, Ville Herva wrote:

> Is this supposed to be fixed?

Bill Nottingham reported the following:
<https://external-lists.valinux.com/archives/linux-ia64/2001-October/002312.html>

<quote>
The HW has a bug.

You need to run eepro100-diag -f -G -w -w -w  to fix it.
You can get eepro100-diag from ftp.scyld.com; it's one of Donald Becker's
diagnostic tools.
</quote>

On IA-64, I explicitly use the e100 driver for the following:
0x8086  0x1229  "e100"  "Intel Corporation|82559 [Ethernet Pro 100]"
0x8086  0x2449  "e100"  "Intel Corporation|EtherExpress PRO/100"

> Mandrake ftp installation stalled in the middle, and when I straced the
> process from shell, it seemed to be loading rpm from ftp server, but the
> read always returned 1 byte, and and it took about half a second to do so.

Strange. I never tried FTP installations on an HP i2000 but full installs
through NFS sure did work on this machine. FTP installs also work on our
Lions btw. Have you tried with another FTP server?

Bye,
Gwenolé.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Linux-ia64] eepro100 hang?
  2001-12-07  8:10 [Linux-ia64] eepro100 hang? Ville Herva
                   ` (2 preceding siblings ...)
  2001-12-27 14:34 ` Gwenole Beauchesne
@ 2002-01-04  4:12 ` Grant Grundler
  3 siblings, 0 replies; 5+ messages in thread
From: Grant Grundler @ 2002-01-04  4:12 UTC (permalink / raw)
  To: linux-ia64

Gwenole Beauchesne wrote:
> The HW has a bug.
> 
> You need to run eepro100-diag -f -G -w -w -w  to fix it.

There's more than one bug in the 8255x 100bt cards.

We needed an additional patch to 2.4.16.
It doesn't look like this patch is in 2.4.17.

Our problem was reproducible in about 10 minutes w/o the appended
patch when under load from a traffic generator spewing 1 byte packets
at nearly link rate.  Ran 8+ hours with patch.

grant


----- Forwarded message from Tim Hockin -----

Delivery-date: Thu, 13 Dec 2001 17:07:42 +0000
Date: Thu, 13 Dec 2001 09:06:17 -0800
Subject: eepro100

voila
-- 
Tim Hockin
Systems Software Engineer
Sun Microsystems, Cobalt Server Appliances
thockin@sun.com

diff -ruN 2.4.14-orig/drivers/net/eepro100.c 2.4.14-cobalt/drivers/net/eepro100.c
--- 2.4.14-orig/drivers/net/eepro100.c	Tue Dec  4 14:30:09 2001
+++ 2.4.14-cobalt/drivers/net/eepro100.c	Tue Dec  4 14:03:35 2001
@@ -64,8 +64,8 @@
 
 /* A few values that may be tweaked. */
 /* The ring sizes should be a power of two for efficiency. */
-#define TX_RING_SIZE	32
-#define RX_RING_SIZE	32
+#define TX_RING_SIZE	64
+#define RX_RING_SIZE	1024
 /* How much slots multicast filter setup may take.
    Do not descrease without changing set_rx_mode() implementaion. */
 #define TX_MULTICAST_SIZE   2
@@ -1067,6 +1071,50 @@
 	outw(CUStart | SCBMaskEarlyRx | SCBMaskFlowCtl, ioaddr + SCBCmd);
 }
 
+/*
+ * Sometimes the receiver stops making progress.  This routine knows how to
+ * get it going again, without losing packets or being otherwise nasty like
+ * a chip reset would be.  Previously the driver had a whole sequence
+ * of if RxSuspended, if it's no buffers do one thing, if it's no resources,
+ * do another, etc.  But those things don't really matter.  Separate logic
+ * in the ISR provides for allocating buffers--the other half of operation
+ * is just making sure the receiver is active.  speedo_rx_soft_reset does that.
+ * This problem with the old, more involved algorithm is shown up under
+ * ping floods on the order of 60K packets/second on a 100Mbps fdx network.
+ */
+static void
+speedo_rx_soft_reset(struct net_device *dev)
+{
+	struct speedo_private *sp = dev->priv;
+	struct RxFD *rfd;
+	long ioaddr;
+
+	ioaddr = dev->base_addr;
+	wait_for_cmd_done(ioaddr + SCBCmd);
+	if (inb(ioaddr + SCBCmd) != 0) {
+		printk("%s: previous command stalled\n", dev->name);
+		return;
+	}
+	/*
+	* Put the hardware into a known state.
+	*/
+	outb(RxAbort, ioaddr + SCBCmd);
+
+	rfd = sp->rx_ringp[sp->cur_rx % RX_RING_SIZE];
+
+	rfd->rx_buf_addr = 0xffffffff;
+
+	wait_for_cmd_done(ioaddr + SCBCmd);
+
+	if (inb(ioaddr + SCBCmd) != 0) {
+		printk("%s: RxAbort command stalled\n", dev->name);
+		return;
+	}
+	outl(sp->rx_ring_dma[sp->cur_rx % RX_RING_SIZE],
+		ioaddr + SCBPointer);
+	outb(RxStart, ioaddr + SCBCmd);
+}
+
 /* Media monitoring and control. */
 static void speedo_timer(unsigned long data)
 {
@@ -1500,82 +1591,37 @@
 		if ((status & 0xfc00) = 0)
 			break;
 
-		/* Always check if all rx buffers are allocated.  --SAW */
-		speedo_refill_rx_buffers(dev, 0);
-
 		if ((status & 0x5000) ||	/* Packet received, or Rx error. */
 			(sp->rx_ring_state&(RrNoMem|RrPostponed)) = RrPostponed)
 									/* Need to gather the postponed packet. */
 			speedo_rx(dev);
 
-		if (status & 0x1000) {
-			spin_lock(&sp->lock);
-			if ((status & 0x003c) = 0x0028) {		/* No more Rx buffers. */
-				struct RxFD *rxf;
-				printk(KERN_WARNING "%s: card reports no RX buffers.\n",
-						dev->name);
-				rxf = sp->rx_ringp[sp->cur_rx % RX_RING_SIZE];
-				if (rxf = NULL) {
-					if (speedo_debug > 2)
-						printk(KERN_DEBUG
-								"%s: NULL cur_rx in speedo_interrupt().\n",
-								dev->name);
-					sp->rx_ring_state |= RrNoMem|RrNoResources;
-				} else if (rxf = sp->last_rxf) {
-					if (speedo_debug > 2)
-						printk(KERN_DEBUG
-								"%s: cur_rx is last in speedo_interrupt().\n",
-								dev->name);
-					sp->rx_ring_state |= RrNoMem|RrNoResources;
-				} else
-					outb(RxResumeNoResources, ioaddr + SCBCmd);
-			} else if ((status & 0x003c) = 0x0008) { /* No resources. */
-				struct RxFD *rxf;
-				printk(KERN_WARNING "%s: card reports no resources.\n",
-						dev->name);
-				rxf = sp->rx_ringp[sp->cur_rx % RX_RING_SIZE];
-				if (rxf = NULL) {
-					if (speedo_debug > 2)
-						printk(KERN_DEBUG
-								"%s: NULL cur_rx in speedo_interrupt().\n",
-								dev->name);
-					sp->rx_ring_state |= RrNoMem|RrNoResources;
-				} else if (rxf = sp->last_rxf) {
-					if (speedo_debug > 2)
-						printk(KERN_DEBUG
-								"%s: cur_rx is last in speedo_interrupt().\n",
-								dev->name);
-					sp->rx_ring_state |= RrNoMem|RrNoResources;
-				} else {
-					/* Restart the receiver. */
-					outl(sp->rx_ring_dma[sp->cur_rx % RX_RING_SIZE],
-						 ioaddr + SCBPointer);
-					outb(RxStart, ioaddr + SCBCmd);
-				}
-			}
-			sp->stats.rx_errors++;
-			spin_unlock(&sp->lock);
-		}
+		/* Always check if all rx buffers are allocated.  --SAW */
+		speedo_refill_rx_buffers(dev, 0);
 
-		if ((sp->rx_ring_state&(RrNoMem|RrNoResources)) = RrNoResources) {
-			printk(KERN_WARNING
-					"%s: restart the receiver after a possible hang.\n",
-					dev->name);
-			spin_lock(&sp->lock);
-			/* Restart the receiver.
-			   I'm not sure if it's always right to restart the receiver
-			   here but I don't know another way to prevent receiver hangs.
-			   1999/12/25 SAW */
-			outl(sp->rx_ring_dma[sp->cur_rx % RX_RING_SIZE],
-				 ioaddr + SCBPointer);
-			outb(RxStart, ioaddr + SCBCmd);
-			sp->rx_ring_state &= ~RrNoResources;
-			spin_unlock(&sp->lock);
+		spin_lock(&sp->lock);
+		/*
+		 * The chip may have suspended reception for various reasons.
+		 * Check for that, and re-prime it should this be the case.
+		 */
+		switch ((status >> 2) & 0xf) {
+		case 0: /* Idle */
+			break;
+		case 1:	/* Suspended */
+		case 2:	/* No resources (RxFDs) */
+		case 9:	/* Suspended with no more RBDs */
+		case 10: /* No resources due to no RBDs */
+		case 12: /* Ready with no RBDs */
+			speedo_rx_soft_reset(dev);
+			break;
+		case 3:  case 5:  case 6:  case 7:  case 8:
+		case 11:  case 13:  case 14:  case 15:
+			/* these are all reserved values */
+			break;
 		}
 
 		/* User interrupt, Command/Tx unit interrupt or CU not active. */
 		if (status & 0xA400) {
-			spin_lock(&sp->lock);
 			speedo_tx_buffer_gc(dev);
 			if (sp->tx_full
 				&& (int)(sp->cur_tx - sp->dirty_tx) < TX_QUEUE_UNFULL) {
@@ -1583,8 +1629,8 @@
 				sp->tx_full = 0;
 				netif_wake_queue(dev); /* Attention: under a spinlock.  --SAW */
 			}
-			spin_unlock(&sp->lock);
 		}
+		spin_unlock(&sp->lock);
 
 		if (--boguscnt < 0) {
 			printk(KERN_ERR "%s: Too much work at interrupt, status=0x%4.4x.\n",
@@ -1702,6 +1748,7 @@
 	int entry = sp->cur_rx % RX_RING_SIZE;
 	int rx_work_limit = sp->dirty_rx + RX_RING_SIZE - sp->cur_rx;
 	int alloc_ok = 1;
+	int npkts = 0;
 
 	if (speedo_debug > 4)
 		printk(KERN_DEBUG " In speedo_rx().\n");
@@ -1768,6 +1815,7 @@
 				memcpy(skb_put(skb, pkt_len), sp->rx_skbuff[entry]->tail,
 					   pkt_len);
 #endif
+				npkts++;
 			} else {
 				/* Pass up the already-filled skbuff. */
 				skb = sp->rx_skbuff[entry];
@@ -1778,6 +1826,7 @@
 				}
 				sp->rx_skbuff[entry] = NULL;
 				skb_put(skb, pkt_len);
+				npkts++;
 				sp->rx_ringp[entry] = NULL;
 				pci_unmap_single(sp->pdev, sp->rx_ring_dma[entry],
 						PKT_BUF_SZ + sizeof(struct RxFD), PCI_DMA_FROMDEVICE);
@@ -1798,7 +1847,8 @@
 	/* Try hard to refill the recently taken buffers. */
 	speedo_refill_rx_buffers(dev, 1);
 
-	sp->last_rx_time = jiffies;
+	if (npkts)
+		sp->last_rx_time = jiffies;
 
 	return 0;
 }


----- End forwarded message -----

-- 
Revolutions do not require corporate support.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-01-04  4:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-07  8:10 [Linux-ia64] eepro100 hang? Ville Herva
2001-12-07 16:16 ` David Mosberger
2001-12-07 17:03 ` Ville Herva
2001-12-27 14:34 ` Gwenole Beauchesne
2002-01-04  4:12 ` Grant Grundler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox