* Q: natsemi.c spinlocks @ 2000-12-23 17:45 Manfred 2000-12-24 0:54 ` Andrew Morton 0 siblings, 1 reply; 7+ messages in thread From: Manfred @ 2000-12-23 17:45 UTC (permalink / raw) To: mulder.abg, jgarzik; +Cc: linux-kernel Hi Jeff, Tjeerd, I spotted the spin_lock in natsemi.c, and I think it's bogus. The "simultaneous interrupt entry" is a bug in some 2.0 and 2.1 kernel (even Alan didn't remember it exactly when I asked him), thus a sane driver can assume that an interrupt handler is never reentered. Donald often uses dev->interrupt to hide other races, but I don't see anything in this driver (tx_timeout and netdev_timer are both trivial) -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Q: natsemi.c spinlocks 2000-12-23 17:45 Q: natsemi.c spinlocks Manfred @ 2000-12-24 0:54 ` Andrew Morton 2000-12-24 11:15 ` Manfred 2001-01-21 14:43 ` [PATCH] " Jeff Garzik 0 siblings, 2 replies; 7+ messages in thread From: Andrew Morton @ 2000-12-24 0:54 UTC (permalink / raw) To: Manfred; +Cc: mulder.abg, jgarzik, linux-kernel Manfred wrote: > > Hi Jeff, Tjeerd, > > I spotted the spin_lock in natsemi.c, and I think it's bogus. > > The "simultaneous interrupt entry" is a bug in some 2.0 and 2.1 kernel > (even Alan didn't remember it exactly when I asked him), thus a sane > driver can assume that an interrupt handler is never reentered. > > Donald often uses dev->interrupt to hide other races, but I don't see > anything in this driver (tx_timeout and netdev_timer are both trivial) Hi, Manfed. I think you're right. 2.4's interrupt handling prevents simultaneous entry of the same ISR. However, natsemi.c's spinlock needs to be retained, and extended into start_tx(), because this driver has a race which has cropped up in a few others: Current code: start_tx() { ... if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) { /* WINDOW HERE */ np->tx_full = 1; netif_stop_queue(dev); } ... } If the ring is currently full and an interrupt comes in at the indicated window and reaps ALL the packets in the ring, the driver ends up in state `tx_full = 1' and tramsmit disabled, but with no outstanding transmit interrupts. It's screwed. You need another interrupt so tx_full can be cleared and the queue can be restarted, but you can't *get* another interrupt because there are no Tx packets outstanding. It's very unlikely to happen with this particular driver because it's also polling the transmit queue within receive interrupts. Receiving a packet will clear the condition. If you were madly hosing out UDP packets and receiving nothing then this could occur. It was certainly triggerable in 3c59x.c, which doesn't test the Tx queue state in Rx interrupts. I currently have natsemi.c lying in pieces on my garage floor, so I'll put this locking in if it's OK with everyone? - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Q: natsemi.c spinlocks 2000-12-24 0:54 ` Andrew Morton @ 2000-12-24 11:15 ` Manfred 2001-01-21 14:38 ` Jeff Garzik 2001-01-21 14:43 ` [PATCH] " Jeff Garzik 1 sibling, 1 reply; 7+ messages in thread From: Manfred @ 2000-12-24 11:15 UTC (permalink / raw) To: Andrew Morton; +Cc: jgarzik, linux-kernel Andrew Morton wrote: > > start_tx() > { Yes, I overlooked start_tx. Hmm. start_tx also assumes that the cpu commits writes in order, I'm sure the driver is unreliable on RISC cpus. Perhaps the driver should use pci_alloc_consistent and pci_map_single? -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Q: natsemi.c spinlocks 2000-12-24 11:15 ` Manfred @ 2001-01-21 14:38 ` Jeff Garzik 0 siblings, 0 replies; 7+ messages in thread From: Jeff Garzik @ 2001-01-21 14:38 UTC (permalink / raw) To: Manfred; +Cc: Andrew Morton, linux-kernel Manfred wrote: > > Andrew Morton wrote: > > > > start_tx() > > { > > Yes, I overlooked start_tx. > > Hmm. start_tx also assumes that the cpu commits writes in order, I'm > sure the driver is unreliable on RISC cpus. > > Perhaps the driver should use pci_alloc_consistent and pci_map_single? Eventually, all drivers which use PCI DMA of some sort -should- use pci_alloc_consistent, etc. Jeff -- Jeff Garzik | "You see, in this world there's two kinds of Building 1024 | people, my friend: Those with loaded guns MandrakeSoft | and those who dig. You dig." --Blondie - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] Re: Q: natsemi.c spinlocks 2000-12-24 0:54 ` Andrew Morton 2000-12-24 11:15 ` Manfred @ 2001-01-21 14:43 ` Jeff Garzik 2001-01-22 7:06 ` Donald Becker 1 sibling, 1 reply; 7+ messages in thread From: Jeff Garzik @ 2001-01-21 14:43 UTC (permalink / raw) To: Andrew Morton; +Cc: Manfred, mulder.abg, linux-kernel, Donald Becker [-- Attachment #1: Type: text/plain, Size: 2592 bytes --] Andrew Morton wrote: > > Manfred wrote: > > > > Hi Jeff, Tjeerd, > > > > I spotted the spin_lock in natsemi.c, and I think it's bogus. > > > > The "simultaneous interrupt entry" is a bug in some 2.0 and 2.1 kernel > > (even Alan didn't remember it exactly when I asked him), thus a sane > > driver can assume that an interrupt handler is never reentered. > > > > Donald often uses dev->interrupt to hide other races, but I don't see > > anything in this driver (tx_timeout and netdev_timer are both trivial) > > Hi, Manfed. > > I think you're right. 2.4's interrupt handling prevents > simultaneous entry of the same ISR. > > However, natsemi.c's spinlock needs to be retained, and > extended into start_tx(), because this driver has > a race which has cropped up in a few others: > > Current code: > > start_tx() > { > ... > if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) { > /* WINDOW HERE */ > np->tx_full = 1; > netif_stop_queue(dev); > } > ... > } > > If the ring is currently full and an interrupt comes in > at the indicated window and reaps ALL the packets in the > ring, the driver ends up in state `tx_full = 1' and tramsmit > disabled, but with no outstanding transmit interrupts. > > It's screwed. You need another interrupt so tx_full > can be cleared and the queue can be restarted, but you can't > *get* another interrupt because there are no Tx packets outstanding. > > It's very unlikely to happen with this particular driver > because it's also polling the transmit queue within > receive interrupts. Receiving a packet will clear > the condition. > > If you were madly hosing out UDP packets and receiving nothing > then this could occur. It was certainly triggerable in 3c59x.c, > which doesn't test the Tx queue state in Rx interrupts. > > I currently have natsemi.c lying in pieces on my garage floor, > so I'll put this locking in if it's OK with everyone? (entire message quoted, as it was from 12/23/2000) Attached is a patch against 2.4.1-pre9, which includes the changes I would prefer. Comments? The Tx locking is a bit conservative -- I think Donald suggested it could be removed completely -- but I would prefer to have something I am 100% certain will work, and then test the driver without locking under stress conditions to make sure no race or other bug exists. Jeff -- Jeff Garzik | "You see, in this world there's two kinds of Building 1024 | people, my friend: Those with loaded guns MandrakeSoft | and those who dig. You dig." --Blondie [-- Attachment #2: natsemi-2.4.1.9.patch --] [-- Type: text/plain, Size: 4201 bytes --] Index: linux_2_4/drivers/net/natsemi.c diff -u linux_2_4/drivers/net/natsemi.c:1.1.1.6 linux_2_4/drivers/net/natsemi.c:1.1.1.6.24.1 --- linux_2_4/drivers/net/natsemi.c:1.1.1.6 Mon Dec 11 19:23:42 2000 +++ linux_2_4/drivers/net/natsemi.c Sun Jan 21 06:39:02 2001 @@ -26,6 +26,11 @@ - Bug fixes and better intr performance (Tjeerd) Version 1.0.2: - Now reads correct MAC address from eeprom + Version 1.0.3: + - Eliminate redundant priv->tx_full flag + - Call netif_start_queue from dev->tx_timeout + - wmb() in start_tx() to flush data + - Update Tx locking */ @@ -35,7 +40,7 @@ static const char version2[] = " http://www.scyld.com/network/natsemi.html\n"; static const char version3[] = -" (unofficial 2.4.x kernel port, version 1.0.2, October 6, 2000 Jeff Garzik, Tjeerd Mulder)\n"; +" (unofficial 2.4.x kernel port, version 1.0.3, January 21, 2001 Jeff Garzik, Tjeerd Mulder)\n"; /* Updated to recommendations in pci-skeleton v2.03. */ /* Automatically extracted configuration info: @@ -187,13 +192,14 @@ The send packet thread has partial control over the Tx ring and 'dev->tbusy' flag. It sets the tbusy flag whenever it's queuing a Tx packet. If the next -queue slot is empty, it clears the tbusy flag when finished otherwise it sets -the 'lp->tx_full' flag. +queue slot is empty, it clears the tbusy flag when finished. Under 2.4, the +"tbusy flag" is now controlled by netif_{start,stop,wake}_queue() and tested +by netif_queue_stopped(). The interrupt handler has exclusive control over the Rx ring and records stats from the Tx ring. After reaping the stats, it marks the Tx queue entry as -empty by incrementing the dirty_tx mark. Iff the 'lp->tx_full' flag is set, it -clears both the tx_full and tbusy flags. +empty by incrementing the dirty_tx mark. Iff Tx queueing is stopped and Tx +entries were reaped, the Tx queue is started and scheduled. IV. Notes @@ -319,7 +325,6 @@ unsigned int cur_rx, dirty_rx; /* Producer/consumer ring indices */ unsigned int cur_tx, dirty_tx; unsigned int rx_buf_sz; /* Based on MTU+slack. */ - unsigned int tx_full:1; /* The Tx queue is full. */ /* These values are keep track of the transceiver/media in use. */ unsigned int full_duplex:1; /* Full-duplex operation requested. */ unsigned int duplex_lock:1; @@ -697,7 +702,7 @@ dev->trans_start = jiffies; np->stats.tx_errors++; - return; + netif_start_queue(dev); } @@ -707,7 +712,6 @@ struct netdev_private *np = (struct netdev_private *)dev->priv; int i; - np->tx_full = 0; np->cur_rx = np->cur_tx = 0; np->dirty_rx = np->dirty_tx = 0; @@ -763,11 +767,13 @@ np->cur_tx++; /* StrongARM: Explicitly cache flush np->tx_ring and skb->data,skb->len. */ + wmb(); - if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) { - np->tx_full = 1; + spin_lock_irq(&np->lock); + if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) netif_stop_queue(dev); - } + spin_unlock_irq(&np->lock); + /* Wake the potentially-idle transmit channel. */ writel(TxOn, dev->base_addr + ChipCmd); @@ -798,9 +804,7 @@ #endif ioaddr = dev->base_addr; - np = (struct netdev_private *)dev->priv; - - spin_lock(&np->lock); + np = dev->priv; do { u32 intr_status = readl(ioaddr + IntrStatus); @@ -818,6 +822,8 @@ if (intr_status & (IntrRxDone | IntrRxErr | IntrRxIdle | IntrRxOverrun)) netdev_rx(dev); + spin_lock(&np->lock); + for (; np->cur_tx - np->dirty_tx > 0; np->dirty_tx++) { int entry = np->dirty_tx % TX_RING_SIZE; if (np->tx_ring[entry].cmd_status & cpu_to_le32(DescOwn)) @@ -839,13 +845,14 @@ dev_kfree_skb_irq(np->tx_skbuff[entry]); np->tx_skbuff[entry] = 0; } - if (np->tx_full + if (netif_queue_stopped(dev) && np->cur_tx - np->dirty_tx < TX_QUEUE_LEN - 4) { /* The ring is no longer full, wake queue. */ - np->tx_full = 0; netif_wake_queue(dev); } + spin_unlock(&np->lock); + /* Abnormal error summary/uncommon events handlers. */ if (intr_status & IntrAbnormalSummary) netdev_error(dev, intr_status); @@ -873,8 +880,6 @@ } } #endif - - spin_unlock(&np->lock); } /* This routine is logically part of the interrupt handler, but separated ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Re: Q: natsemi.c spinlocks 2001-01-21 14:43 ` [PATCH] " Jeff Garzik @ 2001-01-22 7:06 ` Donald Becker 2001-01-22 8:51 ` Manfred Spraul 0 siblings, 1 reply; 7+ messages in thread From: Donald Becker @ 2001-01-22 7:06 UTC (permalink / raw) To: Jeff Garzik; +Cc: Andrew Morton, Manfred, mulder.abg, linux-kernel On Sun, 21 Jan 2001, Jeff Garzik wrote: > Andrew Morton wrote: > > Manfred wrote: > > > Hi Jeff, Tjeerd, > > > I spotted the spin_lock in natsemi.c, and I think it's bogus. > > > > > > The "simultaneous interrupt entry" is a bug in some 2.0 and 2.1 kernel > > > (even Alan didn't remember it exactly when I asked him), thus a sane > > > driver can assume that an interrupt handler is never reentered. ... > > I think you're right. 2.4's interrupt handling prevents > > simultaneous entry of the same ISR. The bug (simultaneous calls to the interrupt handler on SMP) existed in most 2.0 versions was fixed before 2.2. A driver that needs to work with multiple kernel versions must have the check. > > However, natsemi.c's spinlock needs to be retained, and > > extended into start_tx(), because this driver has > > a race which has cropped up in a few others: > > ... > > if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) { > > /* WINDOW HERE */ > > np->tx_full = 1; > > netif_stop_queue(dev); > > } > > If the ring is currently full and an interrupt comes in > > at the indicated window and reaps ALL the packets in the > > ring, the driver ends up in state `tx_full = 1' and tramsmit > > disabled, but with no outstanding transmit interrupts. The better solution, which I've been adding to the drivers, is to check again for a just-cleared Tx queue after setting tx_full. That trades an extra comparison on a rarely followed path for a spinlock that is taken for every transmit and interrupt. Remember: spinlocks are expensive! Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Re: Q: natsemi.c spinlocks 2001-01-22 7:06 ` Donald Becker @ 2001-01-22 8:51 ` Manfred Spraul 0 siblings, 0 replies; 7+ messages in thread From: Manfred Spraul @ 2001-01-22 8:51 UTC (permalink / raw) To: Donald Becker; +Cc: Jeff Garzik, Andrew Morton, mulder.abg, linux-kernel Donald Becker wrote: > > > > However, natsemi.c's spinlock needs to be retained, and > > > extended into start_tx(), because this driver has > > > a race which has cropped up in a few others: > > > ... > > > if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) { > > > /* WINDOW HERE */ > > > np->tx_full = 1; > > > netif_stop_queue(dev); > > > } > > > If the ring is currently full and an interrupt comes in > > > at the indicated window and reaps ALL the packets in the > > > ring, the driver ends up in state `tx_full = 1' and tramsmit > > > disabled, but with no outstanding transmit interrupts. > > The better solution, which I've been adding to the drivers, is to check > again for a just-cleared Tx queue after setting tx_full. > That trades an extra comparison on a rarely followed path for a spinlock > that is taken for every transmit and interrupt. > Please do not forget the memory barrier(s): tx_full = 1; if(condition) ...; That's exactly the sequence that caused deadlocks with wait_queues - even a Pentium cpu will evaluate the condition before the write to tx_full is commited. I have a test program (userspace) that reliably locks up on my P II. I can send you the details if you are interested. I think you also need a memory barrier in the tx_interrupt codepath. > > Remember: spinlocks are expensive! > But memory barriers are extremely error prone. What about tx_interrupt() if(netif_queue_stopped(dev)) { spin_lock(&np->lock); if(np->cur_tx - np->dirty_tx <= TX_QUEUE_LEN/2) netif_wake_queue(dev)); spin_unlock(&np->lock); } hard_xmit() if(np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN-1) { spin_lock_irq(&np->lock); if(np->cur_tx - np_dirty_tx >= TX_QUEUE_LEN-1) netif_stop_queue(dev); spin_unlock_irq(&np->lock); } -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2001-01-22 8:52 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2000-12-23 17:45 Q: natsemi.c spinlocks Manfred 2000-12-24 0:54 ` Andrew Morton 2000-12-24 11:15 ` Manfred 2001-01-21 14:38 ` Jeff Garzik 2001-01-21 14:43 ` [PATCH] " Jeff Garzik 2001-01-22 7:06 ` Donald Becker 2001-01-22 8:51 ` Manfred Spraul
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox