All of lore.kernel.org
 help / color / mirror / Atom feed
From: peter swain <pswain@akamai.com>
To: "Udo A. Steinberg" <us15@os.inf.tu-dresden.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Len Brown <len.brown@intel.com>,
	andrews@gate.ort.spb.ru, scott.feldman@intel.com
Subject: [patch] e100 and shared interrupts [was: Spurious interrupts when SCI shared with e100]
Date: Mon, 08 Nov 2004 12:54:34 -0800	[thread overview]
Message-ID: <418FDD0A.5010308@akamai.com> (raw)
In-Reply-To: <20041108115955.1c8bf10f.us15@os.inf.tu-dresden.de>

[-- Attachment #1: Type: text/plain, Size: 3463 bytes --]

Udo A. Steinberg wrote:

>My laptop has IRQ9 configured as ACPI SCI. When IRQ9 is shared between ACPI
>and e100 an IRQ9 storm occurs when e100 is enabled, as can be seen in the
>dmesg output below. The kernel then disables IRQ9, thus preventing e100 from
>working properly. The problem does not occur, if I override the default PCI
>steering in the BIOS, e.g. by routing LNKA to IRQ11.
>
>Nonetheless it would be good if someone could figure out why sharing IRQ9 
>is problematic.
>  
>
Udo, please try the attached 2.6.9 patch.

I suspect intel e100 driver always had problems with shared interrupts, 
due to a classic case of
"executable comment syndrome" in the original intel driver.
I noticed races on some of our ancient boxes with e100s, sharing 
interrupts with other e100s, or other cards.

The intel e100 v2.0.40 e100intr() code says.....
        intr_status = readw(&bdp->scb->scb_status);
        /* If not my interrupt, just return */
        if (!(intr_status & SCB_STATUS_ACK_MASK) || (intr_status == 
0xffff)) {
                return IRQ_NONE;
        }

But the test above is *not* "not my interrupt" but "no interesting 
conditions".
Both tests are needed for reliable operation in a shared-irq scenario.

When the driver is meddling with e100 setup, causing intr_status bits to 
pop up, but has not yet enabled
e100 interrupts, a "stray" interrupt from a device sharing the IRQ will 
cause e100intr() to walk all over the
device even if the driver is in a critical section.  Chaos ensues -- in 
my case duplicate skb_free calls when
a parasitic interrupt "completed" processing of tx ring skbs which napi 
bh was still setting up.

Compare with becker's eepro100, where there are (IIRC) 3 locks -- a lock 
on tx resources, a lock on rx resources, and an implicit lock by way of 
the card's interrupt-enable bit.  If you enable interrupts, you're
allowing the irq-handler to play with tx,rx resources without any other 
locks.  So a courteous irq-handler will
check to see if it has been granted the privilege, by inspecting the 
interrupt-enable bit.
Other drivers follow this model, but e100 driver merely has a misleading 
comment instead of the check.

I'd guess that Udo's problem appears when...
- e100 driver is meddling with card, with e100 interrupt enable *off*
- ACPI interrupt causes e100intr to be invoked parasitcally, cleaning up 
(or breaking) some half-finished
   work intended to be guarded by interrupt-enable bit
- driver enables e100 interrupt
- e100-driven IRQ calls e100intr(), finds no work to do
- (conjecture) 2.6.10-rc1 actually checks the IRQ_NONE return, notes 
spurious interrupt & disables IRQ

Our e100 problems went away on hundreds of old linux-2.4 dual-e100 boxes 
when the attached e100-v2-irq-share.patch was applied. It applies 
cleanly to linux-2.4.27, but is untested there.
I'd seen no lkml mention of shared-irq e100 problems, so was letting it 
bake for a couple of weeks before
declaring success, porting it to recent 2.6 and publishing.

But Udo has a problem, so let's see if this fixes it...
My linux-2.6.9 patch is an untested transliteration to intel's e100-v3 
driver structure.
(v3 driver has one e100.c, v2 has e100/*.c tree, cutover was about 
2.6.4? 5?)

I'm not sure if this is implicated in the widespread mistrust of e100 
multiport cards under linux, the e100-eepro100 wars, or the recent 
e100-suspend problems. Could be the missing piece in all.


^..^ feedback very welcome
(oo)




[-- Attachment #2: e100-v2-irq-share.patch --]
[-- Type: text/plain, Size: 1709 bytes --]

## add a module param own_irq=1 to forbid interrupt sharing
## add a check in e100intr for device interrupt masking -- if the driver has masked irqs off,
##  don't execute the usual irq service, but simply report the clash

--- Linux/drivers/net/e100/e100_main.c	Thu Oct 21 17:25:24 2004
+++ linux/drivers/net/e100/e100_main.c	Fri Oct 22 11:45:27 2004
@@ -424,6 +424,9 @@ E100_PARAM(BundleMax, "Maximum number fo
 E100_PARAM(IFS, "Disable or enable the adaptive IFS algorithm");
 E100_PARAM(weight, "rx packets processed per poll");
 
+int own_irq = 0; /* every card gets *own* IRQ? */
+MODULE_PARM(own_irq, "i");
+
 /**
  * e100_exec_cmd - issue a comand
  * @bdp: atapter's private data struct
@@ -1153,7 +1156,7 @@ e100_open(struct net_device *dev)
 		netif_start_queue(dev);
 
 	e100_start_ru(bdp);
-	if ((rc = request_irq(dev->irq, &e100intr, SA_SHIRQ,
+	if ((rc = request_irq(dev->irq, &e100intr, own_irq ? 0 : SA_SHIRQ,
 			      dev->name, dev)) != 0) {
 		del_timer_sync(&bdp->watchdog_timer);
 		goto err_exit;
@@ -2062,11 +2065,20 @@ e100intr(int irq, void *dev_inst, struct
 	dev = dev_inst;
 	bdp = dev->priv;
 
-	intr_status = readw(&bdp->scb->scb_status);
 	/* If not my interrupt, just return */
-	if (!(intr_status & SCB_STATUS_ACK_MASK) || (intr_status == 0xffff)) {
+	if (readb(&bdp->scb->scb_cmd_hi) & SCB_INT_MASK) {
+		static int once = 0;
+
+		if (!once)
+			printk(KERN_ERR "e100intr ignoring disabled interrupt, suspect irq-sharing\n");
+		once = 1;
 		return IRQ_NONE;
 	}
+
+	/* If no pending action, just return */
+	intr_status = readw(&bdp->scb->scb_status);
+	if (!(intr_status & SCB_STATUS_ACK_MASK) || (intr_status == 0xffff))
+		return IRQ_NONE;
 
 
 #ifdef CONFIG_E100_NAPI

[-- Attachment #3: e100-linux-2.6.9-irq-sharing.patch --]
[-- Type: text/plain, Size: 807 bytes --]

--- drivers/net/e100.c-pre-swine	Mon Nov  8 12:19:08 2004
+++ drivers/net/e100.c	Mon Nov  8 12:25:36 2004
@@ -1580,11 +1580,18 @@ static irqreturn_t e100_intr(int irq, vo
 {
 	struct net_device *netdev = dev_id;
 	struct nic *nic = netdev_priv(netdev);
-	u8 stat_ack = readb(&nic->csr->scb.stat_ack);
+	u8 stat_ack, cmd_hi;
 
+	cmd_hi = readb(&nic->csr->scb.cmd_hi);
+	DPRINTK(INTR, DEBUG, "cmd_hi = 0x%02X\n", cmd_hi);
+
+	if(cmd_hi & irq_mask_all)	/* Not our interrupt */
+		return IRQ_NONE;
+
+	stat_ack = readb(&nic->csr->scb.stat_ack);
 	DPRINTK(INTR, DEBUG, "stat_ack = 0x%02X\n", stat_ack);
 
-	if(stat_ack == stat_ack_not_ours ||	/* Not our interrupt */
+	if(stat_ack == stat_ack_not_ours ||	/* nothing to do */
 	   stat_ack == stat_ack_not_present)	/* Hardware is ejected */
 		return IRQ_NONE;
 

  reply	other threads:[~2004-11-08 20:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-08 10:59 Spurious interrupts when SCI shared with e100 Udo A. Steinberg
2004-11-08 20:54 ` peter swain [this message]
2004-11-09 22:22   ` [patch] e100 and shared interrupts [was: Spurious interrupts when SCI shared with e100] J.A. Magallon
2004-11-09  5:42 ` Spurious interrupts when SCI shared with e100 Len Brown
2004-11-09 16:21   ` Phil Oester

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=418FDD0A.5010308@akamai.com \
    --to=pswain@akamai.com \
    --cc=andrews@gate.ort.spb.ru \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=scott.feldman@intel.com \
    --cc=us15@os.inf.tu-dresden.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.