From: peter swain <pswain@akamai.com>
To: "Udo A. Steinberg" <us15@os.inf.tu-dresden.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Len Brown <len.brown@intel.com>,
andrews@gate.ort.spb.ru, scott.feldman@intel.com
Subject: [patch] e100 and shared interrupts [was: Spurious interrupts when SCI shared with e100]
Date: Mon, 08 Nov 2004 12:54:34 -0800 [thread overview]
Message-ID: <418FDD0A.5010308@akamai.com> (raw)
In-Reply-To: <20041108115955.1c8bf10f.us15@os.inf.tu-dresden.de>
[-- Attachment #1: Type: text/plain, Size: 3463 bytes --]
Udo A. Steinberg wrote:
>My laptop has IRQ9 configured as ACPI SCI. When IRQ9 is shared between ACPI
>and e100 an IRQ9 storm occurs when e100 is enabled, as can be seen in the
>dmesg output below. The kernel then disables IRQ9, thus preventing e100 from
>working properly. The problem does not occur, if I override the default PCI
>steering in the BIOS, e.g. by routing LNKA to IRQ11.
>
>Nonetheless it would be good if someone could figure out why sharing IRQ9
>is problematic.
>
>
Udo, please try the attached 2.6.9 patch.
I suspect intel e100 driver always had problems with shared interrupts,
due to a classic case of
"executable comment syndrome" in the original intel driver.
I noticed races on some of our ancient boxes with e100s, sharing
interrupts with other e100s, or other cards.
The intel e100 v2.0.40 e100intr() code says.....
intr_status = readw(&bdp->scb->scb_status);
/* If not my interrupt, just return */
if (!(intr_status & SCB_STATUS_ACK_MASK) || (intr_status ==
0xffff)) {
return IRQ_NONE;
}
But the test above is *not* "not my interrupt" but "no interesting
conditions".
Both tests are needed for reliable operation in a shared-irq scenario.
When the driver is meddling with e100 setup, causing intr_status bits to
pop up, but has not yet enabled
e100 interrupts, a "stray" interrupt from a device sharing the IRQ will
cause e100intr() to walk all over the
device even if the driver is in a critical section. Chaos ensues -- in
my case duplicate skb_free calls when
a parasitic interrupt "completed" processing of tx ring skbs which napi
bh was still setting up.
Compare with becker's eepro100, where there are (IIRC) 3 locks -- a lock
on tx resources, a lock on rx resources, and an implicit lock by way of
the card's interrupt-enable bit. If you enable interrupts, you're
allowing the irq-handler to play with tx,rx resources without any other
locks. So a courteous irq-handler will
check to see if it has been granted the privilege, by inspecting the
interrupt-enable bit.
Other drivers follow this model, but e100 driver merely has a misleading
comment instead of the check.
I'd guess that Udo's problem appears when...
- e100 driver is meddling with card, with e100 interrupt enable *off*
- ACPI interrupt causes e100intr to be invoked parasitcally, cleaning up
(or breaking) some half-finished
work intended to be guarded by interrupt-enable bit
- driver enables e100 interrupt
- e100-driven IRQ calls e100intr(), finds no work to do
- (conjecture) 2.6.10-rc1 actually checks the IRQ_NONE return, notes
spurious interrupt & disables IRQ
Our e100 problems went away on hundreds of old linux-2.4 dual-e100 boxes
when the attached e100-v2-irq-share.patch was applied. It applies
cleanly to linux-2.4.27, but is untested there.
I'd seen no lkml mention of shared-irq e100 problems, so was letting it
bake for a couple of weeks before
declaring success, porting it to recent 2.6 and publishing.
But Udo has a problem, so let's see if this fixes it...
My linux-2.6.9 patch is an untested transliteration to intel's e100-v3
driver structure.
(v3 driver has one e100.c, v2 has e100/*.c tree, cutover was about
2.6.4? 5?)
I'm not sure if this is implicated in the widespread mistrust of e100
multiport cards under linux, the e100-eepro100 wars, or the recent
e100-suspend problems. Could be the missing piece in all.
^..^ feedback very welcome
(oo)
[-- Attachment #2: e100-v2-irq-share.patch --]
[-- Type: text/plain, Size: 1709 bytes --]
## add a module param own_irq=1 to forbid interrupt sharing
## add a check in e100intr for device interrupt masking -- if the driver has masked irqs off,
## don't execute the usual irq service, but simply report the clash
--- Linux/drivers/net/e100/e100_main.c Thu Oct 21 17:25:24 2004
+++ linux/drivers/net/e100/e100_main.c Fri Oct 22 11:45:27 2004
@@ -424,6 +424,9 @@ E100_PARAM(BundleMax, "Maximum number fo
E100_PARAM(IFS, "Disable or enable the adaptive IFS algorithm");
E100_PARAM(weight, "rx packets processed per poll");
+int own_irq = 0; /* every card gets *own* IRQ? */
+MODULE_PARM(own_irq, "i");
+
/**
* e100_exec_cmd - issue a comand
* @bdp: atapter's private data struct
@@ -1153,7 +1156,7 @@ e100_open(struct net_device *dev)
netif_start_queue(dev);
e100_start_ru(bdp);
- if ((rc = request_irq(dev->irq, &e100intr, SA_SHIRQ,
+ if ((rc = request_irq(dev->irq, &e100intr, own_irq ? 0 : SA_SHIRQ,
dev->name, dev)) != 0) {
del_timer_sync(&bdp->watchdog_timer);
goto err_exit;
@@ -2062,11 +2065,20 @@ e100intr(int irq, void *dev_inst, struct
dev = dev_inst;
bdp = dev->priv;
- intr_status = readw(&bdp->scb->scb_status);
/* If not my interrupt, just return */
- if (!(intr_status & SCB_STATUS_ACK_MASK) || (intr_status == 0xffff)) {
+ if (readb(&bdp->scb->scb_cmd_hi) & SCB_INT_MASK) {
+ static int once = 0;
+
+ if (!once)
+ printk(KERN_ERR "e100intr ignoring disabled interrupt, suspect irq-sharing\n");
+ once = 1;
return IRQ_NONE;
}
+
+ /* If no pending action, just return */
+ intr_status = readw(&bdp->scb->scb_status);
+ if (!(intr_status & SCB_STATUS_ACK_MASK) || (intr_status == 0xffff))
+ return IRQ_NONE;
#ifdef CONFIG_E100_NAPI
[-- Attachment #3: e100-linux-2.6.9-irq-sharing.patch --]
[-- Type: text/plain, Size: 807 bytes --]
--- drivers/net/e100.c-pre-swine Mon Nov 8 12:19:08 2004
+++ drivers/net/e100.c Mon Nov 8 12:25:36 2004
@@ -1580,11 +1580,18 @@ static irqreturn_t e100_intr(int irq, vo
{
struct net_device *netdev = dev_id;
struct nic *nic = netdev_priv(netdev);
- u8 stat_ack = readb(&nic->csr->scb.stat_ack);
+ u8 stat_ack, cmd_hi;
+ cmd_hi = readb(&nic->csr->scb.cmd_hi);
+ DPRINTK(INTR, DEBUG, "cmd_hi = 0x%02X\n", cmd_hi);
+
+ if(cmd_hi & irq_mask_all) /* Not our interrupt */
+ return IRQ_NONE;
+
+ stat_ack = readb(&nic->csr->scb.stat_ack);
DPRINTK(INTR, DEBUG, "stat_ack = 0x%02X\n", stat_ack);
- if(stat_ack == stat_ack_not_ours || /* Not our interrupt */
+ if(stat_ack == stat_ack_not_ours || /* nothing to do */
stat_ack == stat_ack_not_present) /* Hardware is ejected */
return IRQ_NONE;
next prev parent reply other threads:[~2004-11-08 20:59 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-08 10:59 Spurious interrupts when SCI shared with e100 Udo A. Steinberg
2004-11-08 20:54 ` peter swain [this message]
2004-11-09 22:22 ` [patch] e100 and shared interrupts [was: Spurious interrupts when SCI shared with e100] J.A. Magallon
2004-11-09 5:42 ` Spurious interrupts when SCI shared with e100 Len Brown
2004-11-09 16:21 ` Phil Oester
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=418FDD0A.5010308@akamai.com \
--to=pswain@akamai.com \
--cc=andrews@gate.ort.spb.ru \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=scott.feldman@intel.com \
--cc=us15@os.inf.tu-dresden.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox