From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>,
"Siddha, Suresh B" <suresh.b.siddha@intel.com>,
linux-kernel@vger.kernel.org
Subject: Re: Device hang when offlining a CPU due to IRQ misrouting
Date: Tue, 19 Jun 2007 11:00:03 -0700 [thread overview]
Message-ID: <20070619180003.GE7160@linux-os.sc.intel.com> (raw)
In-Reply-To: <m1sl8nn82i.fsf@ebiederm.dsl.xmission.com>
On Tue, Jun 19, 2007 at 11:54:45AM -0600, Eric W. Biederman wrote:
> "Darrick J. Wong" <djwong@us.ibm.com> writes:
>
> > On Mon, Jun 18, 2007 at 04:54:34PM -0700, Siddha, Suresh B wrote:
> >
> >> > <call to set_affinity>
> >> > [ 256.298787] irq=4341 affinity=d
> >> > <ethernet on irq 4341 stops working>
> >>
> >> And just to make sure, at this point, your MSI irq 4341 affinity
> >> (/proc/irq/4341/smp_affinity) still points to '2'?
> >
> > Actually, it's 0xD. From the kernel's perspective the mask has been
> > updated (and I even stuck a printk into set_msi_irq_affinity to verify
> > that the writes are happening) but ... the hardware doesn't seem to
> > reflect this. I also tried putting read_msi_msg right afterwards to
> > compare contents, though it complained about all the MSIs _except_ for
> > 4341. (Of course, I could just be way off on the effectiveness of
> > that.)
>
> The fact that MSI interrupts are having problems is odd. It is possible
> that we still have a bug in there somewhere but msi interrupts should
> be safe to migrate outside of irq context (no known hardware bugs).
> As we can actually synchronize with the irq source and eliminate all
> of the migration races.
>
> The non-msi case requires hitting a hardware race that is rare enough
> you should not normally have problems.
Yep. But Darrick's seems to say, problem happens consistently.
Anyhow, Darrick there is a general bug in this area, can you try this and
see if it helps?
diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 3eaceac..a0e11c9 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -144,17 +144,35 @@ void fixup_irqs(cpumask_t map)
for (irq = 0; irq < NR_IRQS; irq++) {
cpumask_t mask;
+ int break_affinity = 0;
+ int set_affinity = 1;
+
if (irq == 2)
continue;
+ /* irq's are disabled at this point */
+ spin_lock(&irq_desc[irq].lock);
+
cpus_and(mask, irq_desc[irq].affinity, map);
if (any_online_cpu(mask) == NR_CPUS) {
- printk("Breaking affinity for irq %i\n", irq);
+ break_affinity = 1;
mask = map;
}
+
+ irq_desc[irq].chip->mask(irq);
+
if (irq_desc[irq].chip->set_affinity)
irq_desc[irq].chip->set_affinity(irq, mask);
else if (irq_desc[irq].action && !(warned++))
+ set_affinity = 0;
+
+ irq_desc[irq].chip->unmask(irq);
+
+ spin_unlock(&irq_desc[irq].lock);
+
+ if (break_affinity && set_affinity)
+ printk("Broke affinity for irq %i\n", irq);
+ else if (!set_affinity)
printk("Cannot set affinity for irq %i\n", irq);
}
next prev parent reply other threads:[~2007-06-19 18:04 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-01 0:44 Device hang when offlining a CPU due to IRQ misrouting Darrick J. Wong
2007-06-01 19:39 ` Eric W. Biederman
2007-06-05 17:23 ` Siddha, Suresh B
2007-06-05 17:36 ` Darrick J. Wong
2007-06-05 18:13 ` Siddha, Suresh B
2007-06-05 18:33 ` Darrick J. Wong
2007-06-05 18:40 ` Siddha, Suresh B
2007-06-05 20:09 ` Darrick J. Wong
2007-06-05 21:14 ` Siddha, Suresh B
2007-06-05 23:57 ` Darrick J. Wong
2007-06-06 1:37 ` Siddha, Suresh B
2007-06-06 18:58 ` Darrick J. Wong
2007-06-06 19:35 ` Siddha, Suresh B
2007-06-06 23:16 ` Darrick J. Wong
2007-06-08 0:57 ` Siddha, Suresh B
2007-06-18 22:38 ` Darrick J. Wong
2007-06-18 23:54 ` Siddha, Suresh B
2007-06-19 0:51 ` Darrick J. Wong
2007-06-19 17:54 ` Eric W. Biederman
2007-06-19 18:00 ` Siddha, Suresh B [this message]
2007-06-19 18:55 ` Eric W. Biederman
2007-06-19 19:06 ` Darrick J. Wong
2007-06-19 19:59 ` Siddha, Suresh B
2007-06-19 20:49 ` Darrick J. Wong
2007-06-19 22:08 ` Siddha, Suresh B
2007-06-23 23:54 ` Rafael J. Wysocki
2007-06-23 23:58 ` Andrew Morton
2007-06-24 0:45 ` Eric W. Biederman
2007-06-24 0:51 ` Siddha, Suresh B
2007-06-24 12:50 ` Rafael J. Wysocki
2007-06-24 0:28 ` Siddha, Suresh B
2007-06-24 12:48 ` Rafael J. Wysocki
-- strict thread matches above, loose matches on Subject: below --
2007-06-01 21:57 Emmanuel Fusté
2007-06-02 0:18 ` Eric W. Biederman
2007-06-02 2:19 ` Darrick J. Wong
2007-06-02 3:48 ` Eric W. Biederman
2007-06-03 21:03 Emmanuel Fusté
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070619180003.GE7160@linux-os.sc.intel.com \
--to=suresh.b.siddha@intel.com \
--cc=djwong@us.ibm.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.