From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>,
"Siddha, Suresh B" <suresh.b.siddha@intel.com>,
linux-kernel@vger.kernel.org
Subject: Re: Device hang when offlining a CPU due to IRQ misrouting
Date: Tue, 19 Jun 2007 11:00:03 -0700 [thread overview]
Message-ID: <20070619180003.GE7160@linux-os.sc.intel.com> (raw)
In-Reply-To: <m1sl8nn82i.fsf@ebiederm.dsl.xmission.com>
On Tue, Jun 19, 2007 at 11:54:45AM -0600, Eric W. Biederman wrote:
> "Darrick J. Wong" <djwong@us.ibm.com> writes:
>
> > On Mon, Jun 18, 2007 at 04:54:34PM -0700, Siddha, Suresh B wrote:
> >
> >> > <call to set_affinity>
> >> > [ 256.298787] irq=4341 affinity=d
> >> > <ethernet on irq 4341 stops working>
> >>
> >> And just to make sure, at this point, your MSI irq 4341 affinity
> >> (/proc/irq/4341/smp_affinity) still points to '2'?
> >
> > Actually, it's 0xD. From the kernel's perspective the mask has been
> > updated (and I even stuck a printk into set_msi_irq_affinity to verify
> > that the writes are happening) but ... the hardware doesn't seem to
> > reflect this. I also tried putting read_msi_msg right afterwards to
> > compare contents, though it complained about all the MSIs _except_ for
> > 4341. (Of course, I could just be way off on the effectiveness of
> > that.)
>
> The fact that MSI interrupts are having problems is odd. It is possible
> that we still have a bug in there somewhere but msi interrupts should
> be safe to migrate outside of irq context (no known hardware bugs).
> As we can actually synchronize with the irq source and eliminate all
> of the migration races.
>
> The non-msi case requires hitting a hardware race that is rare enough
> you should not normally have problems.
Yep. But Darrick's seems to say, problem happens consistently.
Anyhow, Darrick there is a general bug in this area, can you try this and
see if it helps?
diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 3eaceac..a0e11c9 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -144,17 +144,35 @@ void fixup_irqs(cpumask_t map)
for (irq = 0; irq < NR_IRQS; irq++) {
cpumask_t mask;
+ int break_affinity = 0;
+ int set_affinity = 1;
+
if (irq == 2)
continue;
+ /* irq's are disabled at this point */
+ spin_lock(&irq_desc[irq].lock);
+
cpus_and(mask, irq_desc[irq].affinity, map);
if (any_online_cpu(mask) == NR_CPUS) {
- printk("Breaking affinity for irq %i\n", irq);
+ break_affinity = 1;
mask = map;
}
+
+ irq_desc[irq].chip->mask(irq);
+
if (irq_desc[irq].chip->set_affinity)
irq_desc[irq].chip->set_affinity(irq, mask);
else if (irq_desc[irq].action && !(warned++))
+ set_affinity = 0;
+
+ irq_desc[irq].chip->unmask(irq);
+
+ spin_unlock(&irq_desc[irq].lock);
+
+ if (break_affinity && set_affinity)
+ printk("Broke affinity for irq %i\n", irq);
+ else if (!set_affinity)
printk("Cannot set affinity for irq %i\n", irq);
}
next prev parent reply other threads:[~2007-06-19 18:04 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-01 0:44 Device hang when offlining a CPU due to IRQ misrouting Darrick J. Wong
2007-06-01 19:39 ` Eric W. Biederman
2007-06-05 17:23 ` Siddha, Suresh B
2007-06-05 17:36 ` Darrick J. Wong
2007-06-05 18:13 ` Siddha, Suresh B
2007-06-05 18:33 ` Darrick J. Wong
2007-06-05 18:40 ` Siddha, Suresh B
2007-06-05 20:09 ` Darrick J. Wong
2007-06-05 21:14 ` Siddha, Suresh B
2007-06-05 23:57 ` Darrick J. Wong
2007-06-06 1:37 ` Siddha, Suresh B
2007-06-06 18:58 ` Darrick J. Wong
2007-06-06 19:35 ` Siddha, Suresh B
2007-06-06 23:16 ` Darrick J. Wong
2007-06-08 0:57 ` Siddha, Suresh B
2007-06-18 22:38 ` Darrick J. Wong
2007-06-18 23:54 ` Siddha, Suresh B
2007-06-19 0:51 ` Darrick J. Wong
2007-06-19 17:54 ` Eric W. Biederman
2007-06-19 18:00 ` Siddha, Suresh B [this message]
2007-06-19 18:55 ` Eric W. Biederman
2007-06-19 19:06 ` Darrick J. Wong
2007-06-19 19:59 ` Siddha, Suresh B
2007-06-19 20:49 ` Darrick J. Wong
2007-06-19 22:08 ` Siddha, Suresh B
2007-06-23 23:54 ` Rafael J. Wysocki
2007-06-23 23:58 ` Andrew Morton
2007-06-24 0:45 ` Eric W. Biederman
2007-06-24 0:51 ` Siddha, Suresh B
2007-06-24 12:50 ` Rafael J. Wysocki
2007-06-24 0:28 ` Siddha, Suresh B
2007-06-24 12:48 ` Rafael J. Wysocki
-- strict thread matches above, loose matches on Subject: below --
2007-06-01 21:57 Emmanuel Fusté
2007-06-02 0:18 ` Eric W. Biederman
2007-06-02 2:19 ` Darrick J. Wong
2007-06-02 3:48 ` Eric W. Biederman
2007-06-03 21:03 Emmanuel Fusté
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070619180003.GE7160@linux-os.sc.intel.com \
--to=suresh.b.siddha@intel.com \
--cc=djwong@us.ibm.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox