* [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping @ 2009-06-04 18:59 Suresh Siddha 2009-06-04 23:13 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Suresh Siddha @ 2009-06-04 18:59 UTC (permalink / raw) To: mingo, hpa, tglx; +Cc: linux-kernel, ebiederm, andi, travis, steiner From: Suresh Siddha <suresh.b.siddha@intel.com> Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping In the presence of interrupt-remapping, irqs will be migrated in the process context and we don't do (and there is no need to) irq_chip mask/unmask while migrating the interrupt. Similarly fix the fixup_irqs() that get called during cpu offline and avoid calling irq_chip mask/unmask for irqs that are ok to be migrated in the process context. While we didn't observe any race condition with the existing code, this change takes complete advantage of interrupt-remapping in the newer generation platforms and avoids any potential HW lockup's (that often worry Eric :) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> --- diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c index 977d8b4..82265a5 100644 --- a/arch/x86/kernel/irq_64.c +++ b/arch/x86/kernel/irq_64.c @@ -95,7 +95,7 @@ void fixup_irqs(void) affinity = cpu_all_mask; } - if (desc->chip->mask) + if (!(desc->status & IRQ_MOVE_PCNTXT) && desc->chip->mask) desc->chip->mask(irq); if (desc->chip->set_affinity) @@ -103,7 +103,7 @@ void fixup_irqs(void) else if (!(warned++)) set_affinity = 0; - if (desc->chip->unmask) + if (!(desc->status & IRQ_MOVE_PCNTXT) && desc->chip->unmask) desc->chip->unmask(irq); spin_unlock(&desc->lock); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-04 18:59 [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping Suresh Siddha @ 2009-06-04 23:13 ` Eric W. Biederman 2009-06-05 1:18 ` Suresh Siddha 0 siblings, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2009-06-04 23:13 UTC (permalink / raw) To: suresh.b.siddha Cc: mingo, hpa, tglx, linux-kernel, andi, travis, steiner, Gary Hade Suresh Siddha <suresh.b.siddha@intel.com> writes: > From: Suresh Siddha <suresh.b.siddha@intel.com> > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping > > In the presence of interrupt-remapping, irqs will be migrated in the > process context and we don't do (and there is no need to) irq_chip mask/unmask > while migrating the interrupt. > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid > calling irq_chip mask/unmask for irqs that are ok to be migrated in the > process context. > > While we didn't observe any race condition with the existing code, > this change takes complete advantage of interrupt-remapping in > the newer generation platforms and avoids any potential HW lockup's > (that often worry Eric :) You now apparently fail to migrate the irq threads in tandem with the rest of the irqs. Eric > Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> > Cc: Eric W. Biederman <ebiederm@xmission.com> > --- > > diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c > index 977d8b4..82265a5 100644 > --- a/arch/x86/kernel/irq_64.c > +++ b/arch/x86/kernel/irq_64.c > @@ -95,7 +95,7 @@ void fixup_irqs(void) > affinity = cpu_all_mask; > } > > - if (desc->chip->mask) > + if (!(desc->status & IRQ_MOVE_PCNTXT) && desc->chip->mask) > desc->chip->mask(irq); > > if (desc->chip->set_affinity) > @@ -103,7 +103,7 @@ void fixup_irqs(void) > else if (!(warned++)) > set_affinity = 0; > > - if (desc->chip->unmask) > + if (!(desc->status & IRQ_MOVE_PCNTXT) && desc->chip->unmask) > desc->chip->unmask(irq); > > spin_unlock(&desc->lock); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-04 23:13 ` Eric W. Biederman @ 2009-06-05 1:18 ` Suresh Siddha 2009-06-05 1:19 ` Suresh Siddha 2009-06-05 22:20 ` Gary Hade 0 siblings, 2 replies; 10+ messages in thread From: Suresh Siddha @ 2009-06-05 1:18 UTC (permalink / raw) To: Eric W. Biederman Cc: mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, Gary Hade On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote: > Suresh Siddha <suresh.b.siddha@intel.com> writes: > > > From: Suresh Siddha <suresh.b.siddha@intel.com> > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping > > > > In the presence of interrupt-remapping, irqs will be migrated in the > > process context and we don't do (and there is no need to) irq_chip mask/unmask > > while migrating the interrupt. > > > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the > > process context. > > > > While we didn't observe any race condition with the existing code, > > this change takes complete advantage of interrupt-remapping in > > the newer generation platforms and avoids any potential HW lockup's > > (that often worry Eric :) > > You now apparently fail to migrate the irq threads in tandem with > the rest of the irqs. Eric, Are you referring to Gary's issues? As far as I understand, they don't happen in the presence of interrupt-remapping. Can you ack this patch, as this avoid touching IO-APIC and MSI entries and does fixup_irqs() in a much more reliable fashion. I haven't followed Gary's couple of patches related to non interrupt-remapping case. I will go through them and see how I can help there. thanks, suresh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-05 1:18 ` Suresh Siddha @ 2009-06-05 1:19 ` Suresh Siddha 2009-06-05 1:47 ` Eric W. Biederman 2009-06-05 22:20 ` Gary Hade 1 sibling, 1 reply; 10+ messages in thread From: Suresh Siddha @ 2009-06-05 1:19 UTC (permalink / raw) To: Eric W. Biederman Cc: mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, Gary Hade On Thu, 2009-06-04 at 18:18 -0700, Suresh Siddha wrote: > On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote: > > Suresh Siddha <suresh.b.siddha@intel.com> writes: > > > > > From: Suresh Siddha <suresh.b.siddha@intel.com> > > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping > > > > > > In the presence of interrupt-remapping, irqs will be migrated in the > > > process context and we don't do (and there is no need to) irq_chip mask/unmask > > > while migrating the interrupt. > > > > > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid > > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the > > > process context. > > > > > > While we didn't observe any race condition with the existing code, > > > this change takes complete advantage of interrupt-remapping in > > > the newer generation platforms and avoids any potential HW lockup's > > > (that often worry Eric :) > > > > You now apparently fail to migrate the irq threads in tandem with > > the rest of the irqs. > > Eric, Are you referring to Gary's issues? As far as I understand, they > don't happen in the presence of interrupt-remapping. > > Can you ack this patch, as this avoid touching IO-APIC and MSI entries > and does fixup_irqs() in a much more reliable fashion. in the presence of interrupt-remapping ofcourse :) > > I haven't followed Gary's couple of patches related to non > interrupt-remapping case. I will go through them and see how I can help > there. > > thanks, > suresh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-05 1:19 ` Suresh Siddha @ 2009-06-05 1:47 ` Eric W. Biederman 2009-06-06 1:19 ` Suresh Siddha 0 siblings, 1 reply; 10+ messages in thread From: Eric W. Biederman @ 2009-06-05 1:47 UTC (permalink / raw) To: suresh.b.siddha Cc: mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, Gary Hade Suresh Siddha <suresh.b.siddha@intel.com> writes: > On Thu, 2009-06-04 at 18:18 -0700, Suresh Siddha wrote: >> On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote: >> > Suresh Siddha <suresh.b.siddha@intel.com> writes: >> > >> > > From: Suresh Siddha <suresh.b.siddha@intel.com> >> > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping >> > > >> > > In the presence of interrupt-remapping, irqs will be migrated in the >> > > process context and we don't do (and there is no need to) irq_chip mask/unmask >> > > while migrating the interrupt. >> > > >> > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid >> > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the >> > > process context. >> > > >> > > While we didn't observe any race condition with the existing code, >> > > this change takes complete advantage of interrupt-remapping in >> > > the newer generation platforms and avoids any potential HW lockup's >> > > (that often worry Eric :) >> > >> > You now apparently fail to migrate the irq threads in tandem with >> > the rest of the irqs. >> >> Eric, Are you referring to Gary's issues? As far as I understand, they >> don't happen in the presence of interrupt-remapping. >> >> Can you ack this patch, as this avoid touching IO-APIC and MSI entries >> and does fixup_irqs() in a much more reliable fashion. > > in the presence of interrupt-remapping ofcourse :) As far as this patch goes it looks like an improvement. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> However after looking at Gary's issues I see some things that are still wrong on this path. 1) We don't do the part of irq migration that moves irq threads. We aren't using irq threads yet but still If we could figure out how to call irq_set_affinity for the IRQ_MOVE_PCNTXT code path that would make the maintenance a lot simpler. 2) We still diverge on 32bit vs 64bit for no reason. I expect the fixed 64bit version should be moved into apic/io_apic.c 3) We still enable irqs for a short while after this to let things drain. I am wondering if that is really necessary. It does very simply allow the irq cleanup ipi to happen, and it unjams any irqs that happened before we migrated them. If we wanted to very strictly follow the rules I guess we could do something like the cleanup_ipi by hand on the cpu that is going down and rebroadcast all of the pending irqs to another cpu to process. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-05 1:47 ` Eric W. Biederman @ 2009-06-06 1:19 ` Suresh Siddha 2009-06-06 2:58 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Suresh Siddha @ 2009-06-06 1:19 UTC (permalink / raw) To: Eric W. Biederman Cc: mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, Gary Hade On Thu, 2009-06-04 at 18:47 -0700, Eric W. Biederman wrote: > As far as this patch goes it looks like an improvement. > > Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Thanks. Ingo, Peter: Can you please queue this patch? Eric, more comments below. > However after looking at Gary's issues I see some things that are still wrong > on this path. > > 1) We don't do the part of irq migration that moves irq threads. > We aren't using irq threads yet but still > Ok. Will look at the irq threads code (-rt tree?). > If we could figure out how to call irq_set_affinity for the IRQ_MOVE_PCNTXT > code path that would make the maintenance a lot simpler. Ok. Will post this cleanup sometime. > > 2) We still diverge on 32bit vs 64bit for no reason. > I expect the fixed 64bit version should be moved into apic/io_apic.c Agree. > > 3) We still enable irqs for a short while after this to let things drain. > I am wondering if that is really necessary. It does very simply > allow the irq cleanup ipi to happen, and it unjams any irqs that happened > before we migrated them. Yes. > If we wanted to very strictly follow the rules I guess we could do something > like the cleanup_ipi by hand on the cpu that is going down and rebroadcast > all of the pending irqs to another cpu to process. > This will be ok for all the cases except the most difficult case (non interrupt-remapping and IO-APIC level triggered). We should service the pending interrupt from the same cpu 'X' (that is going down) because of the vector information (that is cpu 'X' specific) in the IO-APIC RTE. Otherwise we have to do directed EOI ( and I am not sure if all the IO-APIC's support that) to the io-apic. Is there a problem with enabling irqs in the fixup_irqs()? My old proposal (which will fix the stuck IRR issue seen by Gary) for fixing the level irq migration in fixup_irqs() is to do something like: Disable interrupts .. Mask io-apic RTE check if the IRR is set if so, unmask the io-apic RTE enable interrupts and go back to top else ok to migrate the io-apic rte. So if there is any other reason for keeping the interrupts disabled during fixup_irqs(), then we need to think of another strategy to address this. Otherwise, If everyone agrees with this direction then we can try to comeup with a clean patch for this. thanks, suresh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-06 1:19 ` Suresh Siddha @ 2009-06-06 2:58 ` Eric W. Biederman 0 siblings, 0 replies; 10+ messages in thread From: Eric W. Biederman @ 2009-06-06 2:58 UTC (permalink / raw) To: suresh.b.siddha Cc: mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, Gary Hade Suresh Siddha <suresh.b.siddha@intel.com> writes: > On Thu, 2009-06-04 at 18:47 -0700, Eric W. Biederman wrote: >> As far as this patch goes it looks like an improvement. >> >> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > Thanks. Ingo, Peter: Can you please queue this patch? Eric, more > comments below. > >> However after looking at Gary's issues I see some things that are still wrong >> on this path. >> >> 1) We don't do the part of irq migration that moves irq threads. >> We aren't using irq threads yet but still >> > > Ok. Will look at the irq threads code (-rt tree?). In 2.6.30 No one is using them yet but the code is merged. >> If we could figure out how to call irq_set_affinity for the IRQ_MOVE_PCNTXT >> code path that would make the maintenance a lot simpler. > > Ok. Will post this cleanup sometime. Thanks. >> 2) We still diverge on 32bit vs 64bit for no reason. >> I expect the fixed 64bit version should be moved into apic/io_apic.c > > Agree. > >> >> 3) We still enable irqs for a short while after this to let things drain. >> I am wondering if that is really necessary. It does very simply >> allow the irq cleanup ipi to happen, and it unjams any irqs that happened >> before we migrated them. > > Yes. > >> If we wanted to very strictly follow the rules I guess we could do something >> like the cleanup_ipi by hand on the cpu that is going down and rebroadcast >> all of the pending irqs to another cpu to process. >> > > This will be ok for all the cases except the most difficult case (non > interrupt-remapping and IO-APIC level triggered). We should service the > pending interrupt from the same cpu 'X' (that is going down) because of > the vector information (that is cpu 'X' specific) in the IO-APIC RTE. > Otherwise we have to do directed EOI ( and I am not sure if all the > IO-APIC's support that) to the io-apic. > > Is there a problem with enabling irqs in the fixup_irqs()? Strictly speaking it is against the rules. At least it was last time I audited the cpu hotunplug path. In practice it seems to work well. > My old proposal (which will fix the stuck IRR issue seen by Gary) for > fixing the level irq migration in fixup_irqs() is to do something like: > > Disable interrupts > .. > Mask io-apic RTE > check if the IRR is set > if so, > unmask the io-apic RTE > enable interrupts > and go back to top > else > ok to migrate the io-apic rte. > > So if there is any other reason for keeping the interrupts disabled > during fixup_irqs(), then we need to think of another strategy to > address this. As long as fixup_irqs is where we grow the nasty work-arounds and we don't burden the other saner paths. I am generally in favor of it. > Otherwise, If everyone agrees with this direction then we can try to > comeup with a clean patch for this. There is also something else we can do. We can register a cpu hotplug notifier and tell hotunplug to fail if we dealing with a case that we can not actually support cleanly. At the same time there are a couple of more case that we can move into process context. MSI cpu irq migration being the primary. It needs a really good testing but I think all of lowest priority interrupt delivery can be done safely from process context as well. Since we only need a single register write and if we have properly shutdown the cpu the hardware just won't deliver irqs to it automagically. My pipe dream is that we can move just enough irq migration cases into process context that the suspend to ram code code will be happy. And we can make cpu hotunplug fail if the irqs are only safe to migrate in interrupt context. Eric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-05 1:18 ` Suresh Siddha 2009-06-05 1:19 ` Suresh Siddha @ 2009-06-05 22:20 ` Gary Hade 2009-06-06 0:57 ` Suresh Siddha 1 sibling, 1 reply; 10+ messages in thread From: Gary Hade @ 2009-06-05 22:20 UTC (permalink / raw) To: Suresh Siddha Cc: Eric W. Biederman, mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, Gary Hade, lcm On Thu, Jun 04, 2009 at 06:18:14PM -0700, Suresh Siddha wrote: > On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote: > > Suresh Siddha <suresh.b.siddha@intel.com> writes: > > > > > From: Suresh Siddha <suresh.b.siddha@intel.com> > > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping > > > > > > In the presence of interrupt-remapping, irqs will be migrated in the > > > process context and we don't do (and there is no need to) irq_chip mask/unmask > > > while migrating the interrupt. > > > > > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid > > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the > > > process context. > > > > > > While we didn't observe any race condition with the existing code, > > > this change takes complete advantage of interrupt-remapping in > > > the newer generation platforms and avoids any potential HW lockup's > > > (that often worry Eric :) > > > > You now apparently fail to migrate the irq threads in tandem with > > the rest of the irqs. > > Eric, Are you referring to Gary's issues? As far as I understand, they > don't happen in the presence of interrupt-remapping. Suresh, We do not currently have the h/w on which to test this assertion but it seems like there is a good chance that at least the race that http://lkml.org/lkml/2009/6/2/377 fixes could reproduce there. The other problem that is repaired by http://lkml.org/lkml/2009/6/2/378 depends on the i/o redirection table write with remote IRR bit set lockup anomaly that the interrupt-remapping code may avoid or perhaps is not even present with that h/w. My proposed fix for the problem is based on previous interrupt-remapping code that you recently removed with http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0280f7c416c652a2fd95d166f52b199ae61122c0 If I correctly understand your justification for the change it sounds like the interrupt-remapping code now "avoids" the problem. Thanks, Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 garyhade@us.ibm.com http://www.ibm.com/linux/ltc ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-05 22:20 ` Gary Hade @ 2009-06-06 0:57 ` Suresh Siddha 2009-06-06 23:37 ` Gary Hade 0 siblings, 1 reply; 10+ messages in thread From: Suresh Siddha @ 2009-06-06 0:57 UTC (permalink / raw) To: Gary Hade Cc: Eric W. Biederman, mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, lcm@us.ibm.com On Fri, 2009-06-05 at 15:20 -0700, Gary Hade wrote: > On Thu, Jun 04, 2009 at 06:18:14PM -0700, Suresh Siddha wrote: > > On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote: > > > Suresh Siddha <suresh.b.siddha@intel.com> writes: > > > > > > > From: Suresh Siddha <suresh.b.siddha@intel.com> > > > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping > > > > > > > > In the presence of interrupt-remapping, irqs will be migrated in the > > > > process context and we don't do (and there is no need to) irq_chip mask/unmask > > > > while migrating the interrupt. > > > > > > > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid > > > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the > > > > process context. > > > > > > > > While we didn't observe any race condition with the existing code, > > > > this change takes complete advantage of interrupt-remapping in > > > > the newer generation platforms and avoids any potential HW lockup's > > > > (that often worry Eric :) > > > > > > You now apparently fail to migrate the irq threads in tandem with > > > the rest of the irqs. > > > > Eric, Are you referring to Gary's issues? As far as I understand, they > > don't happen in the presence of interrupt-remapping. > > Suresh, > We do not currently have the h/w on which to test this assertion > but it seems like there is a good chance that at least the race that > http://lkml.org/lkml/2009/6/2/377 > fixes could reproduce there. No. In the presence of interrupt-remapping, migration of the irq will be done atomically from the process context. So we don't depend for the next interrupt to arrive for the migration. In the particular case that you mentioned above, we are calling the send_cleanup_vector() from the set_affinity() itself in case of interrupt-remapping. And this will reset the cfg->move_in_progress. But I agree that this bug is pretty nasty for non interrupt-remapping cases and we are very lucky so far for not hitting it with all the irqbalance changes and with suspend/resume code path. While I agree with your online fix, I have to catch up with Eric's concerns. > > The other problem that is repaired by > http://lkml.org/lkml/2009/6/2/378 oh! This one is the famous Eric's rant about broken fixup_irqs() in the presence of non interrupt-remapping. Long time back I have proposed a solution to Eric to resolve this for non interrupt-remapping cases but don't think I never addressed that. Again let me catch up with Eric's comments and see if we can comeup with a solution that is acceptable to everyone. > depends on the i/o redirection table write with remote IRR bit > set lockup anomaly that the interrupt-remapping code may avoid > or perhaps is not even present with that h/w. My proposed fix > for the problem is based on previous interrupt-remapping code > that you recently removed with > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0280f7c416c652a2fd95d166f52b199ae61122c0 I cleaned up my code for a reason (that I never liked it and is complex). So I would def recommend not to go down that path. This second issue also doesn't happen with interrupt-remapping, as we avoid touching the io-apic RTE's and use a virtual vector in the RTE (which is same irrespective of the destination). Will work with you next week in coming up with clean fixes. thanks, suresh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping 2009-06-06 0:57 ` Suresh Siddha @ 2009-06-06 23:37 ` Gary Hade 0 siblings, 0 replies; 10+ messages in thread From: Gary Hade @ 2009-06-06 23:37 UTC (permalink / raw) To: Suresh Siddha Cc: Gary Hade, Eric W. Biederman, mingo@elte.hu, hpa@zytor.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, ak@linux.intel.com, travis@sgi.com, steiner@sgi.com, lcm@us.ibm.com On Fri, Jun 05, 2009 at 05:57:29PM -0700, Suresh Siddha wrote: > On Fri, 2009-06-05 at 15:20 -0700, Gary Hade wrote: > > On Thu, Jun 04, 2009 at 06:18:14PM -0700, Suresh Siddha wrote: > > > On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote: > > > > Suresh Siddha <suresh.b.siddha@intel.com> writes: > > > > > > > > > From: Suresh Siddha <suresh.b.siddha@intel.com> > > > > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping > > > > > > > > > > In the presence of interrupt-remapping, irqs will be migrated in the > > > > > process context and we don't do (and there is no need to) irq_chip mask/unmask > > > > > while migrating the interrupt. > > > > > > > > > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid > > > > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the > > > > > process context. > > > > > > > > > > While we didn't observe any race condition with the existing code, > > > > > this change takes complete advantage of interrupt-remapping in > > > > > the newer generation platforms and avoids any potential HW lockup's > > > > > (that often worry Eric :) > > > > > > > > You now apparently fail to migrate the irq threads in tandem with > > > > the rest of the irqs. > > > > > > Eric, Are you referring to Gary's issues? As far as I understand, they > > > don't happen in the presence of interrupt-remapping. > > > > Suresh, > > We do not currently have the h/w on which to test this assertion > > but it seems like there is a good chance that at least the race that > > http://lkml.org/lkml/2009/6/2/377 > > fixes could reproduce there. > > No. In the presence of interrupt-remapping, migration of the irq will be > done atomically from the process context. So we don't depend for the > next interrupt to arrive for the migration. Thats good. > > In the particular case that you mentioned above, we are calling the > send_cleanup_vector() from the set_affinity() itself in case of > interrupt-remapping. And this will reset the cfg->move_in_progress. Now that you mention this, I do remember seeing those send_cleanup() calls in ir_set_msi_irq_affinity() and migrate_ioapic_irq_desc(). Much better than having to depend on the IRQ move destination CPU, especially when correct behavior depends on it's timely receipt of an interrupt. > > But I agree that this bug is pretty nasty for non interrupt-remapping > cases and we are very lucky so far for not hitting it with all the > irqbalance changes and with suspend/resume code path. Yes, _very_ nasty. It was extremely easy for me to initially reproduce the problem so I have also been wondering why others have been so lucky. The first time I saw the problem was after the first run of a simple script similar to the below script that I had just written to do a quick-and-dirty CPU hotplug sanity check. I then started seeing file i/o plus sas and aic94xx device driver errors and found myself with a very trashed system. Not good! :::::::::::::: cpus_offline_all.sh :::::::::::::: #!/bin/sh ## # Offline all offlineable CPUs ## SYS_CPU_DIR=/sys/devices/system/cpu for d in $SYS_CPU_DIR/cpu[1-9] $SYS_CPU_DIR/cpu??; do cpu="`basename $d`" state=`cat $d/online` if [ "$state" = 1 ]; then echo $cpu: offlining echo 0 > $d/online else echo $cpu: already offline fi done > > While I agree with your online fix, Thats encouraging. :) > I have to catch up with Eric's concerns. > > > > > The other problem that is repaired by > > http://lkml.org/lkml/2009/6/2/378 > > oh! This one is the famous Eric's rant about broken fixup_irqs() in the > presence of non interrupt-remapping. > > Long time back I have proposed a solution to Eric to resolve this for > non interrupt-remapping cases but don't think I never addressed that. > Again let me catch up with Eric's comments and see if we can comeup with > a solution that is acceptable to everyone. > > > depends on the i/o redirection table write with remote IRR bit > > set lockup anomaly that the interrupt-remapping code may avoid > > or perhaps is not even present with that h/w. My proposed fix > > for the problem is based on previous interrupt-remapping code > > that you recently removed with > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0280f7c416c652a2fd95d166f52b199ae61122c0 > > I cleaned up my code for a reason (that I never liked it and is > complex). That delay code may look a little complex but it sure seems to do a nice job. Based on my testing it certainly feels solid from a functional standpoint. > So I would def recommend not to go down that path. Since it has been working so great for us, I am sorry to hear you say this. However, I am open to any ideas you have that would maintain the current user interface and would simple and non-intrusive enough for our distribution partners to feel comfortable adding via an update an existing release. It would be nice to get this kind of fix into mainline as quickly as possible to prevent further propagation of the bug before a better but more disruptive overall solution, such as the one that Eric has been pushing for, is available. BTW, this one has been much harder to reproduce than the other issue so I am not nearly as surprised that others have not reported it. I tripped on it after fixing the other issue and deciding that it might be a good idea to crank up the interrupt rate while testing the fix. I am actually at work today because I had some ideas on how to revise this patch to address Eric's most recent concerns about the IMHO minor, yet still very worrysome to Eric, changes it makes to the interrupt context migration path. Since it sounds like you have concerns beyond it not touching the interrupt context migration path, I guess I will hold off on this work until you have a chance to look at the issue next week. > > This second issue also doesn't happen with interrupt-remapping, as we avoid > touching the io-apic RTE's and use a virtual vector in the RTE > (which is same irrespective of the destination). > > Will work with you next week in coming up with clean fixes. Thanks. I really appreciate this. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 garyhade@us.ibm.com http://www.ibm.com/linux/ltc ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-06-06 23:38 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-04 18:59 [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping Suresh Siddha 2009-06-04 23:13 ` Eric W. Biederman 2009-06-05 1:18 ` Suresh Siddha 2009-06-05 1:19 ` Suresh Siddha 2009-06-05 1:47 ` Eric W. Biederman 2009-06-06 1:19 ` Suresh Siddha 2009-06-06 2:58 ` Eric W. Biederman 2009-06-05 22:20 ` Gary Hade 2009-06-06 0:57 ` Suresh Siddha 2009-06-06 23:37 ` Gary Hade
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox