* [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt @ 2012-10-16 12:50 Dimitri Sivanich 2012-10-24 12:59 ` [tip:x86/urgent] x86/irq/ioapic: Check " tip-bot for Dimitri Sivanich 0 siblings, 1 reply; 26+ messages in thread From: Dimitri Sivanich @ 2012-10-16 12:50 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: x86, Suresh Siddha, Joerg Roedel, Yinghai Lu, Alexander Gordeev, linux-kernel Posting this patch to fix an issue concerning sparse irq's that I raised a while back. There was discussion about adding refcounting to sparse irqs (to fix other potential race conditions), but that does not appear to have been addressed yet. This covers the only issue of this type that I've encountered in this area. A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. In create_irq_nr() there is a window where we have set vector_irq in __assign_irq_vector(), but not yet called irq_set_chip_data() to set the irq_cfg pointer. Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned irq, but panic when accessing irq_cfg. Only continue processing the irq if irq_cfg is non-NULL. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> --- arch/x86/kernel/apic/io_apic.c | 3 +++ 1 file changed, 3 insertions(+) Index: linux/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux.orig/arch/x86/kernel/apic/io_apic.c +++ linux/arch/x86/kernel/apic/io_apic.c @@ -2257,6 +2257,9 @@ asmlinkage void smp_irq_move_cleanup_int continue; cfg = irq_cfg(irq); + if (!cfg) + continue; + raw_spin_lock(&desc->lock); /* ^ permalink raw reply [flat|nested] 26+ messages in thread
* [tip:x86/urgent] x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-10-16 12:50 [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt Dimitri Sivanich @ 2012-10-24 12:59 ` tip-bot for Dimitri Sivanich 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Dimitri Sivanich @ 2012-10-24 12:59 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, agordeev, hpa, mingo, yinghai, joerg.roedel, sivanich, suresh.b.siddha, tglx Commit-ID: 94777fc51b3ad85ff9f705ddf7cdd0eb3bbad5a6 Gitweb: http://git.kernel.org/tip/94777fc51b3ad85ff9f705ddf7cdd0eb3bbad5a6 Author: Dimitri Sivanich <sivanich@sgi.com> AuthorDate: Tue, 16 Oct 2012 07:50:21 -0500 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 Oct 2012 12:53:51 +0200 x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt Posting this patch to fix an issue concerning sparse irq's that I raised a while back. There was discussion about adding refcounting to sparse irqs (to fix other potential race conditions), but that does not appear to have been addressed yet. This covers the only issue of this type that I've encountered in this area. A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. In create_irq_nr() there is a window where we have set vector_irq in __assign_irq_vector(), but not yet called irq_set_chip_data() to set the irq_cfg pointer. Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned irq, but panic when accessing irq_cfg. Only continue processing the irq if irq_cfg is non-NULL. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Alexander Gordeev <agordeev@redhat.com> Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/apic/io_apic.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index c265593..1817fa9 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2257,6 +2257,9 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) continue; cfg = irq_cfg(irq); + if (!cfg) + continue; + raw_spin_lock(&desc->lock); /* ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt @ 2012-05-21 16:49 Dimitri Sivanich 2012-05-21 21:05 ` Suresh Siddha 2012-05-21 21:07 ` Thomas Gleixner 0 siblings, 2 replies; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-21 16:49 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel The smp_irq_move_cleanup_interrupt routine should be checking for a valid irq_cfg pointer prior to accessing it. It also seems that this should be done after taking the desc lock. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> --- arch/x86/kernel/apic/io_apic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) Index: linux/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux.orig/arch/x86/kernel/apic/io_apic.c +++ linux/arch/x86/kernel/apic/io_apic.c @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int if (!desc) continue; - cfg = irq_cfg(irq); raw_spin_lock(&desc->lock); + cfg = irq_cfg(irq); + if (!cfg) + goto unlock; + /* * Check if the irq migration is in progress. If so, we * haven't received the cleanup request yet for this irq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 16:49 [PATCH] x86: check " Dimitri Sivanich @ 2012-05-21 21:05 ` Suresh Siddha 2012-05-21 21:09 ` Dimitri Sivanich 2012-05-21 21:07 ` Thomas Gleixner 1 sibling, 1 reply; 26+ messages in thread From: Suresh Siddha @ 2012-05-21 21:05 UTC (permalink / raw) To: Dimitri Sivanich Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, 2012-05-21 at 11:49 -0500, Dimitri Sivanich wrote: > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > irq_cfg pointer prior to accessing it. It also seems that this should be > done after taking the desc lock. I think these changes are correct. Did you see any crashes during module unload etc? Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> > > Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> > --- > arch/x86/kernel/apic/io_apic.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > Index: linux/arch/x86/kernel/apic/io_apic.c > =================================================================== > --- linux.orig/arch/x86/kernel/apic/io_apic.c > +++ linux/arch/x86/kernel/apic/io_apic.c > @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int > if (!desc) > continue; > > - cfg = irq_cfg(irq); > raw_spin_lock(&desc->lock); > > + cfg = irq_cfg(irq); > + if (!cfg) > + goto unlock; > + > /* > * Check if the irq migration is in progress. If so, we > * haven't received the cleanup request yet for this irq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 21:05 ` Suresh Siddha @ 2012-05-21 21:09 ` Dimitri Sivanich 2012-05-21 21:10 ` Suresh Siddha 0 siblings, 1 reply; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-21 21:09 UTC (permalink / raw) To: Suresh Siddha Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, May 21, 2012 at 02:05:53PM -0700, Suresh Siddha wrote: > On Mon, 2012-05-21 at 11:49 -0500, Dimitri Sivanich wrote: > > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > > irq_cfg pointer prior to accessing it. It also seems that this should be > > done after taking the desc lock. > > I think these changes are correct. Did you see any crashes during module > unload etc? Yes, we have seen these on occasion during boot. > > Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> > > > > > Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> > > --- > > arch/x86/kernel/apic/io_apic.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > Index: linux/arch/x86/kernel/apic/io_apic.c > > =================================================================== > > --- linux.orig/arch/x86/kernel/apic/io_apic.c > > +++ linux/arch/x86/kernel/apic/io_apic.c > > @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int > > if (!desc) > > continue; > > > > - cfg = irq_cfg(irq); > > raw_spin_lock(&desc->lock); > > > > + cfg = irq_cfg(irq); > > + if (!cfg) > > + goto unlock; > > + > > /* > > * Check if the irq migration is in progress. If so, we > > * haven't received the cleanup request yet for this irq. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 21:09 ` Dimitri Sivanich @ 2012-05-21 21:10 ` Suresh Siddha 2012-05-22 2:41 ` Dimitri Sivanich 0 siblings, 1 reply; 26+ messages in thread From: Suresh Siddha @ 2012-05-21 21:10 UTC (permalink / raw) To: Dimitri Sivanich Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, 2012-05-21 at 16:09 -0500, Dimitri Sivanich wrote: > On Mon, May 21, 2012 at 02:05:53PM -0700, Suresh Siddha wrote: > > On Mon, 2012-05-21 at 11:49 -0500, Dimitri Sivanich wrote: > > > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > > > irq_cfg pointer prior to accessing it. It also seems that this should be > > > done after taking the desc lock. > > > > I think these changes are correct. Did you see any crashes during module > > unload etc? > > Yes, we have seen these on occasion during boot. During boot or shutdown? Review of the code shows that this can trigger during module unload which can call destroy_irq() etc and can trigger the crash if there is a parallel irq migration related cleanup. Unsuccessful module loads can also call destroy_irq() but I doubt that is what happening here. thanks, suresh ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 21:10 ` Suresh Siddha @ 2012-05-22 2:41 ` Dimitri Sivanich 0 siblings, 0 replies; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-22 2:41 UTC (permalink / raw) To: Suresh Siddha Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, May 21, 2012 at 02:10:39PM -0700, Suresh Siddha wrote: > On Mon, 2012-05-21 at 16:09 -0500, Dimitri Sivanich wrote: > > On Mon, May 21, 2012 at 02:05:53PM -0700, Suresh Siddha wrote: > > > On Mon, 2012-05-21 at 11:49 -0500, Dimitri Sivanich wrote: > > > > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > > > > irq_cfg pointer prior to accessing it. It also seems that this should be > > > > done after taking the desc lock. > > > > > > I think these changes are correct. Did you see any crashes during module > > > unload etc? > > > > Yes, we have seen these on occasion during boot. > > During boot or shutdown? > Early on in boot, on rare occasions. Easiest to reproduce doing module repeated module loads/unloads. > Review of the code shows that this can trigger during module unload > which can call destroy_irq() etc and can trigger the crash if there is a > parallel irq migration related cleanup. > > Unsuccessful module loads can also call destroy_irq() but I doubt that > is what happening here. > > thanks, > suresh > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 16:49 [PATCH] x86: check " Dimitri Sivanich 2012-05-21 21:05 ` Suresh Siddha @ 2012-05-21 21:07 ` Thomas Gleixner 2012-05-21 21:19 ` Dimitri Sivanich 1 sibling, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2012-05-21 21:07 UTC (permalink / raw) To: Dimitri Sivanich Cc: Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, 21 May 2012, Dimitri Sivanich wrote: > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > irq_cfg pointer prior to accessing it. It also seems that this should be > done after taking the desc lock. It seems that you either missed or failed to explain why it should be done _after_ taking the lock. Changelogs matter, really. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 21:07 ` Thomas Gleixner @ 2012-05-21 21:19 ` Dimitri Sivanich 2012-05-21 21:34 ` Thomas Gleixner 0 siblings, 1 reply; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-21 21:19 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, May 21, 2012 at 11:07:04PM +0200, Thomas Gleixner wrote: > On Mon, 21 May 2012, Dimitri Sivanich wrote: > > > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > > irq_cfg pointer prior to accessing it. It also seems that this should be > > done after taking the desc lock. > > It seems that you either missed or failed to explain why it should be > done _after_ taking the lock. > > Changelogs matter, really. > How about this? The smp_irq_move_cleanup_interrupt routine should be checking for a valid irq_cfg pointer prior to accessing it. Follow the same protocol shown in irq_set_chip_data(), by taking the desc lock before accessing this location. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> --- arch/x86/kernel/apic/io_apic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) Index: linux/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux.orig/arch/x86/kernel/apic/io_apic.c +++ linux/arch/x86/kernel/apic/io_apic.c @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int if (!desc) continue; - cfg = irq_cfg(irq); raw_spin_lock(&desc->lock); + cfg = irq_cfg(irq); + if (!cfg) + goto unlock; + /* * Check if the irq migration is in progress. If so, we * haven't received the cleanup request yet for this irq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 21:19 ` Dimitri Sivanich @ 2012-05-21 21:34 ` Thomas Gleixner 2012-05-23 18:16 ` Dimitri Sivanich 0 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2012-05-21 21:34 UTC (permalink / raw) To: Dimitri Sivanich Cc: Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, 21 May 2012, Dimitri Sivanich wrote: > On Mon, May 21, 2012 at 11:07:04PM +0200, Thomas Gleixner wrote: > > On Mon, 21 May 2012, Dimitri Sivanich wrote: > > > > > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > > > irq_cfg pointer prior to accessing it. It also seems that this should be > > > done after taking the desc lock. > > > > It seems that you either missed or failed to explain why it should be > > done _after_ taking the lock. > > > > Changelogs matter, really. > > > How about this? > > The smp_irq_move_cleanup_interrupt routine should be checking for a valid > irq_cfg pointer prior to accessing it. Why should it? > Follow the same protocol shown in irq_set_chip_data(), by taking the desc > lock before accessing this location. This is not a proper explanation. irq_set_chip_data() might be wrong as well and aside of that it might be correct to ignore that protocol in that particular situation. What's wrong with adding the actual wreckage scenario _AND_ the solution to the changelog so that a casual reader, who is not completely familiar with the code can understand what you are trying to solve? Don't misunderstand me. The patch is correct, just the explanation sucks. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-21 21:34 ` Thomas Gleixner @ 2012-05-23 18:16 ` Dimitri Sivanich 2012-05-23 19:04 ` Dimitri Sivanich 0 siblings, 1 reply; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-23 18:16 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Mon, May 21, 2012 at 11:34:32PM +0200, Thomas Gleixner wrote: > Don't misunderstand me. The patch is correct, just the explanation > sucks. > Hopefully this explanation is descriptive enough. A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. In create_irq_nr() there is a window where we have set vector_irq in __assign_irq_vector(), but not yet called irq_set_chip_data() to set the irq_cfg pointer. Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned irq, but panic when accessing irq_cfg. Only continue processing the irq if irq_cfg is non-NULL. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> --- arch/x86/kernel/apic/io_apic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) Index: linux/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux.orig/arch/x86/kernel/apic/io_apic.c +++ linux/arch/x86/kernel/apic/io_apic.c @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int if (!desc) continue; - cfg = irq_cfg(irq); raw_spin_lock(&desc->lock); + cfg = irq_cfg(irq); + if (!cfg) + goto unlock; + /* * Check if the irq migration is in progress. If so, we * haven't received the cleanup request yet for this irq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 18:16 ` Dimitri Sivanich @ 2012-05-23 19:04 ` Dimitri Sivanich 2012-05-23 19:24 ` Thomas Gleixner 2012-05-23 19:24 ` Suresh Siddha 0 siblings, 2 replies; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-23 19:04 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, May 23, 2012 at 01:16:36PM -0500, Dimitri Sivanich wrote: > In create_irq_nr() there is a window where we have set vector_irq in > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > irq_cfg pointer. BTW - is there a reason why we're calling irq_set_chip_data() in create_irq_nr() rather than in __assign_irq_vector() for the case where irq_cfg is NULL? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 19:04 ` Dimitri Sivanich @ 2012-05-23 19:24 ` Thomas Gleixner 2012-05-23 19:24 ` Suresh Siddha 1 sibling, 0 replies; 26+ messages in thread From: Thomas Gleixner @ 2012-05-23 19:24 UTC (permalink / raw) To: Dimitri Sivanich Cc: Ingo Molnar, H. Peter Anvin, x86, Suresh Siddha, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, 23 May 2012, Dimitri Sivanich wrote: > On Wed, May 23, 2012 at 01:16:36PM -0500, Dimitri Sivanich wrote: > > In create_irq_nr() there is a window where we have set vector_irq in > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > irq_cfg pointer. > > BTW - is there a reason why we're calling irq_set_chip_data() in create_irq_nr() > rather than in __assign_irq_vector() for the case where irq_cfg is NULL? We can't nest desc->lock inside vector lock. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 19:04 ` Dimitri Sivanich 2012-05-23 19:24 ` Thomas Gleixner @ 2012-05-23 19:24 ` Suresh Siddha 2012-05-23 20:02 ` Dimitri Sivanich 1 sibling, 1 reply; 26+ messages in thread From: Suresh Siddha @ 2012-05-23 19:24 UTC (permalink / raw) To: Dimitri Sivanich Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, 2012-05-23 at 14:04 -0500, Dimitri Sivanich wrote: > On Wed, May 23, 2012 at 01:16:36PM -0500, Dimitri Sivanich wrote: > > In create_irq_nr() there is a window where we have set vector_irq in > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > irq_cfg pointer. Ha. Now I understand how it can happen during boot/module load time. Thanks. > BTW - is there a reason why we're calling irq_set_chip_data() in create_irq_nr() > rather than in __assign_irq_vector() for the case where irq_cfg is NULL? assign_irq_vector() is also for setting up vectors during irq migration. So may be we could have done the irq_set_chip_data() in create_irq_nr() itself before calling assign_irq_vector(). Anyways, this change can't help in case of destroy irq path which can also lead to the same issue of de-referencing null pointer. Also, it will be nice if you can refer to this destroy irq path in your changelog. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 19:24 ` Suresh Siddha @ 2012-05-23 20:02 ` Dimitri Sivanich 2012-05-23 23:49 ` Suresh Siddha 0 siblings, 1 reply; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-23 20:02 UTC (permalink / raw) To: Suresh Siddha Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, May 23, 2012 at 12:24:46PM -0700, Suresh Siddha wrote: > On Wed, 2012-05-23 at 14:04 -0500, Dimitri Sivanich wrote: > > On Wed, May 23, 2012 at 01:16:36PM -0500, Dimitri Sivanich wrote: > > > In create_irq_nr() there is a window where we have set vector_irq in > > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > > irq_cfg pointer. > > Ha. Now I understand how it can happen during boot/module load time. > Thanks. > > > BTW - is there a reason why we're calling irq_set_chip_data() in create_irq_nr() > > rather than in __assign_irq_vector() for the case where irq_cfg is NULL? > > assign_irq_vector() is also for setting up vectors during irq migration. > So may be we could have done the irq_set_chip_data() in create_irq_nr() > itself before calling assign_irq_vector(). Anyways, this change can't > help in case of destroy irq path which can also lead to the same issue > of de-referencing null pointer. > > Also, it will be nice if you can refer to this destroy irq path in your > changelog. > > Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> OK. Hopefully this covers it. A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. In create_irq_nr() there is a window where we have set vector_irq in __assign_irq_vector(), but not yet called irq_set_chip_data() to set the irq_cfg pointer. Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned irq, but panic when accessing irq_cfg. There is also a window in destroy_irq() where we've cleared the irq_cfg pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. Only continue processing the irq if irq_cfg is non-NULL. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> --- arch/x86/kernel/apic/io_apic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) Index: linux/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux.orig/arch/x86/kernel/apic/io_apic.c +++ linux/arch/x86/kernel/apic/io_apic.c @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int if (!desc) continue; - cfg = irq_cfg(irq); raw_spin_lock(&desc->lock); + cfg = irq_cfg(irq); + if (!cfg) + goto unlock; + /* * Check if the irq migration is in progress. If so, we * haven't received the cleanup request yet for this irq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 20:02 ` Dimitri Sivanich @ 2012-05-23 23:49 ` Suresh Siddha 2012-05-24 1:40 ` Dimitri Sivanich ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Suresh Siddha @ 2012-05-23 23:49 UTC (permalink / raw) To: Dimitri Sivanich Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, 2012-05-23 at 15:02 -0500, Dimitri Sivanich wrote: > OK. Hopefully this covers it. Sorry No. Now you will understand why Thomas wanted detailed changelog. I found one more issue with the help of your new modification to the changelog. > A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if > we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. > > In create_irq_nr() there is a window where we have set vector_irq in > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > irq_cfg pointer. > > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, > smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned > irq, but panic when accessing irq_cfg. > > There is also a window in destroy_irq() where we've cleared the irq_cfg > pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note > that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), > but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. So, what happens if the irq_desc gets freed by the destroy_irq() in the sparse irq case? smp_irq_move_cleanup_interrupt() will refer to freed irq desc memory! Right? May we should really do something like the appended (untested patch)? Can you please review and give this a try? Let me review a bit more to see if this really fixes the issue. Thanks. --- arch/x86/kernel/apic/io_apic.c | 19 +++++++++---------- 1 files changed, 9 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index ffdc152..81f4cab 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2295,6 +2295,8 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) exit_idle(); me = smp_processor_id(); + + raw_spin_lock(&vector_lock); for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { unsigned int irq; unsigned int irr; @@ -2310,17 +2312,16 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) continue; cfg = irq_cfg(irq); - raw_spin_lock(&desc->lock); /* * Check if the irq migration is in progress. If so, we * haven't received the cleanup request yet for this irq. */ if (cfg->move_in_progress) - goto unlock; + continue; if (vector == cfg->vector && cpumask_test_cpu(me, cfg->domain)) - goto unlock; + continue; irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); /* @@ -2332,12 +2333,11 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) */ if (irr & (1 << (vector % 32))) { apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR); - goto unlock; + continue; } __this_cpu_write(vector_irq[vector], -1); -unlock: - raw_spin_unlock(&desc->lock); } + raw_spin_unlock(&vector_lock); irq_exit(); } @@ -2986,17 +2986,16 @@ unsigned int create_irq_nr(unsigned int from, int node) return 0; } + irq_set_chip_data(irq, cfg); raw_spin_lock_irqsave(&vector_lock, flags); if (!__assign_irq_vector(irq, cfg, apic->target_cpus())) ret = irq; raw_spin_unlock_irqrestore(&vector_lock, flags); - if (ret) { - irq_set_chip_data(irq, cfg); + if (ret) irq_clear_status_flags(irq, IRQ_NOREQUEST); - } else { + else free_irq_at(irq, cfg); - } return ret; } ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 23:49 ` Suresh Siddha @ 2012-05-24 1:40 ` Dimitri Sivanich 2012-05-24 14:37 ` Dimitri Sivanich 2012-05-24 14:53 ` Thomas Gleixner 2 siblings, 0 replies; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-24 1:40 UTC (permalink / raw) To: Suresh Siddha Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, May 23, 2012 at 04:49:29PM -0700, Suresh Siddha wrote: > On Wed, 2012-05-23 at 15:02 -0500, Dimitri Sivanich wrote: > > OK. Hopefully this covers it. > > Sorry No. Now you will understand why Thomas wanted detailed changelog. > I found one more issue with the help of your new modification to the > changelog. > > A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if > > we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. > > > > In create_irq_nr() there is a window where we have set vector_irq in > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > irq_cfg pointer. > > > > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, > > smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned > > irq, but panic when accessing irq_cfg. > > > > There is also a window in destroy_irq() where we've cleared the irq_cfg > > pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note > > that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), > > but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. > > So, what happens if the irq_desc gets freed by the destroy_irq() in the > sparse irq case? smp_irq_move_cleanup_interrupt() will refer to freed > irq desc memory! Right? I don't believe I ever hit that case, but yes, I see the possibility, so we have at least 3 holes here. My test (other than repeated boots) was to repeatedly insert and remove a module that sets up irq's on every cpu. Without my patch it panic'ed fairly quickly. With the patch it never failed. > > May we should really do something like the appended (untested patch)? > Can you please review and give this a try? Let me review a bit more to > see if this really fixes the issue. I wonder how many instances of smp_irq_move_cleanup_interrupt() might run simultaneously on a large system? Would there be any issue with lock pressure from holding the vector_lock through the entire loop? I can probably test the code (after further review), but the only case I saw through testing was the create_irq_nr case. > > Thanks. > --- > arch/x86/kernel/apic/io_apic.c | 19 +++++++++---------- > 1 files changed, 9 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c > index ffdc152..81f4cab 100644 > --- a/arch/x86/kernel/apic/io_apic.c > +++ b/arch/x86/kernel/apic/io_apic.c > @@ -2295,6 +2295,8 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) > exit_idle(); > > me = smp_processor_id(); > + > + raw_spin_lock(&vector_lock); > for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { > unsigned int irq; > unsigned int irr; > @@ -2310,17 +2312,16 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) > continue; > > cfg = irq_cfg(irq); > - raw_spin_lock(&desc->lock); > > /* > * Check if the irq migration is in progress. If so, we > * haven't received the cleanup request yet for this irq. > */ > if (cfg->move_in_progress) > - goto unlock; > + continue; > > if (vector == cfg->vector && cpumask_test_cpu(me, cfg->domain)) > - goto unlock; > + continue; > > irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); > /* > @@ -2332,12 +2333,11 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) > */ > if (irr & (1 << (vector % 32))) { > apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR); > - goto unlock; > + continue; > } > __this_cpu_write(vector_irq[vector], -1); > -unlock: > - raw_spin_unlock(&desc->lock); > } > + raw_spin_unlock(&vector_lock); > > irq_exit(); > } > @@ -2986,17 +2986,16 @@ unsigned int create_irq_nr(unsigned int from, int node) > return 0; > } > > + irq_set_chip_data(irq, cfg); > raw_spin_lock_irqsave(&vector_lock, flags); > if (!__assign_irq_vector(irq, cfg, apic->target_cpus())) > ret = irq; > raw_spin_unlock_irqrestore(&vector_lock, flags); > > - if (ret) { > - irq_set_chip_data(irq, cfg); > + if (ret) > irq_clear_status_flags(irq, IRQ_NOREQUEST); > - } else { > + else > free_irq_at(irq, cfg); > - } > return ret; > } > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 23:49 ` Suresh Siddha 2012-05-24 1:40 ` Dimitri Sivanich @ 2012-05-24 14:37 ` Dimitri Sivanich 2012-05-24 18:19 ` Suresh Siddha 2012-05-24 14:53 ` Thomas Gleixner 2 siblings, 1 reply; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-24 14:37 UTC (permalink / raw) To: Suresh Siddha Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, May 23, 2012 at 04:49:29PM -0700, Suresh Siddha wrote: > On Wed, 2012-05-23 at 15:02 -0500, Dimitri Sivanich wrote: > > OK. Hopefully this covers it. > > Sorry No. Now you will understand why Thomas wanted detailed changelog. > I found one more issue with the help of your new modification to the > changelog. > > > A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if > > we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. > > > > In create_irq_nr() there is a window where we have set vector_irq in > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > irq_cfg pointer. > > > > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, > > smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned > > irq, but panic when accessing irq_cfg. > > > > There is also a window in destroy_irq() where we've cleared the irq_cfg > > pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note > > that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), > > but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. > > So, what happens if the irq_desc gets freed by the destroy_irq() in the > sparse irq case? smp_irq_move_cleanup_interrupt() will refer to freed > irq desc memory! Right? And speaking of possible holes in destroy_irq().. What happens if we're running __assign_irq_vector() (say we're changing irq affinity), and on another cpu we had just run through __clear_irq_vector() via destroy_irq(). Now destroy_irq() is going to call free_irq_at()->free_irq_cfg, which will clear irq_cfg. Then __assign_irq_vector goes to access irq_cfg (cfg->vector or cfg->move_in_progress, for instance), which was already freed. I'm not sure if this can happen, but just eyeballing it, it does look that that way. > > May we should really do something like the appended (untested patch)? > Can you please review and give this a try? Let me review a bit more to > see if this really fixes the issue. > > Thanks. > --- > arch/x86/kernel/apic/io_apic.c | 19 +++++++++---------- > 1 files changed, 9 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c > index ffdc152..81f4cab 100644 > --- a/arch/x86/kernel/apic/io_apic.c > +++ b/arch/x86/kernel/apic/io_apic.c > @@ -2295,6 +2295,8 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) > exit_idle(); > > me = smp_processor_id(); > + > + raw_spin_lock(&vector_lock); > for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { > unsigned int irq; > unsigned int irr; > @@ -2310,17 +2312,16 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) > continue; > > cfg = irq_cfg(irq); > - raw_spin_lock(&desc->lock); > > /* > * Check if the irq migration is in progress. If so, we > * haven't received the cleanup request yet for this irq. > */ > if (cfg->move_in_progress) > - goto unlock; > + continue; > > if (vector == cfg->vector && cpumask_test_cpu(me, cfg->domain)) > - goto unlock; > + continue; > > irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); > /* > @@ -2332,12 +2333,11 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) > */ > if (irr & (1 << (vector % 32))) { > apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR); > - goto unlock; > + continue; > } > __this_cpu_write(vector_irq[vector], -1); > -unlock: > - raw_spin_unlock(&desc->lock); > } > + raw_spin_unlock(&vector_lock); > > irq_exit(); > } > @@ -2986,17 +2986,16 @@ unsigned int create_irq_nr(unsigned int from, int node) > return 0; > } > > + irq_set_chip_data(irq, cfg); > raw_spin_lock_irqsave(&vector_lock, flags); > if (!__assign_irq_vector(irq, cfg, apic->target_cpus())) > ret = irq; > raw_spin_unlock_irqrestore(&vector_lock, flags); > > - if (ret) { > - irq_set_chip_data(irq, cfg); > + if (ret) > irq_clear_status_flags(irq, IRQ_NOREQUEST); > - } else { > + else > free_irq_at(irq, cfg); > - } > return ret; > } > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-24 14:37 ` Dimitri Sivanich @ 2012-05-24 18:19 ` Suresh Siddha 2012-05-24 19:16 ` Thomas Gleixner 0 siblings, 1 reply; 26+ messages in thread From: Suresh Siddha @ 2012-05-24 18:19 UTC (permalink / raw) To: Dimitri Sivanich Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Thu, 2012-05-24 at 09:37 -0500, Dimitri Sivanich wrote: > And speaking of possible holes in destroy_irq().. > > What happens if we're running __assign_irq_vector() (say we're changing irq > affinity), and on another cpu we had just run through __clear_irq_vector() > via destroy_irq(). Now destroy_irq() is going to call > free_irq_at()->free_irq_cfg, which will clear irq_cfg. Then > __assign_irq_vector goes to access irq_cfg (cfg->vector or > cfg->move_in_progress, for instance), which was already freed. > > I'm not sure if this can happen, but just eyeballing it, it does look that > that way. > I wanted to say, irq desc is locked when we change the irq affinity, which calls assign_irq_vector() and friends, so this should be fine. BUT NO. I don't see any reference counts being maintained when we do irq_to_desc(). So locking/unlocking that desc pointer is bogus when destroy_irq() can go ahead and free the desc in parallel. So, SPARSE_IRQ looks terribly broken! Yinghai, Thomas? thanks, suresh ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-24 18:19 ` Suresh Siddha @ 2012-05-24 19:16 ` Thomas Gleixner 2012-05-26 0:23 ` Suresh Siddha 0 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2012-05-24 19:16 UTC (permalink / raw) To: Suresh Siddha Cc: Dimitri Sivanich, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Thu, 24 May 2012, Suresh Siddha wrote: > On Thu, 2012-05-24 at 09:37 -0500, Dimitri Sivanich wrote: > > And speaking of possible holes in destroy_irq().. > > > > What happens if we're running __assign_irq_vector() (say we're changing irq > > affinity), and on another cpu we had just run through __clear_irq_vector() > > via destroy_irq(). Now destroy_irq() is going to call > > free_irq_at()->free_irq_cfg, which will clear irq_cfg. Then > > __assign_irq_vector goes to access irq_cfg (cfg->vector or > > cfg->move_in_progress, for instance), which was already freed. > > > > I'm not sure if this can happen, but just eyeballing it, it does look that > > that way. > > > > I wanted to say, irq desc is locked when we change the irq affinity, > which calls assign_irq_vector() and friends, so this should be fine. > > BUT NO. I don't see any reference counts being maintained when we do > irq_to_desc(). So locking/unlocking that desc pointer is bogus when > destroy_irq() can go ahead and free the desc in parallel. > > So, SPARSE_IRQ looks terribly broken! Yinghai, Thomas? Yes, we need refcounts for that. We talked about that before, but then the argument was against it was that all that code is serialized already, so no need. How wrong :) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-24 19:16 ` Thomas Gleixner @ 2012-05-26 0:23 ` Suresh Siddha 2012-05-26 10:18 ` Thomas Gleixner 0 siblings, 1 reply; 26+ messages in thread From: Suresh Siddha @ 2012-05-26 0:23 UTC (permalink / raw) To: Thomas Gleixner Cc: Dimitri Sivanich, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Thu, 2012-05-24 at 21:16 +0200, Thomas Gleixner wrote: > On Thu, 24 May 2012, Suresh Siddha wrote: > > On Thu, 2012-05-24 at 09:37 -0500, Dimitri Sivanich wrote: > > > And speaking of possible holes in destroy_irq().. > > > > > > What happens if we're running __assign_irq_vector() (say we're changing irq > > > affinity), and on another cpu we had just run through __clear_irq_vector() > > > via destroy_irq(). Now destroy_irq() is going to call > > > free_irq_at()->free_irq_cfg, which will clear irq_cfg. Then > > > __assign_irq_vector goes to access irq_cfg (cfg->vector or > > > cfg->move_in_progress, for instance), which was already freed. > > > > > > I'm not sure if this can happen, but just eyeballing it, it does look that > > > that way. > > > > > > > I wanted to say, irq desc is locked when we change the irq affinity, > > which calls assign_irq_vector() and friends, so this should be fine. > > > > BUT NO. I don't see any reference counts being maintained when we do > > irq_to_desc(). So locking/unlocking that desc pointer is bogus when > > destroy_irq() can go ahead and free the desc in parallel. > > > > So, SPARSE_IRQ looks terribly broken! Yinghai, Thomas? > > Yes, we need refcounts for that. We talked about that before, but then > the argument was against it was that all that code is serialized > already, so no need. How wrong :) I looked a bit more at this and it looks like unregister_handler_proc() (that gets called from the free_irq path) will ensure that all the existing proc writers modifying the irq affinity from the /proc/irq/N/smp_affinity interface will be completed. So Dimitri, the code path changing irq affinity looks fine. We also do synchronize_irq() in __free_irq(). Followed by masking/shutdown the irq etc and the destroy_irq() which happens after all this is probably getting lucky because of timing reasons (any other cpu handling that irq would have completed the irq handling and the corresponding reference to the 'desc' etc). But we should be also doing synchronize_rcu()/synchronize_sched() to ensure that other cpu's are not in the irq handling paths having a reference to the 'desc'. There are other (not-so common) irq desc references, like in the show_interrupts() (cat /proc/interrupts path) etc, that does things like this in the process context: desc = irq_to_desc(i); if (!desc) return 0; raw_spin_lock_irqsave(&desc->lock, flags); May be we should introduce something like get_irq_desc_locked()/put_irq_desc_locked() that can safely access the irq desc with pre-emption/irq's disabled and lock it etc. And the synchronize_sched() will enable the destroy_irq()/free_desc() to free it safely etc. thanks, suresh ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-26 0:23 ` Suresh Siddha @ 2012-05-26 10:18 ` Thomas Gleixner 2012-05-27 1:41 ` Jiang Liu 2012-05-30 13:46 ` Dimitri Sivanich 0 siblings, 2 replies; 26+ messages in thread From: Thomas Gleixner @ 2012-05-26 10:18 UTC (permalink / raw) To: Suresh Siddha Cc: Dimitri Sivanich, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Fri, 25 May 2012, Suresh Siddha wrote: > On Thu, 2012-05-24 at 21:16 +0200, Thomas Gleixner wrote: > There are other (not-so common) irq desc references, like in the > show_interrupts() (cat /proc/interrupts path) etc, that does things like > this in the process context: > > desc = irq_to_desc(i); > if (!desc) > return 0; > > raw_spin_lock_irqsave(&desc->lock, flags); > > May be we should introduce something like > get_irq_desc_locked()/put_irq_desc_locked() that can safely access the > irq desc with pre-emption/irq's disabled and lock it etc. And the > synchronize_sched() will enable the destroy_irq()/free_desc() to free it > safely etc. I want to avoid that and instead use proper refcounting. The reason is that we want to move the irq descriptor when the affinity changes nodes, and for that we need refcounting anyway. Thanks, tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-26 10:18 ` Thomas Gleixner @ 2012-05-27 1:41 ` Jiang Liu 2012-05-30 13:46 ` Dimitri Sivanich 1 sibling, 0 replies; 26+ messages in thread From: Jiang Liu @ 2012-05-27 1:41 UTC (permalink / raw) To: Thomas Gleixner Cc: Suresh Siddha, Dimitri Sivanich, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel I have realized the same issue with typical usages of for_each_irq_desc(), which may access freed memory with SPARSE_IRQ. My naive solution was to avoid freeing irq_desc even SPARSE_IRQ_ is enabled. for_each_irq_desc(i, desc) { raw_spin_lock_irq(&desc->lock); On 05/26/2012 06:18 PM, Thomas Gleixner wrote: > On Fri, 25 May 2012, Suresh Siddha wrote: >> On Thu, 2012-05-24 at 21:16 +0200, Thomas Gleixner wrote: >> There are other (not-so common) irq desc references, like in the >> show_interrupts() (cat /proc/interrupts path) etc, that does things like >> this in the process context: >> >> desc = irq_to_desc(i); >> if (!desc) >> return 0; >> >> raw_spin_lock_irqsave(&desc->lock, flags); >> >> May be we should introduce something like >> get_irq_desc_locked()/put_irq_desc_locked() that can safely access the >> irq desc with pre-emption/irq's disabled and lock it etc. And the >> synchronize_sched() will enable the destroy_irq()/free_desc() to free it >> safely etc. > > I want to avoid that and instead use proper refcounting. The reason is > that we want to move the irq descriptor when the affinity changes > nodes, and for that we need refcounting anyway. > > Thanks, > > tglx > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-26 10:18 ` Thomas Gleixner 2012-05-27 1:41 ` Jiang Liu @ 2012-05-30 13:46 ` Dimitri Sivanich 1 sibling, 0 replies; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-30 13:46 UTC (permalink / raw) To: Thomas Gleixner Cc: Suresh Siddha, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel, akpm, gregkh, stable On Sat, May 26, 2012 at 12:18:21PM +0200, Thomas Gleixner wrote: > On Fri, 25 May 2012, Suresh Siddha wrote: > > On Thu, 2012-05-24 at 21:16 +0200, Thomas Gleixner wrote: > > There are other (not-so common) irq desc references, like in the > > show_interrupts() (cat /proc/interrupts path) etc, that does things like > > this in the process context: > > > > desc = irq_to_desc(i); > > if (!desc) > > return 0; > > > > raw_spin_lock_irqsave(&desc->lock, flags); > > > > May be we should introduce something like > > get_irq_desc_locked()/put_irq_desc_locked() that can safely access the > > irq desc with pre-emption/irq's disabled and lock it etc. And the > > synchronize_sched() will enable the destroy_irq()/free_desc() to free it > > safely etc. > > I want to avoid that and instead use proper refcounting. The reason is > that we want to move the irq descriptor when the affinity changes > nodes, and for that we need refcounting anyway. > While this proposal sounds good, in the meantime would there be any harm in putting the NULL cfg check into smp_irq_move_cleanup_interrupt()? It's a minimal change, and eliminates the panics that I've encountered thus far. Reposting the patch. A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. In create_irq_nr() there is a window where we have set vector_irq in __assign_irq_vector(), but not yet called irq_set_chip_data() to set the irq_cfg pointer. Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned irq, but panic when accessing irq_cfg. There is also a window in destroy_irq() where we've cleared the irq_cfg pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. Only continue processing the irq if irq_cfg is non-NULL. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> --- arch/x86/kernel/apic/io_apic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) Index: linux/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux.orig/arch/x86/kernel/apic/io_apic.c +++ linux/arch/x86/kernel/apic/io_apic.c @@ -2478,9 +2478,12 @@ asmlinkage void smp_irq_move_cleanup_int if (!desc) continue; - cfg = irq_cfg(irq); raw_spin_lock(&desc->lock); + cfg = irq_cfg(irq); + if (!cfg) + goto unlock; + /* * Check if the irq migration is in progress. If so, we * haven't received the cleanup request yet for this irq. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-23 23:49 ` Suresh Siddha 2012-05-24 1:40 ` Dimitri Sivanich 2012-05-24 14:37 ` Dimitri Sivanich @ 2012-05-24 14:53 ` Thomas Gleixner 2012-05-24 15:36 ` Dimitri Sivanich 2 siblings, 1 reply; 26+ messages in thread From: Thomas Gleixner @ 2012-05-24 14:53 UTC (permalink / raw) To: Suresh Siddha Cc: Dimitri Sivanich, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Wed, 23 May 2012, Suresh Siddha wrote: > On Wed, 2012-05-23 at 15:02 -0500, Dimitri Sivanich wrote: > > OK. Hopefully this covers it. > > Sorry No. Now you will understand why Thomas wanted detailed changelog. > I found one more issue with the help of your new modification to the > changelog. > > > A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if > > we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. > > > > In create_irq_nr() there is a window where we have set vector_irq in > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > irq_cfg pointer. > > > > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, > > smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned > > irq, but panic when accessing irq_cfg. > > > > There is also a window in destroy_irq() where we've cleared the irq_cfg > > pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note > > that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), > > but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. > > So, what happens if the irq_desc gets freed by the destroy_irq() in the > sparse irq case? smp_irq_move_cleanup_interrupt() will refer to freed > irq desc memory! Right? > > May we should really do something like the appended (untested patch)? > Can you please review and give this a try? Let me review a bit more to > see if this really fixes the issue. It's fixing the problem. But this move_cleanup stuff could be made less stupid. The check for irq_desc is superflous. irq_cfg() calls irq_get_chip_data() which will return NULL if the irq descriptor is not there. To avoid the lookup business completely we should really store irq_desc instead of the irq number in the per cpu vector array, that would also get rid of the lookup in the irq delivery path. Now that still needs to iterate over all vectors, but this could be optimized in a second step. In complete_move() we send the IPI to all cpus in the old mask. We really should set the corresponding vector bit in a per cpu bitfield on those cpus in the mask. The cleanup can rely on the bits and avoid looking at 200+ vectors to find a single one. Thoughts? tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt 2012-05-24 14:53 ` Thomas Gleixner @ 2012-05-24 15:36 ` Dimitri Sivanich 0 siblings, 0 replies; 26+ messages in thread From: Dimitri Sivanich @ 2012-05-24 15:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Suresh Siddha, Ingo Molnar, H. Peter Anvin, x86, Yinghai Lu, Naga Chumbalkar, Jacob Pan, linux-kernel On Thu, May 24, 2012 at 04:53:06PM +0200, Thomas Gleixner wrote: > On Wed, 23 May 2012, Suresh Siddha wrote: > > On Wed, 2012-05-23 at 15:02 -0500, Dimitri Sivanich wrote: > > > OK. Hopefully this covers it. > > > > Sorry No. Now you will understand why Thomas wanted detailed changelog. > > I found one more issue with the help of your new modification to the > > changelog. > > > > > A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if > > > we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data. > > > > > > In create_irq_nr() there is a window where we have set vector_irq in > > > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the > > > irq_cfg pointer. > > > > > > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time, > > > smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned > > > irq, but panic when accessing irq_cfg. > > > > > > There is also a window in destroy_irq() where we've cleared the irq_cfg > > > pointer in free_irq_cfg(), but have not yet called irq_free_desc(). Note > > > that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(), > > > but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc. > > > > So, what happens if the irq_desc gets freed by the destroy_irq() in the > > sparse irq case? smp_irq_move_cleanup_interrupt() will refer to freed > > irq desc memory! Right? > > > > May we should really do something like the appended (untested patch)? > > Can you please review and give this a try? Let me review a bit more to > > see if this really fixes the issue. > > It's fixing the problem. > > But this move_cleanup stuff could be made less stupid. > > The check for irq_desc is superflous. irq_cfg() calls > irq_get_chip_data() which will return NULL if the irq descriptor is > not there. > > To avoid the lookup business completely we should really store > irq_desc instead of the irq number in the per cpu vector array, that > would also get rid of the lookup in the irq delivery path. So if the irq_desc gets deallocated you would clear all corresponding per cpu vector_irq references before deallocation, protecting accesses by smp_irq_move_cleanup_interrupt? > > Now that still needs to iterate over all vectors, but this could be > optimized in a second step. > > In complete_move() we send the IPI to all cpus in the old mask. We > really should set the corresponding vector bit in a per cpu bitfield > on those cpus in the mask. The cleanup can rely on the bits and avoid > looking at 200+ vectors to find a single one. This part does sound more efficient at first glance. > > Thoughts? > > tglx ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2012-10-24 13:00 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-16 12:50 [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt Dimitri Sivanich 2012-10-24 12:59 ` [tip:x86/urgent] x86/irq/ioapic: Check " tip-bot for Dimitri Sivanich -- strict thread matches above, loose matches on Subject: below -- 2012-05-21 16:49 [PATCH] x86: check " Dimitri Sivanich 2012-05-21 21:05 ` Suresh Siddha 2012-05-21 21:09 ` Dimitri Sivanich 2012-05-21 21:10 ` Suresh Siddha 2012-05-22 2:41 ` Dimitri Sivanich 2012-05-21 21:07 ` Thomas Gleixner 2012-05-21 21:19 ` Dimitri Sivanich 2012-05-21 21:34 ` Thomas Gleixner 2012-05-23 18:16 ` Dimitri Sivanich 2012-05-23 19:04 ` Dimitri Sivanich 2012-05-23 19:24 ` Thomas Gleixner 2012-05-23 19:24 ` Suresh Siddha 2012-05-23 20:02 ` Dimitri Sivanich 2012-05-23 23:49 ` Suresh Siddha 2012-05-24 1:40 ` Dimitri Sivanich 2012-05-24 14:37 ` Dimitri Sivanich 2012-05-24 18:19 ` Suresh Siddha 2012-05-24 19:16 ` Thomas Gleixner 2012-05-26 0:23 ` Suresh Siddha 2012-05-26 10:18 ` Thomas Gleixner 2012-05-27 1:41 ` Jiang Liu 2012-05-30 13:46 ` Dimitri Sivanich 2012-05-24 14:53 ` Thomas Gleixner 2012-05-24 15:36 ` Dimitri Sivanich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).