All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dimitri Sivanich <sivanich@sgi.com>
To: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Yinghai Lu <yinghai@kernel.org>,
	Naga Chumbalkar <nagananda.chumbalkar@hp.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
Date: Wed, 23 May 2012 20:40:19 -0500	[thread overview]
Message-ID: <20120524014019.GA12196@sgi.com> (raw)
In-Reply-To: <1337816970.1997.207.camel@sbsiddha-desk.sc.intel.com>

On Wed, May 23, 2012 at 04:49:29PM -0700, Suresh Siddha wrote:
> On Wed, 2012-05-23 at 15:02 -0500, Dimitri Sivanich wrote:
> > OK.  Hopefully this covers it.
> 
> Sorry No. Now you will understand why Thomas wanted detailed changelog.
> I found one more issue with the help of your new modification to the
> changelog.

> > A NULL pointer dereference can occur in smp_irq_move_cleanup_interrupt() if
> > we haven't yet setup the irq_cfg pointer in the irq_desc.irq_data.chip_data.
> > 
> > In create_irq_nr() there is a window where we have set vector_irq in
> > __assign_irq_vector(), but not yet called irq_set_chip_data() to set the
> > irq_cfg pointer.
> > 
> > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during this time,
> > smp_irq_move_cleanup_interrupt() will attempt to process the aforementioned
> > irq, but panic when accessing irq_cfg.
> > 
> > There is also a window in destroy_irq() where we've cleared the irq_cfg
> > pointer in free_irq_cfg(), but have not yet called irq_free_desc().  Note
> > that we have cleared vector_irq in __clear_irq_vector() prior to free_irq_cfg(),
> > but smp_irq_move_cleanup_interrupt() might've already referenced the irq_desc.
> 
> So, what happens if the irq_desc gets freed by the destroy_irq() in the
> sparse irq case? smp_irq_move_cleanup_interrupt() will refer to freed
> irq desc memory! Right?

I don't believe I ever hit that case, but yes, I see the possibility, so
we have at least 3 holes here.

My test (other than repeated boots) was to repeatedly insert and remove a
module that sets up irq's on every cpu.  Without my patch it panic'ed fairly
quickly.  With the patch it never failed.

> 
> May we should really do something like the appended (untested patch)?
> Can you please review and give this a try? Let me review a bit more to
> see if this really fixes the issue.

I wonder how many instances of smp_irq_move_cleanup_interrupt() might
run simultaneously on a large system?  Would there be any issue with lock
pressure from holding the vector_lock through the entire loop?

I can probably test the code (after further review), but the only case I
saw through testing was the create_irq_nr case.

> 
> Thanks.
> ---
>  arch/x86/kernel/apic/io_apic.c |   19 +++++++++----------
>  1 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index ffdc152..81f4cab 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2295,6 +2295,8 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void)
>  	exit_idle();
>  
>  	me = smp_processor_id();
> +
> +	raw_spin_lock(&vector_lock);
>  	for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
>  		unsigned int irq;
>  		unsigned int irr;
> @@ -2310,17 +2312,16 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void)
>  			continue;
>  
>  		cfg = irq_cfg(irq);
> -		raw_spin_lock(&desc->lock);
>  
>  		/*
>  		 * Check if the irq migration is in progress. If so, we
>  		 * haven't received the cleanup request yet for this irq.
>  		 */
>  		if (cfg->move_in_progress)
> -			goto unlock;
> +			continue;
>  
>  		if (vector == cfg->vector && cpumask_test_cpu(me, cfg->domain))
> -			goto unlock;
> +			continue;
>  
>  		irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
>  		/*
> @@ -2332,12 +2333,11 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void)
>  		 */
>  		if (irr  & (1 << (vector % 32))) {
>  			apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR);
> -			goto unlock;
> +			continue;
>  		}
>  		__this_cpu_write(vector_irq[vector], -1);
> -unlock:
> -		raw_spin_unlock(&desc->lock);
>  	}
> +	raw_spin_unlock(&vector_lock);
>  
>  	irq_exit();
>  }
> @@ -2986,17 +2986,16 @@ unsigned int create_irq_nr(unsigned int from, int node)
>  		return 0;
>  	}
>  
> +	irq_set_chip_data(irq, cfg);
>  	raw_spin_lock_irqsave(&vector_lock, flags);
>  	if (!__assign_irq_vector(irq, cfg, apic->target_cpus()))
>  		ret = irq;
>  	raw_spin_unlock_irqrestore(&vector_lock, flags);
>  
> -	if (ret) {
> -		irq_set_chip_data(irq, cfg);
> +	if (ret)
>  		irq_clear_status_flags(irq, IRQ_NOREQUEST);
> -	} else {
> +	else
>  		free_irq_at(irq, cfg);
> -	}
>  	return ret;
>  }
>  
> 

  reply	other threads:[~2012-05-24  1:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-21 16:49 [PATCH] x86: check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt Dimitri Sivanich
2012-05-21 21:05 ` Suresh Siddha
2012-05-21 21:09   ` Dimitri Sivanich
2012-05-21 21:10     ` Suresh Siddha
2012-05-22  2:41       ` Dimitri Sivanich
2012-05-21 21:07 ` Thomas Gleixner
2012-05-21 21:19   ` Dimitri Sivanich
2012-05-21 21:34     ` Thomas Gleixner
2012-05-23 18:16       ` Dimitri Sivanich
2012-05-23 19:04         ` Dimitri Sivanich
2012-05-23 19:24           ` Thomas Gleixner
2012-05-23 19:24           ` Suresh Siddha
2012-05-23 20:02             ` Dimitri Sivanich
2012-05-23 23:49               ` Suresh Siddha
2012-05-24  1:40                 ` Dimitri Sivanich [this message]
2012-05-24 14:37                 ` Dimitri Sivanich
2012-05-24 18:19                   ` Suresh Siddha
2012-05-24 19:16                     ` Thomas Gleixner
2012-05-26  0:23                       ` Suresh Siddha
2012-05-26 10:18                         ` Thomas Gleixner
2012-05-27  1:41                           ` Jiang Liu
2012-05-30 13:46                           ` Dimitri Sivanich
2012-05-24 14:53                 ` Thomas Gleixner
2012-05-24 15:36                   ` Dimitri Sivanich
  -- strict thread matches above, loose matches on Subject: below --
2012-10-16 12:50 Dimitri Sivanich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120524014019.GA12196@sgi.com \
    --to=sivanich@sgi.com \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nagananda.chumbalkar@hp.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.