linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Brandon Philips <bphilips@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86: keep chip_data in create_irq_nr and destroy_irq
Date: Fri, 05 Feb 2010 14:44:57 -0800	[thread overview]
Message-ID: <4B6C9F69.4040605@kernel.org> (raw)
In-Reply-To: <20100205210926.GE4930@jenkins.home.ifup.org>

On 02/05/2010 01:09 PM, Brandon Philips wrote:
> When two drivers are setting up MSI-X at the same time via
> pci_enable_msix() there is a race.  See this dmesg excerpt:
> 
> [   85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
> [   85.170611]   alloc irq_desc for 99 on node -1
> [   85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
> [   85.170614]   alloc kstat_irqs on node -1
> [   85.170616] alloc irq_2_iommu on node -1
> [   85.170617]   alloc irq_desc for 100 on node -1
> [   85.170619]   alloc kstat_irqs on node -1
> [   85.170621] alloc irq_2_iommu on node -1
> [   85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
> [   85.170626]   alloc irq_desc for 101 on node -1
> [   85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
> [   85.170630]   alloc kstat_irqs on node -1
> [   85.170631] alloc irq_2_iommu on node -1
> [   85.170635]   alloc irq_desc for 102 on node -1
> [   85.170636]   alloc kstat_irqs on node -1
> [   85.170639] alloc irq_2_iommu on node -1
> [   85.170646] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000088
> 
> As you can see igb and ixgbe are both alternating on create_irq_nr()
> via pci_enable_msix() in their probe function.
> 
> ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
> choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
> calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
> NULL via dynamic_irq_init().
> 
> igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
> via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:
> 
> 	cfg_new = irq_desc_ptrs[102]->chip_data;
> 	if (cfg_new->vector != 0)
> 		continue;
> 
> This hits the NULL deref.
> 
> Another possible race exists via pci_disable_msix() in a driver or in
> the number of error paths that call free_msi_irqs():
> 
> destroy_irq()
> dynamic_irq_cleanup() which sets desc->chip_data = NULL
> ...race window...
> desc->chip_data = cfg;
> 
> Remove the save and restore code for cfg in create_irq_nr() and
> destroy_irq() and take the desc->lock when checking the irq_cfg.
> 
> Reported-and-analyzed-by: Brandon Philips <bphilips@suse.de>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Signed-off-by: Brandon Phiilps <bphilips@suse.de>
> Cc: stable@kernel.org
> 
> ---
>  arch/x86/kernel/apic/io_apic.c |   14 +++--------
>  include/linux/irq.h            |    2 +
>  kernel/irq/chip.c              |   52 +++++++++++++++++++++++++++++++++--------
>  3 files changed, 49 insertions(+), 19 deletions(-)
> 
> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
> @@ -3228,12 +3228,9 @@ unsigned int create_irq_nr(unsigned int
>  	}
>  	spin_unlock_irqrestore(&vector_lock, flags);
>  
> -	if (irq > 0) {
> -		dynamic_irq_init(irq);
> -		/* restore it, in case dynamic_irq_init clear it */
> -		if (desc_new)
> -			desc_new->chip_data = cfg_new;
> -	}
> +	if (irq > 0)
> +		dynamic_irq_init_keep_chip_data(irq);
> +
>  	return irq;
>  }
>  
> @@ -3260,10 +3257,7 @@ void destroy_irq(unsigned int irq)
>  
>  	/* store it, in case dynamic_irq_cleanup clear it */
>  	desc = irq_to_desc(irq);
> -	cfg = desc->chip_data;
> -	dynamic_irq_cleanup(irq);
> -	/* connect back irq_cfg */
> -	desc->chip_data = cfg;
> +	dynamic_irq_cleanup_keep_chip_data(irq);
>  
>  	free_irte(irq);
>  	spin_lock_irqsave(&vector_lock, flags);
> Index: linux-2.6/include/linux/irq.h
> ===================================================================
> --- linux-2.6.orig/include/linux/irq.h
> +++ linux-2.6/include/linux/irq.h
> @@ -400,7 +400,9 @@ static inline int irq_has_action(unsigne
>  
>  /* Dynamic irq helper functions */
>  extern void dynamic_irq_init(unsigned int irq);
> +extern void dynamic_irq_init_keep_chip_data(unsigned int irq);
>  extern void dynamic_irq_cleanup(unsigned int irq);
> +extern void dynamic_irq_cleanup_keep_chip_data(unsigned int irq);
>  
>  /* Set/get chip/data for an IRQ: */
>  extern int set_irq_chip(unsigned int irq, struct irq_chip *chip);
> Index: linux-2.6/kernel/irq/chip.c
> ===================================================================
> --- linux-2.6.orig/kernel/irq/chip.c
> +++ linux-2.6/kernel/irq/chip.c
> @@ -18,11 +18,7 @@
>  
>  #include "internals.h"
>  
> -/**
> - *	dynamic_irq_init - initialize a dynamically allocated irq
> - *	@irq:	irq number to initialize
> - */
> -void dynamic_irq_init(unsigned int irq)
> +static void dynamic_irq_init_x(unsigned int irq, bool keep_chip_data)
>  {
>  	struct irq_desc *desc;
>  	unsigned long flags;
> @@ -41,7 +37,8 @@ void dynamic_irq_init(unsigned int irq)
>  	desc->depth = 1;
>  	desc->msi_desc = NULL;
>  	desc->handler_data = NULL;
> -	desc->chip_data = NULL;
> +	if (!keep_chip_data)
> +		desc->chip_data = NULL;
>  	desc->action = NULL;
>  	desc->irq_count = 0;
>  	desc->irqs_unhandled = 0;
> @@ -55,10 +52,26 @@ void dynamic_irq_init(unsigned int irq)
>  }
>  
>  /**
> - *	dynamic_irq_cleanup - cleanup a dynamically allocated irq
> + *	dynamic_irq_init - initialize a dynamically allocated irq
>   *	@irq:	irq number to initialize
>   */
> -void dynamic_irq_cleanup(unsigned int irq)
> +void dynamic_irq_init(unsigned int irq)
> +{
> +	dynamic_irq_init_x(irq, false);
> +}
> +
> +/**
> + *	dynamic_irq_init_keep_chip_data - initialize a dynamically allocated irq
> + *	@irq:	irq number to initialize
> + *
> + * 	does not set irq_to_desc(irq)->chip_data to NULL
> + */
> +void dynamic_irq_init_keep_chip_data(unsigned int irq)
> +{
> +	dynamic_irq_init_x(irq, true);
> +}
> +
> +static void dynamic_irq_cleanup_x(unsigned int irq, bool keep_chip_data)
>  {
>  	struct irq_desc *desc = irq_to_desc(irq);
>  	unsigned long flags;
> @@ -77,7 +90,8 @@ void dynamic_irq_cleanup(unsigned int ir
>  	}
>  	desc->msi_desc = NULL;
>  	desc->handler_data = NULL;
> -	desc->chip_data = NULL;
> +	if (!keep_chip_data)
> +		desc->chip_data = NULL;
>  	desc->handle_irq = handle_bad_irq;
>  	desc->chip = &no_irq_chip;
>  	desc->name = NULL;
> @@ -85,6 +99,26 @@ void dynamic_irq_cleanup(unsigned int ir
>  	raw_spin_unlock_irqrestore(&desc->lock, flags);
>  }
>  
> +/**
> + *	dynamic_irq_cleanup - cleanup a dynamically allocated irq
> + *	@irq:	irq number to initialize
> + */
> +void dynamic_irq_cleanup(unsigned int irq)
> +{
> +	dynamic_irq_init_x(irq, false);

should be dynamic_irq_cleanup_x here.



> +}
> +
> +/**
> + *	dynamic_irq_cleanup_keep_chip_data - cleanup a dynamically allocated irq
> + *	@irq:	irq number to initialize
> + *
> + * 	does not set irq_to_desc(irq)->chip_data to NULL
> + */
> +void dynamic_irq_cleanup_keep_chip_data(unsigned int irq)
> +{
> +	dynamic_irq_init_x(irq, true);

should be dynamic_irq_cleanup_x

> +}
> +
>  
>  /**
>   *	set_irq_chip - set the irq chip for an irq

YH

  reply	other threads:[~2010-02-05 22:46 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-03  3:31 x86: fix race in create_irq_nr on irq_desc Brandon Philips
2010-02-03 10:20 ` Yinghai Lu
2010-02-03 17:42   ` Brandon Philips
2010-02-03 19:31     ` Yinghai Lu
2010-02-04  3:17       ` Brandon Philips
2010-02-05  8:45     ` [PATCH] x86: keep chip_data in create_irq_nr Yinghai Lu
2010-02-05 21:05       ` Brandon Philips
2010-02-05 21:42         ` H. Peter Anvin
2010-02-05 21:09       ` [PATCH] x86: keep chip_data in create_irq_nr and destroy_irq Brandon Philips
2010-02-05 22:44         ` Yinghai Lu [this message]
2010-02-05 22:55           ` Brandon Philips
2010-02-06  0:06             ` Yinghai Lu
2010-02-06  0:18               ` [PATCH v2] " Brandon Philips
2010-02-06  6:42                 ` [PATCH v3] " Brandon Philips
2010-02-06  7:16                   ` Yinghai Lu
2010-02-06 20:05                     ` Brandon Philips
2010-02-07 21:02                     ` [PATCH v4] " Brandon Philips
2010-02-19  6:06                       ` [tip:x86/urgent] x86, irq: Keep " tip-bot for Brandon Philips
2010-02-26 10:26                       ` [tip:x86/irq] x86: apic: Fix mismerge, add arch_probe_nr_irqs() again tip-bot for Ingo Molnar
2010-02-26 18:19                         ` Yinghai Lu
2010-02-27  9:10                           ` Ingo Molnar
2010-02-27  9:37                             ` Eric W. Biederman
2010-02-27  9:53                               ` Ingo Molnar
2010-02-27 10:12                                 ` Eric W. Biederman
2010-03-01 11:22                           ` Ian Campbell
2010-03-01 18:34                             ` Eric W. Biederman
2010-03-01 21:44                               ` Ian Campbell
2010-03-01 21:58                                 ` Eric W. Biederman
2010-03-02  8:31                                   ` Thomas Gleixner
2010-03-10 10:55                                   ` Ian Campbell
2010-03-10 10:55                                     ` [PATCH] x86: namespace some I/O APIC related structures and functions ijc
2010-03-10 17:07                                       ` Eric W. Biederman
2010-03-10 10:55                                     ` [PATCH] irq: move some interrupt arch_* functions into struct irq_chip ijc
2010-03-10 11:00                                       ` Ian Campbell
2010-03-10 17:18                                         ` Eric W. Biederman
2010-03-10 17:41                                           ` Ian Campbell
2010-03-10 18:11                                             ` Eric W. Biederman
2010-03-10 12:06                                       ` Yinghai Lu
2010-03-10 12:51                                         ` Ian Campbell
2010-03-10 17:42                                           ` Eric W. Biederman
2010-03-10 17:50                                             ` Ian Campbell
2010-03-10 18:15                                               ` Eric W. Biederman
2010-03-10 18:28                                                 ` Ian Campbell
2010-03-10 18:27                                             ` Jeremy Fitzhardinge
2010-03-10 18:59                                           ` Yinghai Lu
2010-03-10 19:15                                             ` Eric W. Biederman
2010-03-10 22:07                                       ` Michael Ellerman
2010-03-10 10:55                                     ` [PATCH] x86: irq_desc->chip_data is always correct whether or not SPARSE_IRQ is enabled ijc
2010-03-01 22:01                                 ` [tip:x86/irq] x86: apic: Fix mismerge, add arch_probe_nr_irqs() again Jeremy Fitzhardinge
2010-02-27 12:57                       ` [tip:x86/apic] " tip-bot for Ingo Molnar
2010-02-03 10:32 ` x86: fix race in create_irq_nr on irq_desc Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B6C9F69.4040605@kernel.org \
    --to=yinghai@kernel.org \
    --cc=bphilips@suse.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).