public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: linux-kernel@vger.kernel.org,
	Jesse Barnes <jbarnes@virtuousgeek.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [patch, -git] pcie hotplug bootup crash fix
Date: Sat, 24 May 2008 10:40:24 -0700	[thread overview]
Message-ID: <20080524104024.a33116a3.akpm@linux-foundation.org> (raw)
In-Reply-To: <20080524165828.GA29993@elte.hu>

On Sat, 24 May 2008 18:58:28 +0200 Ingo Molnar <mingo@elte.hu> wrote:

> 
> -tip tree testing found that the the PCI hotplug ISR routine crashes 
> with a NULL pointer dereference under certain circumstances.
> 
> The situation under which it occurs is hw and timing related: it appears 
> to happen on a system that has PCI hotplug hardware but with no active 
> hotplug cards, and another interrupt in the same (shared) IRQ line 
> arrives too early, before the hotplug-slot entry has been set up - as 
> triggered by CONFIG_DEBUG_SHIRQ=y:
> 
> pciehp: HPC vendor_id 8086 device_id 27d0 ss_vid 0 ss_did 0
> pciehp: pciehp_find_slot: slot (device=0x0) not found
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
> IP: [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
> PGD 0
> Oops: 0000 [1]
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G        W 2.6.26-rc3-sched-devel.git-00001-g2b99b26-dirty #170
> RIP: 0010:[<ffffffff80494a8b>]  [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
> RSP: 0000:ffff81003f83fbb0  EFLAGS: 00010046
> RAX: 0000000000000039 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000046
> RBP: ffff81003f83fbd0 R08: 0000000000000001 R09: ffffffff80245103
> R10: 0000000000000020 R11: 0000000000000000 R12: ffff81003ea53a30
> R13: 0000000000000000 R14: 0000000000000011 R15: ffffffff80495926
> FS:  0000000000000000(0000) GS:ffffffff80be7400(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000070 CR3: 0000000000201000 CR4: 00000000000006a0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 1, threadinfo ffff81003f83e000, task ffff81003f840000)
> Stack:  0000000000000008 ffff81003f83fbf6 ffff81003ea53a30 0000000000000008
>  ffff81003f83fc10 ffffffff80495ab4 0000000000000011 0000000000000002
>  0000000000000202 0000000000000202 00000000fffffff4 ffff81003ea53a30
> Call Trace:
>  [<ffffffff80495ab4>] pcie_isr+0x18e/0x1bc
>  [<ffffffff80260831>] request_irq+0x106/0x12f
>  [<ffffffff80495fb6>] pcie_init+0x15e/0x6cc
>  [<ffffffff804933a3>] pciehp_probe+0x64/0x541
>  [<ffffffff8048f4e7>] pcie_port_probe_service+0x4c/0x76
>  [<ffffffff8054af70>] driver_probe_device+0xd4/0x1f0
>  [<ffffffff8054b108>] __driver_attach+0x7c/0x7e
>  [<ffffffff8054b08c>] ? __driver_attach+0x0/0x7e
>  [<ffffffff8054a4b6>] bus_for_each_dev+0x53/0x7d
>  [<ffffffff8054ad3c>] driver_attach+0x1c/0x1e
>  [<ffffffff8054a9c2>] bus_add_driver+0xdd/0x25b
>  [<ffffffff80c09d3d>] ? pcied_init+0x0/0x8b
>  [<ffffffff8054b288>] driver_register+0x5f/0x13e
>  [<ffffffff80c09d3d>] ? pcied_init+0x0/0x8b
>  [<ffffffff8048f441>] pcie_port_service_register+0x47/0x49
>  [<ffffffff80c09d52>] pcied_init+0x15/0x8b
>  [<ffffffff80bf3938>] kernel_init+0x75/0x243
>  [<ffffffff808639d2>] ? _spin_unlock_irq+0x2b/0x3a
>  [<ffffffff80228d1f>] ? finish_task_switch+0x57/0x9a
>  [<ffffffff8020c258>] child_rip+0xa/0x12
>  [<ffffffff8020bcec>] ? restore_args+0x0/0x30
>  [<ffffffff80bf38c3>] ? kernel_init+0x0/0x243
>  [<ffffffff8020c24e>] ? child_rip+0x0/0x12
> 
> Code: 83 80 00 00 00 48 39 f0 75 e1 0f b6 c9 48 c7 c2 00 0e 8d 80 48 c7 c6 8a 60 a6 80 48 c7 c7 10 db a8 80 31 c0 e8 3f 8d d9 ff 31 db <48> 8b 43 70 48 8d 75 ef 48 89 df ff 50 30 80 7d ef 00 74 37 48
> RIP  [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
>  RSP <ffff81003f83fbb0>
> CR2: 0000000000000070
> Kernel panic - not syncing: Fatal exception

This looks to me like CONFIG_DEBUG_SHIRQ doing its job.

> the config with which it occurs is:
> 
>   http://redhat.com/~mingo/misc/config-Sat_May_24_18_17_56_CEST_2008.bad
> 
> the fix is to check for NULL slots.
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  drivers/pci/hotplug/pciehp_ctrl.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> Index: linux/drivers/pci/hotplug/pciehp_ctrl.c
> ===================================================================
> --- linux.orig/drivers/pci/hotplug/pciehp_ctrl.c
> +++ linux/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -118,6 +118,9 @@ u8 pciehp_handle_presence_change(u8 hp_s
>  
>  	p_slot = pciehp_find_slot(ctrl, hp_slot + ctrl->slot_device_offset);
>  
> +	if (!p_slot || !p_slot->hpc_ops)
> +		return 0;
> +
>  	/* Switch is open, assume a presence change
>  	 * Save the presence state
>  	 */

It is fishy that pcie_init() calls pciehp_request_irq() before calling
pcie_init_hardware_part2().  That looks like the classic "lets die
horridly if a shared IRQ comes in at the wrong time" sequence.

  reply	other threads:[~2008-05-24 17:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-24 16:58 [patch, -git] pcie hotplug bootup crash fix Ingo Molnar
2008-05-24 17:40 ` Andrew Morton [this message]
2008-05-26  8:35   ` Kenji Kaneshige
2008-05-26  8:47     ` Ingo Molnar
2008-05-26  8:52       ` Andrew Morton
2008-05-26  8:58         ` Ingo Molnar
2008-05-26 10:26         ` Kenji Kaneshige
2008-05-26 10:53           ` Ingo Molnar
2008-05-26 16:20           ` Jesse Barnes
2008-05-27 22:45             ` Kristen Carlson Accardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080524104024.a33116a3.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox