All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
To: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Nish Aravamudan <nish.aravamudan@gmail.com>,
	Pavel Machek <pavel@ucw.cz>,
	kernel list <linux-kernel@vger.kernel.org>,
	linux-ide@vger.kernel.org, ananth@in.ibm.com,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: 2.6.26-git0: IDE oops during boot
Date: Tue, 12 Feb 2008 14:34:19 +0530	[thread overview]
Message-ID: <47B16113.1060807@linux.vnet.ibm.com> (raw)
In-Reply-To: <200802112035.39462.bzolnier@gmail.com>

Bartlomiej Zolnierkiewicz wrote:
> Hi,
> 
> On Monday 11 February 2008, Kamalesh Babulal wrote:
>> Nish Aravamudan wrote:
>>> On 2/7/08, Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:
>>>> On Thursday 07 February 2008, Kamalesh Babulal wrote:
>>>>> Bartlomiej Zolnierkiewicz wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Wednesday 06 February 2008, Pavel Machek wrote:
>>>>>>> On Wed 2008-02-06 11:53:34, Pavel Machek wrote:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> Trying to boot 2.6.25-git0 (few days old), I get
>>>>>>>>
>>>>>>>> BUG: unable to handle kernel paging request at ffff..ffb0
>>>>>>>> IP at init_irq+0x42e
>>>>>> init_irq? hmm...
>>>>>>
>>>>>>>> Call trace:
>>>>>>>> ide_device_add_all
>>>>>> this comes from ide-generic
>>>>>> (Generic IDE host driver)
>>>>>>
>>>>>>>> ide_generic_init
>>>>>>>> kernel_init
>>>>>>>> child_rip
>>>>>>>> vgacon_cursor
>>>>>>>> kernel_init
>>>>>>>> child_rip
>>>>>>>>
>>>>>>>> Excerpt from config:
>>>>>>>>
>>>>>>>> CONFIG_IDE=y
>>>>>>>> CONFIG_BLK_DEV_IDE=y
>>>>>>> Disabling CONFIG_IDE made my machine boot, as it was using libata
>>>>>>> anyway.
>>>>>> Kamalesh/Pavel:
>>>>>>
>>>>>> Could you try latest git and see if the OOPS is still there?
>>>>>>
>>>>>> [ Yeah, I'm unable to reproduce it. :( ]
>>>>>>
>>>>>> Thanks,
>>>>>> Bart
>>>>> Hi Bart,
>>>>>
>>>>> The panic is reproducible with the 2.6.24-git16 kernel, the call trace is
>>>>> similar to the previous one
>>>> Thanks, I again reviewed ide-probe.c changes but nothing seems wrong...
>>>>
>>>> Could you please bisect it down to the guilty commit?
>>> Kamalesh, were you able to bisect this down? I just got hit by the
>>> same panic on a 4-way x86_64, with 2.6.24-git22.
>>>
>>> Thanks,
>>> Nish
>> Hi Nish,
>>
>> I tried bisecting and the guilty patch seems to be 
>>
>> 36501650ec45b1db308c3b51886044863be2d762 is first bad commit
>> commit 36501650ec45b1db308c3b51886044863be2d762
>> Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
>> Date:   Fri Feb 1 23:09:31 2008 +0100
>>
>>     ide: keep pointer to struct device instead of struct pci_dev in ide_hwif_t
>>
>>
>> the gdb output, also points to the changes made by the guilty patch
>>
>> (gdb) p ide_device_add_all
>> $1 = {int (u8 *, const struct ide_port_info *)} 0xffffffff804176ac <ide_device_add_all>
>> (gdb) p/x 0xffffffff804176ac+0xb60
>> $2 = 0xffffffff8041820c
>> (gdb) l *0xffffffff8041820c
>> 0xffffffff8041820c is in ide_device_add_all (drivers/ide/ide-probe.c:1249).
>> 1244                    goto out;
>> 1245            }
>> 1246
>> 1247            sg_init_table(hwif->sg_table, hwif->sg_max_nents);
>> 1248
>> 1249            if (init_irq(hwif) == 0)
>> 1250                    goto done;
>> 1251
>> 1252            old_irq = hwif->irq;
>> 1253            /*
>> (gdb) 
>>
>>
>> (gdb) p init_irq
>> $1 = {int (ide_hwif_t *)} 0xffffffff8041721f <init_irq>
>> (gdb) p/x 0xffffffff8041721f+0x1a4
>> $2 = 0xffffffff804173c3
>> (gdb) l *0xffffffff804173c3
>> 0xffffffff804173c3 is in init_irq (include/asm/pci.h:101).
>> 96      /* Returns the node based on pci bus */
>> 97      static inline int __pcibus_to_node(struct pci_bus *bus)
>> 98      {
>> 99              struct pci_sysdata *sd = bus->sysdata;
>> 100
>> 101             return sd->node;
>> 102     }
>> 103
>> 104     static inline cpumask_t __pcibus_to_cpumask(struct pci_bus *bus)
>> 105     {
>> (gdb) 
> 
> Thanks for the detailed analysis and sorry for the bug.
> 
> I think that this may has been just fixed by Andi's recent hwif_to_node()
> fix (patch below, it is in Linus' tree already), could please verify this?
> 
> commit 1f07e988290fc45932f5028c9e2a862c37a57336
> Author: Andi Kleen <andi@firstfloor.org>
> Date:   Mon Feb 11 01:35:20 2008 +0100
> 
>     Prevent IDE boot ops on NUMA system
>     
>     Without this patch a Opteron test system here oopses at boot with
>     current git.
>     
>     Calling to_pci_dev() on a NULL pointer gives a negative value so the
>     following NULL pointer check never triggers and then an illegal address
>     is referenced.  Check the unadjusted original device pointer for NULL
>     instead.
>     
>     Signed-off-by: Andi Kleen <ak@suse.de>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> diff --git a/include/linux/ide.h b/include/linux/ide.h
> index 23fad89..a3b69c1 100644
> --- a/include/linux/ide.h
> +++ b/include/linux/ide.h
> @@ -1295,7 +1295,7 @@ static inline void ide_dump_identify(u8 *id)
>  static inline int hwif_to_node(ide_hwif_t *hwif)
>  {
>  	struct pci_dev *dev = to_pci_dev(hwif->dev);
> -	return dev ? pcibus_to_node(dev->bus) : -1;
> +	return hwif->dev ? pcibus_to_node(dev->bus) : -1;
>  }
> 
>  static inline ide_drive_t *ide_get_paired_drive(ide_drive_t *drive)
Hi Bart,
Thanks !! the patch solves the kernel panic but when after applying the patch,kernel is not
able to mount the filesystem and panics, am i not sure what is likely causing the panic.

Creating root device.
Mounting root filesystem.
mount: could not  find filesystem
Kernel panic - not syncing: Attempted to kill init!


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

  reply	other threads:[~2008-02-12  9:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20080206105334.GA3664@elf.ucw.cz>
2008-02-06 11:08 ` 2.6.26-git0: IDE oops during boot Pavel Machek
2008-02-06 20:05   ` Bartlomiej Zolnierkiewicz
2008-02-07  9:35     ` Kamalesh Babulal
2008-02-07 14:01       ` Bartlomiej Zolnierkiewicz
2008-02-10 21:32         ` Nish Aravamudan
2008-02-11  7:54           ` Kamalesh Babulal
2008-02-11 19:35             ` Bartlomiej Zolnierkiewicz
2008-02-12  9:04               ` Kamalesh Babulal [this message]
2008-02-13 23:00                 ` Bartlomiej Zolnierkiewicz
2008-02-14  9:46                   ` Kamalesh Babulal
2008-02-14 10:28                     ` Yinghai Lu
2008-02-15 11:15                       ` Kamalesh Babulal
2008-02-25  7:05                         ` Yinghai Lu
2008-02-25  7:23                           ` Yinghai Lu
2008-02-14 12:01                     ` "mount: could not find filesystem" - aacraid? (was: Re: 2.6.26-git0: IDE oops during boot) Bartlomiej Zolnierkiewicz
2008-02-14 12:07                       ` Bartlomiej Zolnierkiewicz
2008-02-14 15:47                         ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47B16113.1060807@linux.vnet.ibm.com \
    --to=kamalesh@linux.vnet.ibm.com \
    --cc=ananth@in.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=bzolnier@gmail.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nish.aravamudan@gmail.com \
    --cc=pavel@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.