linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.20-mm2
       [not found] <20070217215146.30e7ffa3.akpm@linux-foundation.org>
@ 2007-02-18 12:44 ` Rafael J. Wysocki
  2007-02-18 19:43   ` 2.6.20-mm2 Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-18 12:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> 
> Temporarily at
> 
>   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> 
> Will appear later at
> 
>  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/

Two problems:

1) A showstopper with the root partition on RAID1:

md: raid1 personality registered for level 1
[--snip--]
md: multipath personality registered for level -4
register_blkdev: failed to get major for mdp
[--snip--]
VFS: Cannot open root device "md1" or unknown-block(0,0)

At the moment I have no serial console attached to the box, so I had to rewrite
the messages manually.

2) On HPC nx6325 I get the following 100% of the time during the resume from
disk:

BUG: at drivers/pci/pci.c:823 pcim_enable_device()

Call Trace:
 [<ffffffff80325ff8>] pcim_enable_device+0x93/0xb3
 [<ffffffff803a974a>] ata_pci_device_do_resume+0x21/0x5e
 [<ffffffff803b5e6c>] sil_pci_device_resume+0x1c/0x51
 [<ffffffff8032800d>] pci_device_resume+0x22/0x53
 [<ffffffff8039ae58>] resume_device+0xca/0x131
 [<ffffffff8039af40>] dpm_resume+0x81/0xd3
 [<ffffffff8039afc2>] device_resume+0x30/0x45
 [<ffffffff802a0792>] snapshot_ioctl+0x245/0x63e
 [<ffffffff8023cfcc>] do_ioctl+0x5e/0x77
 [<ffffffff8022d2b3>] vfs_ioctl+0x25c/0x279
 [<ffffffff80246a80>] sys_ioctl+0x5f/0x82
 [<ffffffff80215586>] sys_write+0x47/0x70
 [<ffffffff8025711e>] system_call+0x7e/0x83

Nevertheless, the system seems to be fully functional after the resume.

[I've been observing it since 2.6.20-git10 and have reported it for a couple
of times, but apparently nobody cares. :-(]

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-18 12:44 ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-18 19:43   ` Andrew Morton
  2007-02-18 23:25     ` 2.6.20-mm2 Rafael J. Wysocki
  2007-02-20  1:20     ` 2.6.20-mm2 Rafael J. Wysocki
  0 siblings, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2007-02-18 19:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Sun, 18 Feb 2007 13:44:54 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> > 
> > Temporarily at
> > 
> >   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> > 
> > Will appear later at
> > 
> >  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/
> 
> Two problems:
> 
> 1) A showstopper with the root partition on RAID1:
> 
> md: raid1 personality registered for level 1
> [--snip--]
> md: multipath personality registered for level -4
> register_blkdev: failed to get major for mdp
> [--snip--]
> VFS: Cannot open root device "md1" or unknown-block(0,0)

Someone else reported that against mainline.  Can you please debug it a bit?

I'd suggested reverting the recent changes in there:

--- a/block/genhd.c~a
+++ a/block/genhd.c
@@ -61,14 +61,6 @@ int register_blkdev(unsigned int major, 
 	/* temporary */
 	if (major == 0) {
 		for (index = ARRAY_SIZE(major_names)-1; index > 0; index--) {
-			/*
-			 * Disallow the LANANA-assigned LOCAL/EXPERIMENTAL
-			 * majors
-			 */
-			if ((60 <= index && index <= 63) ||
-					(120 <= index && index <= 127) ||
-					(240 <= index && index <= 254))
-				continue;
 			if (major_names[index] == NULL)
 				break;
 		}
_

but I don't see how they could cause this.


> At the moment I have no serial console attached to the box, so I had to rewrite
> the messages manually.

netconsole is good.

> 2) On HPC nx6325 I get the following 100% of the time during the resume from
> disk:
> 
> BUG: at drivers/pci/pci.c:823 pcim_enable_device()
> 
> Call Trace:
>  [<ffffffff80325ff8>] pcim_enable_device+0x93/0xb3
>  [<ffffffff803a974a>] ata_pci_device_do_resume+0x21/0x5e
>  [<ffffffff803b5e6c>] sil_pci_device_resume+0x1c/0x51
>  [<ffffffff8032800d>] pci_device_resume+0x22/0x53
>  [<ffffffff8039ae58>] resume_device+0xca/0x131
>  [<ffffffff8039af40>] dpm_resume+0x81/0xd3
>  [<ffffffff8039afc2>] device_resume+0x30/0x45
>  [<ffffffff802a0792>] snapshot_ioctl+0x245/0x63e
>  [<ffffffff8023cfcc>] do_ioctl+0x5e/0x77
>  [<ffffffff8022d2b3>] vfs_ioctl+0x25c/0x279
>  [<ffffffff80246a80>] sys_ioctl+0x5f/0x82
>  [<ffffffff80215586>] sys_write+0x47/0x70
>  [<ffffffff8025711e>] system_call+0x7e/0x83
> 
> Nevertheless, the system seems to be fully functional after the resume.
> 
> [I've been observing it since 2.6.20-git10 and have reported it for a couple
> of times, but apparently nobody cares. :-(]

This is a Tejun thing - apparently it's due to swsusp calling suspend once
and resume twice (or is it vice versa).  He'll be looking into it soon.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-18 19:43   ` 2.6.20-mm2 Andrew Morton
@ 2007-02-18 23:25     ` Rafael J. Wysocki
  2007-02-18 23:39       ` 2.6.20-mm2 Michal Piotrowski
  2007-02-19  0:00       ` 2.6.20-mm2 Andrew Morton
  2007-02-20  1:20     ` 2.6.20-mm2 Rafael J. Wysocki
  1 sibling, 2 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-18 23:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Sunday, 18 February 2007 20:43, Andrew Morton wrote:
> On Sun, 18 Feb 2007 13:44:54 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> > > 
> > > Temporarily at
> > > 
> > >   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> > > 
> > > Will appear later at
> > > 
> > >  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/
> > 
> > Two problems:
> > 
> > 1) A showstopper with the root partition on RAID1:
> > 
> > md: raid1 personality registered for level 1
> > [--snip--]
> > md: multipath personality registered for level -4
> > register_blkdev: failed to get major for mdp
> > [--snip--]
> > VFS: Cannot open root device "md1" or unknown-block(0,0)
> 
> Someone else reported that against mainline.  Can you please debug it a bit?

Sure, tomorrow I will.

> I'd suggested reverting the recent changes in there:
> 
> --- a/block/genhd.c~a
> +++ a/block/genhd.c
> @@ -61,14 +61,6 @@ int register_blkdev(unsigned int major, 
>  	/* temporary */
>  	if (major == 0) {
>  		for (index = ARRAY_SIZE(major_names)-1; index > 0; index--) {
> -			/*
> -			 * Disallow the LANANA-assigned LOCAL/EXPERIMENTAL
> -			 * majors
> -			 */
> -			if ((60 <= index && index <= 63) ||
> -					(120 <= index && index <= 127) ||
> -					(240 <= index && index <= 254))
> -				continue;
>  			if (major_names[index] == NULL)
>  				break;
>  		}
> _
> 
> but I don't see how they could cause this.
> 
> 
> > At the moment I have no serial console attached to the box, so I had to rewrite
> > the messages manually.
> 
> netconsole is good.

I know. :-)

In the meantime, I've got something worse on another x86_64 box:

Asus Laptop ACPI Extras version 0.30
  L5D model detected, supported
audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
general protection fault: 0000 [2] PREEMPT
last sysfs file: /class/net/eth2/carrier
CPU 0
Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
 ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
 0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
Call Trace:
 [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
 [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
 [<ffffffff8034b7e6>] submit_bio+0xf6/0x110
 [<ffffffff802b60f0>] submit_bh+0x100/0x130
 [<ffffffff802b788a>] __block_write_full_page+0x1ca/0x2e0
 [<ffffffff802bc040>] blkdev_get_block+0x0/0x70
 [<ffffffff802bc040>] blkdev_get_block+0x0/0x70
 [<ffffffff802b7a93>] block_write_full_page+0xf3/0x110
 [<ffffffff802baeb3>] blkdev_writepage+0x13/0x20
 [<ffffffff8026eb85>] __writepage+0x15/0x40
 [<ffffffff8026f1e3>] write_cache_pages+0x1f3/0x360
 [<ffffffff8026eb70>] __writepage+0x0/0x40
 [<ffffffff8026f372>] generic_writepages+0x22/0x30
 [<ffffffff8026f3c6>] do_writepages+0x46/0x80
 [<ffffffff802b1f67>] __writeback_single_inode+0x1d7/0x370
 [<ffffffff802b2355>] generic_sync_sb_inodes+0x35/0x2b0
 [<ffffffff802b24f9>] generic_sync_sb_inodes+0x1d9/0x2b0
 [<ffffffff802b29f2>] writeback_inodes+0x82/0x100
 [<ffffffff802b25f5>] sync_sb_inodes+0x25/0x30
 [<ffffffff802b2a08>] writeback_inodes+0x98/0x100
 [<ffffffff8026fd40>] pdflush+0x0/0x1e0
 [<ffffffff8026f934>] wb_kupdate+0x94/0x110
 [<ffffffff8026fe68>] pdflush+0x128/0x1e0
 [<ffffffff8026f8a0>] wb_kupdate+0x0/0x110
 [<ffffffff8026fd40>] pdflush+0x0/0x1e0
 [<ffffffff80240863>] kthread+0xd3/0x110
 [<ffffffff80240700>] keventd_create_kthread+0x0/0x90
 [<ffffffff8020a3f8>] child_rip+0xa/0x12
 [<ffffffff80483e5b>] _spin_unlock_irq+0x2b/0x60
 [<ffffffff80209fb0>] restore_args+0x0/0x30
 [<ffffffff80240790>] kthread+0x0/0x110
 [<ffffffff8020a3ee>] child_rip+0x0/0x12


Code: 48 8b 43 08 0f 18 08 49 39 dd 75 a2 49 8b be 38 02 00 00 e8
RIP  [<ffffffff8034bce4>] __make_request+0x134/0x370
 RSP <ffff81005ed659a0>
PM: Adding info for No Bus:vcs10
PM: Adding info for No Bus:vcsa10

It looks _really_ bad to me. :-(


> > 2) On HPC nx6325 I get the following 100% of the time during the resume from
> > disk:
> > 
> > BUG: at drivers/pci/pci.c:823 pcim_enable_device()
> > 
> > Call Trace:
> >  [<ffffffff80325ff8>] pcim_enable_device+0x93/0xb3
> >  [<ffffffff803a974a>] ata_pci_device_do_resume+0x21/0x5e
> >  [<ffffffff803b5e6c>] sil_pci_device_resume+0x1c/0x51
> >  [<ffffffff8032800d>] pci_device_resume+0x22/0x53
> >  [<ffffffff8039ae58>] resume_device+0xca/0x131
> >  [<ffffffff8039af40>] dpm_resume+0x81/0xd3
> >  [<ffffffff8039afc2>] device_resume+0x30/0x45
> >  [<ffffffff802a0792>] snapshot_ioctl+0x245/0x63e
> >  [<ffffffff8023cfcc>] do_ioctl+0x5e/0x77
> >  [<ffffffff8022d2b3>] vfs_ioctl+0x25c/0x279
> >  [<ffffffff80246a80>] sys_ioctl+0x5f/0x82
> >  [<ffffffff80215586>] sys_write+0x47/0x70
> >  [<ffffffff8025711e>] system_call+0x7e/0x83
> > 
> > Nevertheless, the system seems to be fully functional after the resume.
> > 
> > [I've been observing it since 2.6.20-git10 and have reported it for a couple
> > of times, but apparently nobody cares. :-(]
> 
> This is a Tejun thing - apparently it's due to swsusp calling suspend once
> and resume twice (or is it vice versa).  He'll be looking into it soon.

OK

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-18 23:25     ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-18 23:39       ` Michal Piotrowski
  2007-02-19  0:00       ` 2.6.20-mm2 Andrew Morton
  1 sibling, 0 replies; 14+ messages in thread
From: Michal Piotrowski @ 2007-02-18 23:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, linux-kernel, Neil Brown, Jeff Garzik, linux-ide,
	Jens Axboe

On 19/02/07, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Sunday, 18 February 2007 20:43, Andrew Morton wrote:
> > On Sun, 18 Feb 2007 13:44:54 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >
> > > On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> > > >
> > > > Temporarily at
> > > >
> > > >   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> > > >
> > > > Will appear later at
> > > >
> > > >  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/
> > >
> > > Two problems:
> > >
> > > 1) A showstopper with the root partition on RAID1:
> > >
> > > md: raid1 personality registered for level 1
> > > [--snip--]
> > > md: multipath personality registered for level -4
> > > register_blkdev: failed to get major for mdp
> > > [--snip--]
> > > VFS: Cannot open root device "md1" or unknown-block(0,0)
> >
> > Someone else reported that against mainline.  Can you please debug it a bit?
>
> Sure, tomorrow I will.
>
> > I'd suggested reverting the recent changes in there:
> >
> > --- a/block/genhd.c~a
> > +++ a/block/genhd.c
> > @@ -61,14 +61,6 @@ int register_blkdev(unsigned int major,
> >       /* temporary */
> >       if (major == 0) {
> >               for (index = ARRAY_SIZE(major_names)-1; index > 0; index--) {
> > -                     /*
> > -                      * Disallow the LANANA-assigned LOCAL/EXPERIMENTAL
> > -                      * majors
> > -                      */
> > -                     if ((60 <= index && index <= 63) ||
> > -                                     (120 <= index && index <= 127) ||
> > -                                     (240 <= index && index <= 254))
> > -                             continue;
> >                       if (major_names[index] == NULL)
> >                               break;
> >               }
> > _
> >
> > but I don't see how they could cause this.
> >
> >
> > > At the moment I have no serial console attached to the box, so I had to rewrite
> > > the messages manually.
> >
> > netconsole is good.
>
> I know. :-)
>
> In the meantime, I've got something worse on another x86_64 box:
>
> Asus Laptop ACPI Extras version 0.30
>   L5D model detected, supported
> audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> general protection fault: 0000 [2] PREEMPT
> last sysfs file: /class/net/eth2/carrier
> CPU 0
> Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
>  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
>  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> Call Trace:
>  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
>  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
>  [<ffffffff8034b7e6>] submit_bio+0xf6/0x110
>  [<ffffffff802b60f0>] submit_bh+0x100/0x130
>  [<ffffffff802b788a>] __block_write_full_page+0x1ca/0x2e0
>  [<ffffffff802bc040>] blkdev_get_block+0x0/0x70
>  [<ffffffff802bc040>] blkdev_get_block+0x0/0x70
>  [<ffffffff802b7a93>] block_write_full_page+0xf3/0x110
>  [<ffffffff802baeb3>] blkdev_writepage+0x13/0x20
>  [<ffffffff8026eb85>] __writepage+0x15/0x40
>  [<ffffffff8026f1e3>] write_cache_pages+0x1f3/0x360
>  [<ffffffff8026eb70>] __writepage+0x0/0x40
>  [<ffffffff8026f372>] generic_writepages+0x22/0x30
>  [<ffffffff8026f3c6>] do_writepages+0x46/0x80
>  [<ffffffff802b1f67>] __writeback_single_inode+0x1d7/0x370
>  [<ffffffff802b2355>] generic_sync_sb_inodes+0x35/0x2b0
>  [<ffffffff802b24f9>] generic_sync_sb_inodes+0x1d9/0x2b0
>  [<ffffffff802b29f2>] writeback_inodes+0x82/0x100
>  [<ffffffff802b25f5>] sync_sb_inodes+0x25/0x30
>  [<ffffffff802b2a08>] writeback_inodes+0x98/0x100
>  [<ffffffff8026fd40>] pdflush+0x0/0x1e0
>  [<ffffffff8026f934>] wb_kupdate+0x94/0x110
>  [<ffffffff8026fe68>] pdflush+0x128/0x1e0
>  [<ffffffff8026f8a0>] wb_kupdate+0x0/0x110
>  [<ffffffff8026fd40>] pdflush+0x0/0x1e0
>  [<ffffffff80240863>] kthread+0xd3/0x110
>  [<ffffffff80240700>] keventd_create_kthread+0x0/0x90
>  [<ffffffff8020a3f8>] child_rip+0xa/0x12
>  [<ffffffff80483e5b>] _spin_unlock_irq+0x2b/0x60
>  [<ffffffff80209fb0>] restore_args+0x0/0x30
>  [<ffffffff80240790>] kthread+0x0/0x110
>  [<ffffffff8020a3ee>] child_rip+0x0/0x12
>
>
> Code: 48 8b 43 08 0f 18 08 49 39 dd 75 a2 49 8b be 38 02 00 00 e8
> RIP  [<ffffffff8034bce4>] __make_request+0x134/0x370
>  RSP <ffff81005ed659a0>
> PM: Adding info for No Bus:vcs10
> PM: Adding info for No Bus:vcsa10
>
> It looks _really_ bad to me. :-(
>

It looks familiar to me

http://www.ussg.iu.edu/hypermail/linux/kernel/0702.2/0646.html
http://www.ussg.iu.edu/hypermail/linux/kernel/0702.2/0821.html

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-18 23:25     ` 2.6.20-mm2 Rafael J. Wysocki
  2007-02-18 23:39       ` 2.6.20-mm2 Michal Piotrowski
@ 2007-02-19  0:00       ` Andrew Morton
  2007-02-19 11:28         ` 2.6.20-mm2 Rafael J. Wysocki
  2007-02-20  0:43         ` 2.6.20-mm2 Rafael J. Wysocki
  1 sibling, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2007-02-19  0:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Mon, 19 Feb 2007 00:25:48 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> > netconsole is good.
> 
> I know. :-)
> 
> In the meantime, I've got something worse on another x86_64 box:
> 
> Asus Laptop ACPI Extras version 0.30
>   L5D model detected, supported
> audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> general protection fault: 0000 [2] PREEMPT
> last sysfs file: /class/net/eth2/carrier
> CPU 0
> Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
>  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
>  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> Call Trace:
>  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
>  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230

yeah. everyone except me is hitting that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-19  0:00       ` 2.6.20-mm2 Andrew Morton
@ 2007-02-19 11:28         ` Rafael J. Wysocki
  2007-02-19 11:45           ` 2.6.20-mm2 Michal Piotrowski
  2007-02-20  0:43         ` 2.6.20-mm2 Rafael J. Wysocki
  1 sibling, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-19 11:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Monday, 19 February 2007 01:00, Andrew Morton wrote:
> On Mon, 19 Feb 2007 00:25:48 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > netconsole is good.
> > 
> > I know. :-)
> > 
> > In the meantime, I've got something worse on another x86_64 box:
> > 
> > Asus Laptop ACPI Extras version 0.30
> >   L5D model detected, supported
> > audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> > general protection fault: 0000 [2] PREEMPT
> > last sysfs file: /class/net/eth2/carrier
> > CPU 0
> > Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> > Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> > RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> > RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> > RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> > RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> > RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> > R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> > FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> > Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> > Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
> >  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
> >  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> > Call Trace:
> >  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
> >  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
> 
> yeah. everyone except me is hitting that.

FWIW, I don't see it on an SMP machine.

On non-SMP it's reproducible, eg. by doing

# echo testproc > /sys/power/disk

and

# echo disk > /sys/power/state

for 3-4 times in a row.

Probably "sync" for a couple of times in a row would be sufficient.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-19 11:28         ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-19 11:45           ` Michal Piotrowski
  2007-02-20  0:04             ` 2.6.20-mm2 Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Piotrowski @ 2007-02-19 11:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, linux-kernel, Neil Brown, Jeff Garzik, linux-ide,
	Jens Axboe

On 19/02/07, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, 19 February 2007 01:00, Andrew Morton wrote:
> > On Mon, 19 Feb 2007 00:25:48 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >
> > > > netconsole is good.
> > >
> > > I know. :-)
> > >
> > > In the meantime, I've got something worse on another x86_64 box:
> > >
> > > Asus Laptop ACPI Extras version 0.30
> > >   L5D model detected, supported
> > > audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> > > general protection fault: 0000 [2] PREEMPT
> > > last sysfs file: /class/net/eth2/carrier
> > > CPU 0
> > > Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> > > Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> > > RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> > > RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> > > RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> > > RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> > > RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> > > R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> > > FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> > > Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> > > Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
> > >  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
> > >  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> > > Call Trace:
> > >  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
> > >  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
> >
> > yeah. everyone except me is hitting that.
>
> FWIW, I don't see it on an SMP machine.
>

I can reproduce this on my SMT P4.

CONFIG_SMP=y
CONFIG_X86_PC=y
CONFIG_MPENTIUM4=y
CONFIG_NR_CPUS=2
CONFIG_SCHED_SMT=y

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-19 11:45           ` 2.6.20-mm2 Michal Piotrowski
@ 2007-02-20  0:04             ` Rafael J. Wysocki
  2007-02-20 21:16               ` 2.6.20-mm2 Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-20  0:04 UTC (permalink / raw)
  To: Michal Piotrowski
  Cc: Andrew Morton, linux-kernel, Neil Brown, Jeff Garzik, linux-ide,
	Jens Axboe

On Monday, 19 February 2007 12:45, Michal Piotrowski wrote:
> On 19/02/07, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Monday, 19 February 2007 01:00, Andrew Morton wrote:
> > > On Mon, 19 Feb 2007 00:25:48 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > >
> > > > > netconsole is good.
> > > >
> > > > I know. :-)
> > > >
> > > > In the meantime, I've got something worse on another x86_64 box:
> > > >
> > > > Asus Laptop ACPI Extras version 0.30
> > > >   L5D model detected, supported
> > > > audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> > > > general protection fault: 0000 [2] PREEMPT
> > > > last sysfs file: /class/net/eth2/carrier
> > > > CPU 0
> > > > Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> > > > Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> > > > RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> > > > RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> > > > RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> > > > RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> > > > RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> > > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> > > > R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> > > > FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> > > > Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> > > > Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
> > > >  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
> > > >  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> > > > Call Trace:
> > > >  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
> > > >  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
> > >
> > > yeah. everyone except me is hitting that.
> >
> > FWIW, I don't see it on an SMP machine.
> >
> 
> I can reproduce this on my SMT P4.
> 
> CONFIG_SMP=y
> CONFIG_X86_PC=y
> CONFIG_MPENTIUM4=y
> CONFIG_NR_CPUS=2
> CONFIG_SCHED_SMT=y

It may be related to preemption.  The box I'm not seeing it on runs a
non-preemptible kernel (CONFIG_PREEMPT_VOLUNTARY is set).

BTW, on the box where I'm able to reproduce it, I have

(gdb) l *__make_request+0x134
0xffffffff8034b764 is in __make_request (include/asm/processor.h:411).
406     #define cpu_has_fpu 1
407
408     #define ARCH_HAS_PREFETCH
409     static inline void prefetch(void *x)
410     {
411             asm volatile("prefetcht0 %0" :: "m" (*(unsigned long *)x));
412     }
413
414     #define ARCH_HAS_PREFETCHW 1
415     static inline void prefetchw(void *x)

So I guess x is NULL somewhere ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-19  0:00       ` 2.6.20-mm2 Andrew Morton
  2007-02-19 11:28         ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-20  0:43         ` Rafael J. Wysocki
  1 sibling, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-20  0:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Monday, 19 February 2007 01:00, Andrew Morton wrote:
> On Mon, 19 Feb 2007 00:25:48 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > > netconsole is good.
> > 
> > I know. :-)
> > 
> > In the meantime, I've got something worse on another x86_64 box:
> > 
> > Asus Laptop ACPI Extras version 0.30
> >   L5D model detected, supported
> > audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> > general protection fault: 0000 [2] PREEMPT
> > last sysfs file: /class/net/eth2/carrier
> > CPU 0
> > Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> > Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> > RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> > RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> > RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> > RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> > RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> > R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> > FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> > Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> > Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
> >  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
> >  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> > Call Trace:
> >  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
> >  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
> 
> yeah. everyone except me is hitting that.

An interesting variant:

------------[ cut here ]------------
kernel BUG at block/ll_rw_blk.c:2782!
invalid opcode: 0000 [1] PREEMPT
last sysfs file: /class/net/eth2/carrier
CPU 0
Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod usbhid pcmcir
Pid: 5060, comm: preload Not tainted 2.6.20-mm2 #4
RIP: 0010:[<ffffffff80349b7a>]  [<ffffffff80349b7a>] bio_attempt_back_merge+0x2a/0xa0
RSP: 0018:ffff810045819a58  EFLAGS: 00010202
RAX: 0000000100000080 RBX: ffff810046946eb0 RCX: 0000000002b26b42
RDX: 0000000100000000 RSI: ffff810046946eb0 RDI: ffff810037d74a90
RBP: ffff810045819a68 R08: ffff810046946eb0 R09: 0000000000000400
R10: 0000000000000000 R11: 0000000000000000 R12: ffff810046fcc330
R13: ffff81004a218770 R14: ffff810037d74a90 R15: ffff81004a218750
FS:  00002acb9c6076f0(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaaaaac000 CR3: 0000000045855000 CR4: 00000000000006e0
Process preload (pid: 5060, threadinfo ffff810045818000, task ffff81004a12e140)
Stack:  ffff810046946eb0 ffff810046fcc330 ffff810045819ac8 ffffffff8034b730
 0000000000000000 0000000000000000 ffff810046fcc330 0000000000000002
 ffff810046946eb0 0000000000000008 ffff810046fcc330 0000000000000800
Call Trace:
 [<ffffffff8034b730>] __make_request+0x100/0x370
 [<ffffffff803488fc>] generic_make_request+0x1ec/0x230
 [<ffffffff802b9a7b>] bio_alloc_bioset+0xeb/0x120
 [<ffffffff8034b266>] submit_bio+0xf6/0x110
 [<ffffffff802b9b10>] bio_alloc+0x10/0x20
 [<ffffffff802bd3f2>] mpage_bio_submit+0x22/0x30
 [<ffffffff802bdfe5>] do_mpage_readpage+0x505/0x590
 [<ffffffff80482cd6>] _write_unlock_irq+0x36/0x60
 [<ffffffff80268bfb>] add_to_page_cache+0xbb/0xf0
 [<ffffffff8026d950>] get_page_from_freelist+0x120/0x430
 [<ffffffff802be2be>] mpage_readpages+0xbe/0x160
 [<ffffffff8030fa20>] ext3_get_block+0x0/0x110
 [<ffffffff8030fa20>] ext3_get_block+0x0/0x110
 [<ffffffff804833b0>] _spin_unlock+0x30/0x50
 [<ffffffff8026da50>] get_page_from_freelist+0x220/0x430
 [<ffffffff8030eb8a>] ext3_readpages+0x1a/0x20
 [<ffffffff8027072f>] __do_page_cache_readahead+0x20f/0x330
 [<ffffffff80294d68>] cp_new_stat+0xf8/0x120
 [<ffffffff80270c7d>] force_page_cache_readahead+0x6d/0xb0
 [<ffffffff8026c533>] sys_fadvise64_64+0x143/0x1e0
 [<ffffffff8026c5d9>] sys_fadvise64+0x9/0x10
 [<ffffffff80209a0e>] system_call+0x7e/0x83


Code: 0f 0b 0f 1f 40 00 eb fe 4c 89 e2 e8 f6 df ff ff 31 d2 85 c0
RIP  [<ffffffff80349b7a>] bio_attempt_back_merge+0x2a/0xa0
 RSP <ffff810045819a58>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-18 19:43   ` 2.6.20-mm2 Andrew Morton
  2007-02-18 23:25     ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-20  1:20     ` Rafael J. Wysocki
  2007-02-20  6:31       ` 2.6.20-mm2 Andrew Morton
  1 sibling, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-20  1:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Sunday, 18 February 2007 20:43, Andrew Morton wrote:
> On Sun, 18 Feb 2007 13:44:54 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> > > 
> > > Temporarily at
> > > 
> > >   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> > > 
> > > Will appear later at
> > > 
> > >  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/
> > 
> > Two problems:
> > 
> > 1) A showstopper with the root partition on RAID1:
> > 
> > md: raid1 personality registered for level 1
> > [--snip--]
> > md: multipath personality registered for level -4
> > register_blkdev: failed to get major for mdp
> > [--snip--]
> > VFS: Cannot open root device "md1" or unknown-block(0,0)
> 
> Someone else reported that against mainline.  Can you please debug it a bit?

For now I can only say 2.6.20 + origin.patch breaks.

However, it's a SUSE 10.1 system with gcc 4.1.0 and this may be the reason.
I'll check that tomorrow.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-20  1:20     ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-20  6:31       ` Andrew Morton
  2007-02-20 22:12         ` 2.6.20-mm2 Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2007-02-20  6:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Tue, 20 Feb 2007 02:20:21 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Sunday, 18 February 2007 20:43, Andrew Morton wrote:
> > On Sun, 18 Feb 2007 13:44:54 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> > > > 
> > > > Temporarily at
> > > > 
> > > >   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> > > > 
> > > > Will appear later at
> > > > 
> > > >  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/
> > > 
> > > Two problems:
> > > 
> > > 1) A showstopper with the root partition on RAID1:
> > > 
> > > md: raid1 personality registered for level 1
> > > [--snip--]
> > > md: multipath personality registered for level -4
> > > register_blkdev: failed to get major for mdp
> > > [--snip--]
> > > VFS: Cannot open root device "md1" or unknown-block(0,0)
> > 
> > Someone else reported that against mainline.  Can you please debug it a bit?
> 
> For now I can only say 2.6.20 + origin.patch breaks.
> 
> However, it's a SUSE 10.1 system with gcc 4.1.0 and this may be the reason.
> I'll check that tomorrow.

Yes, Rolf says this goes away when you stop using gcc-4.1.0.

I'm hoping that churning the code around like below makes things work
right.



From: Andrew Morton <akpm@linux-foundation.org>

Several people have reported failures in dynamic major device number handling
due to the recent changes in there to avoid handing out the local/experimental
majors.

Rolf reports that this is due to a gcc-4.1.0 bug.

The patch refactors that code a lot in an attempt to provoke the compiler into
behaving.

Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 block/genhd.c          |    9 ++-------
 drivers/base/core.c    |   14 ++++++++++++++
 fs/char_dev.c          |    8 ++------
 include/linux/kdev_t.h |    1 +
 4 files changed, 19 insertions(+), 13 deletions(-)

diff -puN block/genhd.c~rework-reserved-major-handling block/genhd.c
--- a/block/genhd.c~rework-reserved-major-handling
+++ a/block/genhd.c
@@ -5,6 +5,7 @@
 #include <linux/module.h>
 #include <linux/fs.h>
 #include <linux/genhd.h>
+#include <linux/kdev_t.h>
 #include <linux/kernel.h>
 #include <linux/blkdev.h>
 #include <linux/init.h>
@@ -61,13 +62,7 @@ int register_blkdev(unsigned int major, 
 	/* temporary */
 	if (major == 0) {
 		for (index = ARRAY_SIZE(major_names)-1; index > 0; index--) {
-			/*
-			 * Disallow the LANANA-assigned LOCAL/EXPERIMENTAL
-			 * majors
-			 */
-			if ((60 <= index && index <= 63) ||
-					(120 <= index && index <= 127) ||
-					(240 <= index && index <= 254))
+			if (is_lanana_major(index))
 				continue;
 			if (major_names[index] == NULL)
 				break;
diff -puN fs/char_dev.c~rework-reserved-major-handling fs/char_dev.c
--- a/fs/char_dev.c~rework-reserved-major-handling
+++ a/fs/char_dev.c
@@ -6,6 +6,7 @@
 
 #include <linux/init.h>
 #include <linux/fs.h>
+#include <linux/kdev_t.h>
 #include <linux/slab.h>
 #include <linux/string.h>
 
@@ -108,12 +109,7 @@ __register_chrdev_region(unsigned int ma
 	/* temporary */
 	if (major == 0) {
 		for (i = ARRAY_SIZE(chrdevs)-1; i > 0; i--) {
-			/*
-			 * Disallow the LANANA-assigned LOCAL/EXPERIMENTAL
-			 * majors
-			 */
-			if ((60 <= i && i <= 63) || (120 <= i && i <= 127) ||
-					(240 <= i && i <= 254))
+			if (is_lanana_major(i))
 				continue;
 			if (chrdevs[i] == NULL)
 				break;
diff -puN drivers/base/core.c~rework-reserved-major-handling drivers/base/core.c
--- a/drivers/base/core.c~rework-reserved-major-handling
+++ a/drivers/base/core.c
@@ -28,6 +28,20 @@ int (*platform_notify)(struct device * d
 int (*platform_notify_remove)(struct device * dev) = NULL;
 
 /*
+ * Detect the LANANA-assigned LOCAL/EXPERIMENTAL majors
+ */
+bool is_lanana_major(unsigned int major)
+{
+	if (major >= 60 && major <= 63)
+		return 1;
+	if (major >= 120 && major <= 127)
+		return 1;
+	if (major >= 240 && major <= 254)
+		return 1;
+	return 0;
+}
+
+/*
  * sysfs bindings for devices.
  */
 
diff -puN include/linux/kdev_t.h~rework-reserved-major-handling include/linux/kdev_t.h
--- a/include/linux/kdev_t.h~rework-reserved-major-handling
+++ a/include/linux/kdev_t.h
@@ -87,6 +87,7 @@ static inline unsigned sysv_minor(u32 de
 	return dev & 0x3ffff;
 }
 
+bool is_lanana_major(unsigned int major);
 
 #else /* __KERNEL__ */
 
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-20  0:04             ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-20 21:16               ` Rafael J. Wysocki
  2007-02-20 21:46                 ` 2.6.20-mm2 Jeff Garzik
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-20 21:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Piotrowski, linux-kernel, Neil Brown, Jeff Garzik,
	linux-ide, Jens Axboe

On Tuesday, 20 February 2007 01:04, Rafael J. Wysocki wrote:
> On Monday, 19 February 2007 12:45, Michal Piotrowski wrote:
> > On 19/02/07, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > On Monday, 19 February 2007 01:00, Andrew Morton wrote:
> > > > On Mon, 19 Feb 2007 00:25:48 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > >
> > > > > > netconsole is good.
> > > > >
> > > > > I know. :-)
> > > > >
> > > > > In the meantime, I've got something worse on another x86_64 box:
> > > > >
> > > > > Asus Laptop ACPI Extras version 0.30
> > > > >   L5D model detected, supported
> > > > > audit(1171831698.918:2): audit_pid=4281 old=0 by auid=4294967295
> > > > > general protection fault: 0000 [2] PREEMPT
> > > > > last sysfs file: /class/net/eth2/carrier
> > > > > CPU 0
> > > > > Modules linked in: af_packet ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device asus_acpi backlight button battery ac dm_mod pcmr
> > > > > Pid: 178, comm: pdflush Not tainted 2.6.20-mm2 #1
> > > > > RIP: 0010:[<ffffffff8034bce4>]  [<ffffffff8034bce4>] __make_request+0x134/0x370
> > > > > RSP: 0000:ffff81005ed659a0  EFLAGS: 00010297
> > > > > RAX: 00000000ffffffff RBX: 6b6b6b6b6b6b6b6b RCX: 000000000203396a
> > > > > RDX: 0000000100000000 RSI: ffff810037b4dbb0 RDI: ffff81004683d8c0
> > > > > RBP: ffff81005ed659f0 R08: ffff81004683d070 R09: ffff81003d333cc0
> > > > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff810037b4dbb0
> > > > > R13: ffff81005daba3f0 R14: ffff810037daca90 R15: ffff81005daba3d0
> > > > > FS:  00002ad4a29e6d00(0000) GS:ffffffff805db000(0000) knlGS:0000000000000000
> > > > > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > > CR2: 00002b6a345aa000 CR3: 0000000056585000 CR4: 00000000000006e0
> > > > > Process pdflush (pid: 178, threadinfo ffff81005ed64000, task ffff810037b060c0)
> > > > > Stack:  ffff810002852540 0000000000000001 ffff810037b4dbb0 ffffffff8026be21
> > > > >  ffff81005ed65a40 0000000000000008 ffff810037b4dbb0 0000000000000800
> > > > >  0000000000000008 ffff8100021d94e0 ffff81005ed65a40 ffffffff80348e7c
> > > > > Call Trace:
> > > > >  [<ffffffff8026be21>] mempool_alloc_slab+0x11/0x20
> > > > >  [<ffffffff80348e7c>] generic_make_request+0x1ec/0x230
> > > >
> > > > yeah. everyone except me is hitting that.
> > >
> > > FWIW, I don't see it on an SMP machine.
> > >
> > 
> > I can reproduce this on my SMT P4.
> > 
> > CONFIG_SMP=y
> > CONFIG_X86_PC=y
> > CONFIG_MPENTIUM4=y
> > CONFIG_NR_CPUS=2
> > CONFIG_SCHED_SMT=y
> 
> It may be related to preemption.  The box I'm not seeing it on runs a
> non-preemptible kernel (CONFIG_PREEMPT_VOLUNTARY is set).

FWIW, with CONFIG_PREEMPT unset (CONFIG_PREEMPT_VOLUNTARY is set instead), I'm
unable to reproduce this problem on the box on which it is readily reproducible with
CONFIG_PREEMPT set.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-20 21:16               ` 2.6.20-mm2 Rafael J. Wysocki
@ 2007-02-20 21:46                 ` Jeff Garzik
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Garzik @ 2007-02-20 21:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Michal Piotrowski, linux-kernel, Neil Brown,
	linux-ide, Jens Axboe

Rafael J. Wysocki wrote:
> FWIW, with CONFIG_PREEMPT unset (CONFIG_PREEMPT_VOLUNTARY is set instead), I'm
> unable to reproduce this problem on the box on which it is readily reproducible with
> CONFIG_PREEMPT set.

I'm not surprised...  I routinely tell people to turn it off, when 
debugging a problem.

	Jeff




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.20-mm2
  2007-02-20  6:31       ` 2.6.20-mm2 Andrew Morton
@ 2007-02-20 22:12         ` Rafael J. Wysocki
  0 siblings, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2007-02-20 22:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Neil Brown, Jeff Garzik, linux-ide, Jens Axboe

On Tuesday, 20 February 2007 07:31, Andrew Morton wrote:
> On Tue, 20 Feb 2007 02:20:21 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Sunday, 18 February 2007 20:43, Andrew Morton wrote:
> > > On Sun, 18 Feb 2007 13:44:54 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > On Sunday, 18 February 2007 06:51, Andrew Morton wrote:
> > > > > 
> > > > > Temporarily at
> > > > > 
> > > > >   http://userweb.kernel.org/~akpm/2.6.20-mm2/
> > > > > 
> > > > > Will appear later at
> > > > > 
> > > > >  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/
> > > > 
> > > > Two problems:
> > > > 
> > > > 1) A showstopper with the root partition on RAID1:
> > > > 
> > > > md: raid1 personality registered for level 1
> > > > [--snip--]
> > > > md: multipath personality registered for level -4
> > > > register_blkdev: failed to get major for mdp
> > > > [--snip--]
> > > > VFS: Cannot open root device "md1" or unknown-block(0,0)
> > > 
> > > Someone else reported that against mainline.  Can you please debug it a bit?
> > 
> > For now I can only say 2.6.20 + origin.patch breaks.
> > 
> > However, it's a SUSE 10.1 system with gcc 4.1.0 and this may be the reason.
> > I'll check that tomorrow.
> 
> Yes, Rolf says this goes away when you stop using gcc-4.1.0.
> 
> I'm hoping that churning the code around like below makes things work
> right.

Yes, that helps.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-02-20 22:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070217215146.30e7ffa3.akpm@linux-foundation.org>
2007-02-18 12:44 ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-18 19:43   ` 2.6.20-mm2 Andrew Morton
2007-02-18 23:25     ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-18 23:39       ` 2.6.20-mm2 Michal Piotrowski
2007-02-19  0:00       ` 2.6.20-mm2 Andrew Morton
2007-02-19 11:28         ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-19 11:45           ` 2.6.20-mm2 Michal Piotrowski
2007-02-20  0:04             ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-20 21:16               ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-20 21:46                 ` 2.6.20-mm2 Jeff Garzik
2007-02-20  0:43         ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-20  1:20     ` 2.6.20-mm2 Rafael J. Wysocki
2007-02-20  6:31       ` 2.6.20-mm2 Andrew Morton
2007-02-20 22:12         ` 2.6.20-mm2 Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).