linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.25-rc8-mm1: Intel SATA boot failure
       [not found] <20080401213214.8fbb6d6b.akpm@linux-foundation.org>
@ 2008-04-03 23:08 ` Tilman Schmidt
  2008-04-03 23:17   ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Tilman Schmidt @ 2008-04-03 23:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-ide, Jeff Garzik

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

Andrew Morton schrieb:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm1/

This fails to come up on my development machine, apparently because it
has trouble accessing the SATA hard disks.
Hardware: Intel Pentium D940, Intel DQ965GF board, two SATA hard disks.
Some unusual things I noticed during the boot process:

- a message "doing fast boot" that looked unfamiliar; unfortunately
  it scrolled off too quickly to note its context

- for each of the two SATA ports in use, a message
  "SATA port is slow to respond, please be patient"
  accompanied by about 10 secs wait

- it actually got past the point where it mounts the root file system,
  so it must have thought it could access the disks

- finally, the system hung completely after the SUSE startup messages

  Setting current sysctl status from /etc/sysctl.conf
  net.ipv4.icmp_echo_ignore_broadcasts = 1

  with a dead keyboard and I had to hit the Win^Wreset button.

- After rebooting into 2.6.24-rc8 (which works fine), nothing had been
  written to the disks, not even the dmesg output which SUSE usually
  dumps into /var/log/boot.msg early during startup.

Before I try booting that kernel again, any instructions on what to
watch out for? Is netconsole usable again? Other ideas?

Regards,
Tilman


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.25-rc8-mm1: Intel SATA boot failure
  2008-04-03 23:08 ` 2.6.25-rc8-mm1: Intel SATA boot failure Tilman Schmidt
@ 2008-04-03 23:17   ` Andrew Morton
  2008-04-09 14:29     ` Tilman Schmidt
  2008-04-14  0:28     ` Tilman Schmidt
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2008-04-03 23:17 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: linux-kernel, linux-ide, jgarzik

On Fri, 04 Apr 2008 01:08:19 +0200
Tilman Schmidt <tilman@imap.cc> wrote:

> Andrew Morton schrieb:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm1/
> 
> This fails to come up on my development machine, apparently because it
> has trouble accessing the SATA hard disks.
> Hardware: Intel Pentium D940, Intel DQ965GF board, two SATA hard disks.
> Some unusual things I noticed during the boot process:
> 
> - a message "doing fast boot" that looked unfamiliar; unfortunately
>   it scrolled off too quickly to note its context
> 
> - for each of the two SATA ports in use, a message
>   "SATA port is slow to respond, please be patient"
>   accompanied by about 10 secs wait
> 
> - it actually got past the point where it mounts the root file system,
>   so it must have thought it could access the disks
> 
> - finally, the system hung completely after the SUSE startup messages
> 
>   Setting current sysctl status from /etc/sysctl.conf
>   net.ipv4.icmp_echo_ignore_broadcasts = 1
> 
>   with a dead keyboard and I had to hit the Win^Wreset button.
> 
> - After rebooting into 2.6.24-rc8 (which works fine), nothing had been
>   written to the disks, not even the dmesg output which SUSE usually
>   dumps into /var/log/boot.msg early during startup.
> 
> Before I try booting that kernel again, any instructions on what to
> watch out for? Is netconsole usable again?

Yes, netconsole is usable again ;)

> Other ideas?

Usual stuff: `diff -u dmesg-2.6.25-rc8 dmesg-2.6.25-rc8-mm1'.  Bisection.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.25-rc8-mm1: Intel SATA boot failure
  2008-04-03 23:17   ` Andrew Morton
@ 2008-04-09 14:29     ` Tilman Schmidt
  2008-04-14  0:28     ` Tilman Schmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Tilman Schmidt @ 2008-04-09 14:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-ide, jgarzik

[-- Attachment #1: Type: text/plain, Size: 5937 bytes --]

This is taking longer than I hoped, so here's a little progress report.

On Thu, 3 Apr 2008 16:17:33 -0700, Andrew Morton wrote:
> On Fri, 04 Apr 2008 01:08:19 +0200
> Tilman Schmidt <tilman@imap.cc> wrote:
> 
>> This fails to come up on my development machine, apparently because it
>> has trouble accessing the SATA hard disks.
>> Hardware: Intel Pentium D940, Intel DQ965GF board, two SATA hard disks.
>> Some unusual things I noticed during the boot process:
>> 
>> - a message "doing fast boot" that looked unfamiliar; unfortunately
>>   it scrolled off too quickly to note its context

That message doesn't make it into dmesg. It's apparently a Suse thing,
and perhaps normal, so let's ignore that for now.

>> - for each of the two SATA ports in use, a message
>>   "SATA port is slow to respond, please be patient"
>>   accompanied by about 10 secs wait

These messages seem to be a separate issue. I also get them with
a .config that otherwise brings up the system successfully. That
allowed me to capture a dmesg, so here are some possibly interesting
hunks of the diff between a mainline kernel and a working 2.6.25-rc8-mm1
one:

--- dmesg-2.6.25-rc8-git.nots-reordered 2008-04-09 15:29:52.000000000 +0200
+++ dmesg-2.6.25-rc8-mm1.nots   2008-04-09 00:48:42.000000000 +0200
@@ -1,4 +1,4 @@
- Linux version 2.6.25-rc8-testing-00210-g51ac03f (ts@xenon) (gcc version 4.2.1 (SUSE Linux)) #37 SMP PREEMPT Wed Apr 9 01:27:07 CEST 2008
+ Linux version 2.6.25-rc8-mm1-testing (ts@xenon) (gcc version 4.2.1 (SUSE Linux)) #6 SMP PREEMPT Wed Apr 9 00:24:23 CEST 2008
   BIOS-provided physical RAM map:
    BIOS-e820: 0000000000000000 - 000000000008f000 (usable)
    BIOS-e820: 000000000008f000 - 00000000000a0000 (reserved)

[...]

@@ -244,12 +277,10 @@
   CPU1: Intel P4/Xeon Extended MCE MSRs (24) available
   CPU1: Thermal monitoring enabled
   CPU1: Intel(R) Pentium(R) D CPU 3.20GHz stepping 04
- Total of 2 processors activated (12796.06 BogoMIPS).
+ Total of 2 processors activated (12796.87 BogoMIPS).
   ENABLING IO-APIC IRQs
   ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
- checking TSC synchronization [CPU#0 -> CPU#1]:
- Measured 560 cycles TSC warp between CPUs, turning off TSC clock.
- Marking TSC unstable due to: check_tsc_sync_source failed.
+ checking TSC synchronization [CPU#0 -> CPU#1]: passed.
   Brought up 2 CPUs
   CPU0 attaching sched-domain:
    domain 0: span 03

[Nice - at last a kernel that likes my TSC; not sure if it matters though.]

@@ -846,26 +880,36 @@
   PCI: Setting latency timer of device 0000:00:1f.2 to 64
   scsi0 : ahci
   PM: Adding info for No Bus:host0
+ PM: Adding info for No Bus:host0
   scsi1 : ahci
   PM: Adding info for No Bus:host1
+ PM: Adding info for No Bus:host1
   scsi2 : ahci
   PM: Adding info for No Bus:host2
+ PM: Adding info for No Bus:host2
   scsi3 : ahci
   PM: Adding info for No Bus:host3
+ PM: Adding info for No Bus:host3
   scsi4 : ahci
   PM: Adding info for No Bus:host4
+ PM: Adding info for No Bus:host4
   scsi5 : ahci
   PM: Adding info for No Bus:host5
+ PM: Adding info for No Bus:host5
   ata1: SATA max UDMA/133 abar m2048@0x92c25000 port 0x92c25100 irq 217
   ata2: SATA max UDMA/133 abar m2048@0x92c25000 port 0x92c25180 irq 217
   ata3: SATA max UDMA/133 abar m2048@0x92c25000 port 0x92c25200 irq 217
   ata4: SATA max UDMA/133 abar m2048@0x92c25000 port 0x92c25280 irq 217
   ata5: SATA max UDMA/133 abar m2048@0x92c25000 port 0x92c25300 irq 217
   ata6: SATA max UDMA/133 abar m2048@0x92c25000 port 0x92c25380 irq 217
+ ata1: port is slow to respond, please be patient (Status 0x80)
+ ata1: COMRESET failed (errno=-16)
   ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
   ata1.00: ATA-7: ST380811AS, 3.AAE, max UDMA/133
   ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
   ata1.00: configured for UDMA/133
+ ata2: port is slow to respond, please be patient (Status 0x80)
+ ata2: COMRESET failed (errno=-16)
   ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
   ata2.00: ATA-7: ST380811AS, 3.AAE, max UDMA/133
   ata2.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
@@ -877,15 +921,22 @@
   PM: Adding info for No Bus:target0:0:0
   scsi 0:0:0:0: Direct-Access     ATA      ST380811AS       3.AA PQ: 0 ANSI: 5
   PM: Adding info for scsi:0:0:0:0
+ PM: Adding info for No Bus:0:0:0:0
+ PM: Adding info for No Bus:0:0:0:0
   PM: Adding info for No Bus:target1:0:0
   scsi 1:0:0:0: Direct-Access     ATA      ST380811AS       3.AA PQ: 0 ANSI: 5
   PM: Adding info for scsi:1:0:0:0
+ PM: Adding info for No Bus:1:0:0:0
+ PM: Adding info for No Bus:1:0:0:0
+ modprobe used greatest stack depth: 2200 bytes left
   ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 17 (level, low) -> IRQ 17
   PCI: Setting latency timer of device 0000:02:00.0 to 64
   scsi6 : pata_marvell
   PM: Adding info for No Bus:host6
+ PM: Adding info for No Bus:host6
   scsi7 : pata_marvell
   PM: Adding info for No Bus:host7
+ PM: Adding info for No Bus:host7
   ata7: PATA max UDMA/100 cmd 0x2018 ctl 0x2024 bmdma 0x2000 irq 17
   ata8: DUMMY
   BAR5:00:00 01:7F 02:22 03:CA 04:00 05:00 06:00 07:00 08:00 09:00 0A:00 0B:00 0C:01 0D:00 0E:00 0F:00

After that, it quickly diverges because of major differences in the
ordering of device startup. Already, to get the diffs above I had to
shift a few blocks of lines around.

>> Before I try booting that kernel again, any instructions on what to
>> watch out for? Is netconsole usable again?
> 
> Yes, netconsole is usable again ;)

Good to know. I'm now looking in my netconsole setup for the reason
why I don't receive anything from it.

Thanks,
Tilman

-- 
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.25-rc8-mm1: Intel SATA boot failure
  2008-04-03 23:17   ` Andrew Morton
  2008-04-09 14:29     ` Tilman Schmidt
@ 2008-04-14  0:28     ` Tilman Schmidt
  2008-04-14  2:05       ` Andrew Morton
  2008-04-15 23:33       ` Tilman Schmidt
  1 sibling, 2 replies; 6+ messages in thread
From: Tilman Schmidt @ 2008-04-14  0:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-ide, jgarzik

[-- Attachment #1: Type: text/plain, Size: 900 bytes --]

Final report, seeing -mm2 is out:

- Netconsole works. (grumblestupidsusefirewallgrumble)

- The hang during boot only happens with kernels compiled with
  CONFIG_CIFS_EXPERIMENTAL=y
  It also doesn't always happen at the same point in the boot sequence.
  I'm suspecting it might be triggered by some network packet.
  Anyway, it's obviously *not* a SATA problem.
  (That was just me jumping to conclusions, because ...)

- That leaves only the messages

  ata1: port is slow to respond, please be patient (Status 0x80)
  ata1: COMRESET failed (errno=-16)

  and accompanying delays during boot, for each installed SATA disk.
  I'll try to find the time to retest this with 2.6.25-rc8-mm2.

Thanks,
Tilman

-- 
Tilman Schmidt                                  E-Mail: tilman@imap.cc
Wehrhausweg 66                                  Fax: +49 228 4299019
53227 Bonn, Germany


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.25-rc8-mm1: Intel SATA boot failure
  2008-04-14  0:28     ` Tilman Schmidt
@ 2008-04-14  2:05       ` Andrew Morton
  2008-04-15 23:33       ` Tilman Schmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2008-04-14  2:05 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: linux-kernel, linux-ide, jgarzik, Steven French

On Mon, 14 Apr 2008 02:28:40 +0200 Tilman Schmidt <tilman@imap.cc> wrote:

> Final report, seeing -mm2 is out:
> 
> - Netconsole works. (grumblestupidsusefirewallgrumble)
> 
> - The hang during boot only happens with kernels compiled with
>   CONFIG_CIFS_EXPERIMENTAL=y
>   It also doesn't always happen at the same point in the boot sequence.
>   I'm suspecting it might be triggered by some network packet.
>   Anyway, it's obviously *not* a SATA problem.
>   (That was just me jumping to conclusions, because ...)

I don't remember seeing a report of the CIFS hang.

It might be caused by
bkl-removal-convert-cifs-over-to-unlocked_ioctl.patch, but it's hard to see
how that could affect things before userspace has started.

> - That leaves only the messages
> 
>   ata1: port is slow to respond, please be patient (Status 0x80)
>   ata1: COMRESET failed (errno=-16)
> 
>   and accompanying delays during boot, for each installed SATA disk.
>   I'll try to find the time to retest this with 2.6.25-rc8-mm2.

That would be good, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.25-rc8-mm1: Intel SATA boot failure
  2008-04-14  0:28     ` Tilman Schmidt
  2008-04-14  2:05       ` Andrew Morton
@ 2008-04-15 23:33       ` Tilman Schmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Tilman Schmidt @ 2008-04-15 23:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-ide, jgarzik

[-- Attachment #1: Type: text/plain, Size: 423 bytes --]

Am 14.04.2008 02:28 schrieb Tilman Schmidt:
> - That leaves only the messages
> 
>   ata1: port is slow to respond, please be patient (Status 0x80)
>   ata1: COMRESET failed (errno=-16)
> 
>   and accompanying delays during boot, for each installed SATA disk.
>   I'll try to find the time to retest this with 2.6.25-rc8-mm2.

Done. The messages and delays do *not* happen with 2.6.25-rc8-mm2.

HTH
Tilman



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-04-15 23:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20080401213214.8fbb6d6b.akpm@linux-foundation.org>
2008-04-03 23:08 ` 2.6.25-rc8-mm1: Intel SATA boot failure Tilman Schmidt
2008-04-03 23:17   ` Andrew Morton
2008-04-09 14:29     ` Tilman Schmidt
2008-04-14  0:28     ` Tilman Schmidt
2008-04-14  2:05       ` Andrew Morton
2008-04-15 23:33       ` Tilman Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).