* [2.6.18,19] SATA boot problems (ICH6/ICH6W)
@ 2006-12-11 18:03 Kovid Goyal
2006-12-20 0:44 ` Tejun Heo
2007-01-23 21:49 ` danieljzhang
0 siblings, 2 replies; 29+ messages in thread
From: Kovid Goyal @ 2006-12-11 18:03 UTC (permalink / raw)
To: linux-ide
Hi,
I'm having problems booting from a SATA disk with 2.6.19.
Grub loads fine, but when the kernel boots, it *sometimes* ends up with
VFS: Cannot open root device "sda5" or unknown-block(0,0).
This issue appears to be identical with the one reported in
http://lkml.org/lkml/2006/10/19/327 except that 2.6.19 did not fix it for me.
In fact, it made it worse. The timeouts reported in that thread are still
present and with 2.6.19 the fraction of unsuccessful boots has increased from
30% to 80%. I often have to reboot my machine 3 times in a row before it
succeeds. I have tried both vanilla 2.6.19 and gentoo-sources-2.6.19-r1
To summarize: My machine stalls for nearly a minute on all boots and the root
filesystem fails to mount on 80% of the boots. This problem first manifested
itself when I switched to 2.6.18 and has become worse with 2.6.19.
The controller:
00:1f.2 IDE interface: Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA
Controller (rev 03)
I don't know how to get the logs from an unsuccessful boot. Here is
an extract from the logs of a successful boot (in which the timeout is
present):
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Probing IDE interface ide0...
hda: PHILIPS DVD+/-RW DVD8631, ATAPI CD/DVD-ROM drive
input: ImExPS/2 Logitech MX Mouse as /class/input/input2
Probing IDE interface ide1...
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache
Uniform CD-ROM driver Revision: 3.20
ata_piix 0000:00:1f.1: version 2.00ac6
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 16 (level, low) -> IRQ 16
ata: 0x1F0 IDE port busy
ata: conflict with ide0
PCI: Setting latency timer of device 0000:00:1f.1 to 64
ata1: DUMMY
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xFFA8 irq 15
scsi0 : ata_piix
scsi1 : ata_piix
ata2: port disabled. ignoring.
ATA: abnormal status 0xFF on port 0x177
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ACPI: PCI Interrupt 0000:00:1f.2[C] -> GSI 20 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata3: SATA max UDMA/133 cmd 0xFE00 ctl 0xFE12 bmdma 0xFEA0 irq 17
ata4: SATA max UDMA/133 cmd 0xFE20 ctl 0xFE32 bmdma 0xFEA8 irq 17
scsi2 : ata_piix
ata3.00: ATA-6, max UDMA/133, 156250000 sectors: LBA48
ata3.00: ata3: dev 0 multi count 8
ata3.01: ATA-6, max UDMA/133, 156301488 sectors: LBA48
ata3.01: ata3: dev 1 multi count 8
ata3.00: configured for UDMA/133
ata3.01: configured for UDMA/133
scsi3 : ata_piix
ata4: port is slow to respond, please be patient (Status 0xff)
ata4: port failed to respond (30 secs, Status 0xff)
ata4: SRST failed (status 0xFF)
ata4: SRST failed (err_mask=0x100)
ata4: softreset failed, retrying in 5 secs
ata4: SRST failed (status 0xFF)
ata4: SRST failed (err_mask=0x100)
ata4: softreset failed, retrying in 5 secs
ata4: SRST failed (status 0xFF)
ata4: SRST failed (err_mask=0x100)
ata4: reset failed, giving up
scsi 2:0:0:0: Direct-Access ATA ST380013AS 8.05 PQ: 0 ANSI: 5
SCSI device sda: 156250000 512-byte hdwr sectors (80000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156250000 512-byte hdwr sectors (80000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 >
sd 2:0:0:0: Attached scsi disk sda
scsi 2:0:1:0: Direct-Access ATA ST380013AS 3.18 PQ: 0 ANSI: 5
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2 sdb3
sd 2:0:1:0: Attached scsi disk sdb
The relevant sections from my kernel configuration:
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_SATA_AHCI is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIL24 is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
CONFIG_SATA_INTEL_COMBINED=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
# CONFIG_IDEDISK_MULTI_MODE is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_IDEPNP is not set
# CONFIG_BLK_DEV_IDEPCI is not set
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
# CONFIG_BLK_DEV_IDEDMA is not set
# CONFIG_IDEDMA_AUTO is not set
# CONFIG_CD_NO_IDESCSI is not set
Thanks,
Kovid.
--
_____________________________________
Kovid Goyal MC 452-48
California Institute of Technology
1200 E California Blvd
Pasadena, CA 91125
home : +01 626 390 8699
office: +01 626 395 6595 (449 Lauritsen)
email : kovid@theory.caltech.edu
web : http://www.kovidgoyal.net
_____________________________________
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-11 18:03 [2.6.18,19] SATA boot problems (ICH6/ICH6W) Kovid Goyal
@ 2006-12-20 0:44 ` Tejun Heo
2006-12-20 2:00 ` Kovid Goyal
2006-12-20 3:29 ` Gary Hade
2007-01-23 21:49 ` danieljzhang
1 sibling, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2006-12-20 0:44 UTC (permalink / raw)
To: Kovid Goyal; +Cc: linux-ide
Kovid Goyal wrote:
> Hi,
>
> I'm having problems booting from a SATA disk with 2.6.19.
> Grub loads fine, but when the kernel boots, it *sometimes* ends up with
> VFS: Cannot open root device "sda5" or unknown-block(0,0).
>
> This issue appears to be identical with the one reported in
> http://lkml.org/lkml/2006/10/19/327 except that 2.6.19 did not fix it for me.
> In fact, it made it worse. The timeouts reported in that thread are still
> present and with 2.6.19 the fraction of unsuccessful boots has increased from
> 30% to 80%. I often have to reboot my machine 3 times in a row before it
> succeeds. I have tried both vanilla 2.6.19 and gentoo-sources-2.6.19-r1
>
> To summarize: My machine stalls for nearly a minute on all boots and the root
> filesystem fails to mount on 80% of the boots. This problem first manifested
> itself when I switched to 2.6.18 and has become worse with 2.6.19.
>
> The controller:
> 00:1f.2 IDE interface: Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA
> Controller (rev 03)
>
> I don't know how to get the logs from an unsuccessful boot. Here is
> an extract from the logs of a successful boot (in which the timeout is
> present):
Does goving back to 2.6.17 fix the problem? Also, please report the
result of 'smartctl -d ata -a /dev/sdX' where sdX is the problematic drive.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 0:44 ` Tejun Heo
@ 2006-12-20 2:00 ` Kovid Goyal
2006-12-20 2:13 ` Tejun Heo
2006-12-20 3:29 ` Gary Hade
1 sibling, 1 reply; 29+ messages in thread
From: Kovid Goyal @ 2006-12-20 2:00 UTC (permalink / raw)
To: Tejun Heo, linux-ide
Yes going back to 2.6.17 does fix it. Also windows and various LiveCDs boot
without problems.
Here is the output
smartctl -d ata -a /dev/sda
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family
Device Model: ST380013AS
Serial Number: 5JVFSQQW
Firmware Version: 8.05
User Capacity: 80,000,000,000 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Tue Dec 19 17:56:07 2006 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 58) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 060 057 006 Pre-fail
Always - 38101080
3 Spin_Up_Time 0x0003 098 098 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 8
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail
Always - 424904923
9 Power_On_Hours 0x0032 087 087 000 Old_age
Always - 11741
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 311
194 Temperature_Celsius 0x0022 060 072 000 Old_age
Always - 60
195 Hardware_ECC_Recovered 0x001a 060 057 000 Old_age
Always - 38101080
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age
Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours)
LBA_of_first_error
# 1 Short offline Completed without error 00% 11726 -
# 2 Short offline Completed without error 00% 11702 -
# 3 Short offline Completed without error 00% 11679 -
# 4 Extended offline Completed without error 00% 11658 -
# 5 Short offline Completed without error 00% 11655 -
# 6 Short offline Completed without error 00% 11632 -
# 7 Short offline Completed without error 00% 11608 -
# 8 Short offline Completed without error 00% 11585 -
# 9 Short offline Completed without error 00% 11562 -
#10 Short offline Completed without error 00% 11538 -
#11 Extended offline Completed without error 00% 11500 -
#12 Short offline Completed without error 00% 11498 -
#13 Short offline Completed without error 00% 11474 -
#14 Short offline Completed without error 00% 11452 -
#15 Short offline Completed without error 00% 11429 -
#16 Short offline Completed without error 00% 11405 -
#17 Short offline Completed without error 00% 11381 -
#18 Short offline Completed without error 00% 11358 -
#19 Extended offline Completed without error 00% 11338 -
#20 Short offline Completed without error 00% 11334 -
#21 Short offline Completed without error 00% 11311 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Thanks,
Kovid.
On Tuesday 19 December 2006 16:44, you wrote:
> Does goving back to 2.6.17 fix the problem? Also, please report the
> result of 'smartctl -d ata -a /dev/sdX' where sdX is the problematic drive.
--
_____________________________________
Kovid Goyal MC 452-48
California Institute of Technology
1200 E California Blvd
Pasadena, CA 91125
home : +01 626 390 8699
office: +01 626 395 6595 (449 Lauritsen)
email : kovid@theory.caltech.edu
web : http://www.kovidgoyal.net
_____________________________________
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 2:00 ` Kovid Goyal
@ 2006-12-20 2:13 ` Tejun Heo
2006-12-20 4:56 ` Kovid Goyal
2007-01-11 23:32 ` Kovid Goyal
0 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2006-12-20 2:13 UTC (permalink / raw)
To: Kovid Goyal; +Cc: linux-ide
Kovid Goyal wrote:
> Yes going back to 2.6.17 does fix it. Also windows and various LiveCDs boot
> without problems.
>
> Here is the output
> smartctl -d ata -a /dev/sda
> smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
Thanks, I couldn't find anything indicating hardware problem from the
smartctl output. Ergh... The 0xff timing out is fixed in the devel
tree. The problem is why sda sometimes doesn't get detected. Can you
post the result of failing dmesg? If you've got a second computer,
doing netconsole (Documentation/networking/netconsole.txt) is the best
way. If not, just take a picture of the screen while ata1 detection is
in progress.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 0:44 ` Tejun Heo
2006-12-20 2:00 ` Kovid Goyal
@ 2006-12-20 3:29 ` Gary Hade
2006-12-20 3:53 ` Tejun Heo
1 sibling, 1 reply; 29+ messages in thread
From: Gary Hade @ 2006-12-20 3:29 UTC (permalink / raw)
To: Tejun Heo; +Cc: Kovid Goyal, linux-ide, lcm
On Wed, Dec 20, 2006 at 09:44:45AM +0900, Tejun Heo wrote:
> Kovid Goyal wrote:
> > Hi,
> >
> > I'm having problems booting from a SATA disk with 2.6.19.
> > Grub loads fine, but when the kernel boots, it *sometimes* ends up with
> > VFS: Cannot open root device "sda5" or unknown-block(0,0).
> >
> > This issue appears to be identical with the one reported in
> > http://lkml.org/lkml/2006/10/19/327 except that 2.6.19 did not fix it for me.
> > In fact, it made it worse. The timeouts reported in that thread are still
> > present and with 2.6.19 the fraction of unsuccessful boots has increased from
> > 30% to 80%. I often have to reboot my machine 3 times in a row before it
> > succeeds. I have tried both vanilla 2.6.19 and gentoo-sources-2.6.19-r1
> >
> > To summarize: My machine stalls for nearly a minute on all boots and the root
> > filesystem fails to mount on 80% of the boots. This problem first manifested
> > itself when I switched to 2.6.18 and has become worse with 2.6.19.
> >
> > The controller:
> > 00:1f.2 IDE interface: Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA
> > Controller (rev 03)
> >
> > I don't know how to get the logs from an unsuccessful boot. Here is
> > an extract from the logs of a successful boot (in which the timeout is
> > present):
>
> Does goving back to 2.6.17 fix the problem? Also, please report the
> result of 'smartctl -d ata -a /dev/sdX' where sdX is the problematic drive.
We have been seeing what might be the same problem on an
IBM Intellistation M30 Pro which contains the same SATA controller.
I have been debugging and trying to better understand the problem
before posting my findings and possible fix but after noticing this
message today and then seeing the "SRST failed (status 0xFF)" messages
included in Kovid's original message I think I should share what I
have found so far.
In our case the SRST failures are associated with a Quantum GoVault
removable hard drive device. We have also noticed that failure is
dependent on cable placement e.g. after encoutering the failure a
reversal of the ports to which the hard drive and GoVault cables
are connected will cause the problem to disappear.
The above SRST failure message was from this code in libata-core.c.
/* Before we perform post reset processing we want to see if
* the bus shows 0xFF because the odd clown forgets the D7
* pulldown resistor.
*/
if (ata_check_status(ap) == 0xFF) {
ata_port_printk(ap, KERN_ERR, "SRST failed (status 0xFF)\n");
return AC_ERR_OTHER;
}
I noticed that Tejun recently provided a "libata: handle 0xff status
properly" patch that is now in mainline that improves this code
re: http://marc.theaimsgroup.com/?l=linux-ide&m=116038642105802&w=2
but I found that the check still failed but more silently and with no
retries.
I decided to try increasing the delay that preceeds the above
check [ msleep(150); ] and found that a change from 150ms to
1000ms caused the problem to disappear.
I then replaced the msleep(150); with:
{
int i, ms = 5;
msleep(ms);
ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
ms, ata_check_status(ap));
for (i = 1; i <= 20; i++) {
ms += 50;
msleep(50);
ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
ms, ata_check_status(ap));
}
}
Output for two cable placement configurations (0xFF check failure
and 0xFF check success) are included below. Note that there are
cable placement configurations for both the hard drive and
GoVault where the initial status is 0xff. i.e. both transition
from 0xff to 0x7f when BSY bit is cleared but it is taking MUCH
longer for the GoVault (600-700ms for GoVault and <5ms for
hard drive). It does not appear that the 0xff starting status
is device specific.
So, it appears that we have a situation with this SATA controller
where a 0xFF status is not an accurate indication that there is
no device.
Although the 150ms to 1000ms delay increase works for the GoVault
device I am not sure if it is the best long term fix for the problem.
Gary
--
Gary Hade
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
================================================
Falied 0xFF check cable placement configuration
================================================
libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 225
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0x20C0 ctl 0x20BA bmdma 0x2090 irq 225
ata2: SATA max UDMA/133 cmd 0x20B0 ctl 0x20A6 bmdma 0x2098 irq 225
scsi0 : ata_piix
ata1: status @ 5 ms: 0xff
ata1: status @ 55 ms: 0xff
ata1: status @ 105 ms: 0xff
ata1: status @ 155 ms: 0xff
ata1: status @ 205 ms: 0xff
ata1: status @ 255 ms: 0xff
ata1: status @ 305 ms: 0xff
ata1: status @ 355 ms: 0xff
ata1: status @ 405 ms: 0xff
ata1: status @ 455 ms: 0xff
ata1: status @ 505 ms: 0xff
ata1: status @ 555 ms: 0xff
ata1: status @ 605 ms: 0xff
ata1: status @ 655 ms: 0x7f
ata1: status @ 705 ms: 0x7f
ata1: status @ 755 ms: 0x7f
ata1: status @ 805 ms: 0x7f
ata1: status @ 855 ms: 0x7f
ata1: status @ 905 ms: 0x7f
ata1: status @ 955 ms: 0x7f
ata1: status @ 1005 ms: 0x7f
ATA: abnormal status 0x7F on port 0x20C7
ATA: abnormal status 0x7F on port 0x20C7
ata1.01: ATAPI, max UDMA/66
ata1.01: configured for UDMA/66
scsi1 : ata_piix
ata2: status @ 5 ms: 0x50
ata2: status @ 55 ms: 0x50
ata2: status @ 105 ms: 0x50
ata2: status @ 155 ms: 0x50
ata2: status @ 205 ms: 0x50
ata2: status @ 255 ms: 0x50
ata2: status @ 305 ms: 0x50
ata2: status @ 355 ms: 0x50
ata2: status @ 405 ms: 0x50
ata2: status @ 455 ms: 0x50
ata2: status @ 505 ms: 0x50
ata2: status @ 555 ms: 0x50
ata2: status @ 605 ms: 0x50
ata2: status @ 655 ms: 0x50
ata2: status @ 705 ms: 0x50
ata2: status @ 755 ms: 0x50
ata2: status @ 805 ms: 0x50
ata2: status @ 855 ms: 0x50
ata2: status @ 905 ms: 0x50
ata2: status @ 955 ms: 0x50
ata2: status @ 1005 ms: 0x50
ata2.00: ATA-7, max UDMA/133, 156312576 sectors: LBA
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/133
Vendor: IBM Model: GoVault Rev: 008F
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 156300464 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 74 00 00
sda: cache data unavailable
sda: assuming drive cache: write through
SCSI device sda: 156300464 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 74 00 00
sda: cache data unavailable
sda: assuming drive cache: write through
sda: sda1 sda2 sda3
sd 0:0:1:0: Attached scsi removable disk sda
Vendor: ATA Model: Maxtor 6Y080M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdb: 156312576 512-byte hdwr sectors (80032 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 156312576 512-byte hdwr sectors (80032 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 sdb12 >
sd 1:0:0:0: Attached scsi disk sdb
==========================================================
===================================================
Successful 0xFF check cable placement configuration
===================================================
libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 225
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0x20C0 ctl 0x20BA bmdma 0x2090 irq 225
ata2: SATA max UDMA/133 cmd 0x20B0 ctl 0x20A6 bmdma 0x2098 irq 225
scsi0 : ata_piix
ata1: status @ 5 ms: 0x7f
ata1: status @ 55 ms: 0x7f
ata1: status @ 105 ms: 0x7f
ata1: status @ 155 ms: 0x7f
ata1: status @ 205 ms: 0x7f
ata1: status @ 255 ms: 0x7f
ata1: status @ 305 ms: 0x7f
ata1: status @ 355 ms: 0x7f
ata1: status @ 405 ms: 0x7f
ata1: status @ 455 ms: 0x7f
ata1: status @ 505 ms: 0x7f
ata1: status @ 555 ms: 0x7f
ata1: status @ 605 ms: 0x7f
ata1: status @ 655 ms: 0x7f
ata1: status @ 705 ms: 0x7f
ata1: status @ 755 ms: 0x7f
ata1: status @ 805 ms: 0x7f
ata1: status @ 855 ms: 0x7f
ata1: status @ 905 ms: 0x7f
ata1: status @ 955 ms: 0x7f
ata1: status @ 1005 ms: 0x7f
ATA: abnormal status 0x7F on port 0x20C7
ATA: abnormal status 0x7F on port 0x20C7
ata1.01: ATA-7, max UDMA/133, 156312576 sectors: LBA
ata1.01: ata1: dev 1 multi count 16
ata1.01: configured for UDMA/133
scsi1 : ata_piix
ata2: status @ 5 ms: 0xd0
ata2: status @ 55 ms: 0xd0
ata2: status @ 105 ms: 0xd0
ata2: status @ 155 ms: 0xd0
ata2: status @ 205 ms: 0xd0
ata2: status @ 255 ms: 0xd0
ata2: status @ 305 ms: 0xd0
ata2: status @ 355 ms: 0xd0
ata2: status @ 405 ms: 0xd0
ata2: status @ 455 ms: 0xd0
ata2: status @ 505 ms: 0xd0
ata2: status @ 555 ms: 0xd0
ata2: status @ 605 ms: 0xd0
ata2: status @ 655 ms: 0x0
ata2: status @ 705 ms: 0x0
ata2: status @ 755 ms: 0x0
ata2: status @ 805 ms: 0x0
ata2: status @ 855 ms: 0x0
ata2: status @ 905 ms: 0x0
ata2: status @ 955 ms: 0x0
ata2: status @ 1005 ms: 0x0
ata2.00: ATAPI, max UDMA/66
ata2.00: configured for UDMA/66
Vendor: ATA Model: Maxtor 6Y080M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 >
sd 0:0:1:0: Attached scsi disk sda
Vendor: IBM Model: GoVault Rev: 008F
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdb: 156300464 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 74 00 00
sdb: cache data unavailable
sdb: assuming drive cache: write through
SCSI device sdb: 156300464 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 74 00 00
sdb: cache data unavailable
sdb: assuming drive cache: write through
sdb: sdb1 sdb2 sdb3
sd 1:0:0:0: Attached scsi removable disk sdb
==========================================================
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 3:29 ` Gary Hade
@ 2006-12-20 3:53 ` Tejun Heo
2006-12-20 4:30 ` Tejun Heo
2006-12-21 17:10 ` Gary Hade
0 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2006-12-20 3:53 UTC (permalink / raw)
To: Gary Hade; +Cc: Kovid Goyal, linux-ide, lcm, Jeff Garzik
Howdy,
Gary Hade wrote:
> I noticed that Tejun recently provided a "libata: handle 0xff status
> properly" patch that is now in mainline that improves this code
> re: http://marc.theaimsgroup.com/?l=linux-ide&m=116038642105802&w=2
> but I found that the check still failed but more silently and with no
> retries.
>
> I decided to try increasing the delay that preceeds the above
> check [ msleep(150); ] and found that a change from 150ms to
> 1000ms caused the problem to disappear.
Aieeee, 150ms not enough for the device to send the first FIS after SRST?
> I then replaced the msleep(150); with:
> {
> int i, ms = 5;
> msleep(ms);
> ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
> ms, ata_check_status(ap));
> for (i = 1; i <= 20; i++) {
> ms += 50;
> msleep(50);
> ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
> ms, ata_check_status(ap));
> }
> }
>
> Output for two cable placement configurations (0xFF check failure
> and 0xFF check success) are included below. Note that there are
> cable placement configurations for both the hard drive and
> GoVault where the initial status is 0xff. i.e. both transition
> from 0xff to 0x7f when BSY bit is cleared but it is taking MUCH
> longer for the GoVault (600-700ms for GoVault and <5ms for
> hard drive). It does not appear that the 0xff starting status
> is device specific.
>
> So, it appears that we have a situation with this SATA controller
> where a 0xFF status is not an accurate indication that there is
> no device.
>
> Although the 150ms to 1000ms delay increase works for the GoVault
> device I am not sure if it is the best long term fix for the problem.
I would be surprised if Kovid's sda not detected case is caused by this.
For GoVault (that's SATAPI right?), yeah, maybe. For an ATA disk, no
way (hopefully).
Can you consult with quantum about it? If they verify your fix (ie,
GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
after SRST), I'll push similar patch upstream.
Hmm.. or do we have to wait !BSY here as old IDE did?
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 3:53 ` Tejun Heo
@ 2006-12-20 4:30 ` Tejun Heo
2006-12-21 17:10 ` Gary Hade
1 sibling, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2006-12-20 4:30 UTC (permalink / raw)
Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, Jeff Garzik
Tejun Heo wrote:
> I would be surprised if Kovid's sda not detected case is caused by this.
> For GoVault (that's SATAPI right?), yeah, maybe. For an ATA disk, no
> way (hopefully).
>
> Can you consult with quantum about it? If they verify your fix (ie,
> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> after SRST), I'll push similar patch upstream.
>
> Hmm.. or do we have to wait !BSY here as old IDE did?
More thoughts.
* IIRC, ata_piix reports 0xff if SATA link is broken. 0x80 or something
similar if link established && waiting for the first FIS. Not sure
whether this always holds but it tends to work that way. So, 0xff
status means phy link broken, which shouldn't happen after SRST. Hmmm..
* Recently in ata_bus_softreset() and waiting functions, we've added the
assumption status==0xff indicates empty port. This is taken from the
IDE layer and proven to work for PATA devices. I'm not sure we can
safely use this to SATA devices tho. If PHY is broken or the first FIS
isn't received, arbitrary value (sometimes accidentally) chosen by the
controller is used which may as well be 0xff. But, then again, many
controllers do mean that there is no devices by 0xff status including
ata_piix w/ its now infamous phantom device problems.
I think we should honor 0xff iff SCR on/offline status is not available.
How does that sound?
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 2:13 ` Tejun Heo
@ 2006-12-20 4:56 ` Kovid Goyal
2007-01-11 23:32 ` Kovid Goyal
1 sibling, 0 replies; 29+ messages in thread
From: Kovid Goyal @ 2006-12-20 4:56 UTC (permalink / raw)
To: Tejun Heo, linux-ide
Unfortunately I only have ssh access to the box for the next two weeks, but
after I return I will send the netconsole logs. Thanks for looking into this.
Kovid.
On Tuesday 19 December 2006 18:13, you wrote:
> Kovid Goyal wrote:
> > Yes going back to 2.6.17 does fix it. Also windows and various LiveCDs
> > boot without problems.
> >
> > Here is the output
> > smartctl -d ata -a /dev/sda
> > smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce
> > Allen Home page is http://smartmontools.sourceforge.net/
>
> Thanks, I couldn't find anything indicating hardware problem from the
> smartctl output. Ergh... The 0xff timing out is fixed in the devel
> tree. The problem is why sda sometimes doesn't get detected. Can you
> post the result of failing dmesg? If you've got a second computer,
> doing netconsole (Documentation/networking/netconsole.txt) is the best
> way. If not, just take a picture of the screen while ata1 detection is
> in progress.
--
_____________________________________
Kovid Goyal MC 452-48
California Institute of Technology
1200 E California Blvd
Pasadena, CA 91125
home : +01 626 390 8699
office: +01 626 395 6595 (449 Lauritsen)
email : kovid@theory.caltech.edu
web : http://www.kovidgoyal.net
_____________________________________
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 3:53 ` Tejun Heo
2006-12-20 4:30 ` Tejun Heo
@ 2006-12-21 17:10 ` Gary Hade
2007-01-30 1:55 ` Gary Hade
1 sibling, 1 reply; 29+ messages in thread
From: Gary Hade @ 2006-12-21 17:10 UTC (permalink / raw)
To: Tejun Heo; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, Jeff Garzik
On Wed, Dec 20, 2006 at 12:53:57PM +0900, Tejun Heo wrote:
> Howdy,
>
> Gary Hade wrote:
> > I noticed that Tejun recently provided a "libata: handle 0xff status
> > properly" patch that is now in mainline that improves this code
> > re: http://marc.theaimsgroup.com/?l=linux-ide&m=116038642105802&w=2
> > but I found that the check still failed but more silently and with no
> > retries.
> >
> > I decided to try increasing the delay that preceeds the above
> > check [ msleep(150); ] and found that a change from 150ms to
> > 1000ms caused the problem to disappear.
>
> Aieeee, 150ms not enough for the device to send the first FIS after SRST?
Yea, it appears so. :)
GoVault access via 'ahci' is also fails in some cable placement
configurations with:
kernel: scsi1 : ahci
kernel: ata2: softreset failed (1st FIS failed)
kernel: ata2: softreset failed, retrying in 5 secs
kernel: ata2: port is slow to respond, please be patient
kernel: ata2: port failed to respond (30 secs)
kernel: ata2: COMRESET failed (device not ready)
kernel: ata2: hardreset failed, retrying in 5 secs
kernel: ata2: port is slow to respond, please be patient
kernel: ata2: port failed to respond (30 secs)
kdump: kexec: failed to load kdump kernel
kernel: ata2: COMRESET failed (device not ready)
kernel: ata2: reset failed, giving up
This problem also disappears after reversing ports to which
the hard drive and GoVault cables are connected.
The following timeout increase appears to correct the 'ahci' problem:
--- ./linux-2.6.18.i386/drivers/scsi/ahci.c.orig 2006-12-19 09:07:58.000000000 -0800
+++ ./linux-2.6.18.i386/drivers/scsi/ahci.c 2006-12-19 13:30:29.000000000 -0800
@@ -788,7 +788,7 @@ static int ahci_softreset(struct ata_por
writel(1, port_mmio + PORT_CMD_ISSUE);
- tmp = ata_wait_register(port_mmio + PORT_CMD_ISSUE, 0x1, 0x1, 1, 500);
+ tmp = ata_wait_register(port_mmio + PORT_CMD_ISSUE, 0x1, 0x1, 1, 2500);
if (tmp & 0x1) {
rc = -EIO;
reason = "1st FIS failed";
1000ms, 1500ms, 1750ms, and 1900ms didn't work. 2000ms worked so
2500ms includes some extra to be safe. This experience seems to
be more representative of the 1 to 2 second time (with RDC present)
mentioned by Quantum (see below) than the 'ata_piix' 600-700ms
experience.
>
> > I then replaced the msleep(150); with:
> > {
> > int i, ms = 5;
> > msleep(ms);
> > ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
> > ms, ata_check_status(ap));
> > for (i = 1; i <= 20; i++) {
> > ms += 50;
> > msleep(50);
> > ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
> > ms, ata_check_status(ap));
> > }
> > }
> >
> > Output for two cable placement configurations (0xFF check failure
> > and 0xFF check success) are included below. Note that there are
> > cable placement configurations for both the hard drive and
> > GoVault where the initial status is 0xff. i.e. both transition
> > from 0xff to 0x7f when BSY bit is cleared but it is taking MUCH
> > longer for the GoVault (600-700ms for GoVault and <5ms for
> > hard drive). It does not appear that the 0xff starting status
> > is device specific.
> >
> > So, it appears that we have a situation with this SATA controller
> > where a 0xFF status is not an accurate indication that there is
> > no device.
> >
> > Although the 150ms to 1000ms delay increase works for the GoVault
> > device I am not sure if it is the best long term fix for the problem.
>
> I would be surprised if Kovid's sda not detected case is caused by this.
> For GoVault (that's SATAPI right?), yeah, maybe.
Yes, the GoVault is an ATAPI device.
> For an ATA disk, no way (hopefully).
Yes, probably true that Kovid got the same errors but for a
different reason.
>
> Can you consult with quantum about it?
I checked with Quantum about this and they said:
---
"We confirmed that if there's an RDC present when the soft reset is
received, then it can take between one and two seconds to complete the
reset. Issuing a SET FEATURES command to the RDC is the longest part of
it.
Even without an RDC, we've measured time on the order of 170
milliseconds. "
---
The RDC has been present for almost all of my testing. Here
are comparison traces with and without the RDC which definitely
confirms the RDC factor. It also confirms the order of 170ms
without RDC time that Quantum mentioned.
========
With RDC
========
kernel: ata1: status @ 5 ms: 0xff
kernel: ata1: status @ 55 ms: 0xff
kernel: ata1: status @ 105 ms: 0xff
kernel: ata1: status @ 155 ms: 0xff
kernel: ata1: status @ 205 ms: 0xff
kernel: ata1: status @ 255 ms: 0xff
kernel: ata1: status @ 305 ms: 0xff
kernel: ata1: status @ 355 ms: 0xff
kernel: ata1: status @ 405 ms: 0xff
kernel: ata1: status @ 455 ms: 0xff
kernel: ata1: status @ 505 ms: 0xff
kernel: ata1: status @ 555 ms: 0xff
kernel: ata1: status @ 605 ms: 0xff
kernel: ata1: status @ 655 ms: 0x7f
kernel: ata1: status @ 705 ms: 0x7f
kernel: ata1: status @ 755 ms: 0x7f
kernel: ata1: status @ 805 ms: 0x7f
kernel: ata1: status @ 855 ms: 0x7f
kernel: ata1: status @ 905 ms: 0x7f
kernel: ata1: status @ 955 ms: 0x7f
kernel: ata1: status @ 1005 ms: 0x7f
===========
Without RDC
===========
kernel: ata1: status @ 5 ms: 0xff
kernel: ata1: status @ 55 ms: 0xff
kernel: ata1: status @ 105 ms: 0xff
kernel: ata1: status @ 155 ms: 0xff
kernel: ata1: status @ 205 ms: 0x7f
kernel: ata1: status @ 255 ms: 0x7f
kernel: ata1: status @ 305 ms: 0x7f
kernel: ata1: status @ 355 ms: 0x7f
kernel: ata1: status @ 405 ms: 0x7f
kernel: ata1: status @ 455 ms: 0x7f
kernel: ata1: status @ 505 ms: 0x7f
kernel: ata1: status @ 555 ms: 0x7f
kernel: ata1: status @ 605 ms: 0x7f
kernel: ata1: status @ 655 ms: 0x7f
kernel: ata1: status @ 705 ms: 0x7f
kernel: ata1: status @ 755 ms: 0x7f
kernel: ata1: status @ 805 ms: 0x7f
kernel: ata1: status @ 855 ms: 0x7f
kernel: ata1: status @ 905 ms: 0x7f
kernel: ata1: status @ 955 ms: 0x7f
kernel: ata1: status @ 1005 ms: 0x7f
> If they verify your fix (ie,
> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> after SRST), I'll push similar patch upstream.
Thanks. If you think that changes to increase the delays are
the way to go (at least until we can find a better solution)
I can provide patches.
>
> Hmm.. or do we have to wait !BSY here as old IDE did?
Not sure. I'm fairly new to this stuff.
Thanks!
Gary
--
Gary Hade
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-20 2:13 ` Tejun Heo
2006-12-20 4:56 ` Kovid Goyal
@ 2007-01-11 23:32 ` Kovid Goyal
2007-01-13 2:19 ` Tejun Heo
1 sibling, 1 reply; 29+ messages in thread
From: Kovid Goyal @ 2007-01-11 23:32 UTC (permalink / raw)
To: linux-ide, Tejun Heo
[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]
Hi,
I was finally able to get the netconsole logs for a failing boot. Attached as
log-panic. Sorry for the delay.
Thanks,
Kovid.
On Tuesday 19 December 2006 18:13, you wrote:
> Kovid Goyal wrote:
> > Yes going back to 2.6.17 does fix it. Also windows and various LiveCDs
> > boot without problems.
> >
> > Here is the output
> > smartctl -d ata -a /dev/sda
> > smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce
> > Allen Home page is http://smartmontools.sourceforge.net/
>
> Thanks, I couldn't find anything indicating hardware problem from the
> smartctl output. Ergh... The 0xff timing out is fixed in the devel
> tree. The problem is why sda sometimes doesn't get detected. Can you
> post the result of failing dmesg? If you've got a second computer,
> doing netconsole (Documentation/networking/netconsole.txt) is the best
> way. If not, just take a picture of the screen while ata1 detection is
> in progress.
--
_____________________________________
Kovid Goyal MC 452-48
California Institute of Technology
1200 E California Blvd
Pasadena, CA 91125
home : +01 626 390 8699
office: +01 626 395 6595 (449 Lauritsen)
email : kovid@theory.caltech.edu
web : http://www.kovidgoyal.net
_____________________________________
[-- Attachment #2: log-panic --]
[-- Type: text/plain, Size: 12888 bytes --]
Linux version 2.6.19-gentoo-r3 (root@giskard) (gcc version 4.1.1 (Gentoo 4.1.1-r3)) #1 SMP Sat Jan 6 09:35:46 PST 2007
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fe88c00 (usable)
BIOS-e820: 000000003fe88c00 - 000000003fe8ac00 (ACPI NVS)
BIOS-e820: 000000003fe8ac00 - 000000003fe8cc00 (ACPI data)
BIOS-e820: 000000003fe8cc00 - 0000000040000000 (reserved)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved)
BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
126MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000fe710
Zone PFN ranges:
DMA 0 -> 4096
Normal 4096 -> 229376
HighMem 229376 -> 261768
early_node_map[1] active PFN ranges
0: 0 -> 261768
DMI 2.3 present.
ACPI: PM-Timer IO Port: 0x808
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:3 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:3 APIC version 20
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x06] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x02] disabled)
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x04] disabled)
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x07] disabled)
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x03] disabled)
ACPI: LAPIC (acpi_id[0x08] lapic_id[0x05] disabled)
ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1])
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Enabling APIC mode: Flat. Using 1 I/O APICs
ACPI: HPET id: 0x8086a201 base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:a0000000)
Detected 3192.174 MHz processor.
Built 1 zonelists. Total pages: 259723
Kernel command line: root=/dev/sda5 console=tty1 netconsole=4444@192.168.1.100/eth0,6666@192.168.1.103/00:09:5B:0E:58:EE
netconsole: local port 4444
netconsole: local IP 192.168.1.100
netconsole: interface eth0
netconsole: remote port 6666
netconsole: remote IP 192.168.1.103
netconsole: remote ethernet address 00:09:5b:0e:58:ee
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1033260k/1047072k available (2672k kernel code, 13256k reserved, 872k data, 228k init, 129568k highmem)
virtual kernel memory layout:
fixmap : 0xfff4f000 - 0xfffff000 ( 704 kB)
pkmap : 0xff800000 - 0xffc00000 (4096 kB)
vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB)
lowmem : 0xc0000000 - 0xf8000000 ( 896 MB)
.init : 0xc047f000 - 0xc04b8000 ( 228 kB)
.data : 0xc039c162 - 0xc047618c ( 872 kB)
.text : 0xc0100000 - 0xc039c162 (2672 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
Using HPET for base-timer
Calibrating delay using timer specific routine.. 6388.08 BogoMIPS (lpj=3194042)
Mount-cache hash table entries: 512
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 20k freed
ACPI: Core revision 20060707
CPU0: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 04
Booting processor 1/1 eip 2000
Initializing CPU#1
Calibrating delay using timer specific routine.. 6383.99 BogoMIPS (lpj=3191999)
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 04
Total of 2 processors activated (12772.08 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=296
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using MMCONFIG
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
PCI quirk: region 0800-087f claimed by ICH6 ACPI/GPIO/TCO
PCI quirk: region 0880-08bf claimed by ICH6 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 7 9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 *9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 *10 11 12 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs *3 4 5 6 7 9 10 11 12 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 13 devices
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
pnp: 00:01: ioport range 0x800-0x85f could not be reserved
pnp: 00:01: ioport range 0xc00-0xc7f has been reserved
pnp: 00:01: ioport range 0x860-0x8ff could not be reserved
pnp: 00:0b: ioport range 0x100-0x1fe has been reserved
pnp: 00:0b: ioport range 0x200-0x277 has been reserved
pnp: 00:0b: ioport range 0x280-0x2e7 has been reserved
pnp: 00:0b: ioport range 0x2e8-0x2ef has been reserved
pnp: 00:0b: ioport range 0x2f0-0x2f7 has been reserved
pnp: 00:0b: ioport range 0x2f8-0x2ff has been reserved
pnp: 00:0b: ioport range 0x300-0x377 has been reserved
pnp: 00:0b: ioport range 0x380-0x3bb has been reserved
PCI: Bridge: 0000:00:01.0
IO window: d000-dfff
MEM window: dfe00000-dfffffff
PREFETCH window: c0000000-cfffffff
PCI: Bridge: 0000:00:1c.0
IO window: disabled.
MEM window: dfd00000-dfdfffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1e.0
IO window: c000-cfff
MEM window: dfc00000-dfcfffff
PREFETCH window: disabled.
ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 16
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 16 (level, low) -> IRQ 16
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
hda: PHILIPS DVD+/-RW DVD8631, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache
Uniform CD-ROM driver Revision: 3.20
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 16 (level, low) -> IRQ 16
ata: 0x1F0 IDE port busy
ata: conflict with ide0
ata1: DUMMY
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xFFA8 irq 15
scsi0 : ata_piix
scsi1 : ata_piix
ata2: port disabled. ignoring.
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ACPI: PCI Interrupt 0000:00:1f.2[C] -> GSI 20 (level, low) -> IRQ 17
ata3: SATA max UDMA/133 cmd 0xFE00 ctl 0xFE12 bmdma 0xFEA0 irq 17
ata4: SATA max UDMA/133 cmd 0xFE20 ctl 0xFE32 bmdma 0xFEA8 irq 17
scsi2 : ata_piix
scsi3 : ata_piix
ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 21 (level, low) -> IRQ 18
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: irq 18, io mem 0xffa80800
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 21 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 18, io base 0x0000ff80
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 22 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000ff60
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 20
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.2: irq 20, io base 0x0000ff40
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.3[D] -> GSI 23 (level, low) -> IRQ 21
uhci_hcd 0000:00:1d.3: UHCI Host Controller
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.3: irq 21, io base 0x0000ff20
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
usb 2-1: new full speed USB device using uhci_hcd and address 2
usb 2-1: configuration #1 chosen from 1 choice
usb 2-2: new full speed USB device using uhci_hcd and address 3
usb 2-2: configuration #1 chosen from 1 choice
usb 3-1: new full speed USB device using uhci_hcd and address 2
usb 3-1: configuration #1 chosen from 1 choice
usb 3-2: new low speed USB device using uhci_hcd and address 3
usb 3-2: configuration #1 chosen from 1 choice
usb 4-1: new low speed USB device using uhci_hcd and address 2
usb 4-1: configuration #1 chosen from 1 choice
usb 4-2: new full speed USB device using uhci_hcd and address 3
usb 4-2: configuration #1 chosen from 1 choice
usb 5-1: new full speed USB device using uhci_hcd and address 2
usb 5-1: configuration #1 chosen from 2 choices
drivers/usb/class/usblp.c: usblp0: USB Bidirectional printer dev 3 if 1 alt 0 proto 2 vid 0x03F0 pid 0x3A11
usbcore: registered new interface driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
input: Microsoft Natural® Ergonomic Keyboard 4000 as /class/input/input0
input: USB HID v1.11 Keyboard [Microsoft Natural® Ergonomic Keyboard 4000] on usb-0000:00:1d.2-1
input: Microsoft Natural® Ergonomic Keyboard 4000 as /class/input/input1
input: USB HID v1.11 Device [Microsoft Natural® Ergonomic Keyboard 4000] on usb-0000:00:1d.2-1
usbcore: registered new interface driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
input: Microsoft Natural® Ergonomic Keyboard 4000 as /class/input/input2
The simple driver 'usbnek4k' attach to the device 'Microsoft Natural® Ergonomic Keyboard 4000'
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input3
input: AT Translated Set 2 keyboard as /class/input/input4
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
Starting balanced_irq
Using IPI Shortcut mode
Time: tsc clocksource has been installed.
input: ImExPS/2 Logitech MX Mouse as /class/input/input5
Root-NFS: No NFS server available, giving up.
VFS: Unable to mount root fs via NFS, trying floppy.
VFS: Cannot open root device "sda5" or unknown-block(2,0)
Please append a correct "root=" boot option
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-11 23:32 ` Kovid Goyal
@ 2007-01-13 2:19 ` Tejun Heo
0 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2007-01-13 2:19 UTC (permalink / raw)
To: Kovid Goyal; +Cc: linux-ide
Kovid Goyal wrote:
> ata3: SATA max UDMA/133 cmd 0xFE00 ctl 0xFE12 bmdma 0xFEA0 irq 17
> ata4: SATA max UDMA/133 cmd 0xFE20 ctl 0xFE32 bmdma 0xFEA8 irq 17
> scsi2 : ata_piix
> scsi3 : ata_piix
Hmmm... ata3.00 was /dev/sda1. Presence testing seems to have failed.
Please give a shot at 2.6.20-rc4. We dropped whole presence detection
thing and went back to polling IDENTIFY and that fixed a lot of
detection bugs. I think your problem is one of those.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-11 18:03 [2.6.18,19] SATA boot problems (ICH6/ICH6W) Kovid Goyal
2006-12-20 0:44 ` Tejun Heo
@ 2007-01-23 21:49 ` danieljzhang
1 sibling, 0 replies; 29+ messages in thread
From: danieljzhang @ 2007-01-23 21:49 UTC (permalink / raw)
To: linux-ide
Need help to rebuild new kernel:
I dont know if this is the same problem. Please do help.
I am trying to build kernel
linux-2.6.20-rc2
with patch
gregkh-all-2.6.20-rc2.patch
But system hanged when boot.
The original working kernel is: 2.6.13-15.13-smp. The distribution is SUSE.
According to YaST2, Disk is ST380013AS, Device Name is /dev/sda Driver is
ata_piix.
I used the original .config but add sata as module, usb as kernel build-in.
In /etc/sysconfig/kernel, I add ahci module. Right now INITRD_MODULES="ahci
piix ata_piix reiserfs processor thremal fan"
When booting, there is no FATAL error for loading module. I built in the USB
module as kernel. I still can get keyboard (USB keyboard).
Then it shows:
"Waiting for device /dev/sda2 to appear: ........ resume device /dev/sda2
not found (ignoring)
Loading reiserfs
Waiting for device /dev/sda3 to appear: ..........not found -- exiting to
/bin/sh
I got shell promt "$" anyway. But just pwd cd works.
But I can not find solution after googling and bothering linux guy around.
Please help!
I was told that there is a new SATA driver that I need to build and load.
Anyone can kindly send me a link?
Help, please!
--
View this message in context: http://www.nabble.com/-2.6.18%2C19--SATA-boot-problems-%28ICH6-ICH6W%29-tf2795723.html#a8550219
Sent from the linux-ide mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2006-12-21 17:10 ` Gary Hade
@ 2007-01-30 1:55 ` Gary Hade
2007-01-30 7:32 ` Tejun Heo
0 siblings, 1 reply; 29+ messages in thread
From: Gary Hade @ 2007-01-30 1:55 UTC (permalink / raw)
To: Gary Hade; +Cc: Tejun Heo, Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
On Thu, Dec 21, 2006 at 09:10:35AM -0800, Gary Hade wrote:
> On Wed, Dec 20, 2006 at 12:53:57PM +0900, Tejun Heo wrote:
> > Howdy,
> >
> > Gary Hade wrote:
> > > I noticed that Tejun recently provided a "libata: handle 0xff status
> > > properly" patch that is now in mainline that improves this code
> > > re: http://marc.theaimsgroup.com/?l=linux-ide&m=116038642105802&w=2
> > > but I found that the check still failed but more silently and with no
> > > retries.
> > >
> > > I decided to try increasing the delay that preceeds the above
> > > check [ msleep(150); ] and found that a change from 150ms to
> > > 1000ms caused the problem to disappear.
> >
> > Aieeee, 150ms not enough for the device to send the first FIS after SRST?
>
> Yea, it appears so. :)
>
> GoVault access via 'ahci' is also fails in some cable placement
> configurations with:
> kernel: scsi1 : ahci
> kernel: ata2: softreset failed (1st FIS failed)
> kernel: ata2: softreset failed, retrying in 5 secs
> kernel: ata2: port is slow to respond, please be patient
> kernel: ata2: port failed to respond (30 secs)
> kernel: ata2: COMRESET failed (device not ready)
> kernel: ata2: hardreset failed, retrying in 5 secs
> kernel: ata2: port is slow to respond, please be patient
> kernel: ata2: port failed to respond (30 secs)
> kdump: kexec: failed to load kdump kernel
> kernel: ata2: COMRESET failed (device not ready)
> kernel: ata2: reset failed, giving up
>
> This problem also disappears after reversing ports to which
> the hard drive and GoVault cables are connected.
>
> The following timeout increase appears to correct the 'ahci' problem:
>
> --- ./linux-2.6.18.i386/drivers/scsi/ahci.c.orig 2006-12-19 09:07:58.000000000 -0800
> +++ ./linux-2.6.18.i386/drivers/scsi/ahci.c 2006-12-19 13:30:29.000000000 -0800
> @@ -788,7 +788,7 @@ static int ahci_softreset(struct ata_por
>
> writel(1, port_mmio + PORT_CMD_ISSUE);
>
> - tmp = ata_wait_register(port_mmio + PORT_CMD_ISSUE, 0x1, 0x1, 1, 500);
> + tmp = ata_wait_register(port_mmio + PORT_CMD_ISSUE, 0x1, 0x1, 1, 2500);
> if (tmp & 0x1) {
> rc = -EIO;
> reason = "1st FIS failed";
>
> 1000ms, 1500ms, 1750ms, and 1900ms didn't work. 2000ms worked so
> 2500ms includes some extra to be safe. This experience seems to
> be more representative of the 1 to 2 second time (with RDC present)
> mentioned by Quantum (see below) than the 'ata_piix' 600-700ms
> experience.
>
> >
> > > I then replaced the msleep(150); with:
> > > {
> > > int i, ms = 5;
> > > msleep(ms);
> > > ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
> > > ms, ata_check_status(ap));
> > > for (i = 1; i <= 20; i++) {
> > > ms += 50;
> > > msleep(50);
> > > ata_port_printk(ap, KERN_INFO, "status @ %d ms: 0x%x\n",
> > > ms, ata_check_status(ap));
> > > }
> > > }
> > >
> > > Output for two cable placement configurations (0xFF check failure
> > > and 0xFF check success) are included below. Note that there are
> > > cable placement configurations for both the hard drive and
> > > GoVault where the initial status is 0xff. i.e. both transition
> > > from 0xff to 0x7f when BSY bit is cleared but it is taking MUCH
> > > longer for the GoVault (600-700ms for GoVault and <5ms for
> > > hard drive). It does not appear that the 0xff starting status
> > > is device specific.
> > >
> > > So, it appears that we have a situation with this SATA controller
> > > where a 0xFF status is not an accurate indication that there is
> > > no device.
> > >
> > > Although the 150ms to 1000ms delay increase works for the GoVault
> > > device I am not sure if it is the best long term fix for the problem.
> >
> > I would be surprised if Kovid's sda not detected case is caused by this.
> > For GoVault (that's SATAPI right?), yeah, maybe.
>
> Yes, the GoVault is an ATAPI device.
>
> > For an ATA disk, no way (hopefully).
>
> Yes, probably true that Kovid got the same errors but for a
> different reason.
>
> >
> > Can you consult with quantum about it?
>
> I checked with Quantum about this and they said:
> ---
> "We confirmed that if there's an RDC present when the soft reset is
> received, then it can take between one and two seconds to complete the
> reset. Issuing a SET FEATURES command to the RDC is the longest part of
> it.
>
> Even without an RDC, we've measured time on the order of 170
> milliseconds. "
> ---
>
> The RDC has been present for almost all of my testing. Here
> are comparison traces with and without the RDC which definitely
> confirms the RDC factor. It also confirms the order of 170ms
> without RDC time that Quantum mentioned.
>
> ========
> With RDC
> ========
> kernel: ata1: status @ 5 ms: 0xff
> kernel: ata1: status @ 55 ms: 0xff
> kernel: ata1: status @ 105 ms: 0xff
> kernel: ata1: status @ 155 ms: 0xff
> kernel: ata1: status @ 205 ms: 0xff
> kernel: ata1: status @ 255 ms: 0xff
> kernel: ata1: status @ 305 ms: 0xff
> kernel: ata1: status @ 355 ms: 0xff
> kernel: ata1: status @ 405 ms: 0xff
> kernel: ata1: status @ 455 ms: 0xff
> kernel: ata1: status @ 505 ms: 0xff
> kernel: ata1: status @ 555 ms: 0xff
> kernel: ata1: status @ 605 ms: 0xff
> kernel: ata1: status @ 655 ms: 0x7f
> kernel: ata1: status @ 705 ms: 0x7f
> kernel: ata1: status @ 755 ms: 0x7f
> kernel: ata1: status @ 805 ms: 0x7f
> kernel: ata1: status @ 855 ms: 0x7f
> kernel: ata1: status @ 905 ms: 0x7f
> kernel: ata1: status @ 955 ms: 0x7f
> kernel: ata1: status @ 1005 ms: 0x7f
>
> ===========
> Without RDC
> ===========
> kernel: ata1: status @ 5 ms: 0xff
> kernel: ata1: status @ 55 ms: 0xff
> kernel: ata1: status @ 105 ms: 0xff
> kernel: ata1: status @ 155 ms: 0xff
> kernel: ata1: status @ 205 ms: 0x7f
> kernel: ata1: status @ 255 ms: 0x7f
> kernel: ata1: status @ 305 ms: 0x7f
> kernel: ata1: status @ 355 ms: 0x7f
> kernel: ata1: status @ 405 ms: 0x7f
> kernel: ata1: status @ 455 ms: 0x7f
> kernel: ata1: status @ 505 ms: 0x7f
> kernel: ata1: status @ 555 ms: 0x7f
> kernel: ata1: status @ 605 ms: 0x7f
> kernel: ata1: status @ 655 ms: 0x7f
> kernel: ata1: status @ 705 ms: 0x7f
> kernel: ata1: status @ 755 ms: 0x7f
> kernel: ata1: status @ 805 ms: 0x7f
> kernel: ata1: status @ 855 ms: 0x7f
> kernel: ata1: status @ 905 ms: 0x7f
> kernel: ata1: status @ 955 ms: 0x7f
> kernel: ata1: status @ 1005 ms: 0x7f
>
> > If they verify your fix (ie,
> > GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> > after SRST), I'll push similar patch upstream.
>
> Thanks. If you think that changes to increase the delays are
> the way to go (at least until we can find a better solution)
> I can provide patches.
Tejun,
I haven't heard anything from you on this so I'm including a delay
increase patch against 2.6.20-rc6 for the 'ata-piix' case below.
I hope that you, Jeff, and others find this acceptable.
With respect to the 'ahci' case w/2.6.20-rc6 the GoVault device is
useable following boot although the below messages are being logged
during initialization. Please let me know if you have any thoughts
on this.
scsi1 : ahci
ata2: softreset failed (port busy but CLO unavailable)
ata2: softreset failed, retrying in 5 secs
ata2: port is slow to respond, please be patient (Status 0x80)
ata2: port failed to respond (30 secs, Status 0x80)
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATAPI, max UDMA/66
ata2.00: configured for UDMA/66
Thanks,
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
Controllers such as the ICH6R/ICH6RW may set the status to 0xFF
when software reset is initiated even when the device is present.
Since some removable media devices can take longer than 150ms
to complete the 0xFF status check can fail even when the device
is present. For example, a software reset for the Quantum GoVault
removable hard can take as long as 2 seconds to complete.
This patch eliminates incorrect software reset failures for
slower than normal software reset responders by adding an
additional 2 second wait when a 0xFF status is detected following
the current 150ms wait.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
--- linux-2.6.20-rc6/drivers/ata/libata-core.c.orig 2007-01-24 18:19:28.000000000 -0800
+++ linux-2.6.20-rc6/drivers/ata/libata-core.c 2007-01-29 16:39:34.000000000 -0800
@@ -2683,6 +2683,13 @@ static unsigned int ata_bus_softreset(st
*/
msleep(150);
+ /* For those controllers where the status could start out at
+ * 0xFF even though the device is present we may need to wait
+ * a little longer for slower removable media devices to respond.
+ */
+ if (ata_check_status(ap) == 0xFF)
+ msleep(2000);
+
/* Before we perform post reset processing we want to see if
* the bus shows 0xFF because the odd clown forgets the D7
* pulldown resistor.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-30 1:55 ` Gary Hade
@ 2007-01-30 7:32 ` Tejun Heo
2007-01-30 23:37 ` Gary Hade
0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2007-01-30 7:32 UTC (permalink / raw)
To: Gary Hade; +Cc: Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
Hello, Gary.
Gary Hade wrote:
>>> If they verify your fix (ie,
>>> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
>>> after SRST), I'll push similar patch upstream.
>> Thanks. If you think that changes to increase the delays are
>> the way to go (at least until we can find a better solution)
>> I can provide patches.
>
> Tejun,
> I haven't heard anything from you on this so I'm including a delay
> increase patch against 2.6.20-rc6 for the 'ata-piix' case below.
> I hope that you, Jeff, and others find this acceptable.
Sorry about being unresponsive. The thing is that the change adds
unnecessary 2 secs of delay to a lot of other normal device-not-present
cases, so I was hesitant to ack the patch. I'll give it more thoughts
(and respond timely this time :-)
> With respect to the 'ahci' case w/2.6.20-rc6 the GoVault device is
> useable following boot although the below messages are being logged
> during initialization. Please let me know if you have any thoughts
> on this.
> scsi1 : ahci
> ata2: softreset failed (port busy but CLO unavailable)
> ata2: softreset failed, retrying in 5 secs
> ata2: port is slow to respond, please be patient (Status 0x80)
> ata2: port failed to respond (30 secs, Status 0x80)
> ata2: COMRESET failed (device not ready)
> ata2: hardreset failed, retrying in 5 secs
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata2.00: ATAPI, max UDMA/66
> ata2.00: configured for UDMA/66
The above should have been fixed in 2.6.20-rc6. Please test it. It was
caused by the ahci driver incorrectly clearing ahci CAP register and
fixed recently.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-30 7:32 ` Tejun Heo
@ 2007-01-30 23:37 ` Gary Hade
2007-01-31 0:54 ` Jeff Garzik
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Gary Hade @ 2007-01-30 23:37 UTC (permalink / raw)
To: Tejun Heo; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
On Tue, Jan 30, 2007 at 04:32:34PM +0900, Tejun Heo wrote:
> Hello, Gary.
>
> Gary Hade wrote:
> >>> If they verify your fix (ie,
> >>> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> >>> after SRST), I'll push similar patch upstream.
> >> Thanks. If you think that changes to increase the delays are
> >> the way to go (at least until we can find a better solution)
> >> I can provide patches.
> >
> > Tejun,
> > I haven't heard anything from you on this so I'm including a delay
> > increase patch against 2.6.20-rc6 for the 'ata-piix' case below.
> > I hope that you, Jeff, and others find this acceptable.
>
> Sorry about being unresponsive. The thing is that the change adds
> unnecessary 2 secs of delay to a lot of other normal device-not-present
> cases, so I was hesitant to ack the patch. I'll give it more thoughts
> (and respond timely this time :-)
Thanks! My followup was untimely so we're even. :-)
Some of my random thoughts:
There does appear to be this invalid assumption that 0xFF status
always implies device-not-present. The status register access
restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The
contents of this register, except for BSY, shall be ignored when
BSY is set to one." which the code does not honor. There is apparently
past experience that 0xFF status implies device-not-present for some
controllers (the odd clowns :) but I have no idea how common these are.
We obviously can't get rid of the check but since we cannot clear
the read-only status register and there appears to be no specification
dictated upper limit on how long it should take for a software reset to
complete it just seems like we need to wait long enough to support the
slowest known device which may be the GoVault.
>
> > With respect to the 'ahci' case w/2.6.20-rc6 the GoVault device is
> > useable following boot although the below messages are being logged
> > during initialization. Please let me know if you have any thoughts
> > on this.
> > scsi1 : ahci
> > ata2: softreset failed (port busy but CLO unavailable)
> > ata2: softreset failed, retrying in 5 secs
> > ata2: port is slow to respond, please be patient (Status 0x80)
> > ata2: port failed to respond (30 secs, Status 0x80)
> > ata2: COMRESET failed (device not ready)
> > ata2: hardreset failed, retrying in 5 secs
> > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > ata2.00: ATAPI, max UDMA/66
> > ata2.00: configured for UDMA/66
>
> The above should have been fixed in 2.6.20-rc6. Please test it. It was
> caused by the ahci driver incorrectly clearing ahci CAP register and
> fixed recently.
I'm clearly seeing this with 2.6.20-rc6 but unlike the ata-piix
issue it does not appear to be dependent on the port to which the
device is attached. I've been playing around with this today and
found that it could be solved by inserting a delay between the
ahci_stop_engine() call and BSY/DRQ check.
This change:
--- linux-2.6.20-rc6/drivers/ata/ahci.c.orig 2007-01-30 11:01:20.000000000 -0800
+++ linux-2.6.20-rc6/drivers/ata/ahci.c 2007-01-30 12:59:38.000000000 -0800
@@ -804,6 +804,19 @@ static int ahci_softreset(struct ata_por
goto fail_restart;
}
+ {
+ int delay;
+ u8 stat;
+ for (delay = 0; delay < 2000; delay+=100) {
+ if (!(ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)))
+ break;
+ msleep(100);
+ stat = ahci_check_status(ap);
+ ata_port_printk(ap, KERN_INFO, "delay=%d BSY=%d DRQ=%d\n",
+ delay, (stat & ATA_BUSY)?1:0, (stat & ATA_DRQ)?1:0);
+ }
+ }
+
/* check BUSY/DRQ, perform Command List Override if necessary */
if (ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)) {
rc = ahci_clo(ap);
Yielded this output both with and without the RDC inserted:
scsi1 : ahci
ata2: delay=0 BSY=1 DRQ=0
ata2: delay=100 BSY=1 DRQ=0
ata2: delay=200 BSY=1 DRQ=0
ata2: delay=300 BSY=1 DRQ=0
ata2: delay=400 BSY=1 DRQ=0
ata2: delay=500 BSY=1 DRQ=0
ata2: delay=600 BSY=1 DRQ=0
ata2: delay=700 BSY=1 DRQ=0
ata2: delay=800 BSY=1 DRQ=0
ata2: delay=900 BSY=1 DRQ=0
ata2: delay=1000 BSY=1 DRQ=0
ata2: delay=1100 BSY=1 DRQ=0
ata2: delay=1200 BSY=0 DRQ=0
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATAPI, max UDMA/66
ata2.00: configured for UDMA/66
So it appears that we may also have a similar device slowness issue
with this driver.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-30 23:37 ` Gary Hade
@ 2007-01-31 0:54 ` Jeff Garzik
2007-01-31 11:00 ` Tejun Heo
2007-01-31 10:44 ` Tejun Heo
2007-02-17 0:34 ` Gary Hade
2 siblings, 1 reply; 29+ messages in thread
From: Jeff Garzik @ 2007-01-31 0:54 UTC (permalink / raw)
To: Gary Hade; +Cc: Tejun Heo, Kovid Goyal, linux-ide, lcm, konradr
Gary Hade wrote:
> Some of my random thoughts:
> There does appear to be this invalid assumption that 0xFF status
> always implies device-not-present. The status register access
> restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The
> contents of this register, except for BSY, shall be ignored when
> BSY is set to one." which the code does not honor. There is apparently
> past experience that 0xFF status implies device-not-present for some
> controllers (the odd clowns :) but I have no idea how common these are.
> We obviously can't get rid of the check but since we cannot clear
> the read-only status register and there appears to be no specification
> dictated upper limit on how long it should take for a software reset to
> complete it just seems like we need to wait long enough to support the
> slowest known device which may be the GoVault.
SRST is specified to take no longer than 31 seconds from the clearing to
the SRST bit to the clearing of the BSY bit.
Look through the software reset protocol documentation (ATA/ATAPI-7
volume 2), mainly the device state machines.
Jeff
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-30 23:37 ` Gary Hade
2007-01-31 0:54 ` Jeff Garzik
@ 2007-01-31 10:44 ` Tejun Heo
2007-01-31 10:47 ` Jeff Garzik
2007-02-01 0:49 ` Gary Hade
2007-02-17 0:34 ` Gary Hade
2 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2007-01-31 10:44 UTC (permalink / raw)
To: Gary Hade; +Cc: Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
Gary Hade wrote:
> Some of my random thoughts:
> There does appear to be this invalid assumption that 0xFF status
> always implies device-not-present. The status register access
> restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The
> contents of this register, except for BSY, shall be ignored when
> BSY is set to one." which the code does not honor. There is apparently
> past experience that 0xFF status implies device-not-present for some
> controllers (the odd clowns :) but I have no idea how common these are.
The 0xff is the value we get when there is no device present and the
motherboard manufacturer forgot to pull down the ATA bus. It's not very
uncommon in cheap PATA world and, following the weird tradition, some
SATA controllers choose to emulate 0xff if there is no device attached
(link not established). Not sure how many of them does it but intel's
SATA chipset is one of them, so we're pretty much stuck with it.
ie. In many P/SATA setups, your patch would add 2 extra secs of waiting
for empty ports.
> We obviously can't get rid of the check but since we cannot clear
> the read-only status register and there appears to be no specification
> dictated upper limit on how long it should take for a software reset to
> complete it just seems like we need to wait long enough to support the
> slowest known device which may be the GoVault.
Agreed but still hesitant to ack the patch. :-)
I'm gonna work on parallel probing for libata. I think we can easily
hide extra 2 secs of waiting with parallel probing. It will take some
time but that seems to be the 'right' thing to do especially considering
the fact that 150ms sleep has been enough for gazillions of ATA devices
during last decade except for this GoVault drive.
I'll leave this thread in my to-do folder and apply your patch after
parallel probing is in place (optimistic ETA 1 month). How does that sound?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 10:44 ` Tejun Heo
@ 2007-01-31 10:47 ` Jeff Garzik
2007-01-31 11:00 ` Tejun Heo
2007-02-01 0:49 ` Gary Hade
1 sibling, 1 reply; 29+ messages in thread
From: Jeff Garzik @ 2007-01-31 10:47 UTC (permalink / raw)
To: Tejun Heo; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
Tejun Heo wrote:
> I'm gonna work on parallel probing for libata. I think we can easily
> hide extra 2 secs of waiting with parallel probing. It will take some
> time but that seems to be the 'right' thing to do especially considering
> the fact that 150ms sleep has been enough for gazillions of ATA devices
> during last decade except for this GoVault drive.
FWIW I don't mind polling for a /maximum/ of $N seconds, but I want to
stress that any patch that blatantly adds stuff like
if (status == 0xff)
mdelay(2000)
will be rejected.
Jeff
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 0:54 ` Jeff Garzik
@ 2007-01-31 11:00 ` Tejun Heo
2007-01-31 12:20 ` Alan
0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2007-01-31 11:00 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
Jeff Garzik wrote:
> SRST is specified to take no longer than 31 seconds from the clearing to
> the SRST bit to the clearing of the BSY bit.
>
> Look through the software reset protocol documentation (ATA/ATAPI-7
> volume 2), mainly the device state machines.
Jeff, this is different delay. We're short circuiting the 31sec wait
when the reported status is 0xff to avoid wasting half a minute on port
which has no device but is showing 0xff because bus datalines are not
pulled down (PATA) or the specific controller is made to emulate 0xff
when link isn't established yet (SATA).
This never matters for PATA devices because no sane device reports 0xff
as the status register value regardless of its state - 0xff is the
special value indicating no device attached and data bus is floating,
and 150ms is more than enough to allow the device to report its status
register value (the spec says >= 2ms).
Some SATA controllers use 0xff to indicate empty port. This seldomly
matters as we have the almighty SStatus register to check device
presence (there is a bug regarding this, patch pending).
This GoVault drive fails because ata_piix doesn't have SCR while using
0xff to indicate port not ready (dunno exact which state causes 0xff
status tho) while the GoVault drive fails to clear that state in 150ms
(not 30s). The libata sees 0xff after SRST if GoVault drive is attached
and thinks that the port is empty. So, I'm afraid there is no easy way
out here but to wait longer for 0xff to clear. As Kovid suggested just
a few secs should be enough.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 10:47 ` Jeff Garzik
@ 2007-01-31 11:00 ` Tejun Heo
0 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2007-01-31 11:00 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
Jeff Garzik wrote:
> Tejun Heo wrote:
>> I'm gonna work on parallel probing for libata. I think we can easily
>> hide extra 2 secs of waiting with parallel probing. It will take some
>> time but that seems to be the 'right' thing to do especially considering
>> the fact that 150ms sleep has been enough for gazillions of ATA devices
>> during last decade except for this GoVault drive.
>
>
> FWIW I don't mind polling for a /maximum/ of $N seconds, but I want to
> stress that any patch that blatantly adds stuff like
>
> if (status == 0xff)
> mdelay(2000)
>
> will be rejected.
Aye aye captain. I wouldn't do that myself either.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 11:00 ` Tejun Heo
@ 2007-01-31 12:20 ` Alan
2007-01-31 13:16 ` Tejun Heo
0 siblings, 1 reply; 29+ messages in thread
From: Alan @ 2007-01-31 12:20 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jeff Garzik, Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
> Some SATA controllers use 0xff to indicate empty port. This seldomly
> matters as we have the almighty SStatus register to check device
> presence (there is a bug regarding this, patch pending).
>
> This GoVault drive fails because ata_piix doesn't have SCR while using
> 0xff to indicate port not ready (dunno exact which state causes 0xff
> status tho) while the GoVault drive fails to clear that state in 150ms
> (not 30s). The libata sees 0xff after SRST if GoVault drive is attached
So we can also cut this down by only doing the extra polling on a device
which is SATA and lacks SCR ?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 12:20 ` Alan
@ 2007-01-31 13:16 ` Tejun Heo
2007-01-31 15:24 ` Jeff Garzik
0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2007-01-31 13:16 UTC (permalink / raw)
To: Alan; +Cc: Jeff Garzik, Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
Alan wrote:
>> Some SATA controllers use 0xff to indicate empty port. This seldomly
>> matters as we have the almighty SStatus register to check device
>> presence (there is a bug regarding this, patch pending).
>>
>> This GoVault drive fails because ata_piix doesn't have SCR while using
>> 0xff to indicate port not ready (dunno exact which state causes 0xff
>> status tho) while the GoVault drive fails to clear that state in 150ms
>> (not 30s). The libata sees 0xff after SRST if GoVault drive is attached
>
> So we can also cut this down by only doing the extra polling on a device
> which is SATA and lacks SCR ?
That's true but the offending one is ata_piix, so the cutting down is
not as effective. If we can live with the extra two secs per empty port
for some of ata_piix for the time being (maybe two or three releases),
the delay can be added now. One more thing to consider is GoVault is
the only known device to show this behavior till now.
Hmm... What do you (Alan and Jeff) think?
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 13:16 ` Tejun Heo
@ 2007-01-31 15:24 ` Jeff Garzik
2007-01-31 15:30 ` Mark Lord
0 siblings, 1 reply; 29+ messages in thread
From: Jeff Garzik @ 2007-01-31 15:24 UTC (permalink / raw)
To: Tejun Heo; +Cc: Alan, Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
Tejun Heo wrote:
> Alan wrote:
>>> Some SATA controllers use 0xff to indicate empty port. This seldomly
>>> matters as we have the almighty SStatus register to check device
>>> presence (there is a bug regarding this, patch pending).
>>>
>>> This GoVault drive fails because ata_piix doesn't have SCR while using
>>> 0xff to indicate port not ready (dunno exact which state causes 0xff
>>> status tho) while the GoVault drive fails to clear that state in 150ms
>>> (not 30s). The libata sees 0xff after SRST if GoVault drive is attached
>> So we can also cut this down by only doing the extra polling on a device
>> which is SATA and lacks SCR ?
>
> That's true but the offending one is ata_piix, so the cutting down is
> not as effective. If we can live with the extra two secs per empty port
> for some of ata_piix for the time being (maybe two or three releases),
> the delay can be added now. One more thing to consider is GoVault is
> the only known device to show this behavior till now.
>
> Hmm... What do you (Alan and Jeff) think?
That last factor weighs on my mind.
While I don't mind making changes for this device, and taking into
consideration Alan's recent comments that some ATAPI workarounds are
still yet to appear for libata, I still dislike making changes for one
specific device with non-standard behavior.
Jeff
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 15:24 ` Jeff Garzik
@ 2007-01-31 15:30 ` Mark Lord
0 siblings, 0 replies; 29+ messages in thread
From: Mark Lord @ 2007-01-31 15:30 UTC (permalink / raw)
To: Jeff Garzik
Cc: Tejun Heo, Alan, Gary Hade, Kovid Goyal, linux-ide, lcm, konradr
Jeff Garzik wrote:
> Tejun Heo wrote:
>> Alan wrote:
>>>> Some SATA controllers use 0xff to indicate empty port. This seldomly
>>>> matters as we have the almighty SStatus register to check device
>>>> presence (there is a bug regarding this, patch pending).
>>>>
>>>> This GoVault drive fails because ata_piix doesn't have SCR while using
>>>> 0xff to indicate port not ready (dunno exact which state causes 0xff
>>>> status tho) while the GoVault drive fails to clear that state in 150ms
>>>> (not 30s). The libata sees 0xff after SRST if GoVault drive is
>>>> attached
>>> So we can also cut this down by only doing the extra polling on a device
>>> which is SATA and lacks SCR ?
>>
>> That's true but the offending one is ata_piix, so the cutting down is
>> not as effective. If we can live with the extra two secs per empty port
..
> While I don't mind making changes for this device, and taking into
> consideration Alan's recent comments that some ATAPI workarounds are
> still yet to appear for libata, I still dislike making changes for one
> specific device with non-standard behavior.
How does drivers/ide manage with this device, or does it?
It would be useful here to patch the PCI ID into drivers/ide
for that PIIX variant, and try it.. just to see what the
behaviour is.
There are other possible ways to avoid a 2-second per-port delay at boot.
Eg. by attempting write+readback on some of the other registers.
It all sounds very messy, but that's the ATA/ATAPI world.
Cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-31 10:44 ` Tejun Heo
2007-01-31 10:47 ` Jeff Garzik
@ 2007-02-01 0:49 ` Gary Hade
1 sibling, 0 replies; 29+ messages in thread
From: Gary Hade @ 2007-02-01 0:49 UTC (permalink / raw)
To: Tejun Heo; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
On Wed, Jan 31, 2007 at 07:44:43PM +0900, Tejun Heo wrote:
> Gary Hade wrote:
> > Some of my random thoughts:
> > There does appear to be this invalid assumption that 0xFF status
> > always implies device-not-present. The status register access
> > restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The
> > contents of this register, except for BSY, shall be ignored when
> > BSY is set to one." which the code does not honor. There is apparently
> > past experience that 0xFF status implies device-not-present for some
> > controllers (the odd clowns :) but I have no idea how common these are.
>
> The 0xff is the value we get when there is no device present and the
> motherboard manufacturer forgot to pull down the ATA bus. It's not very
> uncommon in cheap PATA world and, following the weird tradition, some
> SATA controllers choose to emulate 0xff if there is no device attached
> (link not established). Not sure how many of them does it but intel's
> SATA chipset is one of them, so we're pretty much stuck with it.
>
> ie. In many P/SATA setups, your patch would add 2 extra secs of waiting
> for empty ports.
>
> > We obviously can't get rid of the check but since we cannot clear
> > the read-only status register and there appears to be no specification
> > dictated upper limit on how long it should take for a software reset to
> > complete it just seems like we need to wait long enough to support the
> > slowest known device which may be the GoVault.
>
> Agreed but still hesitant to ack the patch. :-)
>
> I'm gonna work on parallel probing for libata. I think we can easily
> hide extra 2 secs of waiting with parallel probing. It will take some
> time but that seems to be the 'right' thing to do especially considering
> the fact that 150ms sleep has been enough for gazillions of ATA devices
> during last decade except for this GoVault drive.
>
> I'll leave this thread in my to-do folder and apply your patch after
> parallel probing is in place (optimistic ETA 1 month). How does that sound?
Thanks. Solution sounds great. ETA may be an issue since
we need to get the device enabled in a near-term release.
I'll let you know.
Polling version included below. GoVault appeared to be the only
currently known device that would benefit from polling so I
didn't think the extra code was necessary. After considering
Jeff's comment, it makes sense prepare for the possibility
of those future slow but not quite so slow devices.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
Controllers such as the ICH6R/ICH6RW may set the status to 0xFF
when software reset is initiated even when the device is present.
Since some removable media devices can take longer than 150ms
to complete the 0xFF status check can fail even when the device
is present. For example, a software reset for the Quantum GoVault
removable hard can take as long as 2 seconds to complete.
This patch eliminates incorrect software reset failures for
slower than normal software reset responders by adding an
additional 2 second wait when a 0xFF status is detected following
the current 150ms wait.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
--- linux-2.6.20-rc6/drivers/ata/libata-core.c.orig 2007-01-24 18:19:28.000000000 -0800
+++ linux-2.6.20-rc6/drivers/ata/libata-core.c 2007-01-31 12:00:52.000000000 -0800
@@ -2653,6 +2653,8 @@ static unsigned int ata_bus_softreset(st
unsigned int devmask)
{
struct ata_ioports *ioaddr = &ap->ioaddr;
+ u8 status;
+ unsigned long timeout;
DPRINTK("ata%u: bus reset via SRST\n", ap->id);
@@ -2683,11 +2685,22 @@ static unsigned int ata_bus_softreset(st
*/
msleep(150);
+ /* For those controllers where the status could start out at
+ * 0xFF even though the device is present we wait up to 2 seconds
+ * longer for slower removable media devices to respond.
+ */
+ status = ata_check_status(ap);
+ timeout = jiffies + 2*HZ;
+ while (status == 0xFF && time_before(jiffies, timeout)) {
+ msleep(50);
+ status = ata_check_status(ap);
+ }
+
/* Before we perform post reset processing we want to see if
* the bus shows 0xFF because the odd clown forgets the D7
* pulldown resistor.
*/
- if (ata_check_status(ap) == 0xFF)
+ if (status == 0xFF)
return 0;
ata_bus_post_reset(ap, devmask);
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-01-30 23:37 ` Gary Hade
2007-01-31 0:54 ` Jeff Garzik
2007-01-31 10:44 ` Tejun Heo
@ 2007-02-17 0:34 ` Gary Hade
2007-02-21 12:40 ` Tejun Heo
2 siblings, 1 reply; 29+ messages in thread
From: Gary Hade @ 2007-02-17 0:34 UTC (permalink / raw)
To: Gary Hade; +Cc: Tejun Heo, Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
On Tue, Jan 30, 2007 at 03:37:36PM -0800, Gary Hade wrote:
> On Tue, Jan 30, 2007 at 04:32:34PM +0900, Tejun Heo wrote:
> > Hello, Gary.
> >
> > Gary Hade wrote:
> > >>> If they verify your fix (ie,
> > >>> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> > >>> after SRST), I'll push similar patch upstream.
> > >> Thanks. If you think that changes to increase the delays are
> > >> the way to go (at least until we can find a better solution)
> > >> I can provide patches.
> > >
> > > Tejun,
> > > I haven't heard anything from you on this so I'm including a delay
> > > increase patch against 2.6.20-rc6 for the 'ata-piix' case below.
> > > I hope that you, Jeff, and others find this acceptable.
> >
> > Sorry about being unresponsive. The thing is that the change adds
> > unnecessary 2 secs of delay to a lot of other normal device-not-present
> > cases, so I was hesitant to ack the patch. I'll give it more thoughts
> > (and respond timely this time :-)
>
> Thanks! My followup was untimely so we're even. :-)
>
> Some of my random thoughts:
> There does appear to be this invalid assumption that 0xFF status
> always implies device-not-present. The status register access
> restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The
> contents of this register, except for BSY, shall be ignored when
> BSY is set to one." which the code does not honor. There is apparently
> past experience that 0xFF status implies device-not-present for some
> controllers (the odd clowns :) but I have no idea how common these are.
> We obviously can't get rid of the check but since we cannot clear
> the read-only status register and there appears to be no specification
> dictated upper limit on how long it should take for a software reset to
> complete it just seems like we need to wait long enough to support the
> slowest known device which may be the GoVault.
>
> >
> > > With respect to the 'ahci' case w/2.6.20-rc6 the GoVault device is
> > > useable following boot although the below messages are being logged
> > > during initialization. Please let me know if you have any thoughts
> > > on this.
> > > scsi1 : ahci
> > > ata2: softreset failed (port busy but CLO unavailable)
> > > ata2: softreset failed, retrying in 5 secs
> > > ata2: port is slow to respond, please be patient (Status 0x80)
> > > ata2: port failed to respond (30 secs, Status 0x80)
> > > ata2: COMRESET failed (device not ready)
> > > ata2: hardreset failed, retrying in 5 secs
> > > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > > ata2.00: ATAPI, max UDMA/66
> > > ata2.00: configured for UDMA/66
> >
> > The above should have been fixed in 2.6.20-rc6. Please test it. It was
> > caused by the ahci driver incorrectly clearing ahci CAP register and
> > fixed recently.
>
> I'm clearly seeing this with 2.6.20-rc6 but unlike the ata-piix
> issue it does not appear to be dependent on the port to which the
> device is attached. I've been playing around with this today and
> found that it could be solved by inserting a delay between the
> ahci_stop_engine() call and BSY/DRQ check.
>
> This change:
> --- linux-2.6.20-rc6/drivers/ata/ahci.c.orig 2007-01-30 11:01:20.000000000 -0800
> +++ linux-2.6.20-rc6/drivers/ata/ahci.c 2007-01-30 12:59:38.000000000 -0800
> @@ -804,6 +804,19 @@ static int ahci_softreset(struct ata_por
> goto fail_restart;
> }
>
> + {
> + int delay;
> + u8 stat;
> + for (delay = 0; delay < 2000; delay+=100) {
> + if (!(ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)))
> + break;
> + msleep(100);
> + stat = ahci_check_status(ap);
> + ata_port_printk(ap, KERN_INFO, "delay=%d BSY=%d DRQ=%d\n",
> + delay, (stat & ATA_BUSY)?1:0, (stat & ATA_DRQ)?1:0);
> + }
> + }
> +
> /* check BUSY/DRQ, perform Command List Override if necessary */
> if (ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)) {
> rc = ahci_clo(ap);
>
> Yielded this output both with and without the RDC inserted:
> scsi1 : ahci
> ata2: delay=0 BSY=1 DRQ=0
> ata2: delay=100 BSY=1 DRQ=0
> ata2: delay=200 BSY=1 DRQ=0
> ata2: delay=300 BSY=1 DRQ=0
> ata2: delay=400 BSY=1 DRQ=0
> ata2: delay=500 BSY=1 DRQ=0
> ata2: delay=600 BSY=1 DRQ=0
> ata2: delay=700 BSY=1 DRQ=0
> ata2: delay=800 BSY=1 DRQ=0
> ata2: delay=900 BSY=1 DRQ=0
> ata2: delay=1000 BSY=1 DRQ=0
> ata2: delay=1100 BSY=1 DRQ=0
> ata2: delay=1200 BSY=0 DRQ=0
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata2.00: ATAPI, max UDMA/66
> ata2.00: configured for UDMA/66
>
> So it appears that we may also have a similar device slowness issue
> with this driver.
Tejun,
I instrumented the code and found that for the SATA hard drive BSY was set
just before the call to ahci_init_port() from ahci_port_start() and clear
after the return from ahci_init_port(). For the GoVault BSY was still set
after the return from ahci_init_port() and remained set for almost 2 seconds.
The below patch which gives BSY some extra time to clear repairs the problem.
Unlike the extra delay for ata-piix needed by GoVault I believe this delay
will only be seen for attached devices that need it. Please let me know
what you think.
Thanks.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
We encountered a problem where the BSY status bit is still
set on entry to the 'ahci' error handler during initialization
of the Quantum GoVault when attached to an ICH6R/ICH6RW controller.
This caused a software reset failure due to failed BSY/DRQ check
forcing a hard reset with the following messages logged.
ata1: softreset failed (port busy but CLO unavailable)
ata1: softreset failed, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ata1: COMRESET failed (device not ready)
ata1: hardreset failed, retrying in 5 secs
It was taking almost 2 seconds for BSY to clear following the
return from ahci_init_port() in ahci_port_start() so this patch
gives BSY up to 3 seconds extra time to clear eliminating the
problem.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
--- linux-2.6.20-rc7/drivers/ata/ahci.c.orig 2007-02-16 10:11:21.000000000 -0800
+++ linux-2.6.20-rc7/drivers/ata/ahci.c 2007-02-16 13:23:04.000000000 -0800
@@ -1423,6 +1423,8 @@ static int ahci_port_start(struct ata_po
void *mem;
dma_addr_t mem_dma;
int rc;
+ u8 status;
+ unsigned long timeout;
pp = kmalloc(sizeof(*pp), GFP_KERNEL);
if (!pp)
@@ -1477,6 +1479,17 @@ static int ahci_port_start(struct ata_po
/* initialize port */
ahci_init_port(port_mmio, hpriv->cap, pp->cmd_slot_dma, pp->rx_fis_dma);
+ status = ahci_check_status(ap);
+
+ /* for some devices we need to delay to allow BSY to clear */
+ if (status & ATA_BUSY) {
+ timeout = jiffies + 3*HZ;
+ while ((status & ATA_BUSY) && time_before(jiffies, timeout)) {
+ msleep(50);
+ status = ahci_check_status(ap);
+ }
+ }
+
return 0;
}
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-02-17 0:34 ` Gary Hade
@ 2007-02-21 12:40 ` Tejun Heo
2007-02-22 0:41 ` Gary Hade
2007-02-23 0:32 ` Gary Hade
0 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2007-02-21 12:40 UTC (permalink / raw)
To: Gary Hade; +Cc: Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
Hello, Gary.
Gary Hade wrote:
> I instrumented the code and found that for the SATA hard drive BSY was set
> just before the call to ahci_init_port() from ahci_port_start() and clear
> after the return from ahci_init_port(). For the GoVault BSY was still set
> after the return from ahci_init_port() and remained set for almost 2 seconds.
>
> The below patch which gives BSY some extra time to clear repairs the problem.
> Unlike the extra delay for ata-piix needed by GoVault I believe this delay
> will only be seen for attached devices that need it. Please let me know
> what you think.
Can you post full dmesg without the patch? And which controller are you
using (lspci -nn please)? How come it doesn't support CLO?
I don't think the patch is a good idea. !BSY wait before reset is
supposed to be done in ->prereset() if possible. ahci skips that step
because waiting for !BSY isn't reliable after hotplug and it often
wastes 30secs for no good reason after an hotplug event.
Even if you need to add that specific shorter wait, the correct place
would be ->prereset() not at the end of ->port_start().
I don't think fixing here and there for GoVault drive is a good idea.
It's not gonna be tested widely and GoValut would end up working on some
controllers while broken on others. I think recently posted EH
improvements and faster reset patches should help here. The first reset
will fail timely and all in all the drive should be detected in slightly
over ten secs, which isn't that bad and actually is okay if parallel
probing is implemented.
I dunno. Maybe the correct thing to do is wait briefly for !BSY in
std_prereset() when SKIP_D2H_BSY is set. What status value does the
controller report in that state - 0x80 or 0xff?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-02-21 12:40 ` Tejun Heo
@ 2007-02-22 0:41 ` Gary Hade
2007-02-23 0:32 ` Gary Hade
1 sibling, 0 replies; 29+ messages in thread
From: Gary Hade @ 2007-02-22 0:41 UTC (permalink / raw)
To: Tejun Heo; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
Hi Tejun,
Thanks for the help.
On Wed, Feb 21, 2007 at 09:40:10PM +0900, Tejun Heo wrote:
> Hello, Gary.
>
> Gary Hade wrote:
> > I instrumented the code and found that for the SATA hard drive BSY was set
> > just before the call to ahci_init_port() from ahci_port_start() and clear
> > after the return from ahci_init_port(). For the GoVault BSY was still set
> > after the return from ahci_init_port() and remained set for almost 2 seconds.
> >
> > The below patch which gives BSY some extra time to clear repairs the problem.
> > Unlike the extra delay for ata-piix needed by GoVault I believe this delay
> > will only be seen for attached devices that need it. Please let me know
> > what you think.
>
> Can you post full dmesg without the patch?
Included below.
> And which controller are you using (lspci -nn please)?
[root@elm3a114 ~]# lspci -s 00:1f.2
00:1f.2 IDE interface: Intel Corporation 82801FR/FRW (ICH6R/ICH6RW) SATA Controller (rev 03)
[root@elm3a114 ~]# lspci -s 00:1f.2 -nn
00:1f.2 0101: 8086:2652 (rev 03)
[root@elm3a114 ~]# lspci -s 00:1f.2 -nvv
00:1f.2 0101: 8086:2652 (rev 03) (prog-if 8f)
Subsystem: 1014:2652
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin B routed to IRQ 19
Region 0: I/O ports at 20c0 [size=8]
Region 1: I/O ports at 20b8 [size=4]
Region 2: I/O ports at 20b0 [size=8]
Region 3: I/O ports at 20a4 [size=4]
Region 4: I/O ports at 2090 [size=16]
Region 5: Memory at d8500c00 (32-bit, non-prefetchable) [size=1K]
Capabilities: [70] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> How come it doesn't support CLO?
I don't know.
>
> I don't think the patch is a good idea.
No problem. I asked for your opinion. :)
> !BSY wait before reset is
> supposed to be done in ->prereset() if possible. ahci skips that step
> because waiting for !BSY isn't reliable after hotplug and it often
> wastes 30secs for no good reason after an hotplug event.
>
> Even if you need to add that specific shorter wait, the correct place
> would be ->prereset() not at the end of ->port_start().
Yea, I did see that wait for !BSY in ata_std_prereset() and remember
wondering why it was turned off for ahci where it would have nicely
solved the problem. I actually considered inserting the short wait
there (i.e. the change you suggest below) but since the problem
appeared to happen only during initialization I thought it might be
better to put it in a less frequently visited ahci-only place.
>
> I don't think fixing here and there for GoVault drive is a good idea.
> It's not gonna be tested widely and GoValut would end up working on some
> controllers while broken on others. I think recently posted EH
> improvements and faster reset patches should help here. The first reset
> will fail timely and all in all the drive should be detected in slightly
> over ten secs, which isn't that bad and actually is okay if parallel
> probing is implemented.
I could give them a try but it seems unlikely that they will show
up in all of the soon to be released Distro updates we're worried
about.
>
> I dunno. Maybe the correct thing to do is wait briefly for !BSY in
> std_prereset() when SKIP_D2H_BSY is set.
This definitely works for me if the other information I included
in this message doesn't change your mind.
> What status value does the
> controller report in that state - 0x80 or 0xff?
0x80
I believe I tried with GoVault attached to each of the 4 ports
and always saw 0x80. Below is my ugly 2.6.20-rc6 debug output
(hard drive on ata2, GoVault on ata3) that may be of interest.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
========================
2.6.20-rc7 dmesg - start
========================
Linux version 2.6.20-rc7 (root@elm3a114.beaverton.ibm.com) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-52)) #1 SMP Fri Feb 16 09:54:30 PST 2007
Command line: ro root=LABEL=/
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009b800 (usable)
BIOS-e820: 000000000009b800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fef0000 (usable)
BIOS-e820: 000000003fef0000 - 000000003fef6000 (ACPI data)
BIOS-e820: 000000003fef6000 - 000000003ff00000 (ACPI NVS)
BIOS-e820: 000000003ff00000 - 0000000040000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
Entering add_active_range(0, 0, 155) 0 entries of 3200 used
Entering add_active_range(0, 256, 261872) 1 entries of 3200 used
end_pfn_map = 1048576
DMI present.
ACPI: RSDP (v000 PTLTD ) @ 0x00000000000f6e60
ACPI: RSDT (v001 PTLTD RSDT 0x06040000 LTP 0x00000000) @ 0x000000003fef024b
ACPI: FADT (v001 INTEL 0x06040000 PTL 0x00000003) @ 0x000000003fef5de7
ACPI: SPCR (v001 PTLTD $UCRTBL$ 0x06040000 PTL 0x00000001) @ 0x000000003fef5e5b
ACPI: MADT (v001 PTLTD APIC 0x06040000 LTP 0x00000000) @ 0x000000003fef5eab
ACPI: BOOT (v001 PTLTD $SBFTBL$ 0x06040000 LTP 0x00000001) @ 0x000000003fef5f05
ACPI: ASF! (v032 MBI 0x06040000 0x00000000) @ 0x000000003fef5f2d
ACPI: MCFG (v001 PTLTD MCFG 0x06040000 LTP 0x00000000) @ 0x000000003fef5fc4
ACPI: DSDT (v001 IBM FIREBIRD 0x06040000 MSFT 0x0100000e) @ 0x0000000000000000
No NUMA configuration found
Faking a node at 0000000000000000-000000003fef0000
Entering add_active_range(0, 0, 155) 0 entries of 3200 used
Entering add_active_range(0, 256, 261872) 1 entries of 3200 used
Bootmem setup node 0 0000000000000000-000000003fef0000
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 1048576
early_node_map[2] active PFN ranges
0: 0 -> 155
0: 256 -> 261872
On node 0 totalpages: 261771
DMA zone: 56 pages used for memmap
DMA zone: 9 pages reserved
DMA zone: 3930 pages, LIFO batch:0
DMA32 zone: 3524 pages used for memmap
DMA32 zone: 254252 pages, LIFO batch:31
Normal zone: 0 pages used for memmap
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to physical flat
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009b000 - 000000000009c000
Nosave address range: 000000000009c000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e4000
Nosave address range: 00000000000e4000 - 0000000000100000
Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
SMP: Allowing 1 CPUs, 0 hotplug CPUs
PERCPU: Allocating 67840 bytes of per cpu data
Built 1 zonelists. Total pages: 258182
Kernel command line: ro root=LABEL=/
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Checking aperture...
Memory: 1021828k/1047488k available (2377k kernel code, 25256k reserved, 1284k data, 332k init)
Calibrating delay using timer specific routine.. 7185.52 BogoMIPS (lpj=3592761)
Security Framework v1.0.0 initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
selinux_register_security: Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU 0/0 -> Node 0
using mwait in idle threads.
CPU: Hyper-Threading is disabled
CPU0: Thermal monitoring handled by SMI
SMP alternatives: switching to UP code
Freeing SMP alternatives: 32k freed
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12468941
Detected 12.468 MHz APIC timer.
Brought up 1 CPUs
testing NMI watchdog ... OK.
Disabling vsyscall due to use of PM timer
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
time.c: Detected 3591.068 MHz processor.
NET: Registered protocol family 16
No dock devices found.
ACPI: bus type pci registered
PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG.
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
PCI quirk: region 1000-107f claimed by ICH6 ACPI/GPIO/TCO
PCI quirk: region 1180-11bf claimed by ICH6 GPIO
Boot video device is 0000:01:00.0
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEG_._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EXP0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 10 11 14 15) *5
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 *10 11 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 13 devices
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
PCI-GART: No AMD northbridge found.
PCI: Bridge: 0000:00:01.0
IO window: 3000-3fff
MEM window: d8100000-d81fffff
PREFETCH window: d0000000-d7ffffff
PCI: Bridge: 0000:00:1c.0
IO window: disabled.
MEM window: d8000000-d80fffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1e.0
IO window: disabled.
MEM window: d8200000-d82fffff
PREFETCH window: disabled.
ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:01.0 to 64
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1c.0 to 64
PCI: Enabling device 0000:00:1e.0 (0144 -> 0146)
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 1538k freed
Simple Boot Flag at 0x36 set to 0x1
audit: initializing netlink socket (disabled)
audit(1172091506.301:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SELinux: Registering netfilter hooks
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
PCI: Setting latency timer of device 0000:00:01.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:01.0:pcie00]
PCI: Setting latency timer of device 0000:00:1c.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.0:pcie00]
Allocate Port Service[0000:00:1c.0:pcie02]
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: CPU0 (power states: C1[C1] C3[C3])
ACPI: Processor [CPU0] (supports 8 throttling states)
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH6: IDE controller at PCI slot 0000:00:1f.1
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
ICH6: chipset revision 3
ICH6: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x2080-0x2087, BIOS settings: hda:DMA, hdb:pio
Probing IDE interface ide0...
hda: HL-DT-ST GCE-8483B, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
ide-floppy driver 0.99.newide
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:KBC,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
TCP bic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S3 S4 S5)
Freeing unused kernel memory: 332k freed
Write protecting the kernel read-only data: 723k
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 23 (level, low) -> IRQ 23
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:1d.0: irq 23, io base 0x00001800
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.1: irq 19, io base 0x00002000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.2: irq 18, io base 0x00002020
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.3[D] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: UHCI Host Controller
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.3: irq 16, io base 0x00002040
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 23 (level, low) -> IRQ 23
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 5
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 23, io mem 0xd8500000
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 8 ports detected
input: PS/2 Generic Mouse as /class/input/input0
input: AT Translated Set 2 keyboard as /class/input/input1
SCSI subsystem initialized
libata version 2.00 loaded.
ahci 0000:00:1f.2: version 2.0
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: PORTS_IMPL is zero, forcing 0xf
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ahci 0000:00:1f.2: AHCI 0001.0000 32 slots 4 ports 1.5 Gbps 0xf impl IDE mode
ahci 0000:00:1f.2: flags: 64bit ncq pm led pmp slum part
ata1: SATA max UDMA/133 cmd 0xFFFFC2000001CD00 ctl 0x0 bmdma 0x0 irq 19
ata2: SATA max UDMA/133 cmd 0xFFFFC2000001CD80 ctl 0x0 bmdma 0x0 irq 19
ata3: SATA max UDMA/133 cmd 0xFFFFC2000001CE00 ctl 0x0 bmdma 0x0 irq 19
ata4: SATA max UDMA/133 cmd 0xFFFFC2000001CE80 ctl 0x0 bmdma 0x0 irq 19
scsi0 : ahci
ata1: SATA link down (SStatus 0 SControl 300)
scsi1 : ahci
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATA-7, max UDMA/133, 156312576 sectors: LBA
ata2.00: ata2: dev 0 multi count 0
ata2.00: configured for UDMA/133
scsi2 : ahci
ata3: softreset failed (port busy but CLO unavailable)
ata3: softreset failed, retrying in 5 secs
ata3: port is slow to respond, please be patient (Status 0x80)
ata3: port failed to respond (30 secs, Status 0x80)
ata3: COMRESET failed (device not ready)
ata3: hardreset failed, retrying in 5 secs
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ATAPI, max UDMA/66
ata3.00: configured for UDMA/66
scsi3 : ahci
ata4: SATA link down (SStatus 0 SControl 300)
scsi 1:0:0:0: Direct-Access ATA Maxtor 6Y080M0 YAR5 PQ: 0 ANSI: 5
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 sda13 >
sd 1:0:0:0: Attached scsi disk sda
scsi 2:0:0:0: Direct-Access IBM GoVault 0096 PQ: 0 ANSI: 5
SCSI device sdb: 156300464 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 74 00 00
sdb: cache data unavailable
sdb: assuming drive cache: write through
SCSI device sdb: 156300464 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 74 00 00
sdb: cache data unavailable
sdb: assuming drive cache: write through
sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 >
sd 2:0:0:0: Attached scsi removable disk sdb
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
audit(1172091564.004:2): enforcing=1 old_enforcing=0 auid=4294967295
security: 3 users, 6 roles, 1587 types, 172 bools, 1 sens, 1024 cats
security: 59 classes, 49762 rules
security: class dccp_socket not defined in policy
security: permission dccp_recv in class node not defined in policy
security: permission dccp_send in class node not defined in policy
security: permission dccp_recv in class netif not defined in policy
security: permission dccp_send in class netif not defined in policy
SELinux: Completing initialization.
SELinux: Setting up existing superblocks.
SELinux: initialized (dev sda8, type ext3), uses xattr
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev eventpollfs, type eventpollfs), uses task SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev cpuset, type cpuset), not configured for labeling
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
audit(1172091564.636:3): policy loaded auid=4294967295
tg3.c:v3.72 (January 8, 2007)
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:02:00.0 to 64
eth0: Tigon3 [partno(BCM95721) rev 4100 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:0d:60:b7:76:b3
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[76180000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 19 (level, low) -> IRQ 19
input: PC Speaker as /class/input/input2
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE,EPP]
intel_rng: FWH not detected
hda: ATAPI 48X CD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
sd 1:0:0:0: Attached scsi generic sg0 type 0
sd 2:0:0:0: Attached scsi generic sg1 type 0
PCI: Enabling device 0000:00:1e.2 (0005 -> 0007)
ACPI: PCI Interrupt 0000:00:1e.2[A] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1e.2 to 64
intel8x0_measure_ac97_clock: measured 50307 usecs
intel8x0: clocking to 48000
FDC 0 is a National Semiconductor PC87306
lp0: using parport0 (interrupt-driven).
lp0: console ready
input: Power Button (FF) as /class/input/input3
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /class/input/input4
ACPI: Power Button (CM) [PWRB]
ibm_acpi: ec object not found
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com
EXT3 FS on sda8, internal journal
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev sda1, type ext2), uses xattr
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev sda2, type ext3), uses xattr
Adding 1028120k swap on /dev/sda12. Priority:-1 extents:1 across:1028120k
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
IA-32 Microcode Update Driver: v1.14a <tigran@aivazian.fsnet.co.uk>
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
process `sysctl' is using deprecated sysctl (syscall) net.ipv6.neigh.lo.retrans_time; Use net.ipv6.neigh.lo.retrans_time_ms instead.
ADDRCONF(NETDEV_UP): eth0: link is not ready
tg3: eth0: Link is up at 1000 Mbps, full duplex.
tg3: eth0: Flow control is on for TX and on for RX.
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
audit(1172091587.001:4): audit_pid=2126 old=0 by auid=4294967295 subj=system_u:system_r:auditd_t:s0
SELinux: initialized (dev rpc_pipefs, type rpc_pipefs), uses genfs_contexts
Bluetooth: Core ver 2.11
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.8
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM ver 1.8
Bluetooth: HIDP (Human Interface Emulation) ver 1.1
SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
eth0: no IPv6 routers present
[drm] Initialized drm 1.1.0 20060810
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
[drm] Initialized radeon 1.25.0 20060524 on minor 0
[drm] Setting GART location based on new memory map
[drm] Loading R300 Microcode
[drm] writeback test succeeded in 1 usecs
========================
2.6.20-rc7 dmesg - end
========================
===============================
2.6.20-rc6 debug output - start
===============================
...
libata version 2.00 loaded.
ahci 0000:00:1f.2: version 2.0
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: PORTS_IMPL is zero, forcing 0xf
XXX - ~PORT_CMD_FIS_RX in ahci_stop_fix_rx
XXX - ~PORT_CMD_FIS_RX in ahci_stop_fix_rx
XXX - ~PORT_CMD_FIS_RX in ahci_stop_fix_rx
XXX - ~PORT_CMD_FIS_RX in ahci_stop_fix_rx
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ahci 0000:00:1f.2: AHCI 0001.0000 32 slots 4 ports 1.5 Gbps 0xf impl IDE mode
ahci 0000:00:1f.2: flags: 64bit ncq pm led pmp slum part
ata1: XXX - in ata_device_add before port_start
ata1: XXX entering ahci_port_start status=0x7f BSY=0 DRQ=1 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
ata1: XXX in ahci_port_start before ahci_power_up status=0x7f BSY=0 DRQ=1 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_ICC_ACTIVE in ahci_power_up
ata1: XXX in ahci_port_start after ahci_power_up status=0x7f BSY=0 DRQ=1 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_FIS_RX in ahci_start_fis_rx
ata1: XXX in ahci_init_port after ahci_start_fis_rx status=0x7f BSY=0 DRQ=1 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata1: XXX in ahci_init_port after ahci_start_engine status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX in ahci_port_start after ahci_init_port status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX - in ata_device_add after port_start
ata1: SATA max UDMA/133 cmd 0xFFFFC2000001CD00 ctl 0x0 bmdma 0x0 irq 19
ata1: XXX in ahci_freeze before turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX in ahci_freeze after turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX ata_device_add after ata_eh_freeze_port: status=0x7f BSY=0 DRQ=1
ata2: XXX - in ata_device_add before port_start
ata2: XXX entering ahci_port_start status=0x80 BSY=1 DRQ=0 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
ata2: XXX in ahci_port_start before ahci_power_up status=0x80 BSY=1 DRQ=0 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_ICC_ACTIVE in ahci_power_up
ata2: XXX in ahci_port_start after ahci_power_up status=0x80 BSY=1 DRQ=0 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_FIS_RX in ahci_start_fis_rx
ata2: XXX in ahci_init_port after ahci_start_fis_rx status=0x50 BSY=0 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata2: XXX in ahci_init_port after ahci_start_engine status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX in ahci_port_start after ahci_init_port status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX - in ata_device_add after port_start
ata2: SATA max UDMA/133 cmd 0xFFFFC2000001CD80 ctl 0x0 bmdma 0x0 irq 19
ata2: XXX in ahci_freeze before turn IRQ off status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX in ahci_freeze after turn IRQ off status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX ata_device_add after ata_eh_freeze_port: status=0x50 BSY=0 DRQ=0
ata3: XXX - in ata_device_add before port_start
ata3: XXX entering ahci_port_start status=0x80 BSY=1 DRQ=0 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
ata3: XXX in ahci_port_start before ahci_power_up status=0x80 BSY=1 DRQ=0 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_ICC_ACTIVE in ahci_power_up
ata3: XXX in ahci_port_start after ahci_power_up status=0x80 BSY=1 DRQ=0 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_FIS_RX in ahci_start_fis_rx
ata3: XXX in ahci_init_port after ahci_start_fis_rx status=0x80 BSY=1 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata3: XXX in ahci_init_port after ahci_start_engine status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX in ahci_port_start after ahci_init_port status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX - in ata_device_add after port_start
ata3: SATA max UDMA/133 cmd 0xFFFFC2000001CE00 ctl 0x0 bmdma 0x0 irq 19
ata3: XXX in ahci_freeze before turn IRQ off status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX in ahci_freeze after turn IRQ off status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX ata_device_add after ata_eh_freeze_port: status=0x80 BSY=1 DRQ=0
ata4: XXX - in ata_device_add before port_start
ata4: XXX entering ahci_port_start status=0x7f BSY=0 DRQ=1 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
ata4: XXX in ahci_port_start before ahci_power_up status=0x7f BSY=0 DRQ=1 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_ICC_ACTIVE in ahci_power_up
ata4: XXX in ahci_port_start after ahci_power_up status=0x7f BSY=0 DRQ=1 CR=0 FR=0 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_FIS_RX in ahci_start_fis_rx
ata4: XXX in ahci_init_port after ahci_start_fis_rx status=0x7f BSY=0 DRQ=1 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata4: XXX in ahci_init_port after ahci_start_engine status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX in ahci_port_start after ahci_init_port status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX - in ata_device_add after port_start
ata4: SATA max UDMA/133 cmd 0xFFFFC2000001CE80 ctl 0x0 bmdma 0x0 irq 19
ata4: XXX in ahci_freeze before turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX in ahci_freeze after turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX ata_device_add after ata_eh_freeze_port: status=0x7f BSY=0 DRQ=1
ata1: XXX ata_device_add before sata_scr_read: status=0x7f BSY=0 DRQ=1
ata1: XXX ata_device_add before scsi_add_host: status=0x7f BSY=0 DRQ=1
scsi0 : ahci
ata1: XXX ata_device_add before scheduleing eh: status=0x7f BSY=0 DRQ=1
ata1: XXX entering ahci_error_handler: status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX in ahci_freeze before turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX in ahci_freeze after turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: XXX before ata_port_offline status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata1: SATA link down (SStatus 0 SControl 300)
XXX - ~PORT_CMD_ATAPI in ahci_postreset
ata2: XXX ata_device_add before sata_scr_read: status=0x50 BSY=0 DRQ=0
ata2: XXX ata_device_add before scsi_add_host: status=0x50 BSY=0 DRQ=0
scsi1 : ahci
ata2: XXX ata_device_add before scheduleing eh: status=0x50 BSY=0 DRQ=0
ata2: XXX entering ahci_error_handler: status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX in ahci_freeze before turn IRQ off status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX in ahci_freeze after turn IRQ off status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX before ata_port_offline status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata2: XXX before stop_engine status=0x50 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - ~PORT_CMD_START in ahci_stop_engine
ata2: XXX before BSY/DRQ check status=0x50 BSY=0 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
XXX - ~PORT_CMD_ATAPI in ahci_postreset
ata2.00: ATA-7, max UDMA/133, 156312576 sectors: LBA
ata2.00: ata2: dev 0 multi count 0
ata2.00: configured for UDMA/133
ata3: XXX ata_device_add before sata_scr_read: status=0x80 BSY=1 DRQ=0
ata3: XXX ata_device_add before scsi_add_host: status=0x80 BSY=1 DRQ=0
scsi2 : ahci
ata3: XXX ata_device_add before scheduleing eh: status=0x80 BSY=1 DRQ=0
ata3: XXX entering ahci_error_handler: status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX in ahci_freeze before turn IRQ off status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX in ahci_freeze after turn IRQ off status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX before ata_port_offline status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX before stop_engine status=0x80 BSY=1 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - ~PORT_CMD_START in ahci_stop_engine
ata3: XXX before BSY/DRQ check status=0x80 BSY=1 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata3: softreset failed (port busy but CLO unavailable)
ata3: softreset failed, retrying in 5 secs
XXX - ~PORT_CMD_START in ahci_stop_engine
ata3: port is slow to respond, please be patient (Status 0x80)
ata3: port failed to respond (30 secs, Status 0x80)
ata3: COMRESET failed (device not ready)
XXX - PORT_CMD_START in ahci_start_engine
ata3: hardreset failed, retrying in 5 secs
XXX - ~PORT_CMD_START in ahci_stop_engine
XXX - PORT_CMD_START in ahci_start_engine
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
XXX - PORT_CMD_ATAPI in ahci_postreset
ata3.00: ATAPI, max UDMA/66
ata3.00: configured for UDMA/66
ata4: XXX ata_device_add before sata_scr_read: status=0x7f BSY=0 DRQ=1
ata4: XXX ata_device_add before scsi_add_host: status=0x7f BSY=0 DRQ=1
scsi3 : ahci
ata4: XXX ata_device_add before scheduleing eh: status=0x7f BSY=0 DRQ=1
ata4: XXX entering ahci_error_handler: status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX in ahci_freeze before turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX in ahci_freeze after turn IRQ off status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: XXX before ata_port_offline status=0x7f BSY=0 DRQ=1 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata4: SATA link down (SStatus 0 SControl 300)
XXX - ~PORT_CMD_ATAPI in ahci_postreset
scsi 1:0:0:0: Direct-Access ATA Maxtor 6Y080M0 YAR5 PQ: 0 ANSI: 5
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 sda13 >
sd 1:0:0:0: Attached scsi disk sda
scsi 2:0:0:0: Direct-Access IBM GoVault 0096 PQ: 0 ANSI: 5
ata3: XXX entering ahci_error_handler: status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x1
XXX - ~PORT_CMD_START in ahci_stop_engine
ata3: XXX in ahci_error_handler after ahci_stop_engine status=0x51 BSY=0 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata3: XXX in ahci_error_handler after ahci_start_engine status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
ata3: XXX entering ahci_error_handler: status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x1
XXX - ~PORT_CMD_START in ahci_stop_engine
ata3: XXX in ahci_error_handler after ahci_stop_engine status=0x51 BSY=0 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata3: XXX in ahci_error_handler after ahci_start_engine status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
SCSI device sdb: 156300464 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 74 00 00
ata3: XXX entering ahci_error_handler: status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x1
XXX - ~PORT_CMD_START in ahci_stop_engine
ata3: XXX in ahci_error_handler after ahci_stop_engine status=0x51 BSY=0 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata3: XXX in ahci_error_handler after ahci_start_engine status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
sdb: cache data unavailable
sdb: assuming drive cache: write through
SCSI device sdb: 156300464 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 74 00 00
ata3: XXX entering ahci_error_handler: status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x1
XXX - ~PORT_CMD_START in ahci_stop_engine
ata3: XXX in ahci_error_handler after ahci_stop_engine status=0x51 BSY=0 DRQ=0 CR=0 FR=1 ICC=0 CCS=0x0, CI=0x0
XXX - PORT_CMD_START in ahci_start_engine
ata3: XXX in ahci_error_handler after ahci_start_engine status=0x51 BSY=0 DRQ=0 CR=1 FR=1 ICC=0 CCS=0x0, CI=0x0
sdb: cache data unavailable
sdb: assuming drive cache: write through
sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 >
sd 2:0:0:0: Attached scsi removable disk sdb
...
===============================
2.6.20-rc6 debug output - end
===============================
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
2007-02-21 12:40 ` Tejun Heo
2007-02-22 0:41 ` Gary Hade
@ 2007-02-23 0:32 ` Gary Hade
1 sibling, 0 replies; 29+ messages in thread
From: Gary Hade @ 2007-02-23 0:32 UTC (permalink / raw)
To: Tejun Heo; +Cc: Gary Hade, Kovid Goyal, linux-ide, lcm, Jeff Garzik, konradr
On Wed, Feb 21, 2007 at 09:40:10PM +0900, Tejun Heo wrote:
> Hello, Gary.
>
> Gary Hade wrote:
> > I instrumented the code and found that for the SATA hard drive BSY was set
> > just before the call to ahci_init_port() from ahci_port_start() and clear
> > after the return from ahci_init_port(). For the GoVault BSY was still set
> > after the return from ahci_init_port() and remained set for almost 2 seconds.
> >
> > The below patch which gives BSY some extra time to clear repairs the problem.
> > Unlike the extra delay for ata-piix needed by GoVault I believe this delay
> > will only be seen for attached devices that need it. Please let me know
> > what you think.
>
> Can you post full dmesg without the patch? And which controller are you
> using (lspci -nn please)? How come it doesn't support CLO?
>
> I don't think the patch is a good idea. !BSY wait before reset is
> supposed to be done in ->prereset() if possible. ahci skips that step
> because waiting for !BSY isn't reliable after hotplug and it often
> wastes 30secs for no good reason after an hotplug event.
>
> Even if you need to add that specific shorter wait, the correct place
> would be ->prereset() not at the end of ->port_start().
>
> I don't think fixing here and there for GoVault drive is a good idea.
> It's not gonna be tested widely and GoValut would end up working on some
> controllers while broken on others. I think recently posted EH
> improvements and faster reset patches should help here. The first reset
> will fail timely and all in all the drive should be detected in slightly
> over ten secs, which isn't that bad and actually is okay if parallel
> probing is implemented.
>
> I dunno. Maybe the correct thing to do is wait briefly for !BSY in
> std_prereset() when SKIP_D2H_BSY is set.
Tejun, I will be on vacation for a few days (returning 3/1) so I
thought I would go ahead and provide the below patch before you have
had a chance to review the information I sent you yesterday and actually
confirm that it still looks like a viable solution.
Thanks,
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
We encountered a problem where the BSY status bit is still
set on entry to the 'ahci' error handler during initialization
of the Quantum GoVault when attached to an ICH6R/ICH6RW controller.
This caused a software reset failure due to failed BSY/DRQ check
in ahci_softreset() forcing a hard reset with the following
messages logged.
ata1: softreset failed (port busy but CLO unavailable)
ata1: softreset failed, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ata1: COMRESET failed (device not ready)
ata1: hardreset failed, retrying in 5 secs
It was taking almost 2 seconds for BSY to clear following the
return from ahci_init_port() in ahci_port_start() so this patch
gives BSY up to 3 seconds extra time to clear during prereset
eliminating the problem.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
--- linux-2.6.20-rc7/drivers/ata/libata-core.c.orig 2007-02-22 11:52:34.000000000 -0800
+++ linux-2.6.20-rc7/drivers/ata/libata-core.c 2007-02-22 14:59:44.000000000 -0800
@@ -2915,6 +2915,8 @@ int ata_std_prereset(struct ata_port *ap
struct ata_eh_context *ehc = &ap->eh_context;
const unsigned long *timing = sata_ehc_deb_timing(ehc);
int rc;
+ u8 status;
+ unsigned long timeout;
/* handle link resume & hotplug spinup */
if ((ehc->i.flags & ATA_EHI_RESUME_LINK) &&
@@ -2942,9 +2944,19 @@ int ata_std_prereset(struct ata_port *ap
/* Wait for !BSY if the controller can wait for the first D2H
* Reg FIS and we don't know that no device is attached.
+ * For other controllers a brief wait (up to 3 secs) may be
+ * needed for some devices.
*/
if (!(ap->flags & ATA_FLAG_SKIP_D2H_BSY) && !ata_port_offline(ap))
ata_busy_sleep(ap, ATA_TMOUT_BOOT_QUICK, ATA_TMOUT_BOOT);
+ else {
+ timeout = jiffies + 3*HZ;
+ status = ata_busy_wait(ap, ATA_BUSY, 300);
+ while ((status & ATA_BUSY) && time_before(jiffies, timeout)) {
+ msleep(50);
+ status = ata_busy_wait(ap, ATA_BUSY, 3);
+ }
+ }
return 0;
}
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2007-02-23 0:32 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-11 18:03 [2.6.18,19] SATA boot problems (ICH6/ICH6W) Kovid Goyal
2006-12-20 0:44 ` Tejun Heo
2006-12-20 2:00 ` Kovid Goyal
2006-12-20 2:13 ` Tejun Heo
2006-12-20 4:56 ` Kovid Goyal
2007-01-11 23:32 ` Kovid Goyal
2007-01-13 2:19 ` Tejun Heo
2006-12-20 3:29 ` Gary Hade
2006-12-20 3:53 ` Tejun Heo
2006-12-20 4:30 ` Tejun Heo
2006-12-21 17:10 ` Gary Hade
2007-01-30 1:55 ` Gary Hade
2007-01-30 7:32 ` Tejun Heo
2007-01-30 23:37 ` Gary Hade
2007-01-31 0:54 ` Jeff Garzik
2007-01-31 11:00 ` Tejun Heo
2007-01-31 12:20 ` Alan
2007-01-31 13:16 ` Tejun Heo
2007-01-31 15:24 ` Jeff Garzik
2007-01-31 15:30 ` Mark Lord
2007-01-31 10:44 ` Tejun Heo
2007-01-31 10:47 ` Jeff Garzik
2007-01-31 11:00 ` Tejun Heo
2007-02-01 0:49 ` Gary Hade
2007-02-17 0:34 ` Gary Hade
2007-02-21 12:40 ` Tejun Heo
2007-02-22 0:41 ` Gary Hade
2007-02-23 0:32 ` Gary Hade
2007-01-23 21:49 ` danieljzhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).