* PROBLEM: Silicon Image 3112 Lockups
@ 2005-09-07 1:07 Jeremy Smith
2005-09-07 2:01 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Smith @ 2005-09-07 1:07 UTC (permalink / raw)
To: jgarzik; +Cc: linux-ide
I'm working on a system (K8N-DL) with a 3114 driver running 2.6.12 and
experiencing lockups on heavy disk access. This is the built-in on an
ASUS-K8N-DL board. I was wondering if the following problem is
symptomatic of a known bug.
I will get momentary freezes, and then I'll continue. At some point
though, the disk access will freeze--no panic, but a lockup of all disk
access. I have configured the first drive as a single drive concatenation
in the "RAID" bootup and haven't done anything with the second drive. The
large logical partition on each drive is configured with software RAID.
Here is one instance from the system log. A hardboot was required.
Sep 6 17:49:12 localhost kernel: Bootdata ok (command line is
root=/dev/ram0 mem=3000M init=/linuxrc real_root=/dev/sda2 vga=0x317)
Sep 6 17:49:12 localhost kernel: Memory: 3015772k/3072000k available
(3153k kernel code, 0k reserved, 1323k data, 224k init)
Sep 6 17:49:14 localhost kernel: ata1: SATA max UDMA/100 cmd
0xFFFFC2000091E080 ctl 0xFFFFC2000091E08A bmdma 0xFFFFC2000091E000 irq 3
Sep 6 17:49:14 localhost kernel: ata2: SATA max UDMA/100 cmd
0xFFFFC2000091E0C0 ctl 0xFFFFC2000091E0CA bmdma 0xFFFFC2000091E008 irq 3
Sep 6 17:49:14 localhost kernel: ata3: SATA max UDMA/100 cmd
0xFFFFC2000091E280 ctl 0xFFFFC2000091E28A bmdma 0xFFFFC2000091E200 irq 3
Sep 6 17:49:14 localhost kernel: ata4: SATA max UDMA/100 cmd
0xFFFFC2000091E2C0 ctl 0xFFFFC2000091E2CA bmdma 0xFFFFC2000091E208 irq 3
Sep 6 17:49:14 localhost kernel: ata1: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Sep 6 17:49:14 localhost kernel: ata1: dev 0 configured for UDMA/100
Sep 6 17:49:14 localhost kernel: scsi0 : sata_sil
Sep 6 17:49:14 localhost kernel: ata2: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Sep 6 17:49:14 localhost kernel: ata2: dev 0 configured for UDMA/100
Sep 6 17:49:14 localhost kernel: scsi1 : sata_sil
Sep 6 17:49:14 localhost kernel: ata3: no device found (phy stat
00000000)
Sep 6 17:49:14 localhost kernel: scsi2 : sata_sil
Sep 6 17:49:14 localhost kernel: ata4: no device found (phy stat
00000000)
Sep 6 17:49:14 localhost kernel: scsi3 : sata_sil
Sep 6 17:49:15 localhost kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Sep 6 17:49:15 localhost kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Sep 6 18:09:44 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:09:44 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:09:51 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:09:51 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:01 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:01 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:09 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:09 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:16 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:16 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:17 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:17 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:20 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:20 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:22 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:22 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:23 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:23 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:23 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:23 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:24 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:24 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:24 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:24 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:25 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:25 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:25 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:25 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:26 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:26 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:26 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:26 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:27 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:27 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:43 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:43 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:50 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:50 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:50 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:50 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:53 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:53 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:55 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:55 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:11:12 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:11:12 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 19:27:11 localhost kernel: Bootdata ok (command line is
root=/dev/ram0 mem=3000M init=/linuxrc real_root=/dev/sda2 vga=0x317)
Linux localhost 2.6.12-gentoo-r9 #1 SMP Sat Sep 3 02:05:00 MDT 2005
x86_64 AMD Opteron(tm) Processor 244 AuthenticAMD GNU/Linux
Gnu C 3.4.4
Gnu make 3.80
binutils 2.15.92.0.2
util-linux 2.12i
mount 2.12i
module-init-tools 3.0
e2fsprogs 1.38
reiserfsprogs line
reiser4progs line
Linux C Library 2.3.5
Dynamic linker (ldd) 2.3.5
Procps 3.2.5
Net-tools 1.60
Kbd 1.12
Sh-utils 5.2.1
udev 068
Modules Loaded nvidia vmnet parport_pc parport vmmon snd_ca0106
snd_ac97_codec snd_pcm snd_timer snd snd_page_alloc tg3 ata_piix sata_sil
libata sbp2 ohci1394 ieee1394 ohci_hcd uhci_hcd usb_storage usbhid ehci_hcd
dd if=/dev/sda of=/dev/null can reproduct error, as can several
disk-intensive activities
Thanks,
Jer
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 1:07 PROBLEM: Silicon Image 3112 Lockups Jeremy Smith @ 2005-09-07 2:01 ` Tejun Heo 2005-09-07 2:13 ` Jeff Garzik 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2005-09-07 2:01 UTC (permalink / raw) To: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor Cc: jgarzik, linux-ide Hello, Jeremy. Jeremy Smith wrote: > > > I'm working on a system (K8N-DL) with a 3114 driver running 2.6.12 and > experiencing lockups on heavy disk access. This is the built-in on an > ASUS-K8N-DL board. I was wondering if the following problem is > symptomatic of a known bug. > > I will get momentary freezes, and then I'll continue. At some point > though, the disk access will freeze--no panic, but a lockup of all disk > access. I have configured the first drive as a single drive > concatenation in the "RAID" bootup and haven't done anything with the > second drive. The large logical partition on each drive is configured > with software RAID. You're the second person reporting similar problem with ASUS K8N-DL board. Please see the following threads. http://marc.theaimsgroup.com/?l=linux-ide&m=112497821103098&w=2 http://marc.theaimsgroup.com/?l=linux-ide&m=112600646820285&w=2 > dd if=/dev/sda of=/dev/null can reproduct error, as can several > disk-intensive activities In the following mail, I've attached a patch which might alleviate errors during writes (as Alexander was reporting CRC errors with write commands), but it won't do any good if you're getting errors during reading. http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2 Carlos and Paul, do you guys know anything about this mainboard? Should we perform some special tweaking to get these boards work? I'll dig 3112/3114 document further but I'm not very sure what I should look for. Thanks. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 2:01 ` Tejun Heo @ 2005-09-07 2:13 ` Jeff Garzik 2005-09-07 2:34 ` Tejun Heo 0 siblings, 1 reply; 8+ messages in thread From: Jeff Garzik @ 2005-09-07 2:13 UTC (permalink / raw) To: Tejun Heo Cc: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor, linux-ide Tejun Heo wrote: > In the following mail, I've attached a patch which might alleviate > errors during writes (as Alexander was reporting CRC errors with write > commands), but it won't do any good if you're getting errors during > reading. > > http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2 Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since it messes with the voltage. Jeff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 2:13 ` Jeff Garzik @ 2005-09-07 2:34 ` Tejun Heo 2005-09-07 2:42 ` Tejun Heo 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2005-09-07 2:34 UTC (permalink / raw) To: Jeff Garzik Cc: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor, linux-ide Jeff Garzik wrote: > Tejun Heo wrote: > >> In the following mail, I've attached a patch which might alleviate >> errors during writes (as Alexander was reporting CRC errors with write >> commands), but it won't do any good if you're getting errors during >> reading. >> >> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2 > > > Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since > it messes with the voltage. > Alexander & Jeremy. It's as Jeff said. TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE. (enough capitals?) Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is at least inside specified limits. Also, it won't change anything regarding read errors. All it does is increasing voltage swing while transmitting data (writes). Thanks. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 2:34 ` Tejun Heo @ 2005-09-07 2:42 ` Tejun Heo 2005-09-07 3:21 ` Jeremy Smith 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2005-09-07 2:42 UTC (permalink / raw) To: Tejun Heo Cc: Jeff Garzik, Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor, linux-ide Tejun Heo wrote: > Jeff Garzik wrote: > >> Tejun Heo wrote: >> >>> In the following mail, I've attached a patch which might alleviate >>> errors during writes (as Alexander was reporting CRC errors with >>> write commands), but it won't do any good if you're getting errors >>> during reading. >>> >>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2 >> >> >> >> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since >> it messes with the voltage. >> > > Alexander & Jeremy. > > It's as Jeff said. > > TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE. > (enough capitals?) > > Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is at > least inside specified limits. Also, it won't change anything regarding > read errors. All it does is increasing voltage swing while transmitting > data (writes). Oh.. it might affect writes if errors are occurring due to CRC errors during command trasmit. If you're getting ABRT errors instead of ICRC's, it might indicate that commands are being mistransferred (again, I'm not sure at all). -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 2:42 ` Tejun Heo @ 2005-09-07 3:21 ` Jeremy Smith 2005-09-07 6:00 ` Tejun Heo 0 siblings, 1 reply; 8+ messages in thread From: Jeremy Smith @ 2005-09-07 3:21 UTC (permalink / raw) To: Tejun Heo Cc: Jeff Garzik, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor, linux-ide On Wed, 7 Sep 2005, Tejun Heo wrote: > Tejun Heo wrote: >> Jeff Garzik wrote: >> >>> Tejun Heo wrote: >>> >>>> In the following mail, I've attached a patch which might alleviate >>>> errors during writes (as Alexander was reporting CRC errors with write >>>> commands), but it won't do any good if you're getting errors during >>>> reading. >>>> >>>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2 >>> >>> >>> >>> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since it >>> messes with the voltage. >>> >> >> Alexander & Jeremy. >> >> It's as Jeff said. >> >> TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE. (enough >> capitals?) >> >> Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is at >> least inside specified limits. Also, it won't change anything regarding >> read errors. All it does is increasing voltage swing while transmitting >> data (writes). > > Oh.. it might affect writes if errors are occurring due to CRC errors during > command trasmit. If you're getting ABRT errors instead of ICRC's, it might > indicate that commands are being mistransferred (again, I'm not sure at all). > > -- > tejun > Did you mean reads here? Because I think it's happening on reads as well--it happens on an "e2fsck -b -n" on the drive when I'm booted off a CDROM. I'm willing to try it out if it could help, but if it's unlikely too... I don't have any idea how these drivers work, but the ASUS K8N-DL also has the nvidia SATA controller in it--which doesn't appear to work at all, so I started by hooking up the drivers to the SI controller. Can the mere presence of this additional controller make a difference? For what it's worth, I don't _think_ I was seeing similar lockups until I updated the firmware on this board to the latest version (1004 from 1003), but that could be a red herring because I also wasn't paying attention to syslog. I've tried changes to cabling...both drives experience the exact same symptoms for me; it certainly could be hardware related, but it would be on the board, for which I don't have a spare. Is there any additional information I can provide? Jer ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 3:21 ` Jeremy Smith @ 2005-09-07 6:00 ` Tejun Heo 2005-09-07 6:09 ` Jeff Garzik 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2005-09-07 6:00 UTC (permalink / raw) To: Jeremy Smith Cc: Jeff Garzik, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor, linux-ide Jeremy Smith wrote: > > On Wed, 7 Sep 2005, Tejun Heo wrote: > >> Tejun Heo wrote: >> >>> Jeff Garzik wrote: >>> >>>> Tejun Heo wrote: >>>> >>>>> In the following mail, I've attached a patch which might alleviate >>>>> errors during writes (as Alexander was reporting CRC errors with >>>>> write commands), but it won't do any good if you're getting errors >>>>> during reading. >>>>> >>>>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2 >>>> >>>> >>>> >>>> >>>> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, >>>> since it messes with the voltage. >>>> >>> >>> Alexander & Jeremy. >>> >>> It's as Jeff said. >>> >>> TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE. >>> (enough capitals?) >>> >>> Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is >>> at least inside specified limits. Also, it won't change anything >>> regarding read errors. All it does is increasing voltage swing while >>> transmitting data (writes). >> >> >> Oh.. it might affect writes if errors are occurring due to CRC errors >> during command trasmit. If you're getting ABRT errors instead of >> ICRC's, it might indicate that commands are being mistransferred >> (again, I'm not sure at all). >> >> -- >> tejun >> > > Did you mean reads here? Because I think it's happening on reads as > well--it happens on an "e2fsck -b -n" on the drive when I'm booted off a > CDROM. I'm willing to try it out if it could help, but if it's unlikely > too... Yes, I meant reads. It would be great if somebody tries the patch out. Maybe you and Alexander can coordinate and only one can take the risk. ;-p If I had access to K8N-DL, I would have tested it myself, but sadly I don't. I did test with my discerete sii3112 card and Samsung HD160JJ drive at 600mV and had no problem but this doesn't guarantee anything for you guys. I think it would be nice if Alexander or you can test it but I have to warn you again. YOU MAY FRY YOUR HARDWARE WITH THIS. > I don't have any idea how these drivers work, but the ASUS K8N-DL also > has the nvidia SATA controller in it--which doesn't appear to work at > all, so I started by hooking up the drivers to the SI controller. Can > the mere presence of this additional controller make a difference? I doubt that that would have anything to do with this. > For what it's worth, I don't _think_ I was seeing similar lockups until > I updated the firmware on this board to the latest version (1004 from > 1003), but that could be a red herring because I also wasn't paying > attention to syslog. I don't know. If some specific configurations are required for the controller, they are usually done by BIOS (either mainboard BIOS or per-controller BIOS), so BIOS update could affect the problem. But these are still just wild speculations. Maybe we should contact ASUS about this? > I've tried changes to cabling...both drives experience the exact same > symptoms for me; it certainly could be hardware related, but it would be > on the board, for which I don't have a spare. > > Is there any additional information I can provide? Well, I think two same reports for not-so-widespread mainboard indicate away from cabling problems. And I cannot think of any more info which could be helpful yet. I'll let you know if something comes up. Thanks & good luck. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups 2005-09-07 6:00 ` Tejun Heo @ 2005-09-07 6:09 ` Jeff Garzik 0 siblings, 0 replies; 8+ messages in thread From: Jeff Garzik @ 2005-09-07 6:09 UTC (permalink / raw) To: Tejun Heo Cc: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor, linux-ide Tejun Heo wrote: > I don't know. If some specific configurations are required for the > controller, they are usually done by BIOS (either mainboard BIOS or > per-controller BIOS), so BIOS update could affect the problem. But > these are still just wild speculations. Maybe we should contact ASUS > about this? Note that, in the past, system BIOS updates have cured sata_sil data corruption bug reports. Updating BIOS is always a good idea. Jeff ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-09-07 6:09 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-09-07 1:07 PROBLEM: Silicon Image 3112 Lockups Jeremy Smith 2005-09-07 2:01 ` Tejun Heo 2005-09-07 2:13 ` Jeff Garzik 2005-09-07 2:34 ` Tejun Heo 2005-09-07 2:42 ` Tejun Heo 2005-09-07 3:21 ` Jeremy Smith 2005-09-07 6:00 ` Tejun Heo 2005-09-07 6:09 ` Jeff Garzik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).