* PROBLEM: Silicon Image 3112 Lockups
@ 2005-09-07 1:07 Jeremy Smith
2005-09-07 2:01 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Smith @ 2005-09-07 1:07 UTC (permalink / raw)
To: jgarzik; +Cc: linux-ide
I'm working on a system (K8N-DL) with a 3114 driver running 2.6.12 and
experiencing lockups on heavy disk access. This is the built-in on an
ASUS-K8N-DL board. I was wondering if the following problem is
symptomatic of a known bug.
I will get momentary freezes, and then I'll continue. At some point
though, the disk access will freeze--no panic, but a lockup of all disk
access. I have configured the first drive as a single drive concatenation
in the "RAID" bootup and haven't done anything with the second drive. The
large logical partition on each drive is configured with software RAID.
Here is one instance from the system log. A hardboot was required.
Sep 6 17:49:12 localhost kernel: Bootdata ok (command line is
root=/dev/ram0 mem=3000M init=/linuxrc real_root=/dev/sda2 vga=0x317)
Sep 6 17:49:12 localhost kernel: Memory: 3015772k/3072000k available
(3153k kernel code, 0k reserved, 1323k data, 224k init)
Sep 6 17:49:14 localhost kernel: ata1: SATA max UDMA/100 cmd
0xFFFFC2000091E080 ctl 0xFFFFC2000091E08A bmdma 0xFFFFC2000091E000 irq 3
Sep 6 17:49:14 localhost kernel: ata2: SATA max UDMA/100 cmd
0xFFFFC2000091E0C0 ctl 0xFFFFC2000091E0CA bmdma 0xFFFFC2000091E008 irq 3
Sep 6 17:49:14 localhost kernel: ata3: SATA max UDMA/100 cmd
0xFFFFC2000091E280 ctl 0xFFFFC2000091E28A bmdma 0xFFFFC2000091E200 irq 3
Sep 6 17:49:14 localhost kernel: ata4: SATA max UDMA/100 cmd
0xFFFFC2000091E2C0 ctl 0xFFFFC2000091E2CA bmdma 0xFFFFC2000091E208 irq 3
Sep 6 17:49:14 localhost kernel: ata1: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Sep 6 17:49:14 localhost kernel: ata1: dev 0 configured for UDMA/100
Sep 6 17:49:14 localhost kernel: scsi0 : sata_sil
Sep 6 17:49:14 localhost kernel: ata2: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Sep 6 17:49:14 localhost kernel: ata2: dev 0 configured for UDMA/100
Sep 6 17:49:14 localhost kernel: scsi1 : sata_sil
Sep 6 17:49:14 localhost kernel: ata3: no device found (phy stat
00000000)
Sep 6 17:49:14 localhost kernel: scsi2 : sata_sil
Sep 6 17:49:14 localhost kernel: ata4: no device found (phy stat
00000000)
Sep 6 17:49:14 localhost kernel: scsi3 : sata_sil
Sep 6 17:49:15 localhost kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Sep 6 17:49:15 localhost kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Sep 6 18:09:44 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:09:44 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:09:51 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:09:51 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:01 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:01 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:09 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:09 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:16 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:16 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:17 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:17 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:20 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:20 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:22 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:22 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:23 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:23 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:23 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:23 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:24 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:24 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:24 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:24 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:25 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:25 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:25 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:25 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:26 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:26 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:26 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:26 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:27 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:27 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:43 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:43 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:50 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:50 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:50 localhost kernel: ata1: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:50 localhost kernel: ata1: error=0x04 { DriveStatusError }
Sep 6 18:10:53 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:53 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:10:55 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:10:55 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 18:11:12 localhost kernel: ata2: status=0x51 { DriveReady
SeekComplete Error }
Sep 6 18:11:12 localhost kernel: ata2: error=0x04 { DriveStatusError }
Sep 6 19:27:11 localhost kernel: Bootdata ok (command line is
root=/dev/ram0 mem=3000M init=/linuxrc real_root=/dev/sda2 vga=0x317)
Linux localhost 2.6.12-gentoo-r9 #1 SMP Sat Sep 3 02:05:00 MDT 2005
x86_64 AMD Opteron(tm) Processor 244 AuthenticAMD GNU/Linux
Gnu C 3.4.4
Gnu make 3.80
binutils 2.15.92.0.2
util-linux 2.12i
mount 2.12i
module-init-tools 3.0
e2fsprogs 1.38
reiserfsprogs line
reiser4progs line
Linux C Library 2.3.5
Dynamic linker (ldd) 2.3.5
Procps 3.2.5
Net-tools 1.60
Kbd 1.12
Sh-utils 5.2.1
udev 068
Modules Loaded nvidia vmnet parport_pc parport vmmon snd_ca0106
snd_ac97_codec snd_pcm snd_timer snd snd_page_alloc tg3 ata_piix sata_sil
libata sbp2 ohci1394 ieee1394 ohci_hcd uhci_hcd usb_storage usbhid ehci_hcd
dd if=/dev/sda of=/dev/null can reproduct error, as can several
disk-intensive activities
Thanks,
Jer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 1:07 PROBLEM: Silicon Image 3112 Lockups Jeremy Smith
@ 2005-09-07 2:01 ` Tejun Heo
2005-09-07 2:13 ` Jeff Garzik
0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2005-09-07 2:01 UTC (permalink / raw)
To: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor
Cc: jgarzik, linux-ide
Hello, Jeremy.
Jeremy Smith wrote:
>
>
> I'm working on a system (K8N-DL) with a 3114 driver running 2.6.12 and
> experiencing lockups on heavy disk access. This is the built-in on an
> ASUS-K8N-DL board. I was wondering if the following problem is
> symptomatic of a known bug.
>
> I will get momentary freezes, and then I'll continue. At some point
> though, the disk access will freeze--no panic, but a lockup of all disk
> access. I have configured the first drive as a single drive
> concatenation in the "RAID" bootup and haven't done anything with the
> second drive. The large logical partition on each drive is configured
> with software RAID.
You're the second person reporting similar problem with ASUS K8N-DL
board. Please see the following threads.
http://marc.theaimsgroup.com/?l=linux-ide&m=112497821103098&w=2
http://marc.theaimsgroup.com/?l=linux-ide&m=112600646820285&w=2
> dd if=/dev/sda of=/dev/null can reproduct error, as can several
> disk-intensive activities
In the following mail, I've attached a patch which might alleviate
errors during writes (as Alexander was reporting CRC errors with write
commands), but it won't do any good if you're getting errors during reading.
http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2
Carlos and Paul, do you guys know anything about this mainboard?
Should we perform some special tweaking to get these boards work? I'll
dig 3112/3114 document further but I'm not very sure what I should look for.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 2:01 ` Tejun Heo
@ 2005-09-07 2:13 ` Jeff Garzik
2005-09-07 2:34 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2005-09-07 2:13 UTC (permalink / raw)
To: Tejun Heo
Cc: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor,
linux-ide
Tejun Heo wrote:
> In the following mail, I've attached a patch which might alleviate
> errors during writes (as Alexander was reporting CRC errors with write
> commands), but it won't do any good if you're getting errors during
> reading.
>
> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2
Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since
it messes with the voltage.
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 2:13 ` Jeff Garzik
@ 2005-09-07 2:34 ` Tejun Heo
2005-09-07 2:42 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2005-09-07 2:34 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor,
linux-ide
Jeff Garzik wrote:
> Tejun Heo wrote:
>
>> In the following mail, I've attached a patch which might alleviate
>> errors during writes (as Alexander was reporting CRC errors with write
>> commands), but it won't do any good if you're getting errors during
>> reading.
>>
>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2
>
>
> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since
> it messes with the voltage.
>
Alexander & Jeremy.
It's as Jeff said.
TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE.
(enough capitals?)
Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is at
least inside specified limits. Also, it won't change anything regarding
read errors. All it does is increasing voltage swing while transmitting
data (writes).
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 2:34 ` Tejun Heo
@ 2005-09-07 2:42 ` Tejun Heo
2005-09-07 3:21 ` Jeremy Smith
0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2005-09-07 2:42 UTC (permalink / raw)
To: Tejun Heo
Cc: Jeff Garzik, Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo,
Paul Taylor, linux-ide
Tejun Heo wrote:
> Jeff Garzik wrote:
>
>> Tejun Heo wrote:
>>
>>> In the following mail, I've attached a patch which might alleviate
>>> errors during writes (as Alexander was reporting CRC errors with
>>> write commands), but it won't do any good if you're getting errors
>>> during reading.
>>>
>>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2
>>
>>
>>
>> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since
>> it messes with the voltage.
>>
>
> Alexander & Jeremy.
>
> It's as Jeff said.
>
> TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE.
> (enough capitals?)
>
> Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is at
> least inside specified limits. Also, it won't change anything regarding
> read errors. All it does is increasing voltage swing while transmitting
> data (writes).
Oh.. it might affect writes if errors are occurring due to CRC errors
during command trasmit. If you're getting ABRT errors instead of
ICRC's, it might indicate that commands are being mistransferred (again,
I'm not sure at all).
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 2:42 ` Tejun Heo
@ 2005-09-07 3:21 ` Jeremy Smith
2005-09-07 6:00 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Smith @ 2005-09-07 3:21 UTC (permalink / raw)
To: Tejun Heo
Cc: Jeff Garzik, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor,
linux-ide
On Wed, 7 Sep 2005, Tejun Heo wrote:
> Tejun Heo wrote:
>> Jeff Garzik wrote:
>>
>>> Tejun Heo wrote:
>>>
>>>> In the following mail, I've attached a patch which might alleviate
>>>> errors during writes (as Alexander was reporting CRC errors with write
>>>> commands), but it won't do any good if you're getting errors during
>>>> reading.
>>>>
>>>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2
>>>
>>>
>>>
>>> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch, since it
>>> messes with the voltage.
>>>
>>
>> Alexander & Jeremy.
>>
>> It's as Jeff said.
>>
>> TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE. (enough
>> capitals?)
>>
>> Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is at
>> least inside specified limits. Also, it won't change anything regarding
>> read errors. All it does is increasing voltage swing while transmitting
>> data (writes).
>
> Oh.. it might affect writes if errors are occurring due to CRC errors during
> command trasmit. If you're getting ABRT errors instead of ICRC's, it might
> indicate that commands are being mistransferred (again, I'm not sure at all).
>
> --
> tejun
>
Did you mean reads here? Because I think it's happening on reads as
well--it happens on an "e2fsck -b -n" on the drive when I'm booted off a
CDROM. I'm willing to try it out if it could help, but if it's unlikely
too...
I don't have any idea how these drivers work, but the ASUS K8N-DL also has
the nvidia SATA controller in it--which doesn't appear to work at all, so
I started by hooking up the drivers to the SI controller. Can the mere
presence of this additional controller make a difference?
For what it's worth, I don't _think_ I was seeing similar lockups until I
updated the firmware on this board to the latest version (1004 from 1003),
but that could be a red herring because I also wasn't paying attention to
syslog.
I've tried changes to cabling...both drives experience the exact same
symptoms for me; it certainly could be hardware related, but it would be
on the board, for which I don't have a spare.
Is there any additional information I can provide?
Jer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 3:21 ` Jeremy Smith
@ 2005-09-07 6:00 ` Tejun Heo
2005-09-07 6:09 ` Jeff Garzik
0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2005-09-07 6:00 UTC (permalink / raw)
To: Jeremy Smith
Cc: Jeff Garzik, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor,
linux-ide
Jeremy Smith wrote:
>
> On Wed, 7 Sep 2005, Tejun Heo wrote:
>
>> Tejun Heo wrote:
>>
>>> Jeff Garzik wrote:
>>>
>>>> Tejun Heo wrote:
>>>>
>>>>> In the following mail, I've attached a patch which might alleviate
>>>>> errors during writes (as Alexander was reporting CRC errors with
>>>>> write commands), but it won't do any good if you're getting errors
>>>>> during reading.
>>>>>
>>>>> http://marc.theaimsgroup.com/?l=linux-ide&m=112602112819183&w=2
>>>>
>>>>
>>>>
>>>>
>>>> Note that I would put BIG CAPITAL LETTER WARNINGS on that patch,
>>>> since it messes with the voltage.
>>>>
>>>
>>> Alexander & Jeremy.
>>>
>>> It's as Jeff said.
>>>
>>> TRY THE PATCH AT YOUR OWN RISK. IT MIGHT FRY PHY OF YOUR DRIVE.
>>> (enough capitals?)
>>>
>>> Even if you're brave enough to try, DO NOT GO OVER 600mV. 600mV is
>>> at least inside specified limits. Also, it won't change anything
>>> regarding read errors. All it does is increasing voltage swing while
>>> transmitting data (writes).
>>
>>
>> Oh.. it might affect writes if errors are occurring due to CRC errors
>> during command trasmit. If you're getting ABRT errors instead of
>> ICRC's, it might indicate that commands are being mistransferred
>> (again, I'm not sure at all).
>>
>> --
>> tejun
>>
>
> Did you mean reads here? Because I think it's happening on reads as
> well--it happens on an "e2fsck -b -n" on the drive when I'm booted off a
> CDROM. I'm willing to try it out if it could help, but if it's unlikely
> too...
Yes, I meant reads. It would be great if somebody tries the patch
out. Maybe you and Alexander can coordinate and only one can take the
risk. ;-p If I had access to K8N-DL, I would have tested it myself, but
sadly I don't. I did test with my discerete sii3112 card and Samsung
HD160JJ drive at 600mV and had no problem but this doesn't guarantee
anything for you guys.
I think it would be nice if Alexander or you can test it but I have to
warn you again.
YOU MAY FRY YOUR HARDWARE WITH THIS.
> I don't have any idea how these drivers work, but the ASUS K8N-DL also
> has the nvidia SATA controller in it--which doesn't appear to work at
> all, so I started by hooking up the drivers to the SI controller. Can
> the mere presence of this additional controller make a difference?
I doubt that that would have anything to do with this.
> For what it's worth, I don't _think_ I was seeing similar lockups until
> I updated the firmware on this board to the latest version (1004 from
> 1003), but that could be a red herring because I also wasn't paying
> attention to syslog.
I don't know. If some specific configurations are required for the
controller, they are usually done by BIOS (either mainboard BIOS or
per-controller BIOS), so BIOS update could affect the problem. But
these are still just wild speculations. Maybe we should contact ASUS
about this?
> I've tried changes to cabling...both drives experience the exact same
> symptoms for me; it certainly could be hardware related, but it would be
> on the board, for which I don't have a spare.
>
> Is there any additional information I can provide?
Well, I think two same reports for not-so-widespread mainboard
indicate away from cabling problems. And I cannot think of any more
info which could be helpful yet. I'll let you know if something comes up.
Thanks & good luck.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PROBLEM: Silicon Image 3112 Lockups
2005-09-07 6:00 ` Tejun Heo
@ 2005-09-07 6:09 ` Jeff Garzik
0 siblings, 0 replies; 8+ messages in thread
From: Jeff Garzik @ 2005-09-07 6:09 UTC (permalink / raw)
To: Tejun Heo
Cc: Jeremy Smith, Alexander Shaposhnikov, Carlos Pardo, Paul Taylor,
linux-ide
Tejun Heo wrote:
> I don't know. If some specific configurations are required for the
> controller, they are usually done by BIOS (either mainboard BIOS or
> per-controller BIOS), so BIOS update could affect the problem. But
> these are still just wild speculations. Maybe we should contact ASUS
> about this?
Note that, in the past, system BIOS updates have cured sata_sil data
corruption bug reports. Updating BIOS is always a good idea.
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-09-07 6:09 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-07 1:07 PROBLEM: Silicon Image 3112 Lockups Jeremy Smith
2005-09-07 2:01 ` Tejun Heo
2005-09-07 2:13 ` Jeff Garzik
2005-09-07 2:34 ` Tejun Heo
2005-09-07 2:42 ` Tejun Heo
2005-09-07 3:21 ` Jeremy Smith
2005-09-07 6:00 ` Tejun Heo
2005-09-07 6:09 ` Jeff Garzik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).