From: "Kasimir Müller" <kjm@kasimir-mueller.de>
To: Tejun Heo <htejun@gmail.com>,
IDE/ATA development list <linux-ide@vger.kernel.org>,
christian.kuehn@hamburg.de
Subject: Re: [Fwd: Re: libata , Silicon Image 3124]
Date: Sun, 06 Jan 2008 11:55:03 +0100 [thread overview]
Message-ID: <4780B387.1050600@kasimir-mueller.de> (raw)
In-Reply-To: <473A5B03.2020808@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4728 bytes --]
Hi Tejun,
Old communication appended below.
I wish you a Happy Xmas and a successful New Year.
I spent some time during Christmas to further investigate the problem. I
bought a new 500GB disk and put all data on this disk.
This is also contineously watched by nagios and cacti
Then
1.) All 5 disks in the external case connected via Portmapper and sil24
card have excellent health-status with smartd.
2.) I get no(!!!!) errors at all if I use the disks as single drives or
with lvm. I verified this by copying large amounts of data (100-200GB)
with rsync , cp-av and running bonnie++ single and simultaneously
to various combinations of drives.
3.) I get the errors as soon as I use raid. Same errors with raid0 (2
disks), 1 (2 disks), 5 (3 disks) in any combination of the drives
4.) The errors appear usually first during mkfs (same with ext3 and
reiserfs) and than
after writing about 10-50 GB to the raid, and repeat then at 5 to
10 minute intervals according the disk activity.
5.) I used Kernel 2.6.23.1 with Your latest patch: same result
6.) I used kernel 2.6.24 patch rc-6 : same result
7.) during the tests I marked all files with md5-sums: No data
corruption (!!!), so maybe I can live with it.
linux:/var/log # cat /proc/interrupts
CPU0
0: 12457604 XT-PIC-XT timer
1: 8 XT-PIC-XT i8042
2: 0 XT-PIC-XT cascade
4: 186738 XT-PIC-XT serial
5: 13328470 XT-PIC-XT sata_via, ehci_hcd:usb5, VIA8237,
fcpci, eth0
6: 5 XT-PIC-XT floppy
8: 0 XT-PIC-XT rtc
9: 0 XT-PIC-XT acpi
10: 8019753 XT-PIC-XT sata_sil24
11: 114 XT-PIC-XT uhci_hcd:usb1, uhci_hcd:usb2
14: 468588 XT-PIC-XT libata
15: 0 XT-PIC-XT libata, uhci_hcd:usb3, uhci_hcd:usb4
NMI: 0
LOC: 12457724
ERR: 1
complete log of kernel errors from today appended as gz:
In the opensuse mailing-list some people reported related errors.
http://lists.opensuse.org/opensuse-de/2007-12/msg00939.html
Can You make someting out of this ? Do You need any more information
(please detail, because I am not a linux-guru)
Yours sincerely
Kasimir Mueller
> kjm@kasimir.mueller.de didn't work. Forwarding to the original address.
>
it is kjm@kasimir-mueller.de
>
>
> ------------------------------------------------------------------------
>
> Betreff:
> Re: libata , Silicon Image 3124
> Von:
> Tejun Heo <htejun@gmail.com>
> Datum:
> Wed, 14 Nov 2007 11:10:57 +0900
> An:
> Kasimir Mueller <kjm@kasimir.mueller.de>
>
> An:
> Kasimir Mueller <kjm@kasimir.mueller.de>
> CC:
> IDE/ATA development list <linux-ide@vger.kernel.org>
>
> Return-Path:
> <htejun@gmail.com>
> X-Original-To:
> htejun@gmail.com
> Delivered-To:
> tj@htj.dyndns.org
> Received:
> from [127.0.0.2] (htj.dyndns.org [127.0.0.2]) by htj.dyndns.org
> (Postfix) with ESMTP id 50E1813A81CF; Wed, 14 Nov 2007 11:10:57 +0900
> (KST)
> Nachricht-ID:
> <473A5931.4040704@gmail.com>
> User-Agent:
> Thunderbird 2.0.0.6 (X11/20070801)
> MIME-Version:
> 1.0
> Referenzen:
> <4738A231.5010700@kasimir-mueller.de> <4738FC4C.7050103@gmail.com>
> <473A11B4.5030409@kasimir.mueller.de>
> In-Reply-To:
> <473A11B4.5030409@kasimir.mueller.de>
> X-Enigmail-Version:
> 0.95.3
> Content-Type:
> text/plain; charset=ISO-8859-15
> Content-Transfer-Encoding:
> 8bit
>
>
> [linux-ide added back. please don't drop cc list]
>
> Kasimir Mueller wrote:
>
>> Tejun Heo schrieb:
>>
>>> Kasimir Müller wrote:
>>>
>>>
>>>> Nov 12 19:28:42 linux kernel: ata6.02: cmd
>>>> ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
>>>> Nov 12 19:28:42 linux kernel: res
>>>> 40/00:00:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>>>>
>>>>
>>> That's flush timing out, which isn't good.
>>>
>>> 1. Are the errors localized to ata6.02?
>>>
>>>
>> I think it's ata6.01
>>
>
> The above message says 6.02 tho. Anyways, if the errors are localized
> to one drive, please swap that drive with another drive. Do the errors
> follow the drive or stay with the slot?
>
>
>>> 2. Is FLUSH always involved (ie. is cmd always ea)?
>>>
>>>
>> yes
>>
>
> Hmmm... okay. That sounds like a dying drive to me. What does
> 'smartctl -a /dev/sdX' say?
>
>
>>> 3. If you disable NCQ, what do errors look like?
>>>
>>>
>> I put all disk-drives in the blacklist , without a change
>>
>
> So, NCQ isn't the problem.
>
> ATM, it sounds like a dying drive to me. Please double check the errors
> are localized to one drive, check smartctl log and try to verify as
> written above.
>
> Thanks.
>
>
[-- Attachment #2: kernel.log.today.gz --]
[-- Type: application/x-gzip, Size: 5442 bytes --]
next parent reply other threads:[~2008-01-06 10:55 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <473A5B03.2020808@gmail.com>
2008-01-06 10:55 ` Kasimir Müller [this message]
2008-01-08 3:47 ` [Fwd: Re: libata , Silicon Image 3124] Tejun Heo
2008-01-09 17:31 ` Kasimir Müller
2008-01-10 4:08 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4780B387.1050600@kasimir-mueller.de \
--to=kjm@kasimir-mueller.de \
--cc=christian.kuehn@hamburg.de \
--cc=htejun@gmail.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).