From: Joerg Sommrey <jo@sommrey.de>
To: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Subject: 2.6.12-rc2: Promise SATA150 TX4 failures
Date: Sat, 9 Apr 2005 18:35:52 +0200 [thread overview]
Message-ID: <20050409163552.GA30263@sommrey.de> (raw)
Hi all,
just tried 2.6.12-rc2 and I still have the same errors from my SATA
disks as with 2.6.11. The setup is a bit complex. The relevant parts (I
think) are:
Adaptec AHA-2940UW SCSI-controller, attached are:
1 harddisk /dev/sda
1 DDS3 streamer /dev/st0
Promise SATA150 TX4 controller, attached are:
2 identical hardisks /dev/sdb and /dev/sdc
/dev/sda consists of the root partition, a swap partition and 4 other
partitions that are physical volumes for dm volume group /dev/vg1
/dev/sdb and /dev/sdc have two partitions each, the first of both make
a RAID-0 array /dev/md0 and the second of both make a md RAID-1 array
/dev/md1
/dev/md0 and /dev/md1 are the physical volumes for dm volume groups
/dev/vg2 and /dev/vg3 resp.
To trigger the failure:
- For all logical volumes in /dev/vg1, /dev/vg2 and /dev/vg3 a snapshot is
created.
- All snapshots are mounted read-only in a "snapshot hierarchy" under
/snap.
- A backup to tape is taken using something like:
find /snap -print | cpio -oaH crc -F /dev/st0
Backup must go to tape, no problem with /dev/null
- At this point, some additional i/o on the SATA disks cause the whole
box to hang. Mostly some errors are written to syslog, they are
always similar:
Apr 9 01:30:35 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Apr 9 01:30:35 bear kernel: ata2: called with no error (51)!
Apr 9 01:30:35 bear kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Apr 9 01:30:35 bear kernel: sdc: Current: sense key: Medium Error
Apr 9 01:30:35 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed
Apr 9 01:30:35 bear kernel: end_request: I/O error, dev sdc, sector 43100350
Apr 9 01:30:35 bear kernel: raid1: Disk failure on sdc2, disabling device.
Apr 9 01:30:35 bear kernel: ^IOperation continuing on 1 devices
The errors are always reported for /dev/sdc2, the second device of a
RAID-1 array. After reboot I am able to raidhotadd the failed partition
without problems.
The problem is 100% reproducible.
The hang is not a "hard hang": X keeps running, the watchdog doesn't hit
but no new processes can be started. Syslog entries stop after some time
(from a few seconds to several minutes).
The problem appeared somewhere between 2.6.10 and 2.6.11.
2.6.10: ok
2.6.10-ac8: ok
2.6.10-ac11: failed
2.6.11: failed
2.6.12-rc2: failed
I'd be glad if there would be a solution for this problem as it prevents
me from using any newer kernel.
-jo
--
-rw-r--r-- 1 jo users 63 2005-04-09 09:31 /home/jo/.signature
reply other threads:[~2005-04-09 16:36 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050409163552.GA30263@sommrey.de \
--to=jo@sommrey.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox