From: "Jörn Nettingsmeier" <nettings@stackingdwarves.net>
To: linux-scsi@vger.kernel.org
Cc: "Jörn Nettingsmeier" <nettings@stackingdwarves.net>
Subject: 3ware 9690SA: raid6 array blows up under opensuse 11.2 (2.6.31)
Date: Sun, 13 Dec 2009 16:08:30 +0100 [thread overview]
Message-ID: <4B25036E.7080101@stackingdwarves.net> (raw)
hi everyone !
i have had a weird issue with the 3ware 9690SA controller on an intel
nehalem system. there are 5 2tb drives attached to that controller in a
raid6 configuration, which is exported to the operating system as a
single volume.
an installation of opensuse 11.2 (using a 2.6.31 kernel) fails
reproducibly after 10-20mins of heavy disk activity. the error manifests
itself by the first two drives falling off the array, whereupon the
controller switches to read-only mode and any subsequent writes fail.
checking with the CLI or the controller bios shows that drives p0 and p1
are disconnected.
at the same time, the controller bios fails to come up after a warm
reboot 3 out of 4 times, which is fixed only by a cold restart.
the nature of the failure and the reboot problems made me suspect a
hardware failure (an opinion shared by my vendor's support technician).
so the box was sent back for testing. turns out they can reproduce the
error 100% with opensuse 11.2, but not with an older debian or SLES 10
system. and they pretty much swapped all hardware components in the lab
(i had already seen the issue with two different controllers of the same
model).
so it might be a software issue after all.
since the vendor is not supporting oS 11.2, they closed the issue and
sent the machine back.
still, this is nagging me, simply because if it happened on a production
machine, it would be an issue of massive data loss...
do you know of any known regressions in the 3ware driver or userspace
utilities since 2.6.27 (because that was the latest kernel that passed
the tests) that could be causing this issue?
here are some screenshots of the dmesg buffer and the controller bios
during the error condition:
http://stackingdwarves.net/download/3ware-9690SE-crash/
i'd be happy to run further tests on that machine on monday - any advice
on how to proceed would be most welcome.
best regards,
jörn
ps: i'd appreciate being cc:ed on this issue, since i'm not subscribed.
--
Jörn Nettingsmeier
Lortzingstr. 11, 45128 Essen, Tel. +49 177 7937487
Meister für Veranstaltungstechnik (Bühne/Studio), Elektrofachkraft
Audio and event engineer - Ambisonic surround recordings
http://stackingdwarves.net
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2009-12-13 15:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-13 15:08 Jörn Nettingsmeier [this message]
2009-12-13 20:38 ` 3ware 9690SA: raid6 array blows up under opensuse 11.2 (2.6.31) adam radford
2009-12-13 22:24 ` Jörn Nettingsmeier
2009-12-14 0:45 ` adam radford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B25036E.7080101@stackingdwarves.net \
--to=nettings@stackingdwarves.net \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.