From: Hans-Peter Jansen <hpj@urpla.net>
To: linux-raid@vger.kernel.org
Subject: Persistent failures with simple md setup
Date: Tue, 29 Jan 2013 23:14:35 +0100 [thread overview]
Message-ID: <1565063.1kpR7lz4Ph@xrated> (raw)
[-- Attachment #1: Type: text/plain, Size: 3990 bytes --]
[Looks like my first messge didn't made it to the list, hence send
again with tarballed attachments]
Dear list members,
one of the systems, I take care of, there's one pretty bog standard
openSUSE 12.1 installation, that stick out with continued device
failures on boot:
Here a typical case:
~# cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4]
md3 : active raid1 sda4[0]
869702736 blocks super 1.0 [2/1] [U_]
bitmap: 57/415 pages [228KB], 1024KB chunk
md0 : active raid1 sda1[0]
96376 blocks super 1.0 [2/1] [U_]
bitmap: 1/6 pages [4KB], 8KB chunk
md1 : active (auto-read-only) raid1 sdb2[1] sda2[0]
2096468 blocks super 1.0 [2/2] [UU]
bitmap: 0/8 pages [0KB], 128KB chunk
md124 : active raid1 sdb3[1] sda3[0]
104856180 blocks super 1.0 [2/2] [UU]
bitmap: 8/200 pages [32KB], 256KB chunk
[no line breaks on purpose]
Jan 29 20:22:36 zaphkiel kernel: [ 11.047504] md: raid1 personality registered for level 1
Jan 29 20:22:36 zaphkiel kernel: [ 11.549612] md: bind<sda3>
Jan 29 20:22:36 zaphkiel kernel: [ 11.587037] md: bind<sdb3>
Jan 29 20:22:36 zaphkiel kernel: [ 11.630965] md/raid1:md124: active with 2 out of 2 mirrors
Jan 29 20:22:36 zaphkiel kernel: [ 11.708396] md124: bitmap initialized from disk: read 13/13 pages, set 1 of 409595 bits
Jan 29 20:22:36 zaphkiel kernel: [ 11.769213] md124: detected capacity change from 0 to 107372728320
Jan 29 20:22:36 zaphkiel kernel: [ 11.981192] md: raid0 personality registered for level 0
Jan 29 20:22:36 zaphkiel kernel: [ 12.020959] md: raid10 personality registered for level 10
Jan 29 20:22:36 zaphkiel kernel: [ 12.625530] md: raid6 personality registered for level 6
Jan 29 20:22:36 zaphkiel kernel: [ 12.657414] md: raid5 personality registered for level 5
Jan 29 20:22:36 zaphkiel kernel: [ 12.689261] md: raid4 personality registered for level 4
Jan 29 20:22:36 zaphkiel kernel: [ 25.151590] md: bind<sda2>
Jan 29 20:22:36 zaphkiel kernel: [ 25.314284] md: bind<sda1>
Jan 29 20:22:36 zaphkiel kernel: [ 25.409503] md: bind<sda4>
Jan 29 20:22:36 zaphkiel kernel: [ 25.568103] md/raid1:md0: active with 1 out of 2 mirrors
Jan 29 20:22:36 zaphkiel kernel: [ 25.689110] md: bind<sdb2>
Jan 29 20:22:36 zaphkiel kernel: [ 25.713385] md0: bitmap initialized from disk: read 1/1 pages, set 0 of 12047 bits
Jan 29 20:22:36 zaphkiel kernel: [ 25.837207] md0: detected capacity change from 0 to 98689024
Jan 29 20:22:36 zaphkiel kernel: [ 26.045361] md/raid1:md1: active with 2 out of 2 mirrors
Jan 29 20:22:36 zaphkiel kernel: [ 26.260500] md1: bitmap initialized from disk: read 1/1 pages, set 0 of 16379 bits
Jan 29 20:22:36 zaphkiel kernel: [ 26.349129] md1: detected capacity change from 0 to 2146783232
Jan 29 20:22:36 zaphkiel kernel: [ 26.391526] md/raid1:md3: active with 1 out of 2 mirrors
Jan 29 20:22:36 zaphkiel kernel: [ 27.188346] md3: bitmap initialized from disk: read 26/26 pages, set 1547 of 849320 bits
Jan 29 20:22:36 zaphkiel kernel: [ 27.302622] md3: detected capacity change from 0 to 890575601664
This looks like some kind of race during device detection.
The full boot sequence log leading to this mess is attached.
The major parts operating here are:
mdadm-3.2.2-4.9.1.i586
mkinitrd-2.7.0-39.3.1.i586
kernel-desktop-3.1.10-1.16.1.i586
kernel-desktop-base-3.1.10-1.16.1.i586
Sure the system can be repaired with:
mdadm --add /dev/md0 /dev/sdb1
mdadm --add /dev/md3 /dev/sdb4
for this case, but the behavior which partition is affected is random,
only md124 seems stable (the root fs). The strange md naming was the
result of an upgrade installation. The device details are attached as
well.
It happens, that the active device even *switches* between boots, which
is a perfect recipe for actually loosing data, hence this md doesn't
raise data security, it is the reason for loosing them.
Could some kind soul tell me, what's going on here?
Thanks in advance,
Pete
[-- Attachment #2: details-and-log.tar.bz2 --]
[-- Type: application/x-bzip-compressed-tar, Size: 18989 bytes --]
next reply other threads:[~2013-01-29 22:14 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-29 22:14 Hans-Peter Jansen [this message]
2013-01-30 9:07 ` Persistent failures with simple md setup Sebastian Riemer
2013-01-30 17:12 ` Hans-Peter Jansen
2013-02-04 20:43 ` Hans-Peter Jansen
2013-02-05 3:44 ` NeilBrown
2013-02-27 17:01 ` Hans-Peter Jansen
2013-02-28 3:40 ` NeilBrown
2013-02-28 10:49 ` Hans-Peter Jansen
2013-02-28 21:25 ` NeilBrown
2013-02-28 22:16 ` Hans-Peter Jansen
[not found] ` <4291349.FrQcKOnicQ@xrated>
2013-03-03 23:33 ` NeilBrown
2013-03-13 0:52 ` NeilBrown
2013-03-15 22:43 ` Hans-Peter Jansen
2013-03-18 11:20 ` Hans-Peter Jansen
2013-03-21 3:24 ` NeilBrown
2013-04-10 13:28 ` Hans-Peter Jansen
2013-04-10 13:44 ` Hans-Peter Jansen
2013-04-11 7:33 ` NeilBrown
2013-01-30 9:20 ` Roy Sigurd Karlsbakk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1565063.1kpR7lz4Ph@xrated \
--to=hpj@urpla.net \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).