From: Simon McNair <simonmcnair@gmail.com>
To: Phil Turmel <philip@turmel.org>
Cc: NeilBrown <neilb@suse.de>, linux-raid@vger.kernel.org
Subject: Re: Linux software RAID assistance
Date: Wed, 16 Feb 2011 13:56:39 +0000 [thread overview]
Message-ID: <4D5BD797.1040309@gmail.com> (raw)
In-Reply-To: <4D5A92F3.1090004@turmel.org>
one other snippet:
proxmox:/home/simon# for x in /dev/sd{d..m} ; do echo $x ; dd if=$x
skip=2312 count=128 2>/dev/null |strings |grep
9fAJEz-HcaP-RQ51-fV8b-nxrN-Uqwb-PPnOLJ ; done
/dev/sdd
/dev/sde
/dev/sdf
/dev/sdg
/dev/sdh
/dev/sdi
id = "9fAJEz-HcaP-RQ51-fV8b-nxrN-Uqwb-PPnOLJ"
id = "9fAJEz-HcaP-RQ51-fV8b-nxrN-Uqwb-PPnOLJ"
/dev/sdj
/dev/sdk
/dev/sdl
/dev/sdm
On 15/02/2011 14:51, Phil Turmel wrote:
> Hi Neil,
>
> Since Simon has responded, let me summarize the assistance I provided per his off-list request:
>
> On 02/14/2011 11:53 PM, NeilBrown wrote:
>> On Thu, 10 Feb 2011 16:16:44 +0000 Simon McNair<simonmcnair@gmail.com> wrote:
>>
>>> Hi all
>>>
>>> I use a 3ware 9500-12 port sata card (JBOD) which will not work without a
>>> 128mb sodimm. The sodimm socket is flakey and the result is that the
>>> machine occasionally crashes. Yesterday I finally gave in and put
>>> together another
>>> machine so that I can rsync between them. When I turned the machine
>>> on today to set up rync, the RAID array was not gone, but corrupted.
>>> Typical...
>> Presumably the old machine was called 'ubuntu' and the new machine 'proølox'
>>
>>
>>> I built the array in Aug 2010 using the following command:
>>>
>>> mdadm --create --verbose /dev/md0 --metadata=1.1 --level=5
>>> --raid-devices=10 /dev/sd{b,c,d,e,f,g,h,i,j,k}1 --chunk=64
>>>
>>> Using LVM, I did the following:
>>> pvscan
>>> pvcreate -M2 /dev/md0
>>> vgcreate lvm-raid /dev/md0
>>> vgdisplay lvm-raid
>>> vgscan
>>> lvscan
>>> lvcreate -v -l 100%VG -n RAID lvm-raid
>>> lvdisplay /dev/lvm-raid/lvm0
>>>
>>> I then formatted using:
>>> mkfs -t ext4 -v -m .1 -b 4096 -E stride=16,stripe-width=144
>>> /dev/lvm-raid/RAID
>>>
>>> This worked perfectly since I created the array. Now mdadm is coming up
>>> with
>>>
>>> proxmox:/dev/md# mdadm --assemble --scan --verbose
>>> mdadm: looking for devices for further assembly
>>> mdadm: no recogniseable superblock on /dev/md/ubuntu:0
>> And it seems that ubuntu:0 have been successfully assembled.
>> It is missing one device for some reason (sdd1) but RAID can cope with that.
> 3ware card is compromised, with a loose buffer memory dimm. Some of its ECC errors were caught and reported in dmesg. Its likely, based on the loose memory socket, that many multiple-bit errors got through.
>
> [trim /]
>
>>> mdadm: no uptodate device for slot 8 of /dev/md/pro�lox:0
>>> mdadm: no uptodate device for slot 9 of /dev/md/pro�lox:0
>>> mdadm: failed to add /dev/sdd1 to /dev/md/pro�lox:0: Invalid argument
>>> mdadm: /dev/md/pro�lox:0 assembled from 0 drives - not enough to start
>>> the array.
>> This looks like it is *after* to trying the --create command you give
>> below.. It is best to report things in the order they happen, else you can
>> confuse people (or get caught out!).
> Yes, this was after.
>
>>> mdadm: looking for devices for further assembly
>>> mdadm: no recogniseable superblock on /dev/sdd
>>> mdadm: No arrays found in config file or automatically
>>>
>>> pvscan and vgscan show nothing.
>>>
>>> So I tried running mdadm --create --verbose /dev/md0 --metadata=1.1
>>> --level=5 --raid-devices=10 missing /dev/sde1 /dev/sdf1 /dev/sdg1
>>> /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1 /dev/sdm1 --chunk=64
>>>
>>> as it seemed that /dev/sdd1 failed to be added to the array. This did
>>> nothing.
>> It did not to nothing. It wrote a superblock to /dev/sdd1 and complained
>> that it couldn't write to all the others --- didn't it?
> There were multiple attempts to create. One wrote to just sdd1, another succeeded with all but sdd1.
>
>>> dmesg contains:
>>>
>>> md: invalid superblock checksum on sdd1
>> I guess that is why sdd1 was missing from 'ubuntu:0'. Though as I cannot
>> tell if this happened before or after any of the various things reported
>> above, it is hard to be sure.
>>
>>
>> The real mystery is why 'pvscan' reports nothing.
> The original array was created with mdadm v2.6.7, and had a data offset of 264 sectors. After Simon's various attempts to --create, he ended up with data offset of 2048, using mdadm v3.1.4. The mdadm -E reports he posted to the list showed the 264 offset. We didn't realize the offset had been updated until somewhat later in our troubleshooting efforts.
>
> In any case, pvscan couldn't see the LVM signature because it wasn't there (at offset 2048).
>
>> What about
>> pvscan --verbose
>>
>> or
>>
>> blkid -p /dev/md/ubuntu:0
>>
>> or even
>>
>> dd of=/dev/md/ubuntu:0 count=8 | od -c
> Fortunately, Simon did have a copy of his LVM configuration. With the help of dd, strings, and grep, we did locate his LVM sig at the correct location on sdd1 (for data offset 264). After a number of attempts to bypass LVM and access his single LV with dmsetup (based on his backed up configuration, on the assembled new array less sdd1), I realized that the data offset was wrong on the recreated array, and went looking for the cause. I found your git commit that changed that logic last spring, and recommended that Simon revert to the default package for his ubuntu install, which is v2.6.7.
>
> Simon has now attempted to recreate the array with v2.6.7, but the controller is throwing too many errors to succeed, and I suggested it was too flakey to trust any further. Based on the existence of the LVM sig on sdd1, I believe Simon's data is (mostly) intact, and only needs a successful create operation with a properly functioning controller. (He might also need to perform an lvm vgcfgrestore, but he has the necessary backup file.)
>
> A new controller is on order.
>
> Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2011-02-16 13:56 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-10 16:16 Linux software RAID assistance Simon McNair
2011-02-10 18:24 ` Phil Turmel
2011-02-15 4:53 ` NeilBrown
2011-02-15 8:48 ` Simon McNair
2011-02-15 14:51 ` Phil Turmel
2011-02-15 19:04 ` Simon McNair
2011-02-15 19:37 ` Phil Turmel
2011-02-15 19:45 ` Roman Mamedov
2011-02-15 21:09 ` Simon McNair
2011-02-17 15:10 ` Simon Mcnair
2011-02-17 15:42 ` Roman Mamedov
2011-02-18 9:13 ` Simon McNair
2011-02-18 9:38 ` Robin Hill
2011-02-18 10:38 ` Simon Mcnair
2011-02-19 11:46 ` Jan Ceuleers
2011-02-19 12:40 ` Simon McNair
2011-02-19 17:37 ` Jan Ceuleers
2011-02-16 13:51 ` Simon McNair
2011-02-16 14:37 ` Phil Turmel
2011-02-16 17:49 ` Simon McNair
2011-02-16 18:14 ` Phil Turmel
2011-02-16 18:18 ` Simon McNair
2011-02-16 18:22 ` Phil Turmel
2011-02-16 18:25 ` Phil Turmel
2011-02-16 18:52 ` Simon McNair
2011-02-16 18:57 ` Phil Turmel
2011-02-16 19:07 ` Simon McNair
2011-02-16 19:10 ` Phil Turmel
2011-02-16 19:15 ` Simon McNair
2011-02-16 19:36 ` Phil Turmel
2011-02-16 21:28 ` Simon McNair
2011-02-16 21:30 ` Phil Turmel
2011-02-16 22:44 ` Simon Mcnair
2011-02-16 23:39 ` Phil Turmel
2011-02-17 13:26 ` Simon Mcnair
2011-02-17 13:48 ` Phil Turmel
2011-02-17 13:56 ` Simon Mcnair
2011-02-17 14:34 ` Simon Mcnair
2011-02-17 16:54 ` Phil Turmel
2011-02-19 8:43 ` Simon Mcnair
2011-02-19 15:30 ` Phil Turmel
[not found] ` <AANLkTinOXJWRw_et2U43R_T9XPBzQLnN56Kf2bOAz=_c@mail.gmail.com>
2011-02-19 16:19 ` Phil Turmel
2011-02-20 9:56 ` Simon Mcnair
2011-02-20 19:50 ` Phil Turmel
2011-02-20 23:17 ` Simon Mcnair
2011-02-20 23:39 ` Phil Turmel
2011-02-22 17:12 ` Simon Mcnair
2011-02-22 17:14 ` Simon Mcnair
2011-02-22 18:23 ` Phil Turmel
2011-02-22 18:36 ` Simon McNair
2011-02-22 19:06 ` Phil Turmel
2011-02-18 9:31 ` Simon Mcnair
2011-02-18 13:16 ` Phil Turmel
2011-02-18 13:21 ` Roberto Spadim
2011-02-18 13:26 ` Phil Turmel
2011-02-18 13:29 ` Simon Mcnair
2011-02-18 13:34 ` Phil Turmel
2011-02-18 14:12 ` Simon McNair
2011-02-18 16:10 ` Phil Turmel
2011-02-18 16:38 ` Roberto Spadim
[not found] ` <AANLkTi=RmR5nVnmFLuqK5anHc3WDPxjuYjitT6+5wAqS@mail.gmail.com>
2011-02-20 18:48 ` Phil Turmel
2011-02-20 19:25 ` Simon Mcnair
2011-02-19 8:49 ` Simon Mcnair
2011-02-16 13:56 ` Simon McNair [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D5BD797.1040309@gmail.com \
--to=simonmcnair@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=philip@turmel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).