* Data Offset @ 2012-05-10 17:42 Piergiorgio Sartor 2012-05-24 5:20 ` NeilBrown 0 siblings, 1 reply; 24+ messages in thread From: Piergiorgio Sartor @ 2012-05-10 17:42 UTC (permalink / raw) To: linux-raid Hi, after the RAID-5 problem, I just realized that other RAIDs I have, including a multi RAID-6, have different data offset for each component. This seems to be quite of a problem, in case "Create" is used to recover an array. Obviouly, if a 4 disks RAID-5 has 2 disks with one offset and 2 with another, it will not be possible to re-create it (saving the data). Is there any way to fix/prevent such issue? Shouldn't "mdadm" make sure all offset are the same? Or try, at least... What I noticed is that, adding a disk later, might cause different offsets. Any idea? Thanks, bye, -- piergiorgio ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-05-10 17:42 Data Offset Piergiorgio Sartor @ 2012-05-24 5:20 ` NeilBrown 0 siblings, 0 replies; 24+ messages in thread From: NeilBrown @ 2012-05-24 5:20 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1608 bytes --] On Thu, 10 May 2012 19:42:10 +0200 Piergiorgio Sartor <piergiorgio.sartor@nexgo.de> wrote: > Hi, > > after the RAID-5 problem, I just realized that > other RAIDs I have, including a multi RAID-6, > have different data offset for each component. > > This seems to be quite of a problem, in case > "Create" is used to recover an array. > > Obviouly, if a 4 disks RAID-5 has 2 disks with > one offset and 2 with another, it will not be > possible to re-create it (saving the data). I certainly won't be easy. Though if someone did find themselves in that situation it might motivate me to enhance mdadm in some way to make it easily fixable. > > Is there any way to fix/prevent such issue? > Shouldn't "mdadm" make sure all offset are > the same? Or try, at least... I'm not sure. Maybe... With linux-3.5 and mdadm-3.3 (both unreleased) you will probably be able to mdadm --grow --data-offset=5M and that will happen. At least for RAID10. Other levels might follow later. Should mdadm keep them always the same? The reason that it doesn't is that I thought that you could change the data offset by removing each device and adding it back as a spare with a new data_offset. Maybe that isn't such a good idea. I suspect that I got a chorus of people all saying "please keep data_offset consistent" - and particular if I received a patch which did that - then I would probably change mdadm accordingly. NeilBrown > > What I noticed is that, adding a disk later, > might cause different offsets. > > Any idea? > > Thanks, > > bye, > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset
@ 2012-06-01 23:22 freeone3000
2012-06-01 23:52 ` NeilBrown
0 siblings, 1 reply; 24+ messages in thread
From: freeone3000 @ 2012-06-01 23:22 UTC (permalink / raw)
To: linux-raid
Hello. I have an issue concerning a broken RAID of unsure pedigree.
Examining the drives tells me the block sizes are not the same, as
listed in the email.
> I certainly won't be easy. Though if someone did find themselves in that
> situation it might motivate me to enhance mdadm in some way to make it easily
> fixable.
I seem to be your motivation for making this situation fixable.
Somehow I managed to get drives with an invalid block size. All worked
fine until a drive dropped out of the RAID5. When attempting to
replace, I can re-create the RAID, but it cannot be of the same size
because the 1024-sector drives are "too small" when changed to
2048-sector, exactly as described. Are there any recovery options I
could try, including simply editing the header?
mdadm --examine of all drives in the RAID:
/dev/sdb3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
Name : leyline:1 (local to host leyline)
Creation Time : Mon Sep 12 13:19:00 2011
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB)
Array Size : 15626096640 (7451.10 GiB 8000.56 GB)
Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 872097fa:3ae66ab4:ed21256a:10a030c9
Update Time : Fri Jun 1 03:11:54 2012
Checksum : 6d627f7a - correct
Events : 2127454
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAAA. ('A' == active, '.' == missing)
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
Name : leyline:1 (local to host leyline)
Creation Time : Mon Sep 12 13:19:00 2011
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB)
Array Size : 15626096640 (7451.10 GiB 8000.56 GB)
Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 2ea285a1:a2342c24:ffec56a2:ba6fcf07
Update Time : Fri Jun 1 03:11:54 2012
Checksum : fae2ea42 - correct
Events : 2127454
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAAA. ('A' == active, '.' == missing)
/dev/sdc3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
Name : leyline:1 (local to host leyline)
Creation Time : Mon Sep 12 13:19:00 2011
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB)
Array Size : 15626096640 (7451.10 GiB 8000.56 GB)
Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 2ea285a1:a2342c24:ffec56a2:ba6fcf07
Update Time : Fri Jun 1 03:11:54 2012
Checksum : fae2ea42 - correct
Events : 2127454
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAAA. ('A' == active, '.' == missing)
/dev/sdd3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
Name : leyline:1 (local to host leyline)
Creation Time : Mon Sep 12 13:19:00 2011
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
Array Size : 15626096640 (7451.10 GiB 8000.56 GB)
Data Offset : 1024 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 8d656a1d:bbb1da37:edaf4011:1af2bbb9
Update Time : Fri Jun 1 03:11:54 2012
Checksum : ab4c6863 - correct
Events : 2127454
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAA. ('A' == active, '.' == missing)
/dev/sde3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
Name : leyline:1 (local to host leyline)
Creation Time : Mon Sep 12 13:19:00 2011
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
Array Size : 15626096640 (7451.10 GiB 8000.56 GB)
Data Offset : 1024 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 37bb83bd:313c9381:cabff9d0:60bd205c
Update Time : Wed May 23 03:30:50 2012
Checksum : f72e6959 - correct
Events : 2004256
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAAA. ('A' == active, '.' == missing)
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
Name : leyline:1 (local to host leyline)
Creation Time : Mon Sep 12 13:19:00 2011
Raid Level : raid5
Raid Devices : 5
Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
Array Size : 15626096640 (7451.10 GiB 8000.56 GB)
Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : e16d4103:cd11cc3b:bb6ee12e:5ad0a6e9
Update Time : Fri Jun 1 03:11:54 2012
Checksum : e287a82a - correct
Events : 0
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAAA. ('A' == active, '.' == missing)
--
James Moore
--
James Moore
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-01 23:22 freeone3000 @ 2012-06-01 23:52 ` NeilBrown 2012-06-02 0:48 ` freeone3000 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2012-06-01 23:52 UTC (permalink / raw) To: freeone3000; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 7384 bytes --] On Fri, 1 Jun 2012 18:22:33 -0500 freeone3000 <freeone3000@gmail.com> wrote: > Hello. I have an issue concerning a broken RAID of unsure pedigree. > Examining the drives tells me the block sizes are not the same, as > listed in the email. > > > I certainly won't be easy. Though if someone did find themselves in that > > situation it might motivate me to enhance mdadm in some way to make it easily > > fixable. > > I seem to be your motivation for making this situation fixable. > Somehow I managed to get drives with an invalid block size. All worked > fine until a drive dropped out of the RAID5. When attempting to > replace, I can re-create the RAID, but it cannot be of the same size > because the 1024-sector drives are "too small" when changed to > 2048-sector, exactly as described. Are there any recovery options I > could try, including simply editing the header? You seem to be leaving out some important information. The "mdadm --examine" of all the drives is good - thanks - but what exactly if your problem, and what were you trying to do? You appear to have a 5-device RAID5 of which one device (sde3) fell out of the array on or shortly after 23rd May, 3 drives are working fine, and one - sdf (not sdf3??) - is a confused spare.... What exactly did you do to sdf? NeilBrown > > > mdadm --examine of all drives in the RAID: > > /dev/sdb3: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda > Name : leyline:1 (local to host leyline) > Creation Time : Mon Sep 12 13:19:00 2011 > Raid Level : raid5 > Raid Devices : 5 > > Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) > Array Size : 15626096640 (7451.10 GiB 8000.56 GB) > Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : clean > Device UUID : 872097fa:3ae66ab4:ed21256a:10a030c9 > > Update Time : Fri Jun 1 03:11:54 2012 > Checksum : 6d627f7a - correct > Events : 2127454 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : Active device 1 > Array State : AAAA. ('A' == active, '.' == missing) > > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda > Name : leyline:1 (local to host leyline) > Creation Time : Mon Sep 12 13:19:00 2011 > Raid Level : raid5 > Raid Devices : 5 > > Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) > Array Size : 15626096640 (7451.10 GiB 8000.56 GB) > Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : clean > Device UUID : 2ea285a1:a2342c24:ffec56a2:ba6fcf07 > > Update Time : Fri Jun 1 03:11:54 2012 > Checksum : fae2ea42 - correct > Events : 2127454 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : Active device 0 > Array State : AAAA. ('A' == active, '.' == missing) > > /dev/sdc3: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda > Name : leyline:1 (local to host leyline) > Creation Time : Mon Sep 12 13:19:00 2011 > Raid Level : raid5 > Raid Devices : 5 > > Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) > Array Size : 15626096640 (7451.10 GiB 8000.56 GB) > Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : clean > Device UUID : 2ea285a1:a2342c24:ffec56a2:ba6fcf07 > > Update Time : Fri Jun 1 03:11:54 2012 > Checksum : fae2ea42 - correct > Events : 2127454 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : Active device 0 > Array State : AAAA. ('A' == active, '.' == missing) > > > /dev/sdd3: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda > Name : leyline:1 (local to host leyline) > Creation Time : Mon Sep 12 13:19:00 2011 > Raid Level : raid5 > Raid Devices : 5 > > Avail Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) > Array Size : 15626096640 (7451.10 GiB 8000.56 GB) > Data Offset : 1024 sectors > Super Offset : 8 sectors > State : clean > Device UUID : 8d656a1d:bbb1da37:edaf4011:1af2bbb9 > > Update Time : Fri Jun 1 03:11:54 2012 > Checksum : ab4c6863 - correct > Events : 2127454 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : Active device 3 > Array State : AAAA. ('A' == active, '.' == missing) > > /dev/sde3: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda > Name : leyline:1 (local to host leyline) > Creation Time : Mon Sep 12 13:19:00 2011 > Raid Level : raid5 > Raid Devices : 5 > > Avail Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) > Array Size : 15626096640 (7451.10 GiB 8000.56 GB) > Data Offset : 1024 sectors > Super Offset : 8 sectors > State : clean > Device UUID : 37bb83bd:313c9381:cabff9d0:60bd205c > > Update Time : Wed May 23 03:30:50 2012 > Checksum : f72e6959 - correct > Events : 2004256 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : spare > Array State : AAAA. ('A' == active, '.' == missing) > > /dev/sdf: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda > Name : leyline:1 (local to host leyline) > Creation Time : Mon Sep 12 13:19:00 2011 > Raid Level : raid5 > Raid Devices : 5 > > Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB) > Array Size : 15626096640 (7451.10 GiB 8000.56 GB) > Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : clean > Device UUID : e16d4103:cd11cc3b:bb6ee12e:5ad0a6e9 > > Update Time : Fri Jun 1 03:11:54 2012 > Checksum : e287a82a - correct > Events : 0 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : spare > Array State : AAAA. ('A' == active, '.' == missing) > > -- > James Moore > > -- > James Moore > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-01 23:52 ` NeilBrown @ 2012-06-02 0:48 ` freeone3000 2012-06-04 3:35 ` NeilBrown 0 siblings, 1 reply; 24+ messages in thread From: freeone3000 @ 2012-06-02 0:48 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid Sorry. /dev/sde fell out of the array, so I replaced the physical drive with what is now /dev/sdf. udev may have relabelled the drive - smartctl states that the drive that is now /dev/sde works fine. /dev/sdf is a new drive. /dev/sdf has a single, whole-disk partition with type marked as raid. It is physically larger than the others. /dev/sdf1 doesn't have a mdadm superblock. /dev/sdf seems to, so I gave output of that device instead of /dev/sdf1, despite the partition. Whole-drive RAID is fine, if it gets it working. What I'm attempting to do is rebuild the RAID from the data from the other four drives, and bring the RAID back up without losing any of the data. /dev/sdb3, /dev/sdc3, /dev/sdd3, and what is now /dev/sde3 should be used to rebuild the array, with /dev/sdf as a new drive. If I can get the array back up with all my data and all five drives in use, I'll be very happy. On Fri, Jun 1, 2012 at 6:52 PM, NeilBrown <neilb@suse.de> wrote: > On Fri, 1 Jun 2012 18:22:33 -0500 freeone3000 <freeone3000@gmail.com> wrote: > >> Hello. I have an issue concerning a broken RAID of unsure pedigree. >> Examining the drives tells me the block sizes are not the same, as >> listed in the email. >> >> > I certainly won't be easy. Though if someone did find themselves in that >> > situation it might motivate me to enhance mdadm in some way to make it easily >> > fixable. >> >> I seem to be your motivation for making this situation fixable. >> Somehow I managed to get drives with an invalid block size. All worked >> fine until a drive dropped out of the RAID5. When attempting to >> replace, I can re-create the RAID, but it cannot be of the same size >> because the 1024-sector drives are "too small" when changed to >> 2048-sector, exactly as described. Are there any recovery options I >> could try, including simply editing the header? > > You seem to be leaving out some important information. > The "mdadm --examine" of all the drives is good - thanks - but what exactly > if your problem, and what were you trying to do? > > You appear to have a 5-device RAID5 of which one device (sde3) fell out of > the array on or shortly after 23rd May, 3 drives are working fine, and one - > sdf (not sdf3??) - is a confused spare.... > > What exactly did you do to sdf? > > NeilBrown > > >> >> >> mdadm --examine of all drives in the RAID: >> >> /dev/sdb3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda >> Name : leyline:1 (local to host leyline) >> Creation Time : Mon Sep 12 13:19:00 2011 >> Raid Level : raid5 >> Raid Devices : 5 >> >> Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) >> Array Size : 15626096640 (7451.10 GiB 8000.56 GB) >> Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 872097fa:3ae66ab4:ed21256a:10a030c9 >> >> Update Time : Fri Jun 1 03:11:54 2012 >> Checksum : 6d627f7a - correct >> Events : 2127454 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 1 >> Array State : AAAA. ('A' == active, '.' == missing) >> >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda >> Name : leyline:1 (local to host leyline) >> Creation Time : Mon Sep 12 13:19:00 2011 >> Raid Level : raid5 >> Raid Devices : 5 >> >> Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) >> Array Size : 15626096640 (7451.10 GiB 8000.56 GB) >> Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 2ea285a1:a2342c24:ffec56a2:ba6fcf07 >> >> Update Time : Fri Jun 1 03:11:54 2012 >> Checksum : fae2ea42 - correct >> Events : 2127454 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 0 >> Array State : AAAA. ('A' == active, '.' == missing) >> >> /dev/sdc3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda >> Name : leyline:1 (local to host leyline) >> Creation Time : Mon Sep 12 13:19:00 2011 >> Raid Level : raid5 >> Raid Devices : 5 >> >> Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) >> Array Size : 15626096640 (7451.10 GiB 8000.56 GB) >> Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 2ea285a1:a2342c24:ffec56a2:ba6fcf07 >> >> Update Time : Fri Jun 1 03:11:54 2012 >> Checksum : fae2ea42 - correct >> Events : 2127454 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 0 >> Array State : AAAA. ('A' == active, '.' == missing) >> >> >> /dev/sdd3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda >> Name : leyline:1 (local to host leyline) >> Creation Time : Mon Sep 12 13:19:00 2011 >> Raid Level : raid5 >> Raid Devices : 5 >> >> Avail Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) >> Array Size : 15626096640 (7451.10 GiB 8000.56 GB) >> Data Offset : 1024 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 8d656a1d:bbb1da37:edaf4011:1af2bbb9 >> >> Update Time : Fri Jun 1 03:11:54 2012 >> Checksum : ab4c6863 - correct >> Events : 2127454 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 3 >> Array State : AAAA. ('A' == active, '.' == missing) >> >> /dev/sde3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda >> Name : leyline:1 (local to host leyline) >> Creation Time : Mon Sep 12 13:19:00 2011 >> Raid Level : raid5 >> Raid Devices : 5 >> >> Avail Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) >> Array Size : 15626096640 (7451.10 GiB 8000.56 GB) >> Data Offset : 1024 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 37bb83bd:313c9381:cabff9d0:60bd205c >> >> Update Time : Wed May 23 03:30:50 2012 >> Checksum : f72e6959 - correct >> Events : 2004256 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : spare >> Array State : AAAA. ('A' == active, '.' == missing) >> >> /dev/sdf: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda >> Name : leyline:1 (local to host leyline) >> Creation Time : Mon Sep 12 13:19:00 2011 >> Raid Level : raid5 >> Raid Devices : 5 >> >> Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB) >> Array Size : 15626096640 (7451.10 GiB 8000.56 GB) >> Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : e16d4103:cd11cc3b:bb6ee12e:5ad0a6e9 >> >> Update Time : Fri Jun 1 03:11:54 2012 >> Checksum : e287a82a - correct >> Events : 0 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : spare >> Array State : AAAA. ('A' == active, '.' == missing) >> >> -- >> James Moore >> >> -- >> James Moore >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- James Moore -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-02 0:48 ` freeone3000 @ 2012-06-04 3:35 ` NeilBrown 2012-06-04 18:26 ` Pierre Beck 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2012-06-04 3:35 UTC (permalink / raw) To: freeone3000; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1884 bytes --] On Fri, 1 Jun 2012 19:48:41 -0500 freeone3000 <freeone3000@gmail.com> wrote: > Sorry. > > /dev/sde fell out of the array, so I replaced the physical drive with > what is now /dev/sdf. udev may have relabelled the drive - smartctl > states that the drive that is now /dev/sde works fine. > /dev/sdf is a new drive. /dev/sdf has a single, whole-disk partition > with type marked as raid. It is physically larger than the others. > > /dev/sdf1 doesn't have a mdadm superblock. /dev/sdf seems to, so I > gave output of that device instead of /dev/sdf1, despite the > partition. Whole-drive RAID is fine, if it gets it working. > > What I'm attempting to do is rebuild the RAID from the data from the > other four drives, and bring the RAID back up without losing any of > the data. /dev/sdb3, /dev/sdc3, /dev/sdd3, and what is now /dev/sde3 > should be used to rebuild the array, with /dev/sdf as a new drive. If > I can get the array back up with all my data and all five drives in > use, I'll be very happy. You appear to have 3 devices that are happy: sdc3 is device 0 data-offset 2048 sdb3 is device 1 data-offset 2048 sdd3 is device 3 data-offset 1024 nothing claims to be device 2 or 4. sde3 looks like it was last in the array on 23rd May, a little over a week before your report. Could that have been when "sde fell out of the array" ?? Is it possible that you replaced the wrong device? Or is it possible the the array was degraded when sde "fell out" resulting in data loss? I need more precise history to understand what happened, as I cannot suggest a fixed until I have that understanding. When did the array fail? How certain are you that you replaced the correct device? Can you examine the drive that you removed and see what it says? Are you certain that the array wasn't already degraded? NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-04 3:35 ` NeilBrown @ 2012-06-04 18:26 ` Pierre Beck 2012-06-04 22:57 ` NeilBrown 0 siblings, 1 reply; 24+ messages in thread From: Pierre Beck @ 2012-06-04 18:26 UTC (permalink / raw) To: NeilBrown; +Cc: freeone3000, linux-raid I'll try and clear up some confusion (I was in IRC with freeone3000). /dev/sdf is an empty drive, a replacement for a failed drive. The Array attempted to assemble, but failed and reported one drive as spare. This is the moment we saved the --examines. In expectation of a lost write due to drive write-cache, we executed --assemble --force, which kicked another drive. @James: remove /dev/sdf for now and replace /dev/sde3, which indeed has a very outdated update time, with the non-present drive. Post an --examine of that drive. It should report update time Jun 1st. We tried to re-create the array with --assume-clean. But mdadm chose a different data offset for the drives. A re-create with proper data offset will be necessary. Greetings, Pierre Beck Am 04.06.2012 05:35, schrieb NeilBrown: > On Fri, 1 Jun 2012 19:48:41 -0500 freeone3000<freeone3000@gmail.com> wrote: > >> Sorry. >> >> /dev/sde fell out of the array, so I replaced the physical drive with >> what is now /dev/sdf. udev may have relabelled the drive - smartctl >> states that the drive that is now /dev/sde works fine. >> /dev/sdf is a new drive. /dev/sdf has a single, whole-disk partition >> with type marked as raid. It is physically larger than the others. >> >> /dev/sdf1 doesn't have a mdadm superblock. /dev/sdf seems to, so I >> gave output of that device instead of /dev/sdf1, despite the >> partition. Whole-drive RAID is fine, if it gets it working. >> >> What I'm attempting to do is rebuild the RAID from the data from the >> other four drives, and bring the RAID back up without losing any of >> the data. /dev/sdb3, /dev/sdc3, /dev/sdd3, and what is now /dev/sde3 >> should be used to rebuild the array, with /dev/sdf as a new drive. If >> I can get the array back up with all my data and all five drives in >> use, I'll be very happy. > You appear to have 3 devices that are happy: > sdc3 is device 0 data-offset 2048 > sdb3 is device 1 data-offset 2048 > sdd3 is device 3 data-offset 1024 > > nothing claims to be device 2 or 4. > > sde3 looks like it was last in the array on 23rd May, a little over > a week before your report. Could that have been when "sde fell out of the > array" ?? > Is it possible that you replaced the wrong device? > Or is it possible the the array was degraded when sde "fell out" resulting > in data loss? > > I need more precise history to understand what happened, as I cannot suggest > a fixed until I have that understanding. > > When did the array fail? > How certain are you that you replaced the correct device? > Can you examine the drive that you removed and see what it says? > Are you certain that the array wasn't already degraded? > > NeilBrown > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-04 18:26 ` Pierre Beck @ 2012-06-04 22:57 ` NeilBrown 2012-06-05 5:26 ` freeone3000 2014-02-24 11:22 ` wiebittewas 0 siblings, 2 replies; 24+ messages in thread From: NeilBrown @ 2012-06-04 22:57 UTC (permalink / raw) To: Pierre Beck; +Cc: freeone3000, linux-raid [-- Attachment #1: Type: text/plain, Size: 3628 bytes --] On Mon, 04 Jun 2012 20:26:05 +0200 Pierre Beck <mail@pierre-beck.de> wrote: > I'll try and clear up some confusion (I was in IRC with freeone3000). > > /dev/sdf is an empty drive, a replacement for a failed drive. The Array > attempted to assemble, but failed and reported one drive as spare. This > is the moment we saved the --examines. > > In expectation of a lost write due to drive write-cache, we executed > --assemble --force, which kicked another drive. > > @James: remove /dev/sdf for now and replace /dev/sde3, which indeed has > a very outdated update time, with the non-present drive. Post an > --examine of that drive. It should report update time Jun 1st. > > We tried to re-create the array with --assume-clean. But mdadm chose a > different data offset for the drives. A re-create with proper data > offset will be necessary. OK, try: git clone -b data_offset git://neil.brown.name/mdadm cd mdadm make ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 \ /dev/sdc3:2048s /dev/sdb3:2048s ??? /dev/sdd3:1024s ??? The number after ':' after a device name is a data offset. 's' means sectors. With out 's' it means Kilobytes. I don't know what should be at slot 2 or 4 so I put '???'. You should fill it in. You should also double check the command and double check the names of your devices. Don't install this mdadm, and don't use it for anything other than re-creating this array. Good luck. NeilBrown > > Greetings, > > Pierre Beck > > > Am 04.06.2012 05:35, schrieb NeilBrown: > > On Fri, 1 Jun 2012 19:48:41 -0500 freeone3000<freeone3000@gmail.com> wrote: > > > >> Sorry. > >> > >> /dev/sde fell out of the array, so I replaced the physical drive with > >> what is now /dev/sdf. udev may have relabelled the drive - smartctl > >> states that the drive that is now /dev/sde works fine. > >> /dev/sdf is a new drive. /dev/sdf has a single, whole-disk partition > >> with type marked as raid. It is physically larger than the others. > >> > >> /dev/sdf1 doesn't have a mdadm superblock. /dev/sdf seems to, so I > >> gave output of that device instead of /dev/sdf1, despite the > >> partition. Whole-drive RAID is fine, if it gets it working. > >> > >> What I'm attempting to do is rebuild the RAID from the data from the > >> other four drives, and bring the RAID back up without losing any of > >> the data. /dev/sdb3, /dev/sdc3, /dev/sdd3, and what is now /dev/sde3 > >> should be used to rebuild the array, with /dev/sdf as a new drive. If > >> I can get the array back up with all my data and all five drives in > >> use, I'll be very happy. > > You appear to have 3 devices that are happy: > > sdc3 is device 0 data-offset 2048 > > sdb3 is device 1 data-offset 2048 > > sdd3 is device 3 data-offset 1024 > > > > nothing claims to be device 2 or 4. > > > > sde3 looks like it was last in the array on 23rd May, a little over > > a week before your report. Could that have been when "sde fell out of the > > array" ?? > > Is it possible that you replaced the wrong device? > > Or is it possible the the array was degraded when sde "fell out" resulting > > in data loss? > > > > I need more precise history to understand what happened, as I cannot suggest > > a fixed until I have that understanding. > > > > When did the array fail? > > How certain are you that you replaced the correct device? > > Can you examine the drive that you removed and see what it says? > > Are you certain that the array wasn't already degraded? > > > > NeilBrown > > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-04 22:57 ` NeilBrown @ 2012-06-05 5:26 ` freeone3000 2012-06-05 5:44 ` NeilBrown 2014-02-24 11:22 ` wiebittewas 1 sibling, 1 reply; 24+ messages in thread From: freeone3000 @ 2012-06-05 5:26 UTC (permalink / raw) To: NeilBrown; +Cc: Pierre Beck, linux-raid Swapped out the new drive for the old. The new drive is still labelled as /dev/sdf, hopefully.. I decided to check times before proceeding, and to make sure the drives were in the right order. I corrected them to go by `mdadm --examine`'s output as best I could. Here's the output of `mdadm --examine /dev/sdf` and the result of executing the given `./mdadm` command (with re-ordered drives), The binary compiled from git sources crashed with a segmentation fault while attempting to print out a failure writing the superblock. I've tried the drives (with proper sizes) in other combinations, according to both what you posted and what mdadm --examine says the "Device Role" is. I haven't found a working combination; is it possible my drives got swapped around on reboot? There's a re-run of mdadm --examine at the end of my post. root@leyline:~/mdadm# mdadm --examine /dev/sdf /dev/sdf: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda Name : leyline:1 (local to host leyline) Creation Time : Mon Sep 12 13:19:00 2011 Raid Level : raid5 Raid Devices : 5 Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB) Array Size : 7813048320 (7451.10 GiB 8000.56 GB) Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 2edc16c6:cf45ad32:04b026a4:956ce78b Update Time : Fri Jun 1 03:11:54 2012 Checksum : b3e49e59 - correct Events : 2127454 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 2 Array State : AAAA. ('A' == active, '.' == missing) root@leyline:~/mdadm# ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 /dev/sdc3:2048s /dev/sdf:2048s /dev/sdb3:2048s /dev/sdd3:2048s /dev/sde3:1024s mdadm: /dev/sdc3 appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:10:46 2012 mdadm: /dev/sdf appears to contain an ext2fs file system size=242788K mtime=Fri Oct 7 16:55:40 2011 mdadm: /dev/sdf appears to be part of a raid array: level=raid5 devices=5 ctime=Mon Sep 12 13:19:00 2011 mdadm: /dev/sdb3 appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:10:46 2012 mdadm: /dev/sdd3 appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:10:46 2012 Continue creating array? yes Segmentation fault Since I couldn't find any fault with running it again (but I am not a smart man, or I would not be in this position), I decided to run valgrind over it: ==3206== Memcheck, a memory error detector ==3206== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==3206== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info ==3206== Command: ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 /dev/sdc3:2048s /dev/sdf:2048s /dev/sdb3:2048s /dev/sdd3:2048s /dev/sde3:1024s ==3206== ==3206== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints ==3206== This could cause spurious value errors to appear. ==3206== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==3206== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints ==3206== This could cause spurious value errors to appear. ==3206== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==3206== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints ==3206== This could cause spurious value errors to appear. ==3206== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. mdadm: /dev/sdc3 appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:14:20 2012 mdadm: /dev/sdf appears to contain an ext2fs file system size=242788K mtime=Fri Oct 7 16:55:40 2011 mdadm: /dev/sdf appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:14:20 2012 mdadm: /dev/sdb3 appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:14:20 2012 mdadm: /dev/sdd3 appears to be part of a raid array: level=raid5 devices=5 ctime=Tue Jun 5 00:14:20 2012 Continue creating array? ==3206== Invalid read of size 8 ==3206== at 0x43C9B7: write_init_super1 (super1.c:1327) ==3206== by 0x41F1B9: Create (Create.c:951) ==3206== by 0x407231: main (mdadm.c:1464) ==3206== Address 0x8 is not stack'd, malloc'd or (recently) free'd ==3206== ==3206== ==3206== Process terminating with default action of signal 11 (SIGSEGV) ==3206== Access not within mapped region at address 0x8 ==3206== at 0x43C9B7: write_init_super1 (super1.c:1327) ==3206== by 0x41F1B9: Create (Create.c:951) ==3206== by 0x407231: main (mdadm.c:1464) ==3206== If you believe this happened as a result of a stack ==3206== overflow in your program's main thread (unlikely but ==3206== possible), you can try to increase the size of the ==3206== main thread stack using the --main-stacksize= flag. ==3206== The main thread stack size used in this run was 8388608. ==3206== ==3206== HEAP SUMMARY: ==3206== in use at exit: 37,033 bytes in 350 blocks ==3206== total heap usage: 673 allocs, 323 frees, 4,735,171 bytes allocated ==3206== ==3206== LEAK SUMMARY: ==3206== definitely lost: 832 bytes in 8 blocks ==3206== indirectly lost: 18,464 bytes in 4 blocks ==3206== possibly lost: 0 bytes in 0 blocks ==3206== still reachable: 17,737 bytes in 338 blocks ==3206== suppressed: 0 bytes in 0 blocks ==3206== Rerun with --leak-check=full to see details of leaked memory ==3206== ==3206== For counts of detected and suppressed errors, rerun with: -v ==3206== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4) mdadm --examine of all my drives (again): /dev/sdb3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 9ed0b17f:e9a7a813:a9139679:4f8f999b Name : leyline:1 (local to host leyline) Creation Time : Tue Jun 5 00:14:35 2012 Raid Level : raid5 Raid Devices : 5 Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) Array Size : 7813046272 (7451.10 GiB 8000.56 GB) Used Dev Size : 3906523136 (1862.78 GiB 2000.14 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : feb94069:b2afeb6e:ae6b2af2:f9e3cee4 Update Time : Tue Jun 5 00:14:35 2012 Checksum : 1909d79e - correct Events : 0 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 2 Array State : AAAAA ('A' == active, '.' == missing) /dev/sdc3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 9ed0b17f:e9a7a813:a9139679:4f8f999b Name : leyline:1 (local to host leyline) Creation Time : Tue Jun 5 00:14:35 2012 Raid Level : raid5 Raid Devices : 5 Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB) Array Size : 7813046272 (7451.10 GiB 8000.56 GB) Used Dev Size : 3906523136 (1862.78 GiB 2000.14 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : ed6116ac:4f91c2dd:4ada53df:0e14fc2a Update Time : Tue Jun 5 00:14:35 2012 Checksum : fe0cffd8 - correct Events : 0 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 0 Array State : AAAAA ('A' == active, '.' == missing) /dev/sdd3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 9ed0b17f:e9a7a813:a9139679:4f8f999b Name : leyline:1 (local to host leyline) Creation Time : Tue Jun 5 00:14:35 2012 Raid Level : raid5 Raid Devices : 5 Avail Dev Size : 3906523136 (1862.78 GiB 2000.14 GB) Array Size : 7813046272 (7451.10 GiB 8000.56 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : d839ca02:1d14cde3:65b54275:8caa0275 Update Time : Tue Jun 5 00:14:35 2012 Checksum : 3ac0c483 - correct Events : 0 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 3 Array State : AAAAA ('A' == active, '.' == missing) mdadm: No md superblock detected on /dev/sde3. On Mon, Jun 4, 2012 at 5:57 PM, NeilBrown <neilb@suse.de> wrote: > On Mon, 04 Jun 2012 20:26:05 +0200 Pierre Beck <mail@pierre-beck.de> wrote: > >> I'll try and clear up some confusion (I was in IRC with freeone3000). >> >> /dev/sdf is an empty drive, a replacement for a failed drive. The Array >> attempted to assemble, but failed and reported one drive as spare. This >> is the moment we saved the --examines. >> >> In expectation of a lost write due to drive write-cache, we executed >> --assemble --force, which kicked another drive. >> >> @James: remove /dev/sdf for now and replace /dev/sde3, which indeed has >> a very outdated update time, with the non-present drive. Post an >> --examine of that drive. It should report update time Jun 1st. >> >> We tried to re-create the array with --assume-clean. But mdadm chose a >> different data offset for the drives. A re-create with proper data >> offset will be necessary. > > OK, try: > > git clone -b data_offset git://neil.brown.name/mdadm > cd mdadm > make > > ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 \ > /dev/sdc3:2048s /dev/sdb3:2048s ??? /dev/sdd3:1024s ??? > > The number after ':' after a device name is a data offset. 's' means sectors. > With out 's' it means Kilobytes. > I don't know what should be at slot 2 or 4 so I put '???'. You should fill it > in. You should also double check the command and double check the names of > your devices. > Don't install this mdadm, and don't use it for anything other than > re-creating this array. > > Good luck. > > NeilBrown > >> >> Greetings, >> >> Pierre Beck >> >> >> Am 04.06.2012 05:35, schrieb NeilBrown: >> > On Fri, 1 Jun 2012 19:48:41 -0500 freeone3000<freeone3000@gmail.com> wrote: >> > >> >> Sorry. >> >> >> >> /dev/sde fell out of the array, so I replaced the physical drive with >> >> what is now /dev/sdf. udev may have relabelled the drive - smartctl >> >> states that the drive that is now /dev/sde works fine. >> >> /dev/sdf is a new drive. /dev/sdf has a single, whole-disk partition >> >> with type marked as raid. It is physically larger than the others. >> >> >> >> /dev/sdf1 doesn't have a mdadm superblock. /dev/sdf seems to, so I >> >> gave output of that device instead of /dev/sdf1, despite the >> >> partition. Whole-drive RAID is fine, if it gets it working. >> >> >> >> What I'm attempting to do is rebuild the RAID from the data from the >> >> other four drives, and bring the RAID back up without losing any of >> >> the data. /dev/sdb3, /dev/sdc3, /dev/sdd3, and what is now /dev/sde3 >> >> should be used to rebuild the array, with /dev/sdf as a new drive. If >> >> I can get the array back up with all my data and all five drives in >> >> use, I'll be very happy. >> > You appear to have 3 devices that are happy: >> > sdc3 is device 0 data-offset 2048 >> > sdb3 is device 1 data-offset 2048 >> > sdd3 is device 3 data-offset 1024 >> > >> > nothing claims to be device 2 or 4. >> > >> > sde3 looks like it was last in the array on 23rd May, a little over >> > a week before your report. Could that have been when "sde fell out of the >> > array" ?? >> > Is it possible that you replaced the wrong device? >> > Or is it possible the the array was degraded when sde "fell out" resulting >> > in data loss? >> > >> > I need more precise history to understand what happened, as I cannot suggest >> > a fixed until I have that understanding. >> > >> > When did the array fail? >> > How certain are you that you replaced the correct device? >> > Can you examine the drive that you removed and see what it says? >> > Are you certain that the array wasn't already degraded? >> > >> > NeilBrown >> > > -- James Moore -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-05 5:26 ` freeone3000 @ 2012-06-05 5:44 ` NeilBrown [not found] ` <CAFhY2CiDTMRSV2wFCMhT9ZstUkHkJS7E0p7SP-ssfqwaquo+0w@mail.gmail.com> 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2012-06-05 5:44 UTC (permalink / raw) To: freeone3000; +Cc: Pierre Beck, linux-raid [-- Attachment #1: Type: text/plain, Size: 683 bytes --] On Tue, 5 Jun 2012 00:26:20 -0500 freeone3000 <freeone3000@gmail.com> wrote: > ==3206== at 0x43C9B7: write_init_super1 (super1.c:1327) > ==3206== by 0x41F1B9: Create (Create.c:951) > ==3206== by 0x407231: main (mdadm.c:1464) > ==3206== Address 0x8 is not stack'd, malloc'd or (recently) free'd Ahhh.. I know that bug. If you "git pull; make" it should go away. However the bug happens when printing an error message, so it won't make things actually work. It seems that it is failing to write a superblock to /dev/sde3. I wonder why. Maybe if you run it under strace: strace -o /tmp/str ./mdadm -C ..... then post the "/tmp/str" file. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <CAFhY2CiDTMRSV2wFCMhT9ZstUkHkJS7E0p7SP-ssfqwaquo+0w@mail.gmail.com>]
[parent not found: <20120610074531.65eaed81@notabene.brown>]
[parent not found: <CAFhY2CgxkjH6JvJzvQt9XT0oawntK7YoTFqnXQJGzvqthD8XpQ@mail.gmail.com>]
* Re: Data Offset [not found] ` <CAFhY2CgxkjH6JvJzvQt9XT0oawntK7YoTFqnXQJGzvqthD8XpQ@mail.gmail.com> @ 2012-06-13 9:46 ` Pierre Beck 2012-06-13 12:49 ` Phil Turmel 0 siblings, 1 reply; 24+ messages in thread From: Pierre Beck @ 2012-06-13 9:46 UTC (permalink / raw) To: freeone3000; +Cc: NeilBrown, linux-raid You specified same offset for all drives, which is wrong. Your initial drive setup had differing offsets - look at the examines. In summary, we have this information: Drive 0: offset 2048 Drive 1: offset 2048 Drive 2: offset 2048 Drive 3: offset 1024 Drive 4: -dead- The order was also wrong. The unpartitioned drive was active device 2. The /dev/sdX ordering information we got is like Drive 0: ? Drive 1: ? Drive 2: /dev/sde Drive 3: ? Drive 4: missing That makes 3 variables. You can trial-and-error the order of drive 0, 1, 3 but make sure the offset of drive 3 is always 1024. You'd shift characters only. My first try would be: /dev/sdc3:2048s /dev/sdb3:2048s /dev/sde:2048s /dev/sdd3:1024s missing My second try would be: /dev/sdb3:2048s /dev/sdc3:2048s /dev/sde:2048s /dev/sdd3:1024s missing ... and so on. And DON'T run fsck early. Try mounting read-only, take a look at your data. Something bigger than chunksize, like a movie file, checksum some iso or smth. It should be intact. THEN run fsck. fsck cannot expect raid stripe reordering and worst case may cause damage (not that I heard of it happen, but inspecting data first is safe). Good luck, Pierre Beck Am 13.06.2012 08:57, schrieb freeone3000: > /dev/sdc3:2048s /dev/sdb3:2048s /dev/sdd3:2048s /dev/sde:2048s ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-13 9:46 ` Pierre Beck @ 2012-06-13 12:49 ` Phil Turmel 2012-06-13 17:56 ` Pierre Beck 0 siblings, 1 reply; 24+ messages in thread From: Phil Turmel @ 2012-06-13 12:49 UTC (permalink / raw) To: Pierre Beck; +Cc: freeone3000, NeilBrown, linux-raid On 06/13/2012 05:46 AM, Pierre Beck wrote: [trim /] > And DON'T run fsck early. Try mounting read-only, take a look at your > data. Something bigger than chunksize, like a movie file, checksum some > iso or smth. It should be intact. THEN run fsck. fsck cannot expect raid > stripe reordering and worst case may cause damage (not that I heard of > it happen, but inspecting data first is safe). This is *dangerous* advice. Modern filesystems will replay their journal even when mounted read-only. When attempting to determine the member order, mounting the file system is *not* safe. And some filesystems ignore the read-only status of the array, so that won't avoid the problem. fsck -n is the *only* safe way to automate the check. Hex dumps of expected signature blocks is even better, but is difficult to automate. HTH, Phil ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-13 12:49 ` Phil Turmel @ 2012-06-13 17:56 ` Pierre Beck 2012-06-13 18:11 ` Phil Turmel 0 siblings, 1 reply; 24+ messages in thread From: Pierre Beck @ 2012-06-13 17:56 UTC (permalink / raw) To: Phil Turmel; +Cc: linux-raid Am 13.06.2012 14:49, schrieb Phil Turmel: > > fsck -n is the *only* safe way to automate the check. Hex dumps of > expected signature blocks is even better, but is difficult to automate. The journal playback seemed less dangerous than fsck in general, but fsck -n is even better, agreed. Greetings, Pierre Beck ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-13 17:56 ` Pierre Beck @ 2012-06-13 18:11 ` Phil Turmel 2012-06-13 18:22 ` Pierre Beck 0 siblings, 1 reply; 24+ messages in thread From: Phil Turmel @ 2012-06-13 18:11 UTC (permalink / raw) To: Pierre Beck; +Cc: linux-raid On 06/13/2012 01:56 PM, Pierre Beck wrote: > Am 13.06.2012 14:49, schrieb Phil Turmel: >> >> fsck -n is the *only* safe way to automate the check. Hex dumps of >> expected signature blocks is even better, but is difficult to automate. > > The journal playback seemed less dangerous than fsck in general, but > fsck -n is even better, agreed. It's not a question more or less dangerous. If the stripes are out of order, writing to the array is destructive. Not only will the writes probably land in the wrong place on the device, but writing triggers recalculated parity, which also probably stomps on other data. Phil ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-13 18:11 ` Phil Turmel @ 2012-06-13 18:22 ` Pierre Beck 2012-06-13 18:49 ` Piergiorgio Sartor 0 siblings, 1 reply; 24+ messages in thread From: Pierre Beck @ 2012-06-13 18:22 UTC (permalink / raw) To: Phil Turmel; +Cc: linux-raid I wonder if LVM activation can be done without writing to the array? IIRC it updates at least some timestamp ... Am 13.06.2012 20:12, schrieb Phil Turmel: > It's not a question more or less dangerous. If the stripes are out of > order, writing to the array is destructive. Not only will the writes > probably land in the wrong place on the device, but writing triggers > recalculated parity, which also probably stomps on other data. > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-13 18:22 ` Pierre Beck @ 2012-06-13 18:49 ` Piergiorgio Sartor 2012-06-20 3:56 ` freeone3000 0 siblings, 1 reply; 24+ messages in thread From: Piergiorgio Sartor @ 2012-06-13 18:49 UTC (permalink / raw) To: Pierre Beck; +Cc: Phil Turmel, linux-raid Hi all, On Wed, Jun 13, 2012 at 08:22:02PM +0200, Pierre Beck wrote: > I wonder if LVM activation can be done without writing to the array? > IIRC it updates at least some timestamp ... we tried setting the array itself read only, which prevent (or it should) anybody above, filesystem or LVM, to perform writes... Neveretheless, the result was a bit "strange", namely a kernel BUG() or similar, we had to reset the PC. I'm not sure if this was caused by other issue, since the PC was in a not really healty state, or a direct consequence of md device in r/o. bye, -- piergiorgio ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-13 18:49 ` Piergiorgio Sartor @ 2012-06-20 3:56 ` freeone3000 2012-06-20 14:09 ` Pierre Beck 2012-06-25 6:25 ` NeilBrown 0 siblings, 2 replies; 24+ messages in thread From: freeone3000 @ 2012-06-20 3:56 UTC (permalink / raw) To: Piergiorgio Sartor; +Cc: Pierre Beck, Phil Turmel, linux-raid Thanks a lot for your help. I have my data back! Played a few movie files off of the mounted drive, and they all worked perfect. Sorry for being such a dunce with the block sizes. `/dev/sdc3:2048s /dev/sdb3:2048s /dev/sde:2048s /dev/sdd3:1024s missing` mounted the drive successfully. Now, is there a way to "normalize" my drives so I can mount it without running through this guesswork again? Or that I can re-create my array using a standard mdadm? On Wed, Jun 13, 2012 at 1:49 PM, Piergiorgio Sartor <piergiorgio.sartor@nexgo.de> wrote: > > Hi all, > > On Wed, Jun 13, 2012 at 08:22:02PM +0200, Pierre Beck wrote: > > I wonder if LVM activation can be done without writing to the array? > > IIRC it updates at least some timestamp ... > > we tried setting the array itself read only, > which prevent (or it should) anybody above, > filesystem or LVM, to perform writes... > > Neveretheless, the result was a bit "strange", > namely a kernel BUG() or similar, we had to > reset the PC. > > I'm not sure if this was caused by other issue, > since the PC was in a not really healty state, > or a direct consequence of md device in r/o. > > bye, > > -- > > piergiorgio > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- James Moore -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-20 3:56 ` freeone3000 @ 2012-06-20 14:09 ` Pierre Beck 2012-06-25 6:25 ` NeilBrown 1 sibling, 0 replies; 24+ messages in thread From: Pierre Beck @ 2012-06-20 14:09 UTC (permalink / raw) To: freeone3000; +Cc: linux-raid Through the re-creation of the array, metadata has been fixed and the array is now as good as it was before the little accident. You can configure auto-assembly of the array as usual in mdadm.conf (updating the UUID is probably all you need to do). Also, you can now add the spare disk with your standard mdadm. Am 20.06.2012 15:54, schrieb freeone3000: > Thanks a lot for your help. I have my data back! Played a few movie > files off of the mounted drive, and they all worked perfect. Sorry for > being such a dunce with the block sizes. > > `/dev/sdc3:2048s /dev/sdb3:2048s /dev/sde:2048s /dev/sdd3:1024s > missing` mounted the drive successfully. Now, is there a way to > "normalize" my drives so I can mount it without running through this > guesswork again? Or that I can re-create my array using a standard > mdadm? > > On Wed, Jun 13, 2012 at 1:49 PM, Piergiorgio Sartor > <piergiorgio.sartor@nexgo.de> wrote: >> Hi all, >> >> On Wed, Jun 13, 2012 at 08:22:02PM +0200, Pierre Beck wrote: >>> I wonder if LVM activation can be done without writing to the array? >>> IIRC it updates at least some timestamp ... >> we tried setting the array itself read only, >> which prevent (or it should) anybody above, >> filesystem or LVM, to perform writes... >> >> Neveretheless, the result was a bit "strange", >> namely a kernel BUG() or similar, we had to >> reset the PC. >> >> I'm not sure if this was caused by other issue, >> since the PC was in a not really healty state, >> or a direct consequence of md device in r/o. >> >> bye, >> >> -- >> >> piergiorgio >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > James Moore > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-20 3:56 ` freeone3000 2012-06-20 14:09 ` Pierre Beck @ 2012-06-25 6:25 ` NeilBrown 1 sibling, 0 replies; 24+ messages in thread From: NeilBrown @ 2012-06-25 6:25 UTC (permalink / raw) To: freeone3000; +Cc: Piergiorgio Sartor, Pierre Beck, Phil Turmel, linux-raid [-- Attachment #1: Type: text/plain, Size: 1986 bytes --] On Tue, 19 Jun 2012 22:56:48 -0500 freeone3000 <freeone3000@gmail.com> wrote: > Thanks a lot for your help. I have my data back! Played a few movie > files off of the mounted drive, and they all worked perfect. Sorry for > being such a dunce with the block sizes. > > `/dev/sdc3:2048s /dev/sdb3:2048s /dev/sde:2048s /dev/sdd3:1024s > missing` mounted the drive successfully. Now, is there a way to > "normalize" my drives so I can mount it without running through this > guesswork again? Or that I can re-create my array using a standard > mdadm? No. But maybe with Linux-3.6 is out. Of course you only need the 'guess work' etc if something serious goes wrong again. Hopefully it won't. NeilBrown > > On Wed, Jun 13, 2012 at 1:49 PM, Piergiorgio Sartor > <piergiorgio.sartor@nexgo.de> wrote: > > > > Hi all, > > > > On Wed, Jun 13, 2012 at 08:22:02PM +0200, Pierre Beck wrote: > > > I wonder if LVM activation can be done without writing to the array? > > > IIRC it updates at least some timestamp ... > > > > we tried setting the array itself read only, > > which prevent (or it should) anybody above, > > filesystem or LVM, to perform writes... > > > > Neveretheless, the result was a bit "strange", > > namely a kernel BUG() or similar, we had to > > reset the PC. > > > > I'm not sure if this was caused by other issue, > > since the PC was in a not really healty state, > > or a direct consequence of md device in r/o. > > > > bye, > > > > -- > > > > piergiorgio > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > James Moore > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2012-06-04 22:57 ` NeilBrown 2012-06-05 5:26 ` freeone3000 @ 2014-02-24 11:22 ` wiebittewas 2014-02-24 21:38 ` NeilBrown 1 sibling, 1 reply; 24+ messages in thread From: wiebittewas @ 2014-02-24 11:22 UTC (permalink / raw) To: linux-raid Am 05.06.2012 00:57 schrieb "NeilBrown" bzgl. "Re: Data Offset": > OK, try: > > git clone -b data_offset git://neil.brown.name/mdadm > cd mdadm > make > > ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 \ > /dev/sdc3:2048s /dev/sdb3:2048s ??? /dev/sdd3:1024s ??? > > The number after ':' after a device name is a data offset. 's' means sectors. well, luckily I've found this message and was able to rescue an array with two failed disks with only a few damaged files... but is there any possibility that this features comes into normal tree, so it will be usable in near future from rescue-sticks like grml? > Don't install this mdadm, and don't use it for anything other than > re-creating this array. or does it interfere with other necessary features? I've viewed a diff to standard 3.2.5 and doesn't see real complex things, but even a lot of smaller changes. at least thanks for your work. w. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data Offset 2014-02-24 11:22 ` wiebittewas @ 2014-02-24 21:38 ` NeilBrown 0 siblings, 0 replies; 24+ messages in thread From: NeilBrown @ 2014-02-24 21:38 UTC (permalink / raw) To: wiebittewas; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1386 bytes --] On Mon, 24 Feb 2014 12:22:44 +0100 wiebittewas <wiebittewas@googlemail.com> wrote: > Am 05.06.2012 00:57 schrieb "NeilBrown" bzgl. "Re: Data Offset": > > > > OK, try: > > > > git clone -b data_offset git://neil.brown.name/mdadm > > cd mdadm > > make > > > > ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 \ > > /dev/sdc3:2048s /dev/sdb3:2048s ??? /dev/sdd3:1024s ??? > > > > The number after ':' after a device name is a data offset. 's' means sectors. > > well, luckily I've found this message and was able to rescue an array > with two failed disks with only a few damaged files... > > but is there any possibility that this features comes into normal tree, > so it will be usable in near future from rescue-sticks like grml? This feature is in mdadm-3.3. NeilBrown > > > Don't install this mdadm, and don't use it for anything other than > > re-creating this array. > > or does it interfere with other necessary features? > > I've viewed a diff to standard 3.2.5 and doesn't see real complex > things, but even a lot of smaller changes. > > at least thanks for your work. > > w. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Data offset @ 2014-05-15 14:11 Patrik Horník 2014-05-15 22:07 ` NeilBrown 0 siblings, 1 reply; 24+ messages in thread From: Patrik Horník @ 2014-05-15 14:11 UTC (permalink / raw) To: linux-raid Hello, I have couple of questions regarding data offset: - Current mdadm sets 128 MiB offset apparently for reshaping backup file. Is it really used by reshape? From which kernel / mdadm version? - Is there any other good reason for such big offset? - Is there a support for specifying data offset when creating array? In which version of mdadm? What are recommended lower values? Thanks. Patrik ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data offset 2014-05-15 14:11 Data offset Patrik Horník @ 2014-05-15 22:07 ` NeilBrown 2014-05-16 0:41 ` Patrik Horník 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2014-05-15 22:07 UTC (permalink / raw) To: patrik; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1040 bytes --] On Thu, 15 May 2014 16:11:46 +0200 Patrik Horník <patrik@dsl.sk> wrote: > Hello, > > I have couple of questions regarding data offset: > > - Current mdadm sets 128 MiB offset apparently for reshaping backup > file. Is it really used by reshape? From which kernel / mdadm version? Yes. Linux 3.5 or there abouts. mdadm 3.3 > > - Is there any other good reason for such big offset? No. But when you typically have a terabyte of more per disc, 128MB isn't very much. > > - Is there a support for specifying data offset when creating array? > In which version of mdadm? What are recommended lower values? Yes. mdadm-3.3. Recommended value is the default. Lowest usable value is the lowest value it will let you use. Why does a 128MB offset bother you? NeilBrown > > Thanks. > > Patrik > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Data offset 2014-05-15 22:07 ` NeilBrown @ 2014-05-16 0:41 ` Patrik Horník 0 siblings, 0 replies; 24+ messages in thread From: Patrik Horník @ 2014-05-16 0:41 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid 2014-05-16 0:07 GMT+02:00 NeilBrown <neilb@suse.de>: > On Thu, 15 May 2014 16:11:46 +0200 Patrik Horník <patrik@dsl.sk> wrote: > >> Hello, >> >> I have couple of questions regarding data offset: >> >> - Current mdadm sets 128 MiB offset apparently for reshaping backup >> file. Is it really used by reshape? From which kernel / mdadm version? > > Yes. Linux 3.5 or there abouts. mdadm 3.3 >> >> - Is there any other good reason for such big offset? > > No. But when you typically have a terabyte of more per disc, 128MB isn't > very much. > >> >> - Is there a support for specifying data offset when creating array? >> In which version of mdadm? What are recommended lower values? > > Yes. mdadm-3.3. Recommended value is the default. Lowest usable value is > the lowest value it will let you use. > Does lowest value depend on something? Or is it fixed and what is it? Are there any other considerations for offset size like to be able add more devices in array, etc? (I dont know how big is this minimal value and how big is record per device for example...) > Why does a 128MB offset bother you? It does not by itself. But I am doing some reorganization, including new layer and need to fit existing fs back on same drives. So I need to save space in RAID level to be able to fit new layer. Patrik > > NeilBrown > >> >> Thanks. >> >> Patrik >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2014-05-16 0:41 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-10 17:42 Data Offset Piergiorgio Sartor 2012-05-24 5:20 ` NeilBrown -- strict thread matches above, loose matches on Subject: below -- 2012-06-01 23:22 freeone3000 2012-06-01 23:52 ` NeilBrown 2012-06-02 0:48 ` freeone3000 2012-06-04 3:35 ` NeilBrown 2012-06-04 18:26 ` Pierre Beck 2012-06-04 22:57 ` NeilBrown 2012-06-05 5:26 ` freeone3000 2012-06-05 5:44 ` NeilBrown [not found] ` <CAFhY2CiDTMRSV2wFCMhT9ZstUkHkJS7E0p7SP-ssfqwaquo+0w@mail.gmail.com> [not found] ` <20120610074531.65eaed81@notabene.brown> [not found] ` <CAFhY2CgxkjH6JvJzvQt9XT0oawntK7YoTFqnXQJGzvqthD8XpQ@mail.gmail.com> 2012-06-13 9:46 ` Pierre Beck 2012-06-13 12:49 ` Phil Turmel 2012-06-13 17:56 ` Pierre Beck 2012-06-13 18:11 ` Phil Turmel 2012-06-13 18:22 ` Pierre Beck 2012-06-13 18:49 ` Piergiorgio Sartor 2012-06-20 3:56 ` freeone3000 2012-06-20 14:09 ` Pierre Beck 2012-06-25 6:25 ` NeilBrown 2014-02-24 11:22 ` wiebittewas 2014-02-24 21:38 ` NeilBrown 2014-05-15 14:11 Data offset Patrik Horník 2014-05-15 22:07 ` NeilBrown 2014-05-16 0:41 ` Patrik Horník
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).