Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* [PATCH] mdadm: replace hard coded string length
From: Song Liu @ 2016-09-15  0:13 UTC (permalink / raw)
  To: linux-raid; +Cc: Jes.Sorensen, Song Liu

This patch replaces hard coded 32 with sizeof(sb->set_name) in a
couple places.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 super1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/super1.c b/super1.c
index 9f62d23..7d03b1f 100644
--- a/super1.c
+++ b/super1.c
@@ -1030,7 +1030,7 @@ static void getinfo_super1(struct supertype *st, struct mdinfo *info, char *map)
 
 	memcpy(info->uuid, sb->set_uuid, 16);
 
-	strncpy(info->name, sb->set_name, 32);
+	strncpy(info->name, sb->set_name, sizeof(sb->set_name));
 	info->name[32] = 0;
 
 	if ((__le32_to_cpu(sb->feature_map)&MD_FEATURE_REPLACEMENT)) {
@@ -1124,7 +1124,7 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 		if (c)
 			strncpy(info->name, c+1, 31 - (c-sb->set_name));
 		else
-			strncpy(info->name, sb->set_name, 32);
+			strncpy(info->name, sb->set_name, sizeof(sb->set_name));
 		info->name[32] = 0;
 	}
 
-- 
2.8.0.rc2


^ permalink raw reply related

* Re: moving spares into group and checking spares
From: Chris Murphy @ 2016-09-14 23:15 UTC (permalink / raw)
  To: scar; +Cc: Linux-RAID
In-Reply-To: <nrckp2$k8e$1@blaine.gmane.org>

On Wed, Sep 14, 2016 at 4:59 PM, scar <scar@drigon.com> wrote:
> Chris Murphy wrote on 09/14/2016 03:33 PM:
>>
>> SCT ERC value is less than SCSI command timer value?
>
>
>
> they are 1TB Hitachi HUA7210SASUN drives in Sun Fire X4540 with SCT ERC
> value of 255 (25.5 seconds) and /sys/block/sdX/device/timeout is set to 30

Interesting choice, I haven't seen it cut that closely before, but it
ought to work. The value I most often see is 70 deciseconds which for
sure will fail, if it's going to, well before the command timer gives
up.


-- 
Chris Murphy

^ permalink raw reply

* Re: moving spares into group and checking spares
From: scar @ 2016-09-14 22:59 UTC (permalink / raw)
  To: linux-raid-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CAJCQCtQJwOTYsWubd0rV-6PRL4kmVRKLfLr3=7ZPr1Zb3SrwtQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Chris Murphy wrote on 09/14/2016 03:33 PM:
> SCT ERC value is less than SCSI command timer value?


they are 1TB Hitachi HUA7210SASUN drives in Sun Fire X4540 with SCT ERC 
value of 255 (25.5 seconds) and /sys/block/sdX/device/timeout is set to 30


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: moving spares into group and checking spares
From: Chris Murphy @ 2016-09-14 22:33 UTC (permalink / raw)
  To: scar; +Cc: Linux-RAID
In-Reply-To: <nrce27$2km$1@blaine.gmane.org>

On Wed, Sep 14, 2016 at 3:05 PM, scar <scar@drigon.com> wrote:
> Roman Mamedov wrote on 09/14/2016 11:22 AM:
>>
>> But you think an 11-member RAID5, let alone four of them joined by LVM is
>> safe? From a resiliency standpoint that setup is like insanity squared.
>
>
> yeah it seems fine?  disks are healthy and regularly checked, just wondering
> how to check the spares.  use cron to schedule weekly smartctl long test?

That you're asking that question now makes me wonder if you've made
certain SCT ERC value is less than SCSI command timer value? If that's
not true, I give you 1 in 3 chances of complete array collapse
following a single drive failure if they are big drives, and 1 in 4
odds if they're just 2T or less. So you need to make certain, because
it's not the default configuration unless you have NAS or enterprise
drives across the board with properly preconfigured SCT ERC out of the
box.

-- 
Chris Murphy

^ permalink raw reply

* [PATCH] md: fix a potential deadlock
From: Shaohua Li @ 2016-09-14 21:26 UTC (permalink / raw)
  To: linux-raid; +Cc: Cong Wang

lockdep reports a potential deadlock. Fix this by droping the mutex
before md_import_device

[ 1137.126601] ======================================================
[ 1137.127013] [ INFO: possible circular locking dependency detected ]
[ 1137.127013] 4.8.0-rc4+ #538 Not tainted
[ 1137.127013] -------------------------------------------------------
[ 1137.127013] mdadm/16675 is trying to acquire lock:
[ 1137.127013]  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff81243cf3>] __blkdev_get+0x63/0x450
[ 1137.127013]
but task is already holding lock:
[ 1137.127013]  (detected_devices_mutex){+.+.+.}, at: [<ffffffff81a5138c>] md_ioctl+0x2ac/0x1f50
[ 1137.127013]
which lock already depends on the new lock.

[ 1137.127013]
the existing dependency chain (in reverse order) is:
[ 1137.127013]
-> #1 (detected_devices_mutex){+.+.+.}:
[ 1137.127013]        [<ffffffff810b6f19>] lock_acquire+0xb9/0x220
[ 1137.127013]        [<ffffffff81c51647>] mutex_lock_nested+0x67/0x3d0
[ 1137.127013]        [<ffffffff81a4eeaf>] md_autodetect_dev+0x3f/0x90
[ 1137.127013]        [<ffffffff81595be8>] rescan_partitions+0x1a8/0x2c0
[ 1137.127013]        [<ffffffff81590081>] __blkdev_reread_part+0x71/0xb0
[ 1137.127013]        [<ffffffff815900e5>] blkdev_reread_part+0x25/0x40
[ 1137.127013]        [<ffffffff81590c4b>] blkdev_ioctl+0x51b/0xa30
[ 1137.127013]        [<ffffffff81242bf1>] block_ioctl+0x41/0x50
[ 1137.127013]        [<ffffffff81214c96>] do_vfs_ioctl+0x96/0x6e0
[ 1137.127013]        [<ffffffff81215321>] SyS_ioctl+0x41/0x70
[ 1137.127013]        [<ffffffff81c56825>] entry_SYSCALL_64_fastpath+0x18/0xa8
[ 1137.127013]
-> #0 (&bdev->bd_mutex){+.+.+.}:
[ 1137.127013]        [<ffffffff810b6af2>] __lock_acquire+0x1662/0x1690
[ 1137.127013]        [<ffffffff810b6f19>] lock_acquire+0xb9/0x220
[ 1137.127013]        [<ffffffff81c51647>] mutex_lock_nested+0x67/0x3d0
[ 1137.127013]        [<ffffffff81243cf3>] __blkdev_get+0x63/0x450
[ 1137.127013]        [<ffffffff81244307>] blkdev_get+0x227/0x350
[ 1137.127013]        [<ffffffff812444f6>] blkdev_get_by_dev+0x36/0x50
[ 1137.127013]        [<ffffffff81a46d65>] lock_rdev+0x35/0x80
[ 1137.127013]        [<ffffffff81a49bb4>] md_import_device+0xb4/0x1b0
[ 1137.127013]        [<ffffffff81a513d6>] md_ioctl+0x2f6/0x1f50
[ 1137.127013]        [<ffffffff815909b3>] blkdev_ioctl+0x283/0xa30
[ 1137.127013]        [<ffffffff81242bf1>] block_ioctl+0x41/0x50
[ 1137.127013]        [<ffffffff81214c96>] do_vfs_ioctl+0x96/0x6e0
[ 1137.127013]        [<ffffffff81215321>] SyS_ioctl+0x41/0x70
[ 1137.127013]        [<ffffffff81c56825>] entry_SYSCALL_64_fastpath+0x18/0xa8
[ 1137.127013]
other info that might help us debug this:

[ 1137.127013]  Possible unsafe locking scenario:

[ 1137.127013]        CPU0                    CPU1
[ 1137.127013]        ----                    ----
[ 1137.127013]   lock(detected_devices_mutex);
[ 1137.127013]                                lock(&bdev->bd_mutex);
[ 1137.127013]                                lock(detected_devices_mutex);
[ 1137.127013]   lock(&bdev->bd_mutex);
[ 1137.127013]
 *** DEADLOCK ***

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 drivers/md/md.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index cd6797b..457b538 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8882,7 +8882,9 @@ static void autostart_arrays(int part)
 		list_del(&node_detected_dev->list);
 		dev = node_detected_dev->dev;
 		kfree(node_detected_dev);
+		mutex_unlock(&detected_devices_mutex);
 		rdev = md_import_device(dev,0, 90);
+		mutex_lock(&detected_devices_mutex);
 		if (IS_ERR(rdev))
 			continue;
 
-- 
2.8.0.rc2


^ permalink raw reply related

* Re: lots of "md: export_rdev(sde)" printed after create IMSM RAID10 with missing
From: Jes Sorensen @ 2016-09-14 21:05 UTC (permalink / raw)
  To: Artur Paszkiewicz; +Cc: Shaohua Li, Yi Zhang, linux-raid
In-Reply-To: <7910bc85-f9c4-1ea3-76a6-40b819738537@intel.com>

Artur Paszkiewicz <artur.paszkiewicz@intel.com> writes:
> On 09/09/2016 12:56 AM, Shaohua Li wrote:
>> On Wed, Sep 07, 2016 at 02:43:41AM -0400, Yi Zhang wrote:
>>> Hello
>>>
>>> I tried create one IMSM RAID10 with missing, found lots of "md:
>>> export_rdev(sde)" printed, anyone could help check it?
>>>
>>> Steps I used:
>>> mdadm -CR /dev/md0 /dev/sd[b-f] -n5 -e imsm
>>> mdadm -CR /dev/md/Volume0 -l10 -n4 /dev/sd[b-d] missing
>>>
>>> Version:
>>> 4.8.0-rc5
>>> mdadm - v3.4-84-gbd1fd72 - 25th August 2016
>> 
>> can't reproduce with old mdadm but can with upstream mdadm. Looks mdadm is
>> keeping write the new_dev sysfs entry.
>> 
>> Jes, any idea?
>> 
>> Thanks,
>> Shaohua 

[snip]

> Can you check if this fix works for you? If it does I'll send a proper
> patch for this.
>
> Thanks,
> Artur

Artur,

You were too fast :) Did you intend to post a patch with a commit
message?

Cheers,
Jes

>
> diff --git a/super-intel.c b/super-intel.c
> index 92817e9..ffa71f6 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -7789,6 +7789,9 @@ static struct mdinfo *imsm_activate_spare(struct active_array *a,
>  			IMSM_T_STATE_DEGRADED)
>  		return NULL;
>  
> +	if (get_imsm_map(dev, MAP_0)->map_state == IMSM_T_STATE_UNINITIALIZED)
> +		return NULL;
> +
>  	/*
>  	 * If there are any failed disks check state of the other volume.
>  	 * Block rebuild if the another one is failed until failed disks
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: moving spares into group and checking spares
From: scar @ 2016-09-14 21:05 UTC (permalink / raw)
  To: linux-raid-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20160914232249.6e5fc568@natsu>

Roman Mamedov wrote on 09/14/2016 11:22 AM:
> But you think an 11-member RAID5, let alone four of them joined by LVM is
> safe? From a resiliency standpoint that setup is like insanity squared.

yeah it seems fine?  disks are healthy and regularly checked, just 
wondering how to check the spares.  use cron to schedule weekly smartctl 
long test?


> Considering that your expenses for redundancy are 8 disks at the moment, you
> could go with 3x15-disk RAID6 with 2 shared hotspares, making overall
> redundancy expense the same 8 disks -- but for a massively safer setup.

actually it would be 9 disks (3x15 +2 = 47 not 48) but i'm ok with that. 
  but rebuilding the array right now is not an option

> might just as well join them using mdadm RAID0 and
> at least gain the improved linear performance.

i did want to do that but debian-installer didn't seem to support it...


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Question about commit f9a67b1182e5 ("md/bitmap: clear bitmap if bitmap_create failed").
From: Marion & Christophe JAILLET @ 2016-09-14 20:39 UTC (permalink / raw)
  To: Guoqing Jiang, Shaohua Li; +Cc: linux-raid, linux-kernel
In-Reply-To: <57D9097C.5050202@suse.com>

Le 14/09/2016 à 10:25, Guoqing Jiang a écrit :
>
>
> On 09/13/2016 01:24 PM, Shaohua Li wrote:
>> On Mon, Sep 12, 2016 at 09:09:48PM +0200, Christophe JAILLET wrote:
>>> Hi,
>>>
>>> I'm puzzled by commit f9a67b1182e5 ("md/bitmap: clear bitmap if
>>> bitmap_create failed").
>> Hi Christophe,
>> Thank you very much to help check this!
>>
>>> Part of the commit is:
>>>
>>> @@ -1865,8 +1866,10 @@ int bitmap_copy_from_slot(struct mddev 
>>> *mddev, int
>>> slot,
>>>       struct bitmap_counts *counts;
>>>       struct bitmap *bitmap = bitmap_create(mddev, slot);
>>>
>>> -    if (IS_ERR(bitmap))
>>> +    if (IS_ERR(bitmap)) {
>>> +        bitmap_free(bitmap);
>>>           return PTR_ERR(bitmap);
>>> +    }
>>>
>>> but if 'bitmap' is an error, I think that bad things will happen in
>>> 'bitmap_free()' when, at the beginning of the function, we will 
>>> execute:
>>>
>>>      if (bitmap->sysfs_can_clear) <-----------------
>>>          sysfs_put(bitmap->sysfs_can_clear);
>
> I guess it is safe, since below part is at the beginning of bitmap_free.
>
>         if (!bitmap) /* there was no bitmap */
>                 return;

I don't share your feeling.
bitmap_create() can return ERR_PTR(-ENOMEM) or ERR_PTR(-EINVAL).

In such cases 'if (!bitmap)' will not be helpful.

Maybe it should be turned into 'if (IS_ERR_OR_NULL(bitmap))' to handle 
errors returned by bitmap_create.
Maybe just removing the call to 'bitmap_free(bitmap)' is enough.

In any case, I think that the current logic is somehow broken.

Best regards,
CJ

^ permalink raw reply

* Re: Inactive arrays
From: Wols Lists @ 2016-09-14 18:42 UTC (permalink / raw)
  To: Daniel Sanabria, Chris Murphy; +Cc: Linux-RAID
In-Reply-To: <CAHscji2BKxNDLzZUonGHVB8PzeFbtuwn32NBaX315QPvvhOxyg@mail.gmail.com>

On 14/09/16 19:16, Daniel Sanabria wrote:
> Other than replacing the green drives with something more
> suitable (any suggestions are welcome)

WD Reds or Seagate NAS. I don't think they make them any more, but
Seagate Constellations are fine too. My Toshiba 2TB 2.5" laptop drive
would be fine.

The tl;dr version of the problem with Greens (and any other desktop
drive for that matter), if you haven't read it up yet, is that when the
kernel requests a read from a dodgy drive, it just sits there,
*unresponsive*, until the read succeeds or the drive times out. And the
drive will time out in its own good time.

If the kernel times out *before* the drive, and by default the kernel
does so after 7 secs, while the drive can take two minutes or more, then
the kernel will recreate the missing block and try to write it. The
drive is unresponsive, the write times out, and the kernel assumes the
drive is dead and kicks it from the array.

That's why you need to increase the kernel timeout, because you can't
reduce the drive timeout, and which is why a flaky hard drive will cause
system response to fall off horrendously.

Cheers,
Wol

^ permalink raw reply

* Re: Inactive arrays
From: Chris Murphy @ 2016-09-14 18:37 UTC (permalink / raw)
  To: Daniel Sanabria; +Cc: Chris Murphy, Linux-RAID
In-Reply-To: <CAHscji2BKxNDLzZUonGHVB8PzeFbtuwn32NBaX315QPvvhOxyg@mail.gmail.com>

On Wed, Sep 14, 2016 at 12:16 PM, Daniel Sanabria <sanabria.d@gmail.com> wrote:
> BRAVO!!!!!!
>
> Thanks a million Chris! After following your advice on recovering the
> MBR and the GPT the arrays re-assembled automatically and all data is
> there.
>
> I already changed the type to make it consistent (FD00 on both
> partitions) and working on setting up the timeouts to 180 at boot
> time. Other than replacing the green drives with something more
> suitable (any suggestions are welcome), what else would you suggest to
> change to make the setup a bit more consistent and upgrade proof (i.e.
> having different metadata versions doesn't look right to me)?

Like I mentioned there's something about Greens spinning down that you
might look at. I'm not sure if delays in spinning back up is a
contributing factor to anything? I'd kinda expect that if the
kernel/libata send a command to the drive, and one spins up slow, the
kernel is just going to wait up to whatever the command timer is set
to. So if you set that to 180 seconds, it should be fine because no
drive takes 3 minute to spin up. But... I dunno if there's some other
vector for these drives to cause confusion.

Umm, yeah I don't think you need to worry too much about the metadata.
0.9 is deprecated, uses kernel autodetect rather than initrd based
detection like metadata 1.x, and can be more complex to troubleshoot.
But so long as it's working I honestly wouldn't mess with it. If you
do want to simplify it just make sure you have current backups because
changes are a RIPE time for mistakes that end up in user data loss. I
would pretty much just assume the user will break something, you have
a not too complex layout compared to others I've seen, but there are
some opportunities to make simple mistakes that will just blow shit up
and then you're screwed.

So I'd say it's easier to just plot a future when you're going to buy
a bunch of new drives and do a complete migration, rather than change
the existing setup metadata just for the sake of changing it.

And one thing to incorporate in the planning stage is LVM RAID.  You
could take all of your drives into one big pool, and create LV's like
you are individual RAIDs, and each LV can have its own RAID level. In
many ways it's easier because you're already using LVM on top of RAID
on top of partitioning. Instead you can create basically one
partition, add them all to LVM, and then manage the LV and raid level
at the same time. The main issue here is, familiarity with all the
tools. If you're more comfortable with mdadm, then use that. If you
can get over the hurdle that is lvm tools (it's like emacs for
storage, its metric piles of flags, documentation, features, and as
yet doesn't have all the same features as mdadm still for the raid
stuff). But it'll do scrubs, and device replacements, all the basic
stuff is there. Monitoring for drive failures is a little different, I
don't think it has a way to email you like mdadm does in case of drive
failures/ejections. So you'll have to look at that also. Note that on
the backend LVM raid uses the md kernel driver just like mdadm does,
it's just the user space tools and on disk metadata that differ.

-- 
Chris Murphy

^ permalink raw reply

* Re: moving spares into group and checking spares
From: Wols Lists @ 2016-09-14 18:35 UTC (permalink / raw)
  To: scar, linux-raid
In-Reply-To: <nrc2pm$2a9$1@blaine.gmane.org>

On 14/09/16 18:52, scar wrote:
> i'm not sure what you're suggesting, that 4x 11+1 RAID5 arrays should be
> changed to 1x 46+2 RAID6 array?  that doesn't seem as safe to me.  and
> checkarray isn't going to check the spare disks just as it's not doing
> now.... also that would require me to backup/restore the data so i can
> create a new array

No. The suggestion is to convert your 4x 11+1 raid5's to 4x 12 raid6's.
(or do you mean 11 drives plus 1 parity? If that's the case I mean 11
plus 2 parity)

That way you're using all the drives, they all get tested, and if a
drive fails, you're left with a degraded raid6 aka raid5 aka redundant
array. With your current setup, if a drive fails you're left with a
degraded raid5 aka raid0 aka a "disaster in waiting".

And then you can add just the one spare disk to a spares group, so if
any drive does fail, it will get rebuilt straight away.

The only problem I can see (and I should warn you) is that there seems
to be a little "upgrading in place" problem at the moment. My gut
feeling is it's down to some interaction with systemd, so if you're not
running systemd I hope it won't bite ...

Cheers,
Wol

^ permalink raw reply

* Re: moving spares into group and checking spares
From: Roman Mamedov @ 2016-09-14 18:22 UTC (permalink / raw)
  To: scar; +Cc: linux-raid
In-Reply-To: <nrc2pm$2a9$1@blaine.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 975 bytes --]

On Wed, 14 Sep 2016 10:52:55 -0700
scar <scar@drigon.com> wrote:

> i'm not sure what you're suggesting, that 4x 11+1 RAID5 arrays should be 
> changed to 1x 46+2 RAID6 array?  that doesn't seem as safe to me

But you think an 11-member RAID5, let alone four of them joined by LVM is
safe? From a resiliency standpoint that setup is like insanity squared.

Considering that your expenses for redundancy are 8 disks at the moment, you
could go with 3x15-disk RAID6 with 2 shared hotspares, making overall
redundancy expense the same 8 disks -- but for a massively safer setup.

Also don't plan on having anything survive any of the joined arrays failure
(expecting to recover data from an FS which suddenly lost a third of itself
should never be part of any plan), and for that reason there is no point in
using LVM concatenation, might just as well join them using mdadm RAID0 and
at least gain the improved linear performance.

-- 
With respect,
Roman

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: Inactive arrays
From: Daniel Sanabria @ 2016-09-14 18:16 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux-RAID
In-Reply-To: <CAJCQCtSsC3=hh60PudHqziubnv_KVFevbCgV3MNZKnL44Fow=Q@mail.gmail.com>

BRAVO!!!!!!

Thanks a million Chris! After following your advice on recovering the
MBR and the GPT the arrays re-assembled automatically and all data is
there.

I already changed the type to make it consistent (FD00 on both
partitions) and working on setting up the timeouts to 180 at boot
time. Other than replacing the green drives with something more
suitable (any suggestions are welcome), what else would you suggest to
change to make the setup a bit more consistent and upgrade proof (i.e.
having different metadata versions doesn't look right to me)?

I'd also like to thank Wol and Adam for their help and for keeping the
thread alive.

Thanks again and again,

This is the current status:

[root@lamachine ~]# mdadm -D /dev/md*
/dev/md126:
        Version : 0.90
  Creation Time : Thu Dec  3 22:12:12 2009
     Raid Level : raid10
     Array Size : 30719936 (29.30 GiB 31.46 GB)
  Used Dev Size : 30719936 (29.30 GiB 31.46 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 126
    Persistence : Superblock is persistent

    Update Time : Wed Sep 14 19:02:55 2016
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 64K

           UUID : 9af006ca:8845bbd3:bfe78010:bc810f04
         Events : 0.264152

    Number   Major   Minor   RaidDevice State
       0       8       82        0      active sync set-A   /dev/sdf2
       1       8        1        1      active sync set-B   /dev/sda1
/dev/md127:
        Version : 1.2
  Creation Time : Tue Jul 26 19:00:28 2011
     Raid Level : raid0
     Array Size : 94367232 (90.00 GiB 96.63 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Tue Jul 26 19:00:28 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : reading.homeunix.com:3
           UUID : acd5374f:72628c93:6a906c4b:5f675ce5
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       85        0      active sync   /dev/sdf5
       1       8       21        1      active sync   /dev/sdb5
       2       8        5        2      active sync   /dev/sda5
/dev/md128:
        Version : 1.2
  Creation Time : Fri Oct 24 15:24:38 2014
     Raid Level : raid5
     Array Size : 4294705152 (4095.75 GiB 4397.78 GB)
  Used Dev Size : 2147352576 (2047.88 GiB 2198.89 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Wed Sep 14 18:46:47 2016
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : lamachine:128  (local to host lamachine)
           UUID : f2372cb9:d3816fd6:ce86d826:882ec82e
         Events : 4154

    Number   Major   Minor   RaidDevice State
       0       8       65        0      active sync   /dev/sde1
       1       8       33        1      active sync   /dev/sdc1
       3       8       49        2      active sync   /dev/sdd1
/dev/md129:
        Version : 1.2
  Creation Time : Mon Nov 10 16:28:11 2014
     Raid Level : raid0
     Array Size : 1572470784 (1499.63 GiB 1610.21 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Mon Nov 10 16:28:11 2014
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : lamachine:129  (local to host lamachine)
           UUID : 895dae98:d1a496de:4f590b8b:cb8ac12a
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       66        0      active sync   /dev/sde2
       1       8       34        1      active sync   /dev/sdc2
       2       8       50        2      active sync   /dev/sdd2
/dev/md2:
        Version : 0.90
  Creation Time : Mon Feb 11 07:54:36 2013
     Raid Level : raid5
     Array Size : 511999872 (488.28 GiB 524.29 GB)
  Used Dev Size : 255999936 (244.14 GiB 262.14 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Wed Sep 14 18:48:51 2016
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 2cff15d1:e411447b:fd5d4721:03e44022 (local to host lamachine)
         Events : 0.611

    Number   Major   Minor   RaidDevice State
       0       8       83        0      active sync   /dev/sdf3
       1       8       18        1      active sync   /dev/sdb2
       2       8        2        2      active sync   /dev/sda2
[root@lamachine ~]#

[root@lamachine ~]# mdadm -E /dev/sd*
/dev/sda:
   MBR Magic : aa55
Partition[0] :     61440000 sectors at           63 (type fd)
Partition[1] :    512000000 sectors at     61440063 (type fd)
Partition[2] :    403328002 sectors at    573440063 (type 05)
/dev/sda1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 9af006ca:8845bbd3:bfe78010:bc810f04
  Creation Time : Thu Dec  3 22:12:12 2009
     Raid Level : raid10
  Used Dev Size : 30719936 (29.30 GiB 31.46 GB)
     Array Size : 30719936 (29.30 GiB 31.46 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 126

    Update Time : Wed Sep 14 19:07:26 2016
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ed993c29 - correct
         Events : 264152

         Layout : near=2
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8        1        1      active sync   /dev/sda1

   0     0       8       82        0      active sync   /dev/sdf2
   1     1       8        1        1      active sync   /dev/sda1
/dev/sda2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2cff15d1:e411447b:fd5d4721:03e44022 (local to host lamachine)
  Creation Time : Mon Feb 11 07:54:36 2013
     Raid Level : raid5
  Used Dev Size : 255999936 (244.14 GiB 262.14 GB)
     Array Size : 511999872 (488.28 GiB 524.29 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 2

    Update Time : Wed Sep 14 18:48:51 2016
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 73b2a76e - correct
         Events : 611

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8        2        2      active sync   /dev/sda2

   0     0       8       83        0      active sync   /dev/sdf3
   1     1       8       18        1      active sync   /dev/sdb2
   2     2       8        2        2      active sync   /dev/sda2
/dev/sda3:
   MBR Magic : aa55
Partition[0] :     62910589 sectors at           63 (type 83)
Partition[1] :      7116795 sectors at     82445692 (type 05)
/dev/sda5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : acd5374f:72628c93:6a906c4b:5f675ce5
           Name : reading.homeunix.com:3
  Creation Time : Tue Jul 26 19:00:28 2011
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 62908541 (30.00 GiB 32.21 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : a0efc1b3:94cc6eb8:deea76ca:772b2d2d

    Update Time : Tue Jul 26 19:00:28 2011
       Checksum : 9eba9119 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
mdadm: No md superblock detected on /dev/sda6.
/dev/sdb:
   MBR Magic : aa55
Partition[1] :    512000000 sectors at       409663 (type fd)
Partition[2] :     16384000 sectors at    512409663 (type 82)
Partition[3] :    447974402 sectors at    528793663 (type 05)
/dev/sdb2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2cff15d1:e411447b:fd5d4721:03e44022 (local to host lamachine)
  Creation Time : Mon Feb 11 07:54:36 2013
     Raid Level : raid5
  Used Dev Size : 255999936 (244.14 GiB 262.14 GB)
     Array Size : 511999872 (488.28 GiB 524.29 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 2

    Update Time : Wed Sep 14 18:48:51 2016
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 73b2a77c - correct
         Events : 611

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       18        1      active sync   /dev/sdb2

   0     0       8       83        0      active sync   /dev/sdf3
   1     1       8       18        1      active sync   /dev/sdb2
   2     2       8        2        2      active sync   /dev/sda2
mdadm: No md superblock detected on /dev/sdb3.
/dev/sdb4:
   MBR Magic : aa55
Partition[0] :     62912354 sectors at           63 (type 83)
Partition[1] :      7116795 sectors at     82447457 (type 05)
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : acd5374f:72628c93:6a906c4b:5f675ce5
           Name : reading.homeunix.com:3
  Creation Time : Tue Jul 26 19:00:28 2011
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 62910306 (30.00 GiB 32.21 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 152d0202:64efb3e7:f23658c3:82a239a1

    Update Time : Tue Jul 26 19:00:28 2011
       Checksum : 892dbb61 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
mdadm: No md superblock detected on /dev/sdb6.
/dev/sdc:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : f2372cb9:d3816fd6:ce86d826:882ec82e
           Name : lamachine:128  (local to host lamachine)
  Creation Time : Fri Oct 24 15:24:38 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 4294705152 (2047.88 GiB 2198.89 GB)
     Array Size : 4294705152 (4095.75 GiB 4397.78 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 8b1bac5c:6c2cb5a4:bff59099:986b26cd

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Sep 14 18:46:47 2016
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : a4766ee5 - correct
         Events : 4154

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 895dae98:d1a496de:4f590b8b:cb8ac12a
           Name : lamachine:129  (local to host lamachine)
  Creation Time : Mon Nov 10 16:28:11 2014
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 1048313856 (499.88 GiB 536.74 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 8bc41f51:9af76d31:36349135:2d004cb3

    Update Time : Mon Nov 10 16:28:11 2014
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 2af9fe79 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : f2372cb9:d3816fd6:ce86d826:882ec82e
           Name : lamachine:128  (local to host lamachine)
  Creation Time : Fri Oct 24 15:24:38 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 4294705152 (2047.88 GiB 2198.89 GB)
     Array Size : 4294705152 (4095.75 GiB 4397.78 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 1f652d4f:92fccd8e:b439abf2:76b881e1

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Sep 14 18:46:47 2016
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : ee861974 - correct
         Events : 4154

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 895dae98:d1a496de:4f590b8b:cb8ac12a
           Name : lamachine:129  (local to host lamachine)
  Creation Time : Mon Nov 10 16:28:11 2014
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 1048313856 (499.88 GiB 536.74 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 562dd382:5ccc00aa:449ea7e4:d8b266c2

    Update Time : Mon Nov 10 16:28:11 2014
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 937158c1 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : f2372cb9:d3816fd6:ce86d826:882ec82e
           Name : lamachine:128  (local to host lamachine)
  Creation Time : Fri Oct 24 15:24:38 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 4294705152 (2047.88 GiB 2198.89 GB)
     Array Size : 4294705152 (4095.75 GiB 4397.78 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 95334f42:beba0d90:8a0854f4:7dfdbd31

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Sep 14 18:46:47 2016
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 34ccb9f0 - correct
         Events : 4154

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 895dae98:d1a496de:4f590b8b:cb8ac12a
           Name : lamachine:129  (local to host lamachine)
  Creation Time : Mon Nov 10 16:28:11 2014
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 1048313856 (499.88 GiB 536.74 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : fd6e6f59:89dad658:e361db17:7c15a63f

    Update Time : Mon Nov 10 16:28:11 2014
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : c956ced4 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
   MBR Magic : aa55
Partition[0] :       407552 sectors at         2048 (type 83)
Partition[1] :     61440000 sectors at       409663 (type fd)
Partition[2] :    512000000 sectors at     61849663 (type fd)
Partition[3] :    402918402 sectors at    573849663 (type 05)
mdadm: No md superblock detected on /dev/sdf1.
/dev/sdf2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 9af006ca:8845bbd3:bfe78010:bc810f04
  Creation Time : Thu Dec  3 22:12:12 2009
     Raid Level : raid10
  Used Dev Size : 30719936 (29.30 GiB 31.46 GB)
     Array Size : 30719936 (29.30 GiB 31.46 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 126

    Update Time : Wed Sep 14 19:07:26 2016
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ed993c78 - correct
         Events : 264152

         Layout : near=2
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       82        0      active sync   /dev/sdf2

   0     0       8       82        0      active sync   /dev/sdf2
   1     1       8        1        1      active sync   /dev/sda1
/dev/sdf3:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 2cff15d1:e411447b:fd5d4721:03e44022 (local to host lamachine)
  Creation Time : Mon Feb 11 07:54:36 2013
     Raid Level : raid5
  Used Dev Size : 255999936 (244.14 GiB 262.14 GB)
     Array Size : 511999872 (488.28 GiB 524.29 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 2

    Update Time : Wed Sep 14 18:48:51 2016
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 73b2a7bb - correct
         Events : 611

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       83        0      active sync   /dev/sdf3

   0     0       8       83        0      active sync   /dev/sdf3
   1     1       8       18        1      active sync   /dev/sdb2
   2     2       8        2        2      active sync   /dev/sda2
/dev/sdf4:
   MBR Magic : aa55
Partition[0] :     62918679 sectors at           63 (type 83)
Partition[1] :      7116795 sectors at     82453782 (type 05)
/dev/sdf5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : acd5374f:72628c93:6a906c4b:5f675ce5
           Name : reading.homeunix.com:3
  Creation Time : Tue Jul 26 19:00:28 2011
     Raid Level : raid0
   Raid Devices : 3

 Avail Dev Size : 62916631 (30.00 GiB 32.21 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 5778cd64:0bbba183:ef3270a8:41f83aca

    Update Time : Tue Jul 26 19:00:28 2011
       Checksum : 96003cba - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
mdadm: No md superblock detected on /dev/sdf6.
[root@lamachine ~]# cat /etc/mdadm.conf
# mdadm.conf written out by anaconda
MAILADDR root
AUTO +imsm +1.x -all
ARRAY /dev/md2 level=raid5 num-devices=3
UUID=2cff15d1:e411447b:fd5d4721:03e44022
ARRAY /dev/md126 level=raid10 num-devices=2
UUID=9af006ca:8845bbd3:bfe78010:bc810f04
ARRAY /dev/md127 level=raid0 num-devices=3
UUID=acd5374f:72628c93:6a906c4b:5f675ce5
ARRAY /dev/md128 metadata=1.2 spares=1 name=lamachine:128
UUID=f2372cb9:d3816fd6:ce86d826:882ec82e
ARRAY /dev/md129 metadata=1.2 name=lamachine:129
UUID=895dae98:d1a496de:4f590b8b:cb8ac12a
[root@lamachine ~]#

On 14 September 2016 at 17:13, Chris Murphy <lists@colorremedies.com> wrote:
> Low priority but you could make the type codes consistent for both
> partitions. It doesn't matter if they're 8300 or FD00 on GPT disks,
> there's nothing I know that actually uses this information. FD00 is
> maybe slightly better only in that it'll flag a human to expect that
> there's mdadm metadata on this partition rather than a file system.
>
>
> Chris

^ permalink raw reply

* Re: moving spares into group and checking spares
From: scar @ 2016-09-14 17:52 UTC (permalink / raw)
  To: linux-raid-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20160914092959.GA3584-oubN3LzF/wf25t9ic+4fgA@public.gmane.org>

Andreas Klauer wrote on 09/14/2016 02:29 AM:
> On Tue, Sep 13, 2016 at 10:18:41PM -0700, scar wrote:
>> i currently have four RAID-5 md arrays which i concatenated into one
>> logical volume (lvm2), essentially creating a RAID-50.  each md array
>> was created with one spare disk.
> That's perfect for switching to RAID-6. More redundancy should be
> more useful than spares that only sync in when you already completely
> lost redundancy ...
>
> And the disks would also be covered by your checks that way ;)

i'm not sure what you're suggesting, that 4x 11+1 RAID5 arrays should be 
changed to 1x 46+2 RAID6 array?  that doesn't seem as safe to me.  and 
checkarray isn't going to check the spare disks just as it's not doing 
now.... also that would require me to backup/restore the data so i can 
create a new array


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Inactive arrays
From: Chris Murphy @ 2016-09-14 16:13 UTC (permalink / raw)
  To: Daniel Sanabria; +Cc: Linux-RAID
In-Reply-To: <CAJCQCtQ3kAKd4U=MgVawU9D6QKuxC+XjLYR6EyL87wuyBvvmtg@mail.gmail.com>

Low priority but you could make the type codes consistent for both
partitions. It doesn't matter if they're 8300 or FD00 on GPT disks,
there's nothing I know that actually uses this information. FD00 is
maybe slightly better only in that it'll flag a human to expect that
there's mdadm metadata on this partition rather than a file system.

Chris

^ permalink raw reply

* Re: Inactive arrays
From: Chris Murphy @ 2016-09-14 16:10 UTC (permalink / raw)
  To: Daniel Sanabria; +Cc: Chris Murphy, Wols Lists, Linux-RAID
In-Reply-To: <CAHscji1wVYzpzu8X+GYuuxZPdLzY-f+8oogyH7ttqVwMJe2=iQ@mail.gmail.com>

On Wed, Sep 14, 2016 at 9:47 AM, Daniel Sanabria <sanabria.d@gmail.com> wrote:
> [root@lamachine ~]# gdisk /dev/sdc
>
> GPT fdisk (gdisk) version 1.0.1
>
> Warning! Disk size is smaller than the main header indicates! Loading
> secondary header from the last sector of the disk! You should use 'v' to
> verify disk integrity, and perhaps options on the experts' menu to repair
> the disk.
>
> Caution: invalid backup GPT header, but valid main header; regenerating
> backup header from main header.
>
> Warning! One or more CRCs don't match. You should repair the disk!
>
> Partition table scan:
>   MBR: not present
>   BSD: not present
>   APM: not present
>   GPT: damaged
>
> Found invalid MBR and corrupt GPT. What do you want to do? (Using the
> GPT MAY permit recovery of GPT data.)
>  1 - Use current GPT
>  2 - Create blank GPT
>
> Your answer: 1
>
> Command (? for help): x
>
> Expert command (? for help): v
>
> Caution: The CRC for the backup partition table is invalid. This table may
> be corrupt. This program will automatically create a new backup partition
> table when you save your partitions.
>
> Problem: The secondary header's self-pointer indicates that it doesn't reside
> at the end of the disk. If you've added a disk to a RAID array, use the 'e'
> option on the experts' menu to adjust the secondary header's and partition
> table's locations.
>
> Problem: Disk is too small to hold all the data!
> (Disk size is 5860531055 sectors, needs to be 5860533168 sectors.)
>
> The 'e' option on the experts' menu may fix this problem.
>
> Problem: GPT claims the disk is larger than it is! (Claimed last usable
> sector is 5860533134, but backup header is at
> 5860533167 and disk size is 5860531055 sectors.
> The 'e' option on the experts' menu will probably fix this problem
>
> Identified 4 problems!
>
> Expert command (? for help): p
> Disk /dev/sdc: 5860531055 sectors, 2.7 TiB
> Logical sector size: 512 bytes
> Disk identifier (GUID): 6DB70F4E-D8ED-4290-AA2E-4E81D8324992
> Partition table holds up to 128 entries
> First usable sector is 2048, last usable sector is 5860533134
> Partitions will be aligned on 2048-sector boundaries
> Total free space is 516987791 sectors (246.5 GiB)
>
> Number  Start (sector)    End (sector)  Size       Code  Name
>    1            2048      4294969343   2.0 TiB     FD00
>    2      4294969344      5343545343   500.0 GiB   8300
>
> Expert command (? for help):

Use the e command. That should fix the 3 main problems, and then a new
CRC is automatically computed for the two headers and two tables at
write time.


-- 
Chris Murphy

^ permalink raw reply

* Re: Inactive arrays
From: Daniel Sanabria @ 2016-09-14 15:47 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Wols Lists, Linux-RAID
In-Reply-To: <CAJCQCtQXd6GgXhu-7h9nNjA1zfdoH_TAEkeQajYv5vsmuG2Vpw@mail.gmail.com>

[root@lamachine ~]# gdisk /dev/sdc

GPT fdisk (gdisk) version 1.0.1

Warning! Disk size is smaller than the main header indicates! Loading
secondary header from the last sector of the disk! You should use 'v' to
verify disk integrity, and perhaps options on the experts' menu to repair
the disk.

Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

Warning! One or more CRCs don't match. You should repair the disk!

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: damaged

Found invalid MBR and corrupt GPT. What do you want to do? (Using the
GPT MAY permit recovery of GPT data.)
 1 - Use current GPT
 2 - Create blank GPT

Your answer: 1

Command (? for help): x

Expert command (? for help): v

Caution: The CRC for the backup partition table is invalid. This table may
be corrupt. This program will automatically create a new backup partition
table when you save your partitions.

Problem: The secondary header's self-pointer indicates that it doesn't reside
at the end of the disk. If you've added a disk to a RAID array, use the 'e'
option on the experts' menu to adjust the secondary header's and partition
table's locations.

Problem: Disk is too small to hold all the data!
(Disk size is 5860531055 sectors, needs to be 5860533168 sectors.)

The 'e' option on the experts' menu may fix this problem.

Problem: GPT claims the disk is larger than it is! (Claimed last usable
sector is 5860533134, but backup header is at
5860533167 and disk size is 5860531055 sectors.
The 'e' option on the experts' menu will probably fix this problem

Identified 4 problems!

Expert command (? for help): p
Disk /dev/sdc: 5860531055 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 6DB70F4E-D8ED-4290-AA2E-4E81D8324992
Partition table holds up to 128 entries
First usable sector is 2048, last usable sector is 5860533134
Partitions will be aligned on 2048-sector boundaries
Total free space is 516987791 sectors (246.5 GiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048      4294969343   2.0 TiB     FD00
   2      4294969344      5343545343   500.0 GiB   8300

Expert command (? for help):

^ permalink raw reply

* Re: Inactive arrays
From: Chris Murphy @ 2016-09-14 15:15 UTC (permalink / raw)
  To: Daniel Sanabria; +Cc: Chris Murphy, Wols Lists, Linux-RAID
In-Reply-To: <CAHscji2dhdE7mL-4i8mE6sShQtGCB6v1K02TBQQD4-6R-ZtoZw@mail.gmail.com>

On Wed, Sep 14, 2016 at 8:57 AM, Daniel Sanabria <sanabria.d@gmail.com> wrote:
>> So yes you can just run gdisk, go to expert menu with x, then verify
>> with v, and print the table with p, and post all of that and we'll
>> confirm before you use w to write out the fixed table.
>
> choices now:
>
> [root@lamachine ~]# gdisk /dev/sdc
>
> GPT fdisk (gdisk) version 1.0.1
>
> Warning! Disk size is smaller than the main header indicates! Loading
> secondary header from the last sector of the disk! You should use 'v' to
> verify disk integrity, and perhaps options on the experts' menu to repair
> the disk.
>
> Caution: invalid backup GPT header, but valid main header; regenerating
> backup header from main header.
>
>
> Warning! One or more CRCs don't match. You should repair the disk!
>
> Partition table scan:
>
>   MBR: not present
>   BSD: not present
>   APM: not present
>   GPT: damaged
>
>
> Found invalid MBR and corrupt GPT. What do you want to do? (Using the
> GPT MAY permit recovery of GPT data.)
>  1 - Use current GPT
>  2 - Create blank GPT
>
>
> Your answer:

1. Use current.

Then x, v, p and post the output.



-- 
Chris Murphy

^ permalink raw reply

* Re: Inactive arrays
From: Daniel Sanabria @ 2016-09-14 14:57 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Wols Lists, Linux-RAID
In-Reply-To: <CAJCQCtSTaByUR_2dnGrA0y0dtTaS_sweLKn19QD1=wv+2mnvHg@mail.gmail.com>

> So yes you can just run gdisk, go to expert menu with x, then verify
> with v, and print the table with p, and post all of that and we'll
> confirm before you use w to write out the fixed table.

choices now:

[root@lamachine ~]# gdisk /dev/sdc

GPT fdisk (gdisk) version 1.0.1

Warning! Disk size is smaller than the main header indicates! Loading
secondary header from the last sector of the disk! You should use 'v' to
verify disk integrity, and perhaps options on the experts' menu to repair
the disk.

Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.


Warning! One or more CRCs don't match. You should repair the disk!

Partition table scan:

  MBR: not present
  BSD: not present
  APM: not present
  GPT: damaged


Found invalid MBR and corrupt GPT. What do you want to do? (Using the
GPT MAY permit recovery of GPT data.)
 1 - Use current GPT
 2 - Create blank GPT


Your answer:



On 14 September 2016 at 15:32, Chris Murphy <lists@colorremedies.com> wrote:
> On Wed, Sep 14, 2016 at 4:36 AM, Daniel Sanabria <sanabria.d@gmail.com> wrote
>
>> it didn't give me any indication that it was in no-act mode but I
>> decided to carry on with the backup flag and got this:
>>
>> [root@lamachine ~]# wipefs -a -b /dev/sdc
>>
>> wipefs: invalid option -- 'b'
>
> Huh, must be a bug. Works for me with util-linux-2.28.1-1.fc24.x86_64
> and I also get files in the user home.
>
> -rw-------. 1 root root     2 Sep 13 22:13 wipefs-sdb-0x000001fe.bak
> -rw-------. 1 root root     8 Sep 13 22:13 wipefs-sdb-0x00000200.bak
> -rw-------. 1 root root     8 Sep 13 22:13 wipefs-sdb-0x3ba7ffe00.bak
>
> Those are backups for the PMBR, primary GPT and backup GPT.
>
>
>
>> before proceeding with manually creating both partitions could you
>> confirm the above is kind of expected?
>
> I expected that it would find the GPT's also and erase their signature
> and write out backup files. Oh well.
>
>
>
>>> Chances are you could just use gdisk to verify and fix the primary and
>>> backup GPTs on sdc and sde
>>
>> Will the fix be offered as part of the verify command in gdisk (command: v)?
>
> It actually does the fix in memory as soon as you run the command, and
> v just elaborates on any problems that don't make sense like
> overlapping partitions etc.
>
> So yes you can just run gdisk, go to expert menu with x, then verify
> with v, and print the table with p, and post all of that and we'll
> confirm before you use w to write out the fixed table.
>
>
>
>>
>>> When it's all done and working with the new MBR you can either leave
>>> it alone, or you can run gdisk on it and it will immediately convert
>>> it (in memory) and you can commit it to disk with the w command to go
>>> back to GPT
>>
>> so just running the w command after running the verify command while
>> on the same session ?
>
> Yes but let's just skip the fdisk stuff. There is already a primary
> and backup GPT, and  you even have a backup of the backup with the
> good on on sdd that confirms all of the numbers. So, I think it's safe
> to just move foward with the repair using gdisk. But post the verify,
> and print command output first, before writing out the change.
>
>
> --
> Chris Murphy

^ permalink raw reply

* Re: Inactive arrays
From: Chris Murphy @ 2016-09-14 14:32 UTC (permalink / raw)
  To: Daniel Sanabria; +Cc: Chris Murphy, Wols Lists, Linux-RAID
In-Reply-To: <CAHscji1fgVrKwOwf+sCfOrg-MNz0MwRzXAJswKcP=vP2x0PVug@mail.gmail.com>

On Wed, Sep 14, 2016 at 4:36 AM, Daniel Sanabria <sanabria.d@gmail.com> wrote

> it didn't give me any indication that it was in no-act mode but I
> decided to carry on with the backup flag and got this:
>
> [root@lamachine ~]# wipefs -a -b /dev/sdc
>
> wipefs: invalid option -- 'b'

Huh, must be a bug. Works for me with util-linux-2.28.1-1.fc24.x86_64
and I also get files in the user home.

-rw-------. 1 root root     2 Sep 13 22:13 wipefs-sdb-0x000001fe.bak
-rw-------. 1 root root     8 Sep 13 22:13 wipefs-sdb-0x00000200.bak
-rw-------. 1 root root     8 Sep 13 22:13 wipefs-sdb-0x3ba7ffe00.bak

Those are backups for the PMBR, primary GPT and backup GPT.

> before proceeding with manually creating both partitions could you
> confirm the above is kind of expected?

I expected that it would find the GPT's also and erase their signature
and write out backup files. Oh well.

>> Chances are you could just use gdisk to verify and fix the primary and
>> backup GPTs on sdc and sde
>
> Will the fix be offered as part of the verify command in gdisk (command: v)?

It actually does the fix in memory as soon as you run the command, and
v just elaborates on any problems that don't make sense like
overlapping partitions etc.

So yes you can just run gdisk, go to expert menu with x, then verify
with v, and print the table with p, and post all of that and we'll
confirm before you use w to write out the fixed table.

>
>> When it's all done and working with the new MBR you can either leave
>> it alone, or you can run gdisk on it and it will immediately convert
>> it (in memory) and you can commit it to disk with the w command to go
>> back to GPT
>
> so just running the w command after running the verify command while
> on the same session ?

Yes but let's just skip the fdisk stuff. There is already a primary
and backup GPT, and  you even have a backup of the backup with the
good on on sdd that confirms all of the numbers. So, I think it's safe
to just move foward with the repair using gdisk. But post the verify,
and print command output first, before writing out the change.

-- 
Chris Murphy

^ permalink raw reply

* [PATCH] Fix RAID metadata check
From: Mariusz Dabrowski @ 2016-09-14 10:43 UTC (permalink / raw)
  To: linux-raid
  Cc: Jes.Sorensen, aleksey.obitotskiy, pawel.baldysiak,
	artur.paszkiewicz, maksymilian.kunt, tomasz.majchrzak,
	Mariusz Dabrowski

mdadm recognizes devices with partition table as part of an RAID array
and invalid warning message is displayed. After this fix proper warning
messages are being displayed for MBR/GPT disks and devices with RAID
metadata.

Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
---
 util.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/util.c b/util.c
index c38ede7..5c845a0 100644
--- a/util.c
+++ b/util.c
@@ -710,17 +710,23 @@ int check_raid(int fd, char *name)
 
 	if (!st)
 		return 0;
-	st->ss->load_super(st, fd, name);
-	/* Looks like a raid array .. */
-	pr_err("%s appears to be part of a raid array:\n",
-		name);
-	st->ss->getinfo_super(st, &info, NULL);
-	st->ss->free_super(st);
-	crtime = info.array.ctime;
-	level = map_num(pers, info.array.level);
-	if (!level) level = "-unknown-";
-	cont_err("level=%s devices=%d ctime=%s",
-		 level, info.array.raid_disks, ctime(&crtime));
+	if (st->ss->add_to_super != NULL) {
+		st->ss->load_super(st, fd, name);
+		/* Looks like a raid array .. */
+		pr_err("%s appears to be part of a raid array:\n",
+			name);
+		st->ss->getinfo_super(st, &info, NULL);
+		st->ss->free_super(st);
+		crtime = info.array.ctime;
+		level = map_num(pers, info.array.level);
+		if (!level) level = "-unknown-";
+		cont_err("level=%s devices=%d ctime=%s",
+		level, info.array.raid_disks, ctime(&crtime));
+	}
+	else {
+		/* Looks like GPT or MBR */
+		pr_err("partition table exists on %s\n", name);
+	}
 	return 1;
 }
 
-- 
1.8.3.1


^ permalink raw reply related

* Re: Inactive arrays
From: Daniel Sanabria @ 2016-09-14 10:36 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Wols Lists, Linux-RAID
In-Reply-To: <CAJCQCtRdbWg7HgX=wghd8OBUcauiVfES6NSOKOZj38HMWVg65A@mail.gmail.com>

> Pretty weird.  Any ideas how that happened? My guess is sdd was
> partitioned first, and its partition was copied to sdc and sde, and
> the tool blindly did not recompute the last usable sector LBA, it used
> the value from sdd.

I have no solid idea but my money is on a human screwing up. While
trying to boot the server after the move I have no clear record of
what actions were taken, however I think it was during this time when
the disks were probably messed.

> sdd1 is 2TB
> sdd2 is 500MB
>
> And it looks like sdc and sde, if we believe the backup GPT, have the
> same exact partition scheme.

yes from the original build that was the idea

> wipefs -a -n /dev/sdc
> wipefs -a -n /dev/sde
>
> So what do you get for that?

[root@lamachine ~]# wipefs -a -n /dev/sdc
/dev/sdc: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdc: calling ioctl to re-read partition table: Success

[root@lamachine ~]# wipefs -a -n /dev/sde
/dev/sde: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sde: calling ioctl to re-read partition table: Success

it didn't give me any indication that it was in no-act mode but I
decided to carry on with the backup flag and got this:

[root@lamachine ~]# wipefs -a -b /dev/sdc

wipefs: invalid option -- 'b'

Usage:

 wipefs [options] <device>

Wipe signatures from a device.

Options:

 -a, --all           wipe all magic strings (BE CAREFUL!)
 -b, --backup        create a signature backup in $HOME
 -f, --force         force erasure
 -h, --help          show this help text
 -n, --no-act        do everything except the actual write() call
 -o, --offset <num>  offset to erase, in bytes
 -p, --parsable      print out in parsable instead of printable format
 -q, --quiet         suppress output messages
 -t, --types <list>  limit the set of filesystem, RAIDs or partition tables
 -V, --version       output version information and exit


For more details see wipefs(8).

[root@lamachine ~]# wipefs -V
wipefs from util-linux 2.27.1

[root@lamachine ~]# wipefs -a --backup /dev/sdc
/dev/sdc: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdc: calling ioctl to re-read partition table: Success

[root@lamachine ~]# wipefs -a --backup /dev/sde
/dev/sde: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sde: calling ioctl to re-read partition table: Success

[root@lamachine ~]#

before proceeding with manually creating both partitions could you
confirm the above is kind of expected?

> Chances are you could just use gdisk to verify and fix the primary and
> backup GPTs on sdc and sde

Will the fix be offered as part of the verify command in gdisk (command: v)?

> When it's all done and working with the new MBR you can either leave
> it alone, or you can run gdisk on it and it will immediately convert
> it (in memory) and you can commit it to disk with the w command to go
> back to GPT

so just running the w command after running the verify command while
on the same session ?

^ permalink raw reply

* Re: moving spares into group and checking spares
From: Andreas Klauer @ 2016-09-14  9:29 UTC (permalink / raw)
  To: scar; +Cc: linux-raid
In-Reply-To: <nramjg$n07$1@blaine.gmane.org>

On Tue, Sep 13, 2016 at 10:18:41PM -0700, scar wrote:
> i currently have four RAID-5 md arrays which i concatenated into one 
> logical volume (lvm2), essentially creating a RAID-50.  each md array 
> was created with one spare disk.

That's perfect for switching to RAID-6. More redundancy should be 
more useful than spares that only sync in when you already completely 
lost redundancy ...

And the disks would also be covered by your checks that way ;)

> i was wondering, then, how i could also check the spare disks to make
> sure they are healthy and ready to be used if needed?

smartmontools, periodic selftests. for all disks, not just the spares.
you can use select,cont tests to check a small region each day, 
covering the entire disk over $X days.

Regards
Andreas Klauer

^ permalink raw reply

* Re: lots of "md: export_rdev(sde)" printed after create IMSM RAID10 with missing
From: Yi Zhang @ 2016-09-14  9:24 UTC (permalink / raw)
  To: Artur Paszkiewicz; +Cc: Shaohua Li, Jes.Sorensen, linux-raid
In-Reply-To: <eae7d252-406c-641f-928e-354b5d6df6c0@intel.com>



On 09/12/2016 06:58 PM, Artur Paszkiewicz wrote:
> On 09/12/2016 10:03 AM, Yi Zhang wrote:
>> Hello Artur
>> With your patch, no "md: export_rdev(sde)" printed after create raid10.
>>
>> I found another problem, not sure whether it is reasonable, could you help confirm it, thanks.
>> When I create one container with 4 disks[1], and create one raid10 with 3 disks(sd[b-d]) + 1 missing [2], but it finally bind the fourth disk: sde [3].
>>
>> [1] mdadm -CR /dev/md0 /dev/sd[b-e] -n4 -e imsm
>> [2] mdadm -CR /dev/md/Volume0 -l10 -n4 /dev/sd[b-d] missing --size=500M
>> [3] # cat /proc/mdstat
>> Personalities : [raid10]
>> md127 : active raid10 sde[4] sdd[2] sdc[1] sdb[0]
>>        1024000 blocks super external:/md0/0 128K chunks 2 near-copies [4/4] [UUUU]
>>
>> md0 : inactive sde[3](S) sdd[2](S) sdc[1](S) sdb[0](S)
>>        4420 blocks super external:imsm
>>
>> unused devices: <none>
> I think that this is correct behavior. Because there is a spare disk
> available in the container, it is used for rebuilding the volume. This
> is equivalent to:
>
> mdadm -CR /dev/md0 /dev/sd[b-d] -n3 -e imsm
> mdadm -CR /dev/md/Volume0 -l10 -n4 /dev/sd[b-d] missing --size=500M
> mdadm -a /dev/md0 /dev/sde
got, thanks Artur for the confirmation.

Yi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* Re: Question about commit f9a67b1182e5 ("md/bitmap: clear bitmap if bitmap_create failed").
From: Guoqing Jiang @ 2016-09-14  8:25 UTC (permalink / raw)
  To: Shaohua Li, Christophe JAILLET; +Cc: linux-raid, linux-kernel
In-Reply-To: <20160913172433.GB24264@kernel.org>

On 09/13/2016 01:24 PM, Shaohua Li wrote:
> On Mon, Sep 12, 2016 at 09:09:48PM +0200, Christophe JAILLET wrote:
>> Hi,
>>
>> I'm puzzled by commit f9a67b1182e5 ("md/bitmap: clear bitmap if
>> bitmap_create failed").
> Hi Christophe,
> Thank you very much to help check this!
>
>> Part of the commit is:
>>
>> @@ -1865,8 +1866,10 @@ int bitmap_copy_from_slot(struct mddev *mddev, int
>> slot,
>>       struct bitmap_counts *counts;
>>       struct bitmap *bitmap = bitmap_create(mddev, slot);
>>
>> -    if (IS_ERR(bitmap))
>> +    if (IS_ERR(bitmap)) {
>> +        bitmap_free(bitmap);
>>           return PTR_ERR(bitmap);
>> +    }
>>
>> but if 'bitmap' is an error, I think that bad things will happen in
>> 'bitmap_free()' when, at the beginning of the function, we will execute:
>>
>>      if (bitmap->sysfs_can_clear) <-----------------
>>          sysfs_put(bitmap->sysfs_can_clear);

I guess it is safe, since below part is at the beginning of bitmap_free.

         if (!bitmap) /* there was no bitmap */
                 return;

> Add Guoqing.
>
> Yeah, you are right, this bitmap_free isn't required. This must be something
> slip in in the v2 patch. I'll delete that line.
>
>> However, the commit log message is really explicit and adding this call to
>> 'bitmap_free' has really been done one purpose. ("If bitmap_create returns
>> an error, we need to call either bitmap_destroy or bitmap_free to do clean
>> up, ...")
> this log is a little confusing, I thought it really means the bitmap_free called
> in bitmap_create. The V1 patch calls bitmap_destroy in bitmap_create.

I double checked v1 patch, it called bitmap_destroy once bitmap_create 
returned
error inside bitmap_copy_from_slot, also bitmap_destroy is also not 
called in
location_store once failed to create bitmap.

But since bitmap_free within bitmap_create is used to handle related 
failure,
seems we don't need the patch, and maybe we also don't need the second line
of below comments (the patch is motivated by the comment IIRC).

/*
  * initialize the bitmap structure
  * if this returns an error, bitmap_destroy must be called to do clean up
  */

Thanks,
Guoqing

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox