Linux RAID subsystem development
 help / color / mirror / Atom feed
* Re: Superblock of raid5 log can't be updated when array stoped
From: Shaohua Li @ 2016-10-21 22:43 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: shli, Song Liu, linux-raid
In-Reply-To: <tencent_5F7A406104D26FAD7587F1C4@qq.com>

On Tue, Oct 18, 2016 at 01:51:57PM +0800, Zhengyuan Liu wrote:
> If that's the problem, I think there is another problem about next_checkpoint initialization.
> No initial operation was done to this field when we loaded/recovery  the log , it got
> assignment  only when IO to raid disk was finished.  So r5l_quiesce may use wrong 
> next_checkpoint to reclaim log space, that would confuse log recovery.  Bellow patch
> may help the expression.

Good catch. This doesn't confuse log recovery but maks reclaimable space
calculation confused. Could you please send a formal patch?

Thanks,
Shaohua 
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 1b1ab4a..998ea00 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c
> @@ -1096,6 +1096,8 @@ static int r5l_recovery_log(struct r5l_log *log)
>                 log->seq = ctx.seq + 11;
>                 log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
>                 r5l_write_super(log, ctx.pos);
> +               log->last_checkpoint = ctx.pos;
> +               log->next_checkpoint = ctx.pos;
>         } else {
>                 log->log_start = ctx.pos;
>                 log->seq = ctx.seq;
> @@ -1168,6 +1170,7 @@ create:
>         if (log->max_free_space > RECLAIM_MAX_FREE_SPACE)
>                 log->max_free_space = RECLAIM_MAX_FREE_SPACE;
>         log->last_checkpoint = cp;
> +       log->next_checkpoint = cp;
>  
>         __free_page(page);
> 
> Thanks,
> --Zhengyuan
> 
> ------------------ Original ------------------
> From:  "Shaohua Li"<shli@kernel.org>;
> Date:  Tue, Oct 18, 2016 08:28 AM
> To:  "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>;
> Cc:  "shli"<shli@fb.com>; "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>;
> Subject:  Re: Superblock of raid5 log can't be updated when array stoped
>  
> On Sat, Oct 15, 2016 at 10:19:36AM +0800, liuzhengyuan wrote:
> > Hi, Shaohua.
> > 
> > when we stop raid5 array with "mdadm -S" or reboot the system,  md module will 
> > call raid5_quiesce and r5l_quiesce to do some clean work. Some code of r5l_quiesce
> > was pasted bellow.
> > 
> >                 /* make sure r5l_write_super_and_discard_space exits */
> >                 mddev = log->rdev->mddev;
> >                 wake_up(&mddev->sb_wait);
> >                 r5l_wake_reclaim(log, -1L);
> >                 md_unregister_thread(&log->reclaim_thread);
> >                 r5l_do_reclaim(log);
> > +                md_update_sb(mddev, 1);
> > 
> > It will reclaim all used space of log and call r5l_write_super to reset rdev->journal_tail
> > to log->next_checkpoint . However, new rdev->journal_tail would not be written to 
> > journal device for persistent because journal device may not support discard operation
> > or due to mddev_trylock fail (this trylock should always get failed since raid5_quiesce 
> > was called with  reconfig_mutex hold, isn't it?). As a result, it will take a long time to
> >  recovery the log when the arrary was restarted. Should r5l_quiesce call md_update_sb
> >  directly to guarantee  superblock  update?
> 
> Yep, that's problem here. Unfortunately we can't call md_update_sb here,
> because we might not hold the mddev lock. I think we should call it at do_md_stop.
> 
> Thanks,
> Shaohua

^ permalink raw reply

* Get A Part Time Job Today
From: Steven Moore @ 2016-10-22 11:54 UTC (permalink / raw)
  To: Recipients

Hello
 
 
My Name is Steven Moore.We seek the service of a Secret Shopper in our apex-consult Organization. The assignment will pay  per duty .You can make $200 everyday , depends on how fast you able to take up your work. I am sure you cannot spend more than 1-2 hours per day on your assignment . No Money or Fee is required to become secret shopper.
 
 
 
KINDLY SEND YOUR INFORMATION BELOW IF INTERESTED:
 
 
 
Your Name:
Street Address: ( Not PO. BOX )
Apartment Number:
City: State:
Zip Code:  
Country:
Cell/Home phone number:
Age:
Present working Status:
 
You can also vist our website to Regrister online:
 
 
http://www.apex-consult.net/
 
 
Regards,
Steven Moore
231101xyz@gmail.com
 

^ permalink raw reply

* query re: resync not persisting over reboot in rescue mode
From: Dan Kortschak @ 2016-10-23 23:50 UTC (permalink / raw)
  To: linux-raid

First, I'll start with an apology - I have little (~nothing) in the way
of hard data to back up this question, but it is now just a matter of
personal interest as the problem appears to be fixed.


Background:

Last week I tried to upgrade a kernel (ubuntu distro 16.04) but that
failed due to out of space (the system was originally built when
kernels were much smaller and I have now got to the point where it is
not possible to have two kernels on the /boot partitian).

On reboot the boot failed with a great deal of disk noise which
persisted. Booting into recovery worked, and the noise was now knowable
to be coming from a RAID resync of /dev/md0 (RAID10) after I dropped
into a root shell. I allowed this to complete and then went back to
resume the normal boot. This failed as before.

Repeating the process above, I found that the RAID was again unsynced
an doing a resync.

This has now resolved - I again allowed the resync to complete and then
did a /sbin/reboot from the CLI rather than going back to the recovery
menu and continuing the boot from the menu.


Question:

What is a/are likely cause(s) for a resync to not persist over a
reboot?


Thanks and apologies for the dearth of information (if more will be
helpful I can add in a follow-up, but since the RAID is now working I
am not sure what will be helpful).

Dan


Current detail (now all working):

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Tue Sep 29 17:41:02 2009
     Raid Level : raid10
     Array Size : 974558208 (929.41 GiB 997.95 GB)
  Used Dev Size : 974558208 (929.41 GiB 997.95 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Oct 24 09:47:52 2016
          State : active 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : far=4
     Chunk Size : 256K

           UUID : xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx
         Events : 0.48263

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2
       2       8       34        2      active sync   /dev/sdc2
       3       8       50        3      active sync   /dev/sdd2

$ cat /proc/mdstat 
Personalities : [raid10] [raid1] [linear] [multipath] [raid0] [raid6]
[raid5] [raid4] 
md1 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0]
      152512 blocks [4/4] [UUUU]
      
md0 : active raid10 sdd2[3] sdc2[2] sda2[1] sdb2[0]
      974558208 blocks 256K chunks 4 far-copies [4/4] [UUUU]
      
unused devices: <none>

^ permalink raw reply

* [PATCH] md/raid5: initialize next_checkpoint field before use
From: Zhengyuan Liu @ 2016-10-24  1:55 UTC (permalink / raw)
  To: shli; +Cc: shli, songliubraving, linux-raid, liuzhengyuang521, Zhengyuan Liu

No initial operation was done to this field when we
load/recovery the log, it got assignment only when IO
to raid disk was finished. So r5l_quiesce may use wrong
next_checkpoint to reclaim log space, that would make
reclaimable space calculation confused.

Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
---
 drivers/md/raid5-cache.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 1b1ab4a..998ea00 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1096,6 +1096,8 @@ static int r5l_recovery_log(struct r5l_log *log)
 		log->seq = ctx.seq + 11;
 		log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
 		r5l_write_super(log, ctx.pos);
+		log->last_checkpoint = ctx.pos;
+		log->next_checkpoint = ctx.pos;
 	} else {
 		log->log_start = ctx.pos;
 		log->seq = ctx.seq;
@@ -1168,6 +1170,7 @@ create:
 	if (log->max_free_space > RECLAIM_MAX_FREE_SPACE)
 		log->max_free_space = RECLAIM_MAX_FREE_SPACE;
 	log->last_checkpoint = cp;
+	log->next_checkpoint = cp;
 
 	__free_page(page);
 
-- 
2.7.4




^ permalink raw reply related

* [PATCH] md/raid5: write an empty meta-block when creating log super-block
From: Zhengyuan Liu @ 2016-10-24  8:15 UTC (permalink / raw)
  To: shli; +Cc: shli, songliubraving, linux-raid, liuzhengyuang521

If superblock points to an invalid meta block, r5l_load_log will set
create_super with true and create an new superblock, this runtime path
would always happen if we do no writing I/O to this array since it was
created. Writing an empty meta block could avoid this unnecessary
action at the first time we created log superblock.

Another reason is for the corretness of log recovery. Currently we have
bellow code to guarantee log revocery to be correct.

        if (ctx.seq > log->last_cp_seq + 1) {
                int ret;

                ret = r5l_log_write_empty_meta_block(log, ctx.pos, ctx.seq + 10);
                if (ret)
                        return ret;
                log->seq = ctx.seq + 11;
                log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
                r5l_write_super(log, ctx.pos);
        } else {
                log->log_start = ctx.pos;
                log->seq = ctx.seq;
        }

If we just created a array with a journal device, log->log_start and
log->last_checkpoint should all be 0, then we write three meta block
which are valid except mid one and supposed crash happened. The ctx.seq
would equal to log->last_cp_seq + 1 and log->log_start would be set to
position of mid invalid meta block after we did a recovery, this will
lead to problems which could be avoided with this patch.

Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
---
 drivers/md/raid5-cache.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 998ea00..981f855 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1156,6 +1156,7 @@ create:
 	if (create_super) {
 		log->last_cp_seq = prandom_u32();
 		cp = 0;
+		r5l_log_write_empty_meta_block(log, cp, log->last_cp_seq);
 		/*
 		 * Make sure super points to correct address. Log might have
 		 * data very soon. If super hasn't correct log tail address,
-- 
2.7.4




^ permalink raw reply related

* [PATCH v2] IMSM: Add warning message when x8-type device is used
From: Pawel Baldysiak @ 2016-10-24  8:19 UTC (permalink / raw)
  To: jes.sorensen; +Cc: linux-raid, Pawel Baldysiak

This patch adds the warning message when x8-type device
is used with IMSM metadata. x8 device is a special
NVMe drive - two of them on a single PCIe card.
This card could be a single point of failure for
RAID levels different than RAID0. x8 devices have
serial number ending with "-A/-B" or "-1/-2".

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
---
 super-intel.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/super-intel.c b/super-intel.c
index 5c6ab05..700cc61 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5139,9 +5139,53 @@ static int add_to_super_imsm(struct supertype *st, mdu_disk_info_t *dk,
 	rv = imsm_read_serial(fd, devname, dd->serial);
 	if (rv) {
 		pr_err("failed to retrieve scsi serial, aborting\n");
+		if (dd->devname)
+			free(dd->devname);
 		free(dd);
 		abort();
 	}
+	if (super->hba && ((super->hba->type == SYS_DEV_NVME) ||
+	   (super->hba->type == SYS_DEV_VMD))) {
+		int i;
+		char *devpath = diskfd_to_devpath(fd);
+		char controller_path[PATH_MAX];
+
+		if (!devpath) {
+			pr_err("failed to get devpath, aborting\n");
+			if (dd->devname)
+				free(dd->devname);
+			free(dd);
+			return 1;
+		}
+
+		snprintf(controller_path, PATH_MAX-1, "%s/device", devpath);
+		free(devpath);
+
+		if (devpath_to_vendor(controller_path) == 0x8086) {
+			/*
+			 * If Intel's NVMe drive has serial ended with
+			 * "-A","-B","-1" or "-2" it means that this is "x8"
+			 * device (double drive on single PCIe card).
+			 * User should be warned about potential data loss.
+			 */
+			for (i = MAX_RAID_SERIAL_LEN-1; i > 0; i--) {
+				/* Skip empty character at the end */
+				if (dd->serial[i] == 0)
+					continue;
+
+				if (((dd->serial[i] == 'A') ||
+				   (dd->serial[i] == 'B') ||
+				   (dd->serial[i] == '1') ||
+				   (dd->serial[i] == '2')) &&
+				   (dd->serial[i-1] == '-'))
+					pr_err("\tThe action you are about to take may put your data at risk.\n"
+						"\tPlease note that x8 devices may consist of two separate x4 devices "
+						"located on a single PCIe port.\n"
+						"\tRAID 0 is the only supported configuration for this type of x8 device.\n");
+				break;
+			}
+		}
+	}
 
 	get_dev_size(fd, NULL, &size);
 	/* clear migr_rec when adding disk to container */
-- 
2.7.4


^ permalink raw reply related

* [PATCH] imsm: load migration record from right disk
From: Tomasz Majchrzak @ 2016-10-24 10:00 UTC (permalink / raw)
  To: linux-raid; +Cc: Jes.Sorensen, Tomasz Majchrzak

Migration record is only stored on disks in first and second metadata
slot. The function to load the record incorrectly passes disk slot as
disk index. If rebuilt has taken place for a container, disk slot
doesn't match disk index so it causes migration record to be read from a
disk it has not been written to. As a result reshape operation fails.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 super-intel.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index ac3330a..4859950 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -2656,19 +2656,12 @@ static int load_imsm_migr_rec(struct intel_super *super, struct mdinfo *info)
 	*/
 	if (dev == NULL)
 		return -2;
-	map = get_imsm_map(dev, MAP_0);
 
 	if (info) {
 		for (sd = info->devs ; sd ; sd = sd->next) {
-			/* skip spare and failed disks
-			 */
-			if (sd->disk.raid_disk < 0)
-				continue;
 			/* read only from one of the first two slots */
-			if (map)
-				slot = get_imsm_disk_slot(map,
-							  sd->disk.raid_disk);
-			if (map == NULL || slot > 1 || slot < 0)
+			if ((sd->disk.raid_disk < 0) ||
+			    (sd->disk.raid_disk > 1))
 				continue;
 
 			sprintf(nm, "%d:%d", sd->disk.major, sd->disk.minor);
@@ -2678,6 +2671,7 @@ static int load_imsm_migr_rec(struct intel_super *super, struct mdinfo *info)
 		}
 	}
 	if (fd < 0) {
+		map = get_imsm_map(dev, MAP_0);
 		for (dl = super->disks; dl; dl = dl->next) {
 			/* skip spare and failed disks
 			*/
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH] md: report 'write_pending' state when array in sync
From: Tomasz Majchrzak @ 2016-10-24 10:47 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Tomasz Majchrzak

If there is a bad block on a disk and there is a recovery performed from
this disk, the same bad block is reported for a new disk. It involves
setting MD_CHANGE_PENDING flag in rdev_set_badblocks. For external
metadata this flag is not being cleared as array state is reported as
'clean'. The read request to bad block in RAID5 array gets stuck as it
is waiting for a flag to be cleared - as per commit c3cce6cda162
("md/raid5: ensure device failure recorded before write request
returns.").

The meaning of MD_CHANGE_PENDING and MD_CHANGE_CLEAN flags has been
clarified in commit 070dc6dd7103 ("md: resolve confusion of
MD_CHANGE_CLEAN"), however MD_CHANGE_PENDING flag has been used in
personality error handlers since and it doesn't fully comply with
initial purpose. It was supposed to notify that write request is about
to start, however now it is also used to request metadata update.
Initially (in md_allow_write, md_write_start) MD_CHANGE_PENDING flag has
been set and in_sync has been set to 0 at the same time. Error handlers
just set the flag without modifying in_sync value. Sysfs array state is
a single value so now it reports 'clean' when MD_CHANGE_PENDING flag is
set and in_sync is set to 1. Userspace has no idea it is expected to
take some action.

Swap the order that array state is checked so 'write_pending' is
reported ahead of 'clean' ('write_pending' is a misleading name but it
is too late to rename it now).

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 drivers/md/md.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 457b538..48f25d8 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3887,10 +3887,10 @@ array_state_show(struct mddev *mddev, char *page)
 			st = read_auto;
 			break;
 		case 0:
-			if (mddev->in_sync)
-				st = clean;
-			else if (test_bit(MD_CHANGE_PENDING, &mddev->flags))
+			if (test_bit(MD_CHANGE_PENDING, &mddev->flags))
 				st = write_pending;
+			else if (mddev->in_sync)
+				st = clean;
 			else if (mddev->safemode)
 				st = active_idle;
 			else
-- 
1.8.3.1


^ permalink raw reply related

* Re: How to fix mistake on raid: mdadm create instead of assemble?
From: Santiago DIEZ @ 2016-10-24 13:02 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Linux Raid LIST, songliubraving, Jes.Sorensen
In-Reply-To: <20161021223524.GB105663@kernel.org>

Fantastic! I ran the following command:

mdadm --assemble /dev/md10 --verbose --force --run /dev/loop0
/dev/loop1 /dev/loop2

and it returned

mdadm: looking for devices for /dev/md10
mdadm: /dev/loop0 is identified as a member of /dev/md10, slot 0.
mdadm: /dev/loop1 is identified as a member of /dev/md10, slot 1.
mdadm: /dev/loop2 is identified as a member of /dev/md10, slot 2.
mdadm: forcing event count in /dev/loop0(0) from 81589 upto 81626
mdadm: forcing event count in /dev/loop2(2) from 81589 upto 81626
mdadm: added /dev/loop1 to /dev/md10 as 1
mdadm: added /dev/loop2 to /dev/md10 as 2
mdadm: no uptodate device for slot 6 of /dev/md10
mdadm: added /dev/loop0 to /dev/md10 as 0
mdadm: /dev/md10 has been started with 3 drives (out of 4).

I can mount /dev/md10 and I have access to my data.

Now that I know my data is safe.
I need to restart the original array on the server.
The array is completely deactivated on the original server.
The company that I rent the dedicated server from is waiting for my GO
to replace disk sdd with a new one.
I'm giving then the GO now.
After I reboot the server, what is it I need to do to rebuild sdd as
the fourth disk of the array?

Regards
-------------------------
Santiago DIEZ
Quark Systems & CAOBA
23 rue du Buisson Saint-Louis, 75010 Paris
-------------------------


On Sat, Oct 22, 2016 at 12:35 AM, Shaohua Li <shli@kernel.org> wrote:
> On Fri, Oct 21, 2016 at 10:45:10AM +0200, Santiago DIEZ wrote:
>> Hi,
>>
>> Thanks Andreas,
>>
>> Yes apparently, 3/4 of the original disks seem to be safe. But I'm
>> terrified at the idea of doing something wrong assembling them.
>> Incidentally, I indeed did a mistake trying to assemble the ddrescue
>> images of the 3 safe disks. I tried to create again with proper
>> metadata and chunck but it did not work. I'm still scared at the idea
>> of restarting the original raid. I'm currently ddrescuing again the 3
>> partitions to then try and *assemble* them rather than *create*.
>>
>>
>> Thanks Wol,
>>
>> I use loop devices because I work on partition images, not on actual partitions:
>> I use ddrescue to copy data from /dev/sd[abc]10 to
>> some.other.server:/home/sd[abc].img
>> Then I go to some.other.server and turn the images into loop devices :
>> losetup /dev/loop0 /home/sda10.img
>> losetup /dev/loop1 /home/sdb10.img
>> losetup /dev/loop2 /home/sdc10.img
>> Then I tried to created the raid, it worked but as I said, the
>> filesystem was unreadable.
>> I know the idea of using loop devices works because I tested it before.
>> I'm doing the whole procedure all over again (takes 5 days to ddrescue
>> the 3 partitions to another server) and then I will use the command
>> you recommended :
>> mdadm --assemble /dev/md0 /dev/loop0 /dev/loop1 /dev/loop2 --force
>>
>>
>> Will keep you posted
>>
>> -------------------------
>> Santiago DIEZ
>> Quark Systems & CAOBA
>> 23 rue du Buisson Saint-Louis, 75010 Paris
>> -------------------------
>>
>> On Mon, Oct 10, 2016 at 12:39 AM, Wols Lists <antlists@youngman.org.uk> wrote:
>> >
>> > On 08/10/16 13:30, Andreas Klauer wrote:
>> > > On Fri, Oct 07, 2016 at 05:37:32PM +0200, Santiago DIEZ wrote:
>> > >> > First thing I did is ddrescue the remaining partitions sd[abc]10 .
>> > >> > ddrescue did not stumble into any read error so I assume all remaining
>> > >> > partitions are perfectly safe.
>> > > So ... don't you still have a good copy?
>> > >
>> > > You only killed one of them, right? Did not make same mistake twice?
>> > >
>> > >> > There comes my mistake: I ran the --create command instead of --assemble :
>> > >> >
>> > >> > ================================================================================
>> > >> > # mdadm --create --verbose /dev/md1 --raid-devices=4 --level=raid5
>> > >> > --run --readonly /dev/loop0 /dev/loop1 /dev/loop2 missing
>> >
>> > One oddity I've noticed. You've created the array using loop devices.
>> > What are these?
>> >
>> > The reason I ask is that using loopback devices is a standard technique
>> > for rebuilding a damaged array, specifically to prevent md from actually
>> > writing to the drive. So is it possible that "mdadm --create" only wrote
>> > to ram, and a reboot will recover your ddrescue copies untouched?
>> >
>> > My raid-fu isn't enough to tell me whether I'm right or not ... :-)
>> >
>> > If necessary you'll have to do another ddrescue from the original
>> > drives, and you should then be able to assemble the array from the
>> > copies. Don't use "missing", use "--force" and you should get a working,
>> > degraded, array to which you can add a new drive and rebuild the array.
>> >
>> > mdadm --assemble /dev/md0 /dev/sd[efg]10 --force
>> >
>> > if I'm right ... so long as it's the copies, you can always recover
>> > again from the original disks, and if there's a problem with the copies
>> > mdadm should complain when it assembles the array.
>
> Hmm, those commands work for me. I'm adding Song and Jes if they have ideas.
>
> Thanks,
> Shaohua

^ permalink raw reply

* Re: [PATCH] md/raid5: write an empty meta-block when creating log super-block
From: Shaohua Li @ 2016-10-24 21:23 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: shli, songliubraving, linux-raid, liuzhengyuang521
In-Reply-To: <1477296959-20123-1-git-send-email-liuzhengyuan@kylinos.cn>

On Mon, Oct 24, 2016 at 04:15:59PM +0800, Zhengyuan Liu wrote:
> If superblock points to an invalid meta block, r5l_load_log will set
> create_super with true and create an new superblock, this runtime path
> would always happen if we do no writing I/O to this array since it was
> created. Writing an empty meta block could avoid this unnecessary
> action at the first time we created log superblock.
> 
> Another reason is for the corretness of log recovery. Currently we have
> bellow code to guarantee log revocery to be correct.
> 
>         if (ctx.seq > log->last_cp_seq + 1) {
>                 int ret;
> 
>                 ret = r5l_log_write_empty_meta_block(log, ctx.pos, ctx.seq + 10);
>                 if (ret)
>                         return ret;
>                 log->seq = ctx.seq + 11;
>                 log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
>                 r5l_write_super(log, ctx.pos);
>         } else {
>                 log->log_start = ctx.pos;
>                 log->seq = ctx.seq;
>         }
> 
> If we just created a array with a journal device, log->log_start and
> log->last_checkpoint should all be 0, then we write three meta block
> which are valid except mid one and supposed crash happened. The ctx.seq
> would equal to log->last_cp_seq + 1 and log->log_start would be set to
> position of mid invalid meta block after we did a recovery, this will
> lead to problems which could be avoided with this patch.

This would be very unlikely, but better to fix. Applied, thanks!

^ permalink raw reply

* Re: [PATCH] md: report 'write_pending' state when array in sync
From: Shaohua Li @ 2016-10-24 22:24 UTC (permalink / raw)
  To: Tomasz Majchrzak; +Cc: linux-raid
In-Reply-To: <1477306048-26097-1-git-send-email-tomasz.majchrzak@intel.com>

On Mon, Oct 24, 2016 at 12:47:28PM +0200, Tomasz Majchrzak wrote:
> If there is a bad block on a disk and there is a recovery performed from
> this disk, the same bad block is reported for a new disk. It involves
> setting MD_CHANGE_PENDING flag in rdev_set_badblocks. For external
> metadata this flag is not being cleared as array state is reported as
> 'clean'. The read request to bad block in RAID5 array gets stuck as it
> is waiting for a flag to be cleared - as per commit c3cce6cda162
> ("md/raid5: ensure device failure recorded before write request
> returns.").
> 
> The meaning of MD_CHANGE_PENDING and MD_CHANGE_CLEAN flags has been
> clarified in commit 070dc6dd7103 ("md: resolve confusion of
> MD_CHANGE_CLEAN"), however MD_CHANGE_PENDING flag has been used in
> personality error handlers since and it doesn't fully comply with
> initial purpose. It was supposed to notify that write request is about
> to start, however now it is also used to request metadata update.
> Initially (in md_allow_write, md_write_start) MD_CHANGE_PENDING flag has
> been set and in_sync has been set to 0 at the same time. Error handlers
> just set the flag without modifying in_sync value. Sysfs array state is
> a single value so now it reports 'clean' when MD_CHANGE_PENDING flag is
> set and in_sync is set to 1. Userspace has no idea it is expected to
> take some action.
> 
> Swap the order that array state is checked so 'write_pending' is
> reported ahead of 'clean' ('write_pending' is a misleading name but it
> is too late to rename it now).


Applied, thanks!

^ permalink raw reply

* data loss+inode recovery using RAID6 write journal
From: Nick Black @ 2016-10-24 23:55 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2164 bytes --]

Hey there, everyone! I've been using and admiring mdadm for over a decade;
thanks for all the awesome work.

I recently put together a new build, and wanted to try out the
--write-journal capability of recent Linux md. My write journal is a
Samsung SSD 840 PRO SSD, atop a RAID6 of 8 4TB spinning disks. All 9 SATA3
devices are plugged into the onboard SATA3 ports of my ASUS X-99 Deluxe II
motherboard. Summary description:

md126 : active raid6 sde1[4] sdg1[6] sdd1[3] sdc1[2] sdf1[5] sdi1[8] sdh1[7] sdb1[1] sda1[0](J)
      23441316864 blocks super 1.2 level 6, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
      bitmap: 0/30 pages [0KB], 65536KB chunk

All filesystems are ext4. ~14TB of ~22TB are in use on the filesystem built
directly atop md126:

 /dev/md126       22T   14T  7.4T  65% /media/trap

Kernel version is 4.8.3 (the array was built under 4.7.5), and mdadm reports
v3.4. Distro is debian unstable, running a custom (but fairly orthodox)
kernel.

I moved a ~20GB tarball from my home directory (located on another device, a
NVMe md RAID1) to /media/trap/backups. The mv completed successfully. A
short time after that, I hard rebooted the machine due to X lockup (I'm
experimenting with compiz). By "short time", I mean "possibly within the
time window before 20GB could be written out to the backing store, but I'm
unsure about that". Upon restart, the machine engaged in minutes of disk
activity, spat out some fsck inode recovery messages (I'm trying to find
these in my logs), and finally mounted the filesystem. The moved file is
nowhere to be found.

It's no big loss to me -- I can recreate that data -- but I thought I'd
report this. As said, I'm looking for logs or other hard details, but not
seeing them in journalctl output. I can probably reproduce the problem if
someone needs me to, though otherwise I will likely disable the write
journal for now (I've not yet done so). Please let me know how I might help
you track this problem down, if a problem does indeed exist. Thanks!

-- 
nick black -=- http://www.nick-black.com
to make an apple pie from scratch, you need first invent a universe.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]

^ permalink raw reply

* [PATCH mdadm] raid6check.c: fix "misleading-indentation" error
From: renyl @ 2016-10-25  5:41 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: LKP, Yilong Ren, NeilBrown, linux-raid

From: Yilong Ren <yilongx.ren@intel.com>

To fix the following error info:

root@vm-lkp-nex04-8G-7 /tmp/mdadm# make test
cc -Wall -Werror -Wstrict-prototypes -Wextra -Wno-unused-parameter -ggdb -DSendmail=\""/usr/sbin/sendmail -t"\" -DCONFFILE=\"/etc/mdadm.conf\" -DCONFFILE2=\"/etc/mdadm/mdadm.conf\" -DMAP_DIR=\"/run/mdadm\" -DMAP_FILE=\"map\" -DMDMON_DIR=\"/run/mdadm\" -DFAILED_SLOTS_DIR=\"/run/mdadm/failed-slots\" -DNO_COROSYNC -DNO_DLM -DVERSION=\"3.4-43-g1dcee1c\" -DVERS_DATE="\"06th April 2016\"" -DUSE_PTHREADS -DBINDIR=\"/sbin\"  -c -o raid6check.o raid6check.c
raid6check.c: In function 'manual_repair':
raid6check.c:267:4: error: this 'else' clause does not guard... [-Werror=misleading-indentation]
    else
    ^~~~
raid6check.c:269:5: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'else'
     printf("Repairing D(%d) and P\n", failed_data);
     ^~~~~~
cc1: all warnings being treated as errors
<builtin>: recipe for target 'raid6check.o' failed
make: *** [raid6check.o] Error 1
root@vm-lkp-nex04-8G-7 /tmp/mdadm# 


Cc: NeilBrown <neilb@suse.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Cc: LKP <lkp@eclists.intel.com>
Signed-off-by: Yilong Ren <yilongx.ren@intel.com>
---
 raid6check.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/raid6check.c b/raid6check.c
index ad7ffe7..acfc9a3 100644
--- a/raid6check.c
+++ b/raid6check.c
@@ -264,9 +264,10 @@ int manual_repair(int chunk_size, int syndrome_disks,
 			int failed_data;
 			if (failed_slot1 == -1)
 				failed_data = failed_slot2;
-			else
+			else {
 				failed_data = failed_slot1;
 				printf("Repairing D(%d) and P\n", failed_data);
+			}
 			raid6_datap_recov(syndrome_disks+2, chunk_size,
 					  failed_data, (uint8_t**)blocks, 1);
 		} else {
-- 
2.1.4


^ permalink raw reply related

* Problems with a RAID5 array
From: Nicolas Nicolaou @ 2016-10-25  7:45 UTC (permalink / raw)
  To: linux-raid

Hi all,

I am a newbie in the RAID field but i encountered some problems 
with my RAID5 configuration on a QNAP NAS machine. 

In particular i added a 3TB drive and the array seemed to be rebuilt 
automatically. Originally i had 3 3TB drives on it. 
The rebuilt finished and i was able to access my data. For some weird reason
one of the drives was not added and i tried to expand the RAID capacity. 
The expand failed but still no problems...

When i rebooted the system however the RAID became inactive and 
now i cannot access any of the data. 

Below you can see the mdadm —examine information for the 4 drives. 

I saw a thread that recreating the RAID may solve the issue 
(https://raid.wiki.kernel.org/index.php/RAID_Recovery). 
Before going to that path though i wanted to see your take. 

Thanks,
Nicolas

/dev/sda3:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x0
Array UUID : 11d32674:4247f385:74ee352b:5e4c22c7
Name : 0
Creation Time : Wed Jan 9 02:29:02 2013
Raid Level : raid5
Raid Devices : 4

Used Dev Size : 5857395112 (2793.02 GiB 2998.99 GB)
Array Size : 17572185216 (8379.07 GiB 8996.96 GB)
Used Size : 5857395072 (2793.02 GiB 2998.99 GB)
Super Offset : 5857395368 sectors
State : clean
Device UUID : 650aa6e2:c725d7f0:b6c8a5fb:8f0ed37f

Update Time : Thu Oct 20 08:33:37 2016
Checksum : bb791848 - correct
Events : 175176

Layout : left-symmetric
Chunk Size : 64K

Array Slot : 2 (0, failed, 2, failed, 3, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed)
Array State : u_Uu 381 failed
/dev/sdb3:
Magic : a92b4efc
Version : 00.90.00
UUID : 3fe6d5d4:5b9d61f2:4f7ddb81:e4ae2138
Creation Time : Thu Jun 7 19:50:56 2012
Raid Level : raid5
Used Dev Size : 1951945600 (1861.52 GiB 1998.79 GB)
Array Size : 5855836800 (5584.56 GiB 5996.38 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0

Update Time : Sun Oct 23 21:05:33 2016
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 968897f8 - correct
Events : 0.12252139

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 2 8 19 2 active sync /dev/sdb3

0 0 8 3 0 active sync /dev/sda3
1 1 8 35 1 active sync /dev/sdc3
2 2 8 19 2 active sync /dev/sdb3
3 3 8 51 3 active sync /dev/sdd3
/dev/sdc3:
Magic : a92b4efc
Version : 00.90.00
UUID : 3fe6d5d4:5b9d61f2:4f7ddb81:e4ae2138
Creation Time : Thu Jun 7 19:50:56 2012
Raid Level : raid5
Used Dev Size : 1951945600 (1861.52 GiB 1998.79 GB)
Array Size : 5855836800 (5584.56 GiB 5996.38 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0

Update Time : Sun Oct 23 21:05:33 2016
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 96889806 - correct
Events : 0.12252139

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 1 8 35 1 active sync /dev/sdc3

0 0 8 3 0 active sync /dev/sda3
1 1 8 35 1 active sync /dev/sdc3
2 2 8 19 2 active sync /dev/sdb3
3 3 8 51 3 active sync /dev/sdd3
/dev/sdd3:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x0
Array UUID : 11d32674:4247f385:74ee352b:5e4c22c7
Name : 0
Creation Time : Wed Jan 9 02:29:02 2013
Raid Level : raid5
Raid Devices : 4

Used Dev Size : 5857395112 (2793.02 GiB 2998.99 GB)
Array Size : 17572185216 (8379.07 GiB 8996.96 GB)
Used Size : 5857395072 (2793.02 GiB 2998.99 GB)
Super Offset : 5857395368 sectors
State : clean
Device UUID : 0b3406a9:15fd802f:e9e3ed19:1c684e54

Update Time : Thu Oct 20 09:45:06 2016
Checksum : b3469ebd - correct
Events : 175176

Layout : left-symmetric
Chunk Size : 64K

Array Slot : 4 (0, failed, 2, failed, 3, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed, failed)
Array State : u_uU 381 failed

^ permalink raw reply

* Re: [PATCH mdadm] raid6check.c: fix "misleading-indentation" error
From: Jes Sorensen @ 2016-10-25 12:00 UTC (permalink / raw)
  To: renyl; +Cc: LKP, NeilBrown, linux-raid
In-Reply-To: <1477374077-4195-1-git-send-email-yilongx.ren@intel.com>

renyl <yilongx.ren@intel.com> writes:
> From: Yilong Ren <yilongx.ren@intel.com>
>
> To fix the following error info:
>
> root@vm-lkp-nex04-8G-7 /tmp/mdadm# make test
> cc -Wall -Werror -Wstrict-prototypes -Wextra -Wno-unused-parameter -ggdb -DSendmail=\""/usr/sbin/sendmail -t"\" -DCONFFILE=\"/etc/mdadm.conf\" -DCONFFILE2=\"/etc/mdadm/mdadm.conf\" -DMAP_DIR=\"/run/mdadm\" -DMAP_FILE=\"map\" -DMDMON_DIR=\"/run/mdadm\" -DFAILED_SLOTS_DIR=\"/run/mdadm/failed-slots\" -DNO_COROSYNC -DNO_DLM -DVERSION=\"3.4-43-g1dcee1c\" -DVERS_DATE="\"06th April 2016\"" -DUSE_PTHREADS -DBINDIR=\"/sbin\"  -c -o raid6check.o raid6check.c
> raid6check.c: In function 'manual_repair':
> raid6check.c:267:4: error: this 'else' clause does not guard... [-Werror=misleading-indentation]
>     else
>     ^~~~
> raid6check.c:269:5: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'else'
>      printf("Repairing D(%d) and P\n", failed_data);
>      ^~~~~~
> cc1: all warnings being treated as errors
> <builtin>: recipe for target 'raid6check.o' failed
> make: *** [raid6check.o] Error 1
> root@vm-lkp-nex04-8G-7 /tmp/mdadm# 
>
>
> Cc: NeilBrown <neilb@suse.com>
> Cc: linux-raid <linux-raid@vger.kernel.org>
> Cc: LKP <lkp@eclists.intel.com>
> Signed-off-by: Yilong Ren <yilongx.ren@intel.com>
> ---
>  raid6check.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/raid6check.c b/raid6check.c
> index ad7ffe7..acfc9a3 100644
> --- a/raid6check.c
> +++ b/raid6check.c
> @@ -264,9 +264,10 @@ int manual_repair(int chunk_size, int syndrome_disks,
>  			int failed_data;
>  			if (failed_slot1 == -1)
>  				failed_data = failed_slot2;
> -			else
> +			else {
>  				failed_data = failed_slot1;
>  				printf("Repairing D(%d) and P\n", failed_data);
> +			}
>  			raid6_datap_recov(syndrome_disks+2, chunk_size,
>  					  failed_data, (uint8_t**)blocks, 1);
>  		} else {

Hi,

I suspect this patch is wrong and the code is meant to print in either
case.

Neil?

Cheers,
Jes

^ permalink raw reply

* Re: data loss+inode recovery using RAID6 write journal
From: Wols Lists @ 2016-10-25 12:36 UTC (permalink / raw)
  To: Nick Black, linux-raid
In-Reply-To: <20161024235505.rb4fucq24ybbn5aq@schwarzgerat.orthanc>

On 25/10/16 00:55, Nick Black wrote:
> I moved a ~20GB tarball from my home directory (located on another
> device, a NVMe md RAID1) to /media/trap/backups. The mv completed
> successfully. A short time after that, I hard rebooted the machine
> due to X lockup (I'm experimenting with compiz). By "short time", I
> mean "possibly within the time window before 20GB could be written
> out to the backing store, but I'm unsure about that". Upon restart,
> the machine engaged in minutes of disk activity, spat out some fsck
> inode recovery messages (I'm trying to find these in my logs), and
> finally mounted the filesystem. The moved file is nowhere to be
> found.

I can't see what filesystem you're using. It could easily be down to that.

If the reboot interrupted the "write to disk" before the directory
containing the i-node had been flushed, that would explain your
observations, I believe.

Personally, I think that explanation is actually unlikely, as the
kernel devs go to great lengths to preserve metadata, so you're more
likely to get the situation where the file exists but is empty.

This ties in with my impression of the kernel devs - especially the
file system guys - placing great emphasis on protecting the computer
at the expense of the data the user stores there. imho that's daft,
but hey they're system guys, they protect the system. "We can reboot
the system in a clean state in one hour instead of 24 now we no longer
need a fsck". They forget that that 24 hours gave the user a usable
system, now the admins need to run a 72-hour user-space integrity
check before they hand the system back ... :-(

I guess what I'm saying is, don't assume it's the raid, as it could
well be something else entirely (although there are probably plenty of
people here who could help you with that).

Cheers,
Wol

^ permalink raw reply

* Re: [PATCH] md/raid5: write an empty meta-block when creating logsuper-block
From: Zhengyuan Liu @ 2016-10-25 12:43 UTC (permalink / raw)
  To: Shaohua Li; +Cc: shli, Song Liu, linux-raid, liuzhengyuang521

After discussion with my colleague, I think there is still a problem that
may happen very unlikely.The superblock should point to the last meta
block we have written after log reclaim or point to the emtpy meta block
after log recovery, just consider we write some meta block behind the
superblock position and suppose crash happens. If the first meta block we
have written neighboring the superblock position is invalid,  ctx.seq would 
also equal to last_cp_seq+1 after we did a recovery . So the safest way is 
we always write an empty meta block at ctx.pos no matter how much
ctx.req is more than last_cp_seq after we did a recovery. 
How do you think, Shaohua? If it is necessary, I'd revert this patch and
resend one.

------------------ Original ------------------
From:  "Shaohua Li"<shli@kernel.org>;
Date:  Tue, Oct 25, 2016 05:23 AM
To:  "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>;
Cc:  "shli"<shli@fb.com>; "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>; "liuzhengyuang521"<liuzhengyuang521@gmail.com>;
Subject:  Re: [PATCH] md/raid5: write an empty meta-block when creating logsuper-block
 
On Mon, Oct 24, 2016 at 04:15:59PM +0800, Zhengyuan Liu wrote:
> If superblock points to an invalid meta block, r5l_load_log will set
> create_super with true and create an new superblock, this runtime path
> would always happen if we do no writing I/O to this array since it was
> created. Writing an empty meta block could avoid this unnecessary
> action at the first time we created log superblock.
> 
> Another reason is for the corretness of log recovery. Currently we have
> bellow code to guarantee log revocery to be correct.
> 
>         if (ctx.seq > log->last_cp_seq + 1) {
>                 int ret;
> 
>                 ret = r5l_log_write_empty_meta_block(log, ctx.pos, ctx.seq + 10);
>                 if (ret)
>                         return ret;
>                 log->seq = ctx.seq + 11;
>                 log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
>                 r5l_write_super(log, ctx.pos);
>         } else {
>                 log->log_start = ctx.pos;
>                 log->seq = ctx.seq;
>         }
> 
> If we just created a array with a journal device, log->log_start and
> log->last_checkpoint should all be 0, then we write three meta block
> which are valid except mid one and supposed crash happened. The ctx.seq
> would equal to log->last_cp_seq + 1 and log->log_start would be set to
> position of mid invalid meta block after we did a recovery, this will
> lead to problems which could be avoided with this patch.

This would be very unlikely, but better to fix. Applied, thanks!

^ permalink raw reply

* Re: Problems with a RAID5 array
From: Wols Lists @ 2016-10-25 12:48 UTC (permalink / raw)
  To: Nicolas Nicolaou, linux-raid
In-Reply-To: <5C95474C-8A2A-4472-9629-2DD6B2143D28@GMAIL.COM>

On 25/10/16 08:45, Nicolas Nicolaou wrote:
> Hi all,
> 
> I am a newbie in the RAID field but i encountered some problems 
> with my RAID5 configuration on a QNAP NAS machine. 
> 
> In particular i added a 3TB drive and the array seemed to be rebuilt 
> automatically. Originally i had 3 3TB drives on it. 
> The rebuilt finished and i was able to access my data. For some weird reason
> one of the drives was not added and i tried to expand the RAID capacity. 
> The expand failed but still no problems...
> 
> When i rebooted the system however the RAID became inactive and 
> now i cannot access any of the data. 
> 
> Below you can see the mdadm —examine information for the 4 drives. 
> 
> I saw a thread that recreating the RAID may solve the issue 
> (https://raid.wiki.kernel.org/index.php/RAID_Recovery). 
> Before going to that path though i wanted to see your take. 
> 
Firstly, using "--force" is not necessarily a bad idea, though you want
to avoid it if you can. Using "--create" is an absolutely crazy idea
unless you are being hand-held by an expert. DO NOT attempt that on your
own unless you really want to lose everything.

Secondly, if you *are* going to be mad enough to try "--create", make
sure you've run Phil's lsdrv utility and you have a hard copy of the
output saved somewhere safe!

Go back to the raid wiki, go to the home page, and read section 4, "When
things go wrogn". Read the entire section. It includes the page you've
referenced, but that's an old page that will be deprecated. It's a
moderately safe bet that when the experts chime in, they will want a lot
of the information that tells you to gather. And hopefully, working
through this will give you a few clues yourself.

Cheers,
Wol

^ permalink raw reply

* Re: data loss+inode recovery using RAID6 write journal
From: Nick Black @ 2016-10-25 13:16 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid
In-Reply-To: <580F51DC.3010308@youngman.org.uk>

[-- Attachment #1: Type: text/plain, Size: 7681 bytes --]

Wols Lists left as an exercise for the reader:
> I can't see what filesystem you're using. It could easily be down to that.

as noted, both source and destination filesystems were ext4 without any
unorthodox options. I wouldn't normally think it was due to mdraid, as I've
never had a problem from it before, but the new write-journal code does
seem a very possible culprit, especially as it appeared to be attempting
to flush data from the SSD during the interruption.

if people don't think it due to raid, i'm not going to say it was. i'll
probably refrain from using the write journal for a bit, though, just on
a hunch.

Destination array details:

[schwarzgerat](0) $ sudo mdadm --detail /dev/md126
/dev/md126:
        Version : 1.2
  Creation Time : Fri Oct 14 21:31:31 2016
     Raid Level : raid6
     Array Size : 23441316864 (22355.38 GiB 24003.91 GB)
  Used Dev Size : 3906886144 (3725.90 GiB 4000.65 GB)
   Raid Devices : 8
  Total Devices : 9
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Oct 24 18:27:09 2016
          State : clean 
 Active Devices : 8
Working Devices : 9
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : schwarzgerat:126  (local to host schwarzgerat)
           UUID : e16ec7ec:77f523c8:243e7ba0:08c5a591
         Events : 15092

    Number   Major   Minor   RaidDevice State
       0       8        1        -      journal   /dev/sda1
       1       8       17        0      active sync   /dev/sdb1
       2       8       33        1      active sync   /dev/sdc1
       3       8       49        2      active sync   /dev/sdd1
       4       8       65        3      active sync   /dev/sde1
       5       8       81        4      active sync   /dev/sdf1
       6       8       97        5      active sync   /dev/sdg1
       7       8      113        6      active sync   /dev/sdh1
       8       8      129        7      active sync   /dev/sdi1
[schwarzgerat](0) $ 

Destination filesystem details:

[schwarzgerat](0) $ sudo dumpe2fs /dev/md126
dumpe2fs 1.43.3 (04-Sep-2016)
Filesystem volume name:   <none>
Last mounted on:          /media/trap
Filesystem UUID:          7f7404cc-9a9c-4a92-98b7-159646f5b355
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              366272512
Block count:              5860329216
Reserved block count:     293016460
Free blocks:              2255006201
Free inodes:              363288750
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         2048
Inode blocks per group:   128
RAID stride:              128
RAID stripe width:        768
Flex block group size:    16
Filesystem created:       Tue Oct 18 01:37:04 2016
Last mount time:          Mon Oct 24 18:26:39 2016
Last write time:          Mon Oct 24 18:26:39 2016
Mount count:              13
Maximum mount count:      -1
Last checked:             Tue Oct 18 01:37:04 2016
Check interval:           0 (<none>)
Lifetime writes:          16 TB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      88fcabd6-4bf3-4817-9eaa-92746a2d4295
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0xeaeb1fb3
Journal features:         journal_incompat_revoke journal_64bit journal_checksum_v3
Journal size:             1024M
Journal length:           262144
Journal sequence:         0x0000bb9e
Journal start:            1
Journal checksum type:    crc32c
Journal checksum:         0xe8d0e4bd

Source array details:

[schwarzgerat](0) $ sudo mdadm --detail /dev/md127
/dev/md127:
        Version : 1.2
  Creation Time : Wed Oct 12 20:56:39 2016
     Raid Level : raid1
     Array Size : 369607744 (352.49 GiB 378.48 GB)
  Used Dev Size : 369607744 (352.49 GiB 378.48 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Tue Oct 25 09:11:04 2016
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : debian:intel 750 nvme
           UUID : 6a826e91:5bf8c1de:56be717d:e55077d9
         Events : 149

    Number   Major   Minor   RaidDevice State
       0     259        6        0      active sync   /dev/nvme0n1p3
       1     259        2        1      active sync   /dev/nvme1n1p3
[schwarzgerat](0) $ 

Source filesystem details:

[schwarzgerat](0) $ sudo dumpe2fs /dev/md127p2
dumpe2fs 1.43.3 (04-Sep-2016)
Filesystem volume name:   <none>
Last mounted on:          /home
Filesystem UUID:          2509ff02-8976-4051-a074-4c8457512e9e
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent 64bit flex_bg sparse_super large_file
huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              18907136
Block count:              75624459
Reserved block count:     3781222
Free blocks:              67197022
Free inodes:              18742837
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Fri Oct 14 19:47:30 2016
Last mount time:          Mon Oct 24 18:26:38 2016
Last write time:          Mon Oct 24 18:26:38 2016
Mount count:              16
Maximum mount count:      -1
Last checked:             Fri Oct 14 19:47:30 2016
Check interval:           0 (<none>)
Lifetime writes:          1123 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
First orphan inode:       15729268
Default directory hash:   half_md4
Directory Hash Seed:      67c455a5-d48e-4244-9acd-0b01dbd67730
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0xabd72e7b
Journal features:         journal_incompat_revoke journal_64bit
journal_checksum_v3
Journal size:             1024M
Journal length:           262144
Journal sequence:         0x00001886
Journal start:            1
Journal checksum type:    crc32c
Journal checksum:         0x1519be37


-- 
nick black -=- http://www.nick-black.com
to make an apple pie from scratch, you need first invent a universe.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]

^ permalink raw reply

* Re: [PATCH mdadm] raid6check.c: fix "misleading-indentation" error
From: Yilong Ren @ 2016-10-25 13:21 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: LKP, NeilBrown, linux-raid
In-Reply-To: <wrfjmvhsskou.fsf@redhat.com>

On Tue, Oct 25, 2016 at 08:00:01AM -0400, Jes Sorensen wrote:
> renyl <yilongx.ren@intel.com> writes:
> > From: Yilong Ren <yilongx.ren@intel.com>
> >
> > To fix the following error info:
> >
> > root@vm-lkp-nex04-8G-7 /tmp/mdadm# make test
> > cc -Wall -Werror -Wstrict-prototypes -Wextra -Wno-unused-parameter -ggdb -DSendmail=\""/usr/sbin/sendmail -t"\" -DCONFFILE=\"/etc/mdadm.conf\" -DCONFFILE2=\"/etc/mdadm/mdadm.conf\" -DMAP_DIR=\"/run/mdadm\" -DMAP_FILE=\"map\" -DMDMON_DIR=\"/run/mdadm\" -DFAILED_SLOTS_DIR=\"/run/mdadm/failed-slots\" -DNO_COROSYNC -DNO_DLM -DVERSION=\"3.4-43-g1dcee1c\" -DVERS_DATE="\"06th April 2016\"" -DUSE_PTHREADS -DBINDIR=\"/sbin\"  -c -o raid6check.o raid6check.c
> > raid6check.c: In function 'manual_repair':
> > raid6check.c:267:4: error: this 'else' clause does not guard... [-Werror=misleading-indentation]
> >     else
> >     ^~~~
> > raid6check.c:269:5: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'else'
> >      printf("Repairing D(%d) and P\n", failed_data);
> >      ^~~~~~
> > cc1: all warnings being treated as errors
> > <builtin>: recipe for target 'raid6check.o' failed
> > make: *** [raid6check.o] Error 1
> > root@vm-lkp-nex04-8G-7 /tmp/mdadm# 
> >
> >
> > Cc: NeilBrown <neilb@suse.com>
> > Cc: linux-raid <linux-raid@vger.kernel.org>
> > Cc: LKP <lkp@eclists.intel.com>
> > Signed-off-by: Yilong Ren <yilongx.ren@intel.com>
> > ---
> >  raid6check.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/raid6check.c b/raid6check.c
> > index ad7ffe7..acfc9a3 100644
> > --- a/raid6check.c
> > +++ b/raid6check.c
> > @@ -264,9 +264,10 @@ int manual_repair(int chunk_size, int syndrome_disks,
> >  			int failed_data;
> >  			if (failed_slot1 == -1)
> >  				failed_data = failed_slot2;
> > -			else
> > +			else {
> >  				failed_data = failed_slot1;
> >  				printf("Repairing D(%d) and P\n", failed_data);
> > +			}
> >  			raid6_datap_recov(syndrome_disks+2, chunk_size,
> >  					  failed_data, (uint8_t**)blocks, 1);
> >  		} else {
> 
> Hi,
> 
> I suspect this patch is wrong and the code is meant to print in either
> case.

Oops, should like below:


diff --git a/raid6check.c b/raid6check.c
index ad7ffe7..551f835 100644
--- a/raid6check.c
+++ b/raid6check.c
@@ -266,7 +266,8 @@ int manual_repair(int chunk_size, int syndrome_disks,
                                failed_data = failed_slot2;
                        else
                                failed_data = failed_slot1;
-                               printf("Repairing D(%d) and P\n", failed_data);
+
+                       printf("Repairing D(%d) and P\n", failed_data);
                        raid6_datap_recov(syndrome_disks+2, chunk_size,
                                          failed_data, (uint8_t**)blocks, 1);
                } else {
--

will send v2 after Neil confirm this point, thanks.

> 
> Neil?
> 
> Cheers,
> Jes

-- 
Thanks
Ren Yilong

^ permalink raw reply related

* [PATCH] md: wake up personality thread after array state update
From: Tomasz Majchrzak @ 2016-10-25 15:07 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Tomasz Majchrzak

When raid1/raid10 array fails to write to one of the drives, the request
is added to bio_end_io_list and finished by personality thread. The
thread doesn't handle it as long as MD_CHANGE_PENDING flag is set. In
case of external metadata this flag is cleared, however the thread is
not woken up. It causes request to be blocked for few seconds (until
another action on the array wakes up the thread) or to get stuck
indefinitely.

Wake up personality thread once MD_CHANGE_PENDING has been cleared.
Moving 'restart_array' call after the flag is cleared it not a solution
because in read-write mode the call doesn't wake up the thread.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 drivers/md/md.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 5655f83..c17efaf 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3930,6 +3930,7 @@ array_state_store(struct mddev *mddev, const char *buf, size_t len)
 		if (st == active) {
 			restart_array(mddev);
 			clear_bit(MD_CHANGE_PENDING, &mddev->flags);
+			md_wakeup_thread(mddev->thread);
 			wake_up(&mddev->sb_wait);
 			err = 0;
 		} else /* st == clean */ {
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH] raid5: revert commit 11367799f3d1
From: Tomasz Majchrzak @ 2016-10-25 15:17 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Tomasz Majchrzak

Revert commit 11367799f3d1 ("md: Prevent IO hold during accessing to faulty
raid5 array") as it doesn't comply with commit c3cce6cda162 ("md/raid5:
ensure device failure recorded before write request returns."). That change
is not required anymore as the problem is resolved by commit 16f889499a52
("md: report 'write_pending' state when array in sync") - read request is
stuck as array state is not reported correctly via sysfs attribute.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 drivers/md/raid5.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index f94472d..323d3c7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4645,9 +4645,7 @@ finish:
 	}
 
 	if (!bio_list_empty(&s.return_bi)) {
-		if (test_bit(MD_CHANGE_PENDING, &conf->mddev->flags) &&
-				(s.failed <= conf->max_degraded ||
-					conf->mddev->external == 0)) {
+		if (test_bit(MD_CHANGE_PENDING, &conf->mddev->flags))
 			spin_lock_irq(&conf->device_lock);
 			bio_list_merge(&conf->return_bi, &s.return_bi);
 			spin_unlock_irq(&conf->device_lock);
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH] dm block manager: use do/while(0) for empty macros
From: Arnd Bergmann @ 2016-10-25 15:54 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Arnd Bergmann, Alasdair Kergon, dm-devel, Shaohua Li,
	Mikulas Patocka, Joe Thornber, linux-raid, linux-kernel

make W=1 reports a new warning for the dm-block-manager:

drivers/md/persistent-data/dm-block-manager.c: In function ‘dm_bm_unlock’:
drivers/md/persistent-data/dm-block-manager.c:598:3: error: suggest braces around empty body in an ‘else’ statement [-Werror=empty-body]

This is completely harmless, but generally speaking it's a good idea to
address this warning as it can often detect nasty bugs, and replacing
empty macros with "do { } while (0)" is generally considered good style
to make code more robust anyway.

Fixes: f94bdb2e26b6 ("dm block manager: make block locking optional")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/md/persistent-data/dm-block-manager.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/persistent-data/dm-block-manager.c b/drivers/md/persistent-data/dm-block-manager.c
index b619c383d88d..a6dde7cab458 100644
--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -306,13 +306,13 @@ static void report_recursive_bug(dm_block_t b, int r)
 
 #else  /* !CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING */
 
-#define bl_init(x)
+#define bl_init(x) do { } while (0)
 #define bl_down_read(x) 0
 #define bl_down_read_nonblock(x) 0
-#define bl_up_read(x)
+#define bl_up_read(x) do { } while (0)
 #define bl_down_write(x) 0
-#define bl_up_write(x)
-#define report_recursive_bug(x, y)
+#define bl_up_write(x) do { } while (0)
+#define report_recursive_bug(x, y) do { } while (0)
 
 #endif /* CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING */
 
-- 
2.9.0


^ permalink raw reply related

* Fail to assemble raid4 with replaced disk
From: Santiago DIEZ @ 2016-10-25 17:08 UTC (permalink / raw)
  To: Linux Raid LIST

Hi Raiders,

I had a raid5 array md10 with sd[abcd]10.
Eventually, sdd10 failed.

I did NOT do any mdadm --fail NOR mdadm --remove command.
What I did is comment out the line "ARRAY /dev/md10 ..." in
/etc/mdadm/mdadm.conf.

Then I powered off the server, replaced the disk sdd with a new one
and booted the system.

I examined the status with:
# cat /proc/mdstat
md10 : inactive sdb10[1]
      1926247296 blocks

I stopped the array with:
# mdadm --stop /dev/md10

I tried to assemble the array with the 3 original disks like this
# mdadm --assemble /dev/md10 --verbose /dev/sda10 /dev/sdb10 /dev/sdc10
mdadm: looking for devices for /dev/md10
mdadm: /dev/sda10 is identified as a member of /dev/md10, slot 0.
mdadm: /dev/sdb10 is identified as a member of /dev/md10, slot 1.
mdadm: /dev/sdc10 is identified as a member of /dev/md10, slot 2.
mdadm: added /dev/sda10 to /dev/md10 as 0 (possibly out of date)
mdadm: added /dev/sdc10 to /dev/md10 as 2 (possibly out of date)
mdadm: no uptodate device for slot 3 of /dev/md10
mdadm: added /dev/sdb10 to /dev/md10 as 1
mdadm: /dev/md10 assembled from 1 drive - not enough to start the array.

I examined the status again with:
# cat /proc/mdstat
md10 : inactive sdb10[1](S) sdc10[2](S) sda10[0](S)
      5778741888 blocks

Now I'm SCARED!
What does the (S) mean?
How do I reassemble my array and add the new sdd10 partition?

Thanks for your help

Regards
-------------------------
Santiago DIEZ
Quark Systems & CAOBA
23 rue du Buisson Saint-Louis, 75010 Paris
-------------------------

^ permalink raw reply

* Re: Fail to assemble raid4 with replaced disk
From: Mikael Abrahamsson @ 2016-10-25 17:35 UTC (permalink / raw)
  To: Santiago DIEZ; +Cc: Linux Raid LIST
In-Reply-To: <CAJh8RqUjr7L_Of0fbW_mAXshmmgcPdXvhtsvhFXKF+XxjOTrFw@mail.gmail.com>

On Tue, 25 Oct 2016, Santiago DIEZ wrote:

> # mdadm --assemble /dev/md10 --verbose /dev/sda10 /dev/sdb10 /dev/sdc10
> mdadm: looking for devices for /dev/md10
> mdadm: /dev/sda10 is identified as a member of /dev/md10, slot 0.
> mdadm: /dev/sdb10 is identified as a member of /dev/md10, slot 1.
> mdadm: /dev/sdc10 is identified as a member of /dev/md10, slot 2.
> mdadm: added /dev/sda10 to /dev/md10 as 0 (possibly out of date)
> mdadm: added /dev/sdc10 to /dev/md10 as 2 (possibly out of date)
> mdadm: no uptodate device for slot 3 of /dev/md10
> mdadm: added /dev/sdb10 to /dev/md10 as 1
> mdadm: /dev/md10 assembled from 1 drive - not enough to start the array.

This means sda10 and sdc10 most likely have a lower event count than 
sdb10.

> I examined the status again with:
> # cat /proc/mdstat
> md10 : inactive sdb10[1](S) sdc10[2](S) sda10[0](S)
>      5778741888 blocks
>
> Now I'm SCARED!
> What does the (S) mean?
> How do I reassemble my array and add the new sdd10 partition?

Check with mdadm -E /dev/sd[abc]10, check the event count, if it differs 
just a little (5-10 perhaps), then you can use --assemble --force to start 
it even though the event count is not exactly the same on each drive.

The event count is increased every time a drive is written to, when there 
is an unclean shutdown mdadm won't auto-assemble drives without operator 
intervention to understand the situation and act accordingly.


-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox