All of lore.kernel.org
 help / color / mirror / Atom feed
* Problems reassembling raid6 with --force option
@ 2016-01-02 18:13 René
  0 siblings, 0 replies; only message in thread
From: René @ 2016-01-02 18:13 UTC (permalink / raw)
  To: linux-raid

Hi,

I have a software raid6 consisting of 6 drives (sd[f-k] at the moment) at md1.
For maintainance I deactivated the array via mdadm --stop /dev/md1
As I wanted to reassamble the array later I got the following:

# mdadm --assemble -v /dev/md1 
mdadm: added /dev/sdk1 to /dev/md1 as 0 (possibly out of date) 
mdadm: added /dev/sdh1 to /dev/md1 as 2 
mdadm: added /dev/sdf1 to /dev/md1 as 3 
mdadm: added /dev/sdj1 to /dev/md1 as 4 
mdadm: added /dev/sdg1 to /dev/md1 as 5 (possibly out of date) 
mdadm: added /dev/sdi1 to /dev/md1 as 1 
mdadm: /dev/md1 has been started with 4 drives (out of 6).


The event counts of the drives were only off by 3:

# mdadm --examine /dev/sd[fghijk]1 | egrep 'Events|/dev/sd' 
/dev/sdf1: 
         Events : 17405 
/dev/sdg1: 
         Events : 17402 
/dev/sdh1: 
         Events : 17405 
/dev/sdi1: 
         Events : 17405 
/dev/sdj1: 
         Events : 17405 
/dev/sdk1: 
         Events : 17402


Reading the man-page and searching the internet I thought --force should do the trick. But it didn't:

# mdadm --assemble -v --force /dev/md1 /dev/sd[fghijk]1        
mdadm: looking for devices for /dev/md1 
mdadm: /dev/sdf1 is identified as a member of /dev/md1, slot 3. 
mdadm: /dev/sdg1 is identified as a member of /dev/md1, slot 5. 
mdadm: /dev/sdh1 is identified as a member of /dev/md1, slot 2. 
mdadm: /dev/sdi1 is identified as a member of /dev/md1, slot 1. 
mdadm: /dev/sdj1 is identified as a member of /dev/md1, slot 4. 
mdadm: /dev/sdk1 is identified as a member of /dev/md1, slot 0. 
mdadm: added /dev/sdk1 to /dev/md1 as 0 (possibly out of date) 
mdadm: added /dev/sdh1 to /dev/md1 as 2 
mdadm: added /dev/sdf1 to /dev/md1 as 3 
mdadm: added /dev/sdj1 to /dev/md1 as 4 
mdadm: added /dev/sdg1 to /dev/md1 as 5 (possibly out of date) 
mdadm: added /dev/sdi1 to /dev/md1 as 1 
mdadm: /dev/md1 has been started with 4 drives (out of 6).

After looking at the code of mdadm I got to this bit around the forced assembly of an array (Assemble.c): 

static int force_array(struct mdinfo *content, 
             struct devs *devices, 
             int *best, int bestcnt, char *avail, 
             int most_recent, 
             struct supertype *st, 
             struct context *c) 
{ 
   int okcnt = 0; 
   while (!enough(content->array.level, content->array.raid_disks, 
             content->array.layout, 1, 
             avail) 
          || 
          (content->reshape_active && content->delta_disks > 0 && 
      !enough(content->array.level, (content->array.raid_disks 
                      - content->delta_disks), 
         content->new_layout, 1, 
         avail) 
             )) { 
... 
   } 
   return okcnt; 
}

So it only updates the event count, when it doesn't have enough disks to start the array. Because only two of my drives were "out of date" and it had four valid drives the --force did nothing. 
Running the assembly with one of the up-to-date drives missing (replaced sdj1 with sdx1 on the command line) worked:

# mdadm --assemble -fv /dev/md1 /dev/sd[fghixk]1 
mdadm: looking for devices for /dev/md1 
mdadm: /dev/sdf1 is identified as a member of /dev/md1, slot 3. 
mdadm: /dev/sdg1 is identified as a member of /dev/md1, slot 5. 
mdadm: /dev/sdh1 is identified as a member of /dev/md1, slot 2. 
mdadm: /dev/sdi1 is identified as a member of /dev/md1, slot 1. 
mdadm: /dev/sdk1 is identified as a member of /dev/md1, slot 0. 
mdadm: forcing event count in /dev/sdk1(0) from 17402 upto 17405 
mdadm: forcing event count in /dev/sdg1(5) from 17402 upto 17405 
mdadm: added /dev/sdi1 to /dev/md1 as 1 
mdadm: added /dev/sdh1 to /dev/md1 as 2 
mdadm: added /dev/sdf1 to /dev/md1 as 3 
mdadm: no uptodate device for slot 8 of /dev/md1 
mdadm: added /dev/sdg1 to /dev/md1 as 5 
mdadm: added /dev/sdk1 to /dev/md1 as 0 
mdadm: /dev/md1 has been started with 5 drives (out of 6).

The intention of this behaviour might be that a rebuild is safer for data integrity when there are enough disks. (Because I shut down the array properly [at least I think so] and the event count was off by that little I chose to trick mdadm.)
Is that assumption on the intention of the code right? If so I think it should be mentioned in the man-page.

Regards,
René
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2016-01-02 18:13 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-02 18:13 Problems reassembling raid6 with --force option René

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.