Wrong array assembly on boot?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Wrong array assembly on boot?
@ 2017-07-22 18:39 Dark Penguin
  2017-07-24 14:48 ` Wols Lists
  0 siblings, 1 reply; 9+ messages in thread
From: Dark Penguin @ 2017-07-22 18:39 UTC (permalink / raw)
  To: linux-raid

Greetings!

I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
partition, just a RAID with some data for services running on this
server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .

One of the devices in the RAID (sdc1) "fell off" - it disappeared from
the system for some reason. Well, I thought, I have to reboot to get the
drive back, and then re-add it.

That's what I did. After the reboot, I saw a degraded array with one
drive missing, so I found out which one , and re-added it back.

Later, I noticed that I'm missing some data, and thinking about this
situation led me to understanding what happened. After a reboot, the
system tried to assemble my arrays; it found sdc1 first (the one that
disappeared), assembled a degraded array with only this drive, and
started it. When I re-added the second drive, I've overwritten
everything that happened between those events.

Now I'm trying to understand why this happened and what am I supposed to
do in this situation to handle it properly. So now I have a lot of
questions boiling down to "how should booting with degraded arrays be
handled?"

- Why did mdadm not notice that the second drive is "newer"? I thought
there were timestamps in the devices and even in the bitmap!..

- Why did it START this array?! I thought if a degraded array is found
at boot, it's supposed to be assembled but not started?.. At least I
think that's how it used to be in Wheezy (before systemd?).

- Googling revealed that if a degraded array is detected, the system
should stop booting and ask for a confirmation in the console. (Only for
root partitions? And only before systemd?..)

- My services are not going to be happy either way. If the array is
assembled but not run, they will have data missing. If the array is
assembled and run, it's even worse - they will start with outdated data!
How is this even supposed to be handled?.. Should I add dependencies on
mounting a specific mountpoint in each service definition?.. Am I wrong
in thinking that mdadm should have detected that the second drive is
"newer" and assemble the array just like it was before, thus avoiding
all those problems easily?.. Especially considering that the array on
the "new" drive already consists of only one drive, which is "not as
degraded" and would be fine to run, compared to the array on the "old"
drive which was not stopped properly and only now learns about one of
the drives missing? Maybe this behaviour is already changed in the newer
versions?..

-- 
darkpenguin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-07-22 18:39 Wrong array assembly on boot? Dark Penguin
@ 2017-07-24 14:48 ` Wols Lists
  2017-07-24 15:27   ` Dark Penguin
  0 siblings, 1 reply; 9+ messages in thread
From: Wols Lists @ 2017-07-24 14:48 UTC (permalink / raw)
  To: Dark Penguin, linux-raid

On 22/07/17 19:39, Dark Penguin wrote:
> Greetings!
> 
> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
> partition, just a RAID with some data for services running on this
> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .

As an absolute minimum, can you please give us your version of mdadm.

And the output of "mdadm --display" of your arrays. (I think I've got
that right, I think --examine is the disk ...)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-07-24 14:48 ` Wols Lists
@ 2017-07-24 15:27   ` Dark Penguin
  2017-07-24 19:36     ` Wols Lists
  0 siblings, 1 reply; 9+ messages in thread
From: Dark Penguin @ 2017-07-24 15:27 UTC (permalink / raw)
  To: Wols Lists, linux-raid

On 24/07/17 17:48, Wols Lists wrote:
> On 22/07/17 19:39, Dark Penguin wrote:
>> Greetings!
>>
>> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>> partition, just a RAID with some data for services running on this
>> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
> 
> As an absolute minimum, can you please give us your version of mdadm.

Oh, right, sorry. I thought the "absolute minimum" would be the kernel
version and the distribution. :)

mdadm - v3.3.2 - 21st August 2014

> And the output of "mdadm --display" of your arrays. (I think I've got
> that right, I think --examine is the disk ...)

It's "mdadm --detail --scan" for all arrays or "mdadm --detail /dev/md0"
for md0.

I have 8 arrays on this server, and the only one that's relevant is this
one. (The rest of them are set up exactly the same way, but with
different names and UUIDs.) So, to avoid cluttering:

$ sudo mdadm --detail /dev/md/RAID
/dev/md/RAID:
        Version : 1.2
  Creation Time : Thu Oct  6 23:15:56 2016
     Raid Level : raid1
     Array Size : 244066432 (232.76 GiB 249.92 GB)
  Used Dev Size : 244066432 (232.76 GiB 249.92 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : /RAID

    Update Time : Mon Jul 24 17:59:53 2017
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : BAAL:RAID  (local to host BAAL)
           UUID : 8b5f18f0:54f655b7:8bfcc60d:4db6e6c8
         Events : 5000

Number   Major   Minor   RaidDevice State
   0       8       65        0      active sync   /dev/sde1
   1       8       33        1      active sync writemostly   /dev/sdc1

And the /etc/mdadm/mdadm.conf entry is:

ARRAY /dev/md/RAID	metadata=1.2	name=BAAL:RAID	bitmap=/RAID
UUID=8b5f18f0:54f655b7:8bfcc60d:4db6e6c8

I don't use the device names here because they change often in a server
with 8 arrays and 20 drives (sometimes I connect a new one or remove an
old one...). The UUID is here, the bitmap file is here, so it just looks
for all drives with this UUID and assembles the array.

As I understand, it has found the first device (/dev/sdc1, which was
outdated) and immediately added it to an array. Then it found the second
device (/dev/sde1, the up-to-date one), noticed an inconsistency and did
not add it. The question is, why did it start the array, why did it not
halt the boot process, why did it not realize that the second device is
newer (and also it already knows about the disappearance of the first
one!)...

-- 
darkpenguin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-07-24 15:27   ` Dark Penguin
@ 2017-07-24 19:36     ` Wols Lists
  2017-07-24 19:58       ` Dark Penguin
  0 siblings, 1 reply; 9+ messages in thread
From: Wols Lists @ 2017-07-24 19:36 UTC (permalink / raw)
  To: Dark Penguin, linux-raid

On 24/07/17 16:27, Dark Penguin wrote:
> On 24/07/17 17:48, Wols Lists wrote:
>> > On 22/07/17 19:39, Dark Penguin wrote:
>>> >> Greetings!
>>> >>
>>> >> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>>> >> partition, just a RAID with some data for services running on this
>>> >> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>>> >> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
>> > 
>> > As an absolute minimum, can you please give us your version of mdadm.
> Oh, right, sorry. I thought the "absolute minimum" would be the kernel
> version and the distribution. :)
> 
> mdadm - v3.3.2 - 21st August 2014
> 
> 
I was afraid it might be that ...

You've hit a known bug in mdadm. It doesn't always successfully assemble
a mirror. I had exactly that problem - I created one mirror and when I
rebooted I had two ...

Can't offer any advice about how to fix your damaged mirror, but you
need to upgrade mdadm! That's two minor versions out of date - 3.4 and 4.0.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-07-24 19:36     ` Wols Lists
@ 2017-07-24 19:58       ` Dark Penguin
  2017-07-24 20:20         ` Wols Lists
  0 siblings, 1 reply; 9+ messages in thread
From: Dark Penguin @ 2017-07-24 19:58 UTC (permalink / raw)
  To: Wols Lists, linux-raid

On 24/07/17 22:36, Wols Lists wrote:
> On 24/07/17 16:27, Dark Penguin wrote:
>> On 24/07/17 17:48, Wols Lists wrote:
>>>> On 22/07/17 19:39, Dark Penguin wrote:
>>>>>> Greetings!
>>>>>>
>>>>>> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>>>>>> partition, just a RAID with some data for services running on this
>>>>>> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>>>>>> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
>>>>
>>>> As an absolute minimum, can you please give us your version of mdadm.
>> Oh, right, sorry. I thought the "absolute minimum" would be the kernel
>> version and the distribution. :)
>>
>> mdadm - v3.3.2 - 21st August 2014
>>
>>
> I was afraid it might be that ...
> 
> You've hit a known bug in mdadm. It doesn't always successfully assemble
> a mirror. I had exactly that problem - I created one mirror and when I
> rebooted I had two ...
> 
> Can't offer any advice about how to fix your damaged mirror, but you
> need to upgrade mdadm! That's two minor versions out of date - 3.4 and 4.0.
> 
> Cheers,
> Wol

My mirror is not damaged anymore - it's quite healthy and cleanly
missing some information I've overwritten. :) Of course, there's no way
to help that now - that's what backups are for. I just wanted to learn
how to avoid this situation in the future. And learn how is it really
supposed to handle such things.

Is this bug fixed in the newer mdadm? Or is it "known, but not fixed yet"?


-- 
darkpenguin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-07-24 19:58       ` Dark Penguin
@ 2017-07-24 20:20         ` Wols Lists
  2017-12-16 12:40           ` Dark Penguin
  0 siblings, 1 reply; 9+ messages in thread
From: Wols Lists @ 2017-07-24 20:20 UTC (permalink / raw)
  To: Dark Penguin, linux-raid

On 24/07/17 20:58, Dark Penguin wrote:
> On 24/07/17 22:36, Wols Lists wrote:
>> On 24/07/17 16:27, Dark Penguin wrote:
>>> On 24/07/17 17:48, Wols Lists wrote:
>>>>> On 22/07/17 19:39, Dark Penguin wrote:
>>>>>>> Greetings!
>>>>>>>
>>>>>>> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>>>>>>> partition, just a RAID with some data for services running on this
>>>>>>> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>>>>>>> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
>>>>>
>>>>> As an absolute minimum, can you please give us your version of mdadm.
>>> Oh, right, sorry. I thought the "absolute minimum" would be the kernel
>>> version and the distribution. :)
>>>
>>> mdadm - v3.3.2 - 21st August 2014
>>>
>>>
>> I was afraid it might be that ...
>>
>> You've hit a known bug in mdadm. It doesn't always successfully assemble
>> a mirror. I had exactly that problem - I created one mirror and when I
>> rebooted I had two ...
>>
>> Can't offer any advice about how to fix your damaged mirror, but you
>> need to upgrade mdadm! That's two minor versions out of date - 3.4 and 4.0.
>>
>> Cheers,
>> Wol
> 
> My mirror is not damaged anymore - it's quite healthy and cleanly
> missing some information I've overwritten. :) Of course, there's no way
> to help that now - that's what backups are for. I just wanted to learn
> how to avoid this situation in the future. And learn how is it really
> supposed to handle such things.
> 
> Is this bug fixed in the newer mdadm? Or is it "known, but not fixed yet"?
> 
> 
Long fixed :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-07-24 20:20         ` Wols Lists
@ 2017-12-16 12:40           ` Dark Penguin
  2017-12-16 20:27             ` Wol's lists
  0 siblings, 1 reply; 9+ messages in thread
From: Dark Penguin @ 2017-12-16 12:40 UTC (permalink / raw)
  To: Wols Lists, linux-raid

On 24/07/17 23:20, Wols Lists wrote:
> On 24/07/17 20:58, Dark Penguin wrote:
>> On 24/07/17 22:36, Wols Lists wrote:
>>> On 24/07/17 16:27, Dark Penguin wrote:
>>>> On 24/07/17 17:48, Wols Lists wrote:
>>>>>> On 22/07/17 19:39, Dark Penguin wrote:
>>>>>>>> Greetings!
>>>>>>>>
>>>>>>>> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>>>>>>>> partition, just a RAID with some data for services running on this
>>>>>>>> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>>>>>>>> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
>>>>>>
>>>>>> As an absolute minimum, can you please give us your version of mdadm.
>>>> Oh, right, sorry. I thought the "absolute minimum" would be the kernel
>>>> version and the distribution. :)
>>>>
>>>> mdadm - v3.3.2 - 21st August 2014
>>>>
>>>>
>>> I was afraid it might be that ...
>>>
>>> You've hit a known bug in mdadm. It doesn't always successfully assemble
>>> a mirror. I had exactly that problem - I created one mirror and when I
>>> rebooted I had two ...

I think this is not the same problem (see below).


>>> Can't offer any advice about how to fix your damaged mirror, but you
>>> need to upgrade mdadm! That's two minor versions out of date - 3.4 and 4.0.

It's 3.4-4 in Ubuntu 17.10 and 3.4-4 in Debian Stretch, so I assume 4.0
must be "not there yet"...


>> My mirror is not damaged anymore - it's quite healthy and cleanly
>> missing some information I've overwritten. :) Of course, there's no way
>> to help that now - that's what backups are for. I just wanted to learn
>> how to avoid this situation in the future. And learn how is it really
>> supposed to handle such things.
>>
>> Is this bug fixed in the newer mdadm? Or is it "known, but not fixed yet"?
>>
>>
> Long fixed :-)

No, this is still not fixed in Ubuntu Artful (17.10) with mdadm v3.4-4 .

My problem is the following (tested just now on Ubuntu 17.10):


- I create a RAID1 on two devices: /dev/sda1 and /dev/sdb1 (writemostly)
- I use it
- I pull /dev/sda1 out (bad cable, exactly the same situation as I had)
- I continue using the degraded array:

$ sudo mdadm --detail /dev/md0
/dev/md0:
<...>
    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       17        1      active sync writemostly   /dev/sdb1


- I shut down the machine and replace the cable, then boot it up again
- I see the following:

mdadm: ignoring /dev/sdb1 as it reports /dev/sda1 as failed
mdadm: /dev/md/0 has been started with 1 drive (out of 2).
mdadm: Found some drive for an array that is already active: /dev/md/0
mdadm: giving up.

$ sudo mdadm --detail /dev/md0
/dev/md0:
<...>
    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       -       0        0        1      removed


So, when assembling the arrays, mdadm sees two devices:
- one that fell off and reports a clean array
- one that knows that the first one fell off and reports it as faulty

And it decides to use the one that obviously fell off, which it knows
about from the second device.

Seriously? Is there a reason for this chosen behaviour, "ignoring the
device that knows about problems"? It seems obviously wrong, but they
know about it and even put the message to explain what's going on! There
must be a reason that makes this "the lesser evil", but I can't imagine
that situation.


-- 
darkpenguin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-12-16 12:40           ` Dark Penguin
@ 2017-12-16 20:27             ` Wol's lists
  2017-12-17 11:38               ` Dark Penguin
  0 siblings, 1 reply; 9+ messages in thread
From: Wol's lists @ 2017-12-16 20:27 UTC (permalink / raw)
  To: Dark Penguin, linux-raid

On 16/12/17 12:40, Dark Penguin wrote:
> On 24/07/17 23:20, Wols Lists wrote:
>> On 24/07/17 20:58, Dark Penguin wrote:
>>> On 24/07/17 22:36, Wols Lists wrote:
>>>> On 24/07/17 16:27, Dark Penguin wrote:
>>>>> On 24/07/17 17:48, Wols Lists wrote:
>>>>>>> On 22/07/17 19:39, Dark Penguin wrote:
>>>>>>>>> Greetings!
>>>>>>>>>
>>>>>>>>> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>>>>>>>>> partition, just a RAID with some data for services running on this
>>>>>>>>> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>>>>>>>>> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
>>>>>>>
>>>>>>> As an absolute minimum, can you please give us your version of mdadm.
>>>>> Oh, right, sorry. I thought the "absolute minimum" would be the kernel
>>>>> version and the distribution. :)
>>>>>
>>>>> mdadm - v3.3.2 - 21st August 2014
>>>>>
>>>>>
>>>> I was afraid it might be that ...
>>>>
>>>> You've hit a known bug in mdadm. It doesn't always successfully assemble
>>>> a mirror. I had exactly that problem - I created one mirror and when I
>>>> rebooted I had two ...
> 
> I think this is not the same problem (see below).
> 
> 
>>>> Can't offer any advice about how to fix your damaged mirror, but you
>>>> need to upgrade mdadm! That's two minor versions out of date - 3.4 and 4.0.
> 
> It's 3.4-4 in Ubuntu 17.10 and 3.4-4 in Debian Stretch, so I assume 4.0
> must be "not there yet"...
> 
https://raid.wiki.kernel.org/index.php/Linux_Raid#Help_wanted

mdadm 4.0 is nearly a year old ...
> 
>>> My mirror is not damaged anymore - it's quite healthy and cleanly
>>> missing some information I've overwritten. :) Of course, there's no way
>>> to help that now - that's what backups are for. I just wanted to learn
>>> how to avoid this situation in the future. And learn how is it really
>>> supposed to handle such things.
>>>
>>> Is this bug fixed in the newer mdadm? Or is it "known, but not fixed yet"?
>>>
>>>
>> Long fixed :-)
> 
> No, this is still not fixed in Ubuntu Artful (17.10) with mdadm v3.4-4 .
> 
> My problem is the following (tested just now on Ubuntu 17.10):
> 
> 
> - I create a RAID1 on two devices: /dev/sda1 and /dev/sdb1 (writemostly)
> - I use it
> - I pull /dev/sda1 out (bad cable, exactly the same situation as I had)
> - I continue using the degraded array:
> 
> $ sudo mdadm --detail /dev/md0
> /dev/md0:
> <...>
>      Number   Major   Minor   RaidDevice State
>         -       0        0        0      removed
>         1       8       17        1      active sync writemostly   /dev/sdb1
> 
> 
> - I shut down the machine and replace the cable, then boot it up again
> - I see the following:
> 
> mdadm: ignoring /dev/sdb1 as it reports /dev/sda1 as failed
> mdadm: /dev/md/0 has been started with 1 drive (out of 2).
> mdadm: Found some drive for an array that is already active: /dev/md/0
> mdadm: giving up.
> 
> $ sudo mdadm --detail /dev/md0
> /dev/md0:
> <...>
>      Number   Major   Minor   RaidDevice State
>         0       8        1        0      active sync   /dev/sda1
>         -       0        0        1      removed
> 
> 
> So, when assembling the arrays, mdadm sees two devices:
> - one that fell off and reports a clean array
> - one that knows that the first one fell off and reports it as faulty
> 
> And it decides to use the one that obviously fell off, which it knows
> about from the second device.

Except that it does NOT know about the second device !!! (At least, not 
to start with.)
> 
> Seriously? Is there a reason for this chosen behaviour, "ignoring the
> device that knows about problems"? It seems obviously wrong, but they
> know about it and even put the message to explain what's going on! There
> must be a reason that makes this "the lesser evil", but I can't imagine
> that situation.
> 
Read the mdadm manual, especially about booting and "mdadm --assemble 
--incremental".

udev detects sda, and passes it to mdadm, which starts building the array.

udev then detects sdb, and passes it to madam, WHICH HITS A BUG IN 3.4 
AND MESSES UP THE ASSEMBLY.

Standard advice for fixing any problems is always "upgrade to the latest 
version and see if you can reproduce the problem". I don't remember 
which version(s) of mdadm had this bug, but I know there were a LOT of 
fixes like this that went into v4.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Wrong array assembly on boot?
  2017-12-16 20:27             ` Wol's lists
@ 2017-12-17 11:38               ` Dark Penguin
  0 siblings, 0 replies; 9+ messages in thread
From: Dark Penguin @ 2017-12-17 11:38 UTC (permalink / raw)
  To: Wol's lists, linux-raid

On 16/12/17 23:27, Wol's lists wrote:
> On 16/12/17 12:40, Dark Penguin wrote:
>> On 24/07/17 23:20, Wols Lists wrote:
>>> On 24/07/17 20:58, Dark Penguin wrote:
>>>> On 24/07/17 22:36, Wols Lists wrote:
>>>>> On 24/07/17 16:27, Dark Penguin wrote:
>>>>>> On 24/07/17 17:48, Wols Lists wrote:
>>>>>>>> On 22/07/17 19:39, Dark Penguin wrote:
>>>>>>>>>> Greetings!
>>>>>>>>>>
>>>>>>>>>> I have a mirror RAID with two devices (sdc1 and sde1). It's not a root
>>>>>>>>>> partition, just a RAID with some data for services running on this
>>>>>>>>>> server. (I'm running Debian Jessie x86_64 with a 4.1.18 kernel.) The
>>>>>>>>>> RAID is listed in /etc/mdadm, and it has an external bitmap in /RAID .
>>>>>>>>
>>>>>>>> As an absolute minimum, can you please give us your version of mdadm.
>>>>>> Oh, right, sorry. I thought the "absolute minimum" would be the kernel
>>>>>> version and the distribution. :)
>>>>>>
>>>>>> mdadm - v3.3.2 - 21st August 2014
>>>>>>
>>>>>>
>>>>> I was afraid it might be that ...
>>>>>
>>>>> You've hit a known bug in mdadm. It doesn't always successfully assemble
>>>>> a mirror. I had exactly that problem - I created one mirror and when I
>>>>> rebooted I had two ...
>>
>> I think this is not the same problem (see below).
>>
>>
>>>>> Can't offer any advice about how to fix your damaged mirror, but you
>>>>> need to upgrade mdadm! That's two minor versions out of date - 3.4 and 4.0.
>>
>> It's 3.4-4 in Ubuntu 17.10 and 3.4-4 in Debian Stretch, so I assume 4.0
>> must be "not there yet"...
>>
> https://raid.wiki.kernel.org/index.php/Linux_Raid#Help_wanted
> 
> mdadm 4.0 is nearly a year old ...
>>
>>>> My mirror is not damaged anymore - it's quite healthy and cleanly
>>>> missing some information I've overwritten. :) Of course, there's no way
>>>> to help that now - that's what backups are for. I just wanted to learn
>>>> how to avoid this situation in the future. And learn how is it really
>>>> supposed to handle such things.
>>>>
>>>> Is this bug fixed in the newer mdadm? Or is it "known, but not fixed yet"?
>>>>
>>>>
>>> Long fixed :-)
>>
>> No, this is still not fixed in Ubuntu Artful (17.10) with mdadm v3.4-4 .
>>
>> My problem is the following (tested just now on Ubuntu 17.10):
>>
>>
>> - I create a RAID1 on two devices: /dev/sda1 and /dev/sdb1 (writemostly)
>> - I use it
>> - I pull /dev/sda1 out (bad cable, exactly the same situation as I had)
>> - I continue using the degraded array:
>>
>> $ sudo mdadm --detail /dev/md0
>> /dev/md0:
>> <...>
>>      Number   Major   Minor   RaidDevice State
>>         -       0        0        0      removed
>>         1       8       17        1      active sync writemostly   /dev/sdb1
>>
>>
>> - I shut down the machine and replace the cable, then boot it up again
>> - I see the following:
>>
>> mdadm: ignoring /dev/sdb1 as it reports /dev/sda1 as failed
>> mdadm: /dev/md/0 has been started with 1 drive (out of 2).
>> mdadm: Found some drive for an array that is already active: /dev/md/0
>> mdadm: giving up.
>>
>> $ sudo mdadm --detail /dev/md0
>> /dev/md0:
>> <...>
>>      Number   Major   Minor   RaidDevice State
>>         0       8        1        0      active sync   /dev/sda1
>>         -       0        0        1      removed
>>
>>
>> So, when assembling the arrays, mdadm sees two devices:
>> - one that fell off and reports a clean array
>> - one that knows that the first one fell off and reports it as faulty
>>
>> And it decides to use the one that obviously fell off, which it knows
>> about from the second device.
> 
> Except that it does NOT know about the second device !!! (At least, not 
> to start with.)
>>
>> Seriously? Is there a reason for this chosen behaviour, "ignoring the
>> device that knows about problems"? It seems obviously wrong, but they
>> know about it and even put the message to explain what's going on! There
>> must be a reason that makes this "the lesser evil", but I can't imagine
>> that situation.
>>
> Read the mdadm manual, especially about booting and "mdadm --assemble 
> --incremental".
> 
> udev detects sda, and passes it to mdadm, which starts building the array.
> 
> udev then detects sdb, and passes it to madam, WHICH HITS A BUG IN 3.4 
> AND MESSES UP THE ASSEMBLY.
> 
> Standard advice for fixing any problems is always "upgrade to the latest 
> version and see if you can reproduce the problem". I don't remember 
> which version(s) of mdadm had this bug, but I know there were a LOT of 
> fixes like this that went into v4.
> 
> Cheers,
> Wol


I was wrong - I was actually testing it on Ubuntu 16.10 that has mdadm
3.4-4 (I assumed "long fixed" means more than a year ago). Now I tried
it on 17.10, which of course has mdadm 4.0-2. And the problem is still
there. Bug I gathered more data this time, including situations in which
the problem goes away.

To reproduce the problem:

- Boot into Ubuntu 17.10 LiveCD and install mdadm (4.0-2 in the repos).
- Create a RAID1 array from two drives and wait for rebuild.

* The first one MUST be earlier in alphabetical order,
i.e. sda1 and sdb1, NOT sdb1 and sda1 !

* The second device (sdb1) MUST be write-mostly!

- Create a filesystem, mount the array and put something on it.
- Disconnect the SATA cable on the FIRST device (NOT writemostly)
- Put more data on the array (to easily see if it's there later).
- Shut down the machine, connect the cable, boot back, install mdadm
- Do mdadm --assemble --scan

$ sudo mdadm --assemble --scan
mdadm: ignoring /dev/sdb1 as it reports /dev/sda1 as failed
mdadm: /dev/md/0 has been started with 1 drive (out of 2).

You can confirm that your "new" data is NOT on the array.

WHAT'S MORE, now do:
$ mdadm --add /dev/md0 --write-mostly /dev/sdb1
mdadm: *re-added* /dev/sdb1

"Re-added"?! But there is no write-intent bitmap!..


Experimenting with different situations gets more results. For example.
I had a situation when mdadm automatically re-added the device to the
array, so after a reboot I got a "clean" array (I don't remember if it
was assembled correctly or wrongly). If the second device is not
write-mostly, then the problem goes away. If you disconnect the second
device and not the first one, the problem goes away.

What I don't understand is the logic in ignoring the device that reports
others as faulty. In what situation could it possibly be sane to ignore
it instead of using it and ignoring all others? On the other hand, I
don't see this message when the "faulty" drive happens to be the second
one - the array just assembles without any errors, with the device that
reports others as faulty.


-- 
darkpenguin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-12-17 11:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-22 18:39 Wrong array assembly on boot? Dark Penguin
2017-07-24 14:48 ` Wols Lists
2017-07-24 15:27   ` Dark Penguin
2017-07-24 19:36     ` Wols Lists
2017-07-24 19:58       ` Dark Penguin
2017-07-24 20:20         ` Wols Lists
2017-12-16 12:40           ` Dark Penguin
2017-12-16 20:27             ` Wol's lists
2017-12-17 11:38               ` Dark Penguin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).