linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Richard Michael <rmichael@edgeofthenet.org>
Cc: linux RAID <linux-raid@vger.kernel.org>
Subject: Re: After RAID0 grow: inconsistent superblocks and /proc/mdstat
Date: Tue, 14 Jan 2014 17:11:16 +1100	[thread overview]
Message-ID: <20140114171116.048211eb@notabene.brown> (raw)
In-Reply-To: <CABR0jEQO7f4DCMbY1zVMDxxyUD3usVTK4Jt4L9p=QN+fb6P3Uw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7286 bytes --]

On Mon, 13 Jan 2014 00:19:28 -0500 Richard Michael
<rmichael@edgeofthenet.org> wrote:

> Neil,
> 
> Thank you for the quick reply.
> 
> I have a few followup questions and comments, inlined below.

I assume it was by mistake that you didn't copy the list on this follow
and I've taken the liberty of copying the list for this reply.


> 
> On Mon, Jan 13, 2014 at 12:03 AM, NeilBrown <neilb@suse.de> wrote:
> > On Sun, 12 Jan 2014 23:37:57 -0500 Richard Michael
> > <rmichael@edgeofthenet.org> wrote:
> >
> >> Hello list,
> >>
> >> I grew a RAID0 by one-disk, and it re-shaped via RAID4 as expected.
> >>
> >> However, the component superblocks still RAID4, while /proc/mdstat,
> >> /sys/block/md0/md/level and "mdadm -D" all indicate RAID0.
> >>
> >> I am reluctant to stop the array, in case auto-assemble can't put it
> >> back together.  (I suppose I could create a new array, but I'd want to
> >> be quite confident about the layout of the disks.)
> >>
> >>
> >> Is this a bug?  Should/can I re-write the superblock(s)?
> >>
> >>
> >> # cat /proc/mdstat
> >> Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
> >> md0 : active raid0 sdc1[2] sdd1[0]
> >>       5860268032 blocks super 1.2 512k chunks
> >>
> >> # cat /sys/block/md0/md/level
> >> raid0
> >>
> >> # mdadm -D /dev/md0
> >> /dev/md0:
> >>         Version : 1.2
> >>   Creation Time : Fri Jan 10 13:02:25 2014
> >>      Raid Level : raid0
> >>      Array Size : 5860268032 (5588.79 GiB 6000.91 GB)
> >>    Raid Devices : 2
> >>   Total Devices : 2
> >>     Persistence : Superblock is persistent
> >>
> >>     Update Time : Sun Jan 12 20:08:53 2014
> >>           State : clean
> >>  Active Devices : 2
> >> Working Devices : 2
> >>  Failed Devices : 0
> >>   Spare Devices : 0
> >>
> >>      Chunk Size : 512K
> >>
> >>     Number   Major   Minor   RaidDevice State
> >>        0       8       49        0      active sync   /dev/sdd1
> >>        2       8       33        1      active sync   /dev/sdc1
> >>
> >>
> >>
> >> But,
> >>
> >>
> >> # mdadm -E /dev/sd[cd]1
> >> /dev/sdc1:
> >>           Magic : a92b4efc
> >>         Version : 1.2
> >>     Feature Map : 0x0
> >>      Array UUID : 8f51352a:610d0ecd:a1e28ddd:86c8586c
> >>            Name : anvil.localdomain:0  (local to host anvil.localdomain)
> >>   Creation Time : Fri Jan 10 13:02:25 2014
> >>      Raid Level : raid4
> >>    Raid Devices : 3
> >>
> >>  Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB)
> >>      Array Size : 5860268032 (5588.79 GiB 6000.91 GB)
> >>   Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
> >>     Data Offset : 260096 sectors
> >>    Super Offset : 8 sectors
> >>    Unused Space : before=260008 sectors, after=2959 sectors
> >>           State : clean
> >>     Device UUID : ad6e6c88:0f897bc1:1f6ec909:f599bc01
> >>
> >>     Update Time : Sun Jan 12 20:08:53 2014
> >>   Bad Block Log : 512 entries available at offset 72 sectors
> >>        Checksum : 1388a7b - correct
> >>          Events : 9451
> >>
> >>      Chunk Size : 512K
> >>
> >>    Device Role : Active device 1
> >>    Array State : AA. ('A' == active, '.' == missing, 'R' == replacing)
> >> /dev/sdd1:
> >>           Magic : a92b4efc
> >>         Version : 1.2
> >>     Feature Map : 0x0
> >>      Array UUID : 8f51352a:610d0ecd:a1e28ddd:86c8586c
> >>            Name : anvil.localdomain:0  (local to host anvil.localdomain)
> >>   Creation Time : Fri Jan 10 13:02:25 2014
> >>      Raid Level : raid4
> >>    Raid Devices : 3
> >>
> >>  Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
> >>      Array Size : 5860268032 (5588.79 GiB 6000.91 GB)
> >>     Data Offset : 260096 sectors
> >>    Super Offset : 8 sectors
> >>    Unused Space : before=260008 sectors, after=2959 sectors
> >>           State : clean
> >>     Device UUID : b3cda274:547919b1:4e026228:0a4981e7
> >>
> >>     Update Time : Sun Jan 12 20:08:53 2014
> >>   Bad Block Log : 512 entries available at offset 72 sectors
> >>        Checksum : e16a1979 - correct
> >>          Events : 9451
> >>
> >>      Chunk Size : 512K
> >>
> >>    Device Role : Active device 0
> >>    Array State : AA. ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >>
> >>
> >> Somewhat aside, I grew the array with:
> >>
> >> "mdadm --grow /dev/md0 --raid-devices=2 --add /dev/sdc1"
> >
> > That is the correct command.
> >
> >>
> >> I suspect I should not have used "--add".  Looking at the superblock,
> >> there is a 3rd unknown device, which I did not intend to add.
> >>
> >> Did I convince mdadm to add two devices at the same time, sdc1 *and* a
> >> missing device?  (This surprises me a bit, in the sense that
> >> --raid-devices=2 would pertain to the added devices, rather than the
> >> total devices in the array.)
> >>
> >> Or, perhaps mdadm add a "dummy" device as part of the temporary RAID4
> >> conversion?
> >
> > Exactly.  The RAID4 had 1 more device than the RAID0.  What is what you are
> > seeing.
> >
> > I'm a bit confused ... did you grow this from a 1-device RAID0 to a 2-device
> > RAID0?  That seems like an odd thing to do, but it should certainly work.
> 
> Yes.  I'm disk/data juggling.  I will copy the data from a third 3TB
> into the new 2-disk 6TB RAID0, then convert it to RAID5 re-using the
> third disk for parity.  (Perhaps there's a method with fewer hoops to
> hop through.)

Seems not-unreasonable.

> 
> >
> > This should work and I think I've tested it.  However looking at the code I
> > cannot see how it ever would have done.  I cannot see anything that would
> > write out the new metadata to the RAID0 after the reshape completes.
> > Normally md will never write to the metadata of a RAID0 so it would need
> > special handling which doesn't seem to be there.
> 
> "never write to the metadata of a RAID0":  is this why there is no
> Name, UUID or Events stanza in the "mdadm -D /dev/md0" output?
> 

No.. That's just because the level recorded in the metadata is different from
the level that md thinks the array is.  mdadm detects this inconsistency and
decides not to trust the metadata.

> >
> > I just tried testing it on the current mainline kernel and it crashes  :-(
> >
> > So it looks like I need to do some fixing here.
> >
> > Your array should continue to work.  If you reboot, it will be assembled as a
> > RAID4 with the parity disk missing.   This will work perfectly but may not be
> > as fast as RAID0.  You can "mdadm --grow /dev/md0 --level=0" to convert it to
> > RAID0 though it probably won't cause the metadata to be updated.
> 
> How can I update the superblock?

I look at the code some more and experimented and if you simply stop the
array the metadata will be written out.  So after stopping the array it will
appear to be RAID0.

> 
> As I mentioned, the next step is convert to RAID5.  Will the RAID4
> superblock confuse [in fact ] RAID0 to RAID5 re-shape?
> 

Shouldn't do.  But if you can stop and restart the array to get the metadata
updated, that would be safer.

> 
> >
> > Thanks for the report.
> 
> You're most welcome ; thank you!
> 
> Regards,
> Richard
> >
> > NeilBrown

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  parent reply	other threads:[~2014-01-14  6:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-13  4:37 After RAID0 grow: inconsistent superblocks and /proc/mdstat Richard Michael
2014-01-13  4:42 ` Richard Michael
2014-01-13  5:03   ` Richard Michael
2014-01-13  5:03 ` NeilBrown
     [not found]   ` <CABR0jEQO7f4DCMbY1zVMDxxyUD3usVTK4Jt4L9p=QN+fb6P3Uw@mail.gmail.com>
2014-01-14  6:11     ` NeilBrown [this message]
2014-01-14 17:09       ` Richard Michael

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140114171116.048211eb@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=rmichael@edgeofthenet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).