Bug in 2.6.17 / mdadm 2.5.1

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ronald Lembcke <es186@fen-net.de>
To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Cc: es186@fen-net.de
Subject: Bug in 2.6.17 / mdadm 2.5.1
Date: Sun, 25 Jun 2006 15:59:26 +0200	[thread overview]
Message-ID: <20060625135926.GA6253@defiant.crash> (raw)
In-Reply-To: <20060624104745.GA6352@defiant.crash>

[-- Attachment #1: Type: text/plain, Size: 3161 bytes --]

Hi!

There's a bug in Kernel 2.6.17 and / or mdadm which prevents (re)adding
a disk to a degraded RAID5-Array.

The mail I'm replying to was sent to linux-raid only. A summary of my
problem is in the quoted part, and everything you need to reproduce it
is below.
There's more information (kernel log, output of mdadm -E, ...) in the
original mail (
Subject: RAID5 degraded after mdadm -S, mdadm --assemble (everytime)
Message-ID: <20060624104745.GA6352@defiant.crash>
It can be found here for example: 
http://www.spinics.net/lists/raid/msg12859.html )

More about this problem below the quoted pard.

On Sat Jun 24 12:47:45 2006, I wrote:
> I set up a RAID5 array of 4 disks. I initially created a degraded array
> and added the fourth disk (sda1) later.
> 
> The array is "clean", but when I do  
>   mdadm -S /dev/md0 
>   mdadm --assemble /dev/md0 /dev/sd[abcd]1
> it won't start. It always says sda1 is "failed".
> 
> When I remove sda1 and add it again everything seems to be fine until I
> stop the array. 

CPU: AMD-K6(tm) 3D processor
Kernel: Linux version 2.6.17 (root@ganges) (gcc version 4.0.3 (Debian
4.0.3-1)) #2 Tue Jun 20 17:48:32 CEST 2006

The problem is: The superblocks get inconsistent, but I couldn't find
where this actually happens.

Here are some simple steps to reproduce it (don't forget to adjust the
device-names if you're allready using /dev/md1 or /dev/loop[0-3]):
The behaviour changes when you execute --zero-superblock in the example 
below (it looks even more broken).
It also changes when you fail some other disk instead of loop2.
When loop3 is failed (without executing --zero-superblock) it can
successfully be re-added.

############################################
cd /tmp; mkdir raiddtest; cd raidtest
dd bs=1M count=1 if=/dev/zero of=disk0
dd bs=1M count=1 if=/dev/zero of=disk1
dd bs=1M count=1 if=/dev/zero of=disk2
dd bs=1M count=1 if=/dev/zero of=disk3
losetup /dev/loop0 disk0
losetup /dev/loop1 disk1
losetup /dev/loop2 disk2
losetup /dev/loop3 disk3
mdadm --create /dev/md1 --level=5 --raid-devices=4 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
mdadm /dev/md1 --fail /dev/loop2
mdadm /dev/md1 --remove /dev/loop2

#mdadm --zero-superblock /dev/loop2
# here something goes wrong
mdadm /dev/md1 --add /dev/loop2

mdadm --stop /dev/md1
# can't reassemble
mdadm --assemble /dev/md1 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
############################################

To cleanup everything :)
############################################
mdadm --stop /dev/md1
losetup -d /dev/loop0
losetup -d /dev/loop1
losetup -d /dev/loop2
losetup -d /dev/loop3
rm disk0 disk1 disk2 disk3
###########################################

After mdadm --create the superblocks are ok, but look a little bit
strange (the failed device):

dev_roles[i]: 0000 0001 0002 fffe 0003
the disks have dev_num: 0,1,2,4

But after --fail --remove --add 
dev_roles[i]: 0000 0001 fffe fffe 0003 0002
but the disks still have dev_num: 0,1,2,4.
Either loop2 must have disk_num=5 or dev_roles needs to be
0,1,2,0xfffe,4.

Greetings,
           Roni

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

next prev parent reply	other threads:[~2006-06-25 13:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-24 10:47 RAID5 degraded after mdadm -S, mdadm --assemble (everytime) Ronald Lembcke
2006-06-24 11:10 ` Ronald Lembcke
2006-06-25 13:59 ` Ronald Lembcke [this message]
2006-06-26  1:06   ` Bug in 2.6.17 / mdadm 2.5.1 Neil Brown
2006-06-26  1:53     ` Neil Brown
2006-06-26 21:24     ` Andre Tomt
2006-06-27  1:00       ` Neil Brown
2006-06-26 14:20 ` RAID5 degraded after mdadm -S, mdadm --assemble (everytime) Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060625135926.GA6253@defiant.crash \
    --to=es186@fen-net.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).