linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Peter Kieser <peter@kieser.ca>
Cc: linux-raid@vger.kernel.org
Subject: Re: raid10: 6 out of 8 disks marked as stale on every restart
Date: Thu, 18 Dec 2014 16:36:32 +1100	[thread overview]
Message-ID: <20141218163632.6cb57524@notabene.brown> (raw)
In-Reply-To: <548B2033.5030803@kieser.ca>

[-- Attachment #1: Type: text/plain, Size: 5981 bytes --]

On Fri, 12 Dec 2014 09:04:51 -0800 Peter Kieser <peter@kieser.ca> wrote:

> Hello,
> 
> I have a 8 disk RAID10 array, 6 of the disks are on an LSISAS2008 
> controller and 2 are on a  82801JI (ICH10 Family) SATA AHCI controller.  
> I upgraded the kernel from 3.17.1 to 3.17.6 when the issue I am having 
> started to occur, but reverting to an older kernel does not resolve the 
> issue.
> 
> Restarting the machine causes the array not to start (or be visible in 
> /proc/mdstat or any mention in kernel messages.) If I try to assemble 
> the drives, mdraid complains that 6 out of the 8 disks (coincidentally 
> all on the LSISAS2008 controller) are non-fresh:
> 
> root@kvm:~# mdadm --assemble /dev/md3 /dev/sde /dev/sdf /dev/sdg 
> /dev/sdh /dev/sdi /dev/sdj /dev/sda /dev/sdb
> 
> Dec 11 21:08:25 kvm kernel: [  528.503736] md: kicking non-fresh sdi 
> from array!
> Dec 11 21:08:25 kvm kernel: [  528.503747] md: unbind<sdi>
> Dec 11 21:08:25 kvm kernel: [  528.523775] md: export_rdev(sdi)
> Dec 11 21:08:25 kvm kernel: [  528.523802] md: kicking non-fresh sdg 
> from array!
> Dec 11 21:08:25 kvm kernel: [  528.523809] md: unbind<sdg>
> Dec 11 21:08:25 kvm kernel: [  528.531753] md: export_rdev(sdg)
> Dec 11 21:08:25 kvm kernel: [  528.531780] md: kicking non-fresh sdf 
> from array!
> Dec 11 21:08:25 kvm kernel: [  528.531788] md: unbind<sdf>
> Dec 11 21:08:25 kvm kernel: [  528.539749] md: export_rdev(sdf)
> Dec 11 21:08:25 kvm kernel: [  528.539776] md: kicking non-fresh sdh 
> from array!
> Dec 11 21:08:25 kvm kernel: [  528.539785] md: unbind<sdh>
> Dec 11 21:08:25 kvm kernel: [  528.547744] md: export_rdev(sdh)
> Dec 11 21:08:25 kvm kernel: [  528.547771] md: kicking non-fresh sdj 
> from array!
> Dec 11 21:08:25 kvm kernel: [  528.547779] md: unbind<sdj>
> Dec 11 21:08:25 kvm kernel: [  528.555755] md: export_rdev(sdj)
> Dec 11 21:08:25 kvm kernel: [  528.555782] md: kicking non-fresh sde 
> from array!
> Dec 11 21:08:25 kvm kernel: [  528.555790] md: unbind<sde>
> Dec 11 21:08:25 kvm kernel: [  528.563758] md: export_rdev(sde)
> Dec 11 21:08:25 kvm kernel: [  528.565831] md/raid10:md3: not enough 
> operational mirrors.
> Dec 11 21:08:25 kvm kernel: [  528.567230] md: pers->run() failed ...
> 
> /dev/sda and /dev/sdb are the only drives not on the LSI controller. If 
> I force the assembly with 6 out of the 8 drives the RAID array comes up:
> 
> root@kvm:~# mdadm --assemble /dev/md3 /dev/sde /dev/sdf /dev/sdg 
> /dev/sdh /dev/sdi /dev/sdj --run
> 
> Then I add the extra drives:
> 
> root@kvm:~# mdadm --manage /dev/md3 --add /dev/sda
> root@kvm:~# mdadm --manage /dev/md3 --add /dev/sdb
> 
> root@kvm:~# mdadm --detail /dev/md3
> /dev/md3:
>          Version : 1.0
>    Creation Time : Thu Sep 12 18:43:56 2013
>       Raid Level : raid10
>       Array Size : 7814055936 (7452.06 GiB 8001.59 GB)
>    Used Dev Size : 1953513984 (1863.02 GiB 2000.40 GB)
>     Raid Devices : 8
>    Total Devices : 8
>      Persistence : Superblock is persistent
> 
>      Update Time : Fri Dec 12 08:58:19 2014
>            State : active, degraded, recovering
>   Active Devices : 6
> Working Devices : 8
>   Failed Devices : 0
>    Spare Devices : 2
> 
>           Layout : near=2
>       Chunk Size : 512K
> 
>   Rebuild Status : 76% complete
> 
>             Name : kvm.taylor.kieser.ca:3
>             UUID : f0bc8469:9879a709:e4cc94a7:521bd273
>           Events : 82901
> 
>      Number   Major   Minor   RaidDevice State
>         0       8      128        0      active sync /dev/sdi
>         8       8       96        1      active sync /dev/sdg
>        11       8        0        2      spare rebuilding /dev/sda
>         3       8      112        3      active sync /dev/sdh
>         4       0        0        4      removed
>        10       8       80        5      active sync /dev/sdf
>         6       8       64        6      active sync /dev/sde
>         9       8      144        7      active sync /dev/sdj
> 
>        12       8       16        -      spare   /dev/sdb
> 
> This occurs every time I restart the machine. Thoughts? I tried 
> rebuilding the initramfs but this didn't resolve the issue. I'm also 
> running bcache on this machine, but on top of the mdraid.
> 
> /etc/mdadm.conf:
> 
> # definitions of existing MD arrays
> ARRAY /dev/md/0 metadata=1.0 UUID=3b174514:49f3e22e:550cf9a7:8ed93920 
> name=linux:0
> ARRAY /dev/md/1 metadata=1.0 UUID=8e23f81d:73f9b393:addd1f7f:5ee1833a 
> name=linux:1
> ARRAY /dev/md/2 metadata=1.0 UUID=cc5a0495:b5262855:fb3cd40a:8b237162 
> name=kvm.taylor.kieser.ca:2
> ARRAY /dev/md/3 metadata=1.0 UUID=f0bc8469:9879a709:e4cc94a7:521bd273 
> name=kvm.taylor.kieser.ca:3
> 
> 
> root@kvm:~# uname -a
> Linux kvm 3.17.6 #3 SMP Sun Dec 7 12:16:45 PST 2014 x86_64 x86_64 x86_64 
> GNU/Linux
> 
> root@kvm:~# mdadm -V
> mdadm - v3.2.5 - 18th May 2012
> 
> root@kvm:~# cat /proc/mdstat
> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] 
> [raid4] [raid10]
> md127 : inactive sdk[2](S)
>        1465138448 blocks super 1.0
> 
> md3 : active raid10 sdb[12](S) sda[11] sdi[0] sdj[9] sde[6] sdf[10] 
> sdh[3] sdg[8]
>        7814055936 blocks super 1.0 512K chunks 2 near-copies [8/6] 
> [UU_U_UUU]
>        [===============>.....]  recovery = 76.6% (1498279040/1953513984) 
> finish=4710.1min speed=1610K/sec
> 
> md1 : active raid1 sdd5[3] sdc5[2]
>        25164672 blocks super 1.0 [2/2] [UU]
> 
> md0 : active raid1 sdd1[3] sdc1[2]
>        16779136 blocks super 1.0 [2/2] [UU]
> 
> md2 : active raid1 sdd6[3] sdc6[2]
>        192472960 blocks super 1.0 [2/2] [UU]
> 
> unused devices: <none>
> 
> -Peter
> 
> 

Curious.

What does "mdadm --examine" report for each device immediately after boot,
before you try assembling anything?

Maybe also get the output just before you shut down to compare.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

  reply	other threads:[~2014-12-18  5:36 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-12 17:04 raid10: 6 out of 8 disks marked as stale on every restart Peter Kieser
2014-12-18  5:36 ` NeilBrown [this message]
2014-12-18  6:26   ` Peter Kieser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141218163632.6cb57524@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=peter@kieser.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).