* raid10: 6 out of 8 disks marked as stale on every restart
@ 2014-12-12 17:04 Peter Kieser
2014-12-18 5:36 ` NeilBrown
0 siblings, 1 reply; 3+ messages in thread
From: Peter Kieser @ 2014-12-12 17:04 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 5408 bytes --]
Hello,
I have a 8 disk RAID10 array, 6 of the disks are on an LSISAS2008
controller and 2 are on a 82801JI (ICH10 Family) SATA AHCI controller.
I upgraded the kernel from 3.17.1 to 3.17.6 when the issue I am having
started to occur, but reverting to an older kernel does not resolve the
issue.
Restarting the machine causes the array not to start (or be visible in
/proc/mdstat or any mention in kernel messages.) If I try to assemble
the drives, mdraid complains that 6 out of the 8 disks (coincidentally
all on the LSISAS2008 controller) are non-fresh:
root@kvm:~# mdadm --assemble /dev/md3 /dev/sde /dev/sdf /dev/sdg
/dev/sdh /dev/sdi /dev/sdj /dev/sda /dev/sdb
Dec 11 21:08:25 kvm kernel: [ 528.503736] md: kicking non-fresh sdi
from array!
Dec 11 21:08:25 kvm kernel: [ 528.503747] md: unbind<sdi>
Dec 11 21:08:25 kvm kernel: [ 528.523775] md: export_rdev(sdi)
Dec 11 21:08:25 kvm kernel: [ 528.523802] md: kicking non-fresh sdg
from array!
Dec 11 21:08:25 kvm kernel: [ 528.523809] md: unbind<sdg>
Dec 11 21:08:25 kvm kernel: [ 528.531753] md: export_rdev(sdg)
Dec 11 21:08:25 kvm kernel: [ 528.531780] md: kicking non-fresh sdf
from array!
Dec 11 21:08:25 kvm kernel: [ 528.531788] md: unbind<sdf>
Dec 11 21:08:25 kvm kernel: [ 528.539749] md: export_rdev(sdf)
Dec 11 21:08:25 kvm kernel: [ 528.539776] md: kicking non-fresh sdh
from array!
Dec 11 21:08:25 kvm kernel: [ 528.539785] md: unbind<sdh>
Dec 11 21:08:25 kvm kernel: [ 528.547744] md: export_rdev(sdh)
Dec 11 21:08:25 kvm kernel: [ 528.547771] md: kicking non-fresh sdj
from array!
Dec 11 21:08:25 kvm kernel: [ 528.547779] md: unbind<sdj>
Dec 11 21:08:25 kvm kernel: [ 528.555755] md: export_rdev(sdj)
Dec 11 21:08:25 kvm kernel: [ 528.555782] md: kicking non-fresh sde
from array!
Dec 11 21:08:25 kvm kernel: [ 528.555790] md: unbind<sde>
Dec 11 21:08:25 kvm kernel: [ 528.563758] md: export_rdev(sde)
Dec 11 21:08:25 kvm kernel: [ 528.565831] md/raid10:md3: not enough
operational mirrors.
Dec 11 21:08:25 kvm kernel: [ 528.567230] md: pers->run() failed ...
/dev/sda and /dev/sdb are the only drives not on the LSI controller. If
I force the assembly with 6 out of the 8 drives the RAID array comes up:
root@kvm:~# mdadm --assemble /dev/md3 /dev/sde /dev/sdf /dev/sdg
/dev/sdh /dev/sdi /dev/sdj --run
Then I add the extra drives:
root@kvm:~# mdadm --manage /dev/md3 --add /dev/sda
root@kvm:~# mdadm --manage /dev/md3 --add /dev/sdb
root@kvm:~# mdadm --detail /dev/md3
/dev/md3:
Version : 1.0
Creation Time : Thu Sep 12 18:43:56 2013
Raid Level : raid10
Array Size : 7814055936 (7452.06 GiB 8001.59 GB)
Used Dev Size : 1953513984 (1863.02 GiB 2000.40 GB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent
Update Time : Fri Dec 12 08:58:19 2014
State : active, degraded, recovering
Active Devices : 6
Working Devices : 8
Failed Devices : 0
Spare Devices : 2
Layout : near=2
Chunk Size : 512K
Rebuild Status : 76% complete
Name : kvm.taylor.kieser.ca:3
UUID : f0bc8469:9879a709:e4cc94a7:521bd273
Events : 82901
Number Major Minor RaidDevice State
0 8 128 0 active sync /dev/sdi
8 8 96 1 active sync /dev/sdg
11 8 0 2 spare rebuilding /dev/sda
3 8 112 3 active sync /dev/sdh
4 0 0 4 removed
10 8 80 5 active sync /dev/sdf
6 8 64 6 active sync /dev/sde
9 8 144 7 active sync /dev/sdj
12 8 16 - spare /dev/sdb
This occurs every time I restart the machine. Thoughts? I tried
rebuilding the initramfs but this didn't resolve the issue. I'm also
running bcache on this machine, but on top of the mdraid.
/etc/mdadm.conf:
# definitions of existing MD arrays
ARRAY /dev/md/0 metadata=1.0 UUID=3b174514:49f3e22e:550cf9a7:8ed93920
name=linux:0
ARRAY /dev/md/1 metadata=1.0 UUID=8e23f81d:73f9b393:addd1f7f:5ee1833a
name=linux:1
ARRAY /dev/md/2 metadata=1.0 UUID=cc5a0495:b5262855:fb3cd40a:8b237162
name=kvm.taylor.kieser.ca:2
ARRAY /dev/md/3 metadata=1.0 UUID=f0bc8469:9879a709:e4cc94a7:521bd273
name=kvm.taylor.kieser.ca:3
root@kvm:~# uname -a
Linux kvm 3.17.6 #3 SMP Sun Dec 7 12:16:45 PST 2014 x86_64 x86_64 x86_64
GNU/Linux
root@kvm:~# mdadm -V
mdadm - v3.2.5 - 18th May 2012
root@kvm:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
[raid4] [raid10]
md127 : inactive sdk[2](S)
1465138448 blocks super 1.0
md3 : active raid10 sdb[12](S) sda[11] sdi[0] sdj[9] sde[6] sdf[10]
sdh[3] sdg[8]
7814055936 blocks super 1.0 512K chunks 2 near-copies [8/6]
[UU_U_UUU]
[===============>.....] recovery = 76.6% (1498279040/1953513984)
finish=4710.1min speed=1610K/sec
md1 : active raid1 sdd5[3] sdc5[2]
25164672 blocks super 1.0 [2/2] [UU]
md0 : active raid1 sdd1[3] sdc1[2]
16779136 blocks super 1.0 [2/2] [UU]
md2 : active raid1 sdd6[3] sdc6[2]
192472960 blocks super 1.0 [2/2] [UU]
unused devices: <none>
-Peter
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4291 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: raid10: 6 out of 8 disks marked as stale on every restart
2014-12-12 17:04 raid10: 6 out of 8 disks marked as stale on every restart Peter Kieser
@ 2014-12-18 5:36 ` NeilBrown
2014-12-18 6:26 ` Peter Kieser
0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2014-12-18 5:36 UTC (permalink / raw)
To: Peter Kieser; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 5981 bytes --]
On Fri, 12 Dec 2014 09:04:51 -0800 Peter Kieser <peter@kieser.ca> wrote:
> Hello,
>
> I have a 8 disk RAID10 array, 6 of the disks are on an LSISAS2008
> controller and 2 are on a 82801JI (ICH10 Family) SATA AHCI controller.
> I upgraded the kernel from 3.17.1 to 3.17.6 when the issue I am having
> started to occur, but reverting to an older kernel does not resolve the
> issue.
>
> Restarting the machine causes the array not to start (or be visible in
> /proc/mdstat or any mention in kernel messages.) If I try to assemble
> the drives, mdraid complains that 6 out of the 8 disks (coincidentally
> all on the LSISAS2008 controller) are non-fresh:
>
> root@kvm:~# mdadm --assemble /dev/md3 /dev/sde /dev/sdf /dev/sdg
> /dev/sdh /dev/sdi /dev/sdj /dev/sda /dev/sdb
>
> Dec 11 21:08:25 kvm kernel: [ 528.503736] md: kicking non-fresh sdi
> from array!
> Dec 11 21:08:25 kvm kernel: [ 528.503747] md: unbind<sdi>
> Dec 11 21:08:25 kvm kernel: [ 528.523775] md: export_rdev(sdi)
> Dec 11 21:08:25 kvm kernel: [ 528.523802] md: kicking non-fresh sdg
> from array!
> Dec 11 21:08:25 kvm kernel: [ 528.523809] md: unbind<sdg>
> Dec 11 21:08:25 kvm kernel: [ 528.531753] md: export_rdev(sdg)
> Dec 11 21:08:25 kvm kernel: [ 528.531780] md: kicking non-fresh sdf
> from array!
> Dec 11 21:08:25 kvm kernel: [ 528.531788] md: unbind<sdf>
> Dec 11 21:08:25 kvm kernel: [ 528.539749] md: export_rdev(sdf)
> Dec 11 21:08:25 kvm kernel: [ 528.539776] md: kicking non-fresh sdh
> from array!
> Dec 11 21:08:25 kvm kernel: [ 528.539785] md: unbind<sdh>
> Dec 11 21:08:25 kvm kernel: [ 528.547744] md: export_rdev(sdh)
> Dec 11 21:08:25 kvm kernel: [ 528.547771] md: kicking non-fresh sdj
> from array!
> Dec 11 21:08:25 kvm kernel: [ 528.547779] md: unbind<sdj>
> Dec 11 21:08:25 kvm kernel: [ 528.555755] md: export_rdev(sdj)
> Dec 11 21:08:25 kvm kernel: [ 528.555782] md: kicking non-fresh sde
> from array!
> Dec 11 21:08:25 kvm kernel: [ 528.555790] md: unbind<sde>
> Dec 11 21:08:25 kvm kernel: [ 528.563758] md: export_rdev(sde)
> Dec 11 21:08:25 kvm kernel: [ 528.565831] md/raid10:md3: not enough
> operational mirrors.
> Dec 11 21:08:25 kvm kernel: [ 528.567230] md: pers->run() failed ...
>
> /dev/sda and /dev/sdb are the only drives not on the LSI controller. If
> I force the assembly with 6 out of the 8 drives the RAID array comes up:
>
> root@kvm:~# mdadm --assemble /dev/md3 /dev/sde /dev/sdf /dev/sdg
> /dev/sdh /dev/sdi /dev/sdj --run
>
> Then I add the extra drives:
>
> root@kvm:~# mdadm --manage /dev/md3 --add /dev/sda
> root@kvm:~# mdadm --manage /dev/md3 --add /dev/sdb
>
> root@kvm:~# mdadm --detail /dev/md3
> /dev/md3:
> Version : 1.0
> Creation Time : Thu Sep 12 18:43:56 2013
> Raid Level : raid10
> Array Size : 7814055936 (7452.06 GiB 8001.59 GB)
> Used Dev Size : 1953513984 (1863.02 GiB 2000.40 GB)
> Raid Devices : 8
> Total Devices : 8
> Persistence : Superblock is persistent
>
> Update Time : Fri Dec 12 08:58:19 2014
> State : active, degraded, recovering
> Active Devices : 6
> Working Devices : 8
> Failed Devices : 0
> Spare Devices : 2
>
> Layout : near=2
> Chunk Size : 512K
>
> Rebuild Status : 76% complete
>
> Name : kvm.taylor.kieser.ca:3
> UUID : f0bc8469:9879a709:e4cc94a7:521bd273
> Events : 82901
>
> Number Major Minor RaidDevice State
> 0 8 128 0 active sync /dev/sdi
> 8 8 96 1 active sync /dev/sdg
> 11 8 0 2 spare rebuilding /dev/sda
> 3 8 112 3 active sync /dev/sdh
> 4 0 0 4 removed
> 10 8 80 5 active sync /dev/sdf
> 6 8 64 6 active sync /dev/sde
> 9 8 144 7 active sync /dev/sdj
>
> 12 8 16 - spare /dev/sdb
>
> This occurs every time I restart the machine. Thoughts? I tried
> rebuilding the initramfs but this didn't resolve the issue. I'm also
> running bcache on this machine, but on top of the mdraid.
>
> /etc/mdadm.conf:
>
> # definitions of existing MD arrays
> ARRAY /dev/md/0 metadata=1.0 UUID=3b174514:49f3e22e:550cf9a7:8ed93920
> name=linux:0
> ARRAY /dev/md/1 metadata=1.0 UUID=8e23f81d:73f9b393:addd1f7f:5ee1833a
> name=linux:1
> ARRAY /dev/md/2 metadata=1.0 UUID=cc5a0495:b5262855:fb3cd40a:8b237162
> name=kvm.taylor.kieser.ca:2
> ARRAY /dev/md/3 metadata=1.0 UUID=f0bc8469:9879a709:e4cc94a7:521bd273
> name=kvm.taylor.kieser.ca:3
>
>
> root@kvm:~# uname -a
> Linux kvm 3.17.6 #3 SMP Sun Dec 7 12:16:45 PST 2014 x86_64 x86_64 x86_64
> GNU/Linux
>
> root@kvm:~# mdadm -V
> mdadm - v3.2.5 - 18th May 2012
>
> root@kvm:~# cat /proc/mdstat
> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
> [raid4] [raid10]
> md127 : inactive sdk[2](S)
> 1465138448 blocks super 1.0
>
> md3 : active raid10 sdb[12](S) sda[11] sdi[0] sdj[9] sde[6] sdf[10]
> sdh[3] sdg[8]
> 7814055936 blocks super 1.0 512K chunks 2 near-copies [8/6]
> [UU_U_UUU]
> [===============>.....] recovery = 76.6% (1498279040/1953513984)
> finish=4710.1min speed=1610K/sec
>
> md1 : active raid1 sdd5[3] sdc5[2]
> 25164672 blocks super 1.0 [2/2] [UU]
>
> md0 : active raid1 sdd1[3] sdc1[2]
> 16779136 blocks super 1.0 [2/2] [UU]
>
> md2 : active raid1 sdd6[3] sdc6[2]
> 192472960 blocks super 1.0 [2/2] [UU]
>
> unused devices: <none>
>
> -Peter
>
>
Curious.
What does "mdadm --examine" report for each device immediately after boot,
before you try assembling anything?
Maybe also get the output just before you shut down to compare.
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: raid10: 6 out of 8 disks marked as stale on every restart
2014-12-18 5:36 ` NeilBrown
@ 2014-12-18 6:26 ` Peter Kieser
0 siblings, 0 replies; 3+ messages in thread
From: Peter Kieser @ 2014-12-18 6:26 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 851 bytes --]
On 2014-12-17 9:36 PM, NeilBrown wrote:
> Curious.
>
> What does "mdadm --examine" report for each device immediately after boot,
> before you try assembling anything?
>
> Maybe also get the output just before you shut down to compare.
>
> NeilBrown
Sadly, I had to disassemble this array and get something workable that
didn't become unassembled on every restart. I did try sticking all
drives on the same AHCI controller, which caused the array to be
resynced on every restart instead of disassembled.
I suspect this issue is caused by bcache (I'm using the md as a backing
device for bcache.) The bcache maintainer states "get the md people or
someone to explain _what_ they want whatever has their device open to do
on reboot."
I'm going to setup a test environment and see if I can reproduce it again.
-Peter
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4291 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-12-18 6:26 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-12 17:04 raid10: 6 out of 8 disks marked as stale on every restart Peter Kieser
2014-12-18 5:36 ` NeilBrown
2014-12-18 6:26 ` Peter Kieser
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).