* argh! @ 2010-10-30 11:56 Jon Hardcastle 2010-10-30 15:45 ` argh! Phil Turmel 0 siblings, 1 reply; 16+ messages in thread From: Jon Hardcastle @ 2010-10-30 11:56 UTC (permalink / raw) To: linux-raid Guys, a new HDD has failed on me during a scrub.... i tried to remove/fail it but it kept saying the device was busy. so i forced a reboot. I have physically disconnected the drive.. can anyone take alook at the examine below and tell me if it is should assemble ok? I tried mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 is that the correct command as it won't assemble and when i --force it still is right. hlep! --- examine starts /dev/sda1: Magic : a92b4efc Version : 0.90.00 UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3 Creation Time : Thu Oct 11 00:01:49 2007 Raid Level : raid6 Used Dev Size : 732571904 (698.64 GiB 750.15 GB) Array Size : 3662859520 (3493.18 GiB 3750.77 GB) Raid Devices : 7 Total Devices : 7 Preferred Minor : 4 Update Time : Sat Oct 30 05:10:12 2010 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : c22df0a0 - correct Events : 1882828 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 1 1 active sync /dev/sda1 0 0 8 97 0 active sync /dev/sdg1 1 1 8 1 1 active sync /dev/sda1 2 2 8 65 2 active sync /dev/sde1 3 3 0 0 3 faulty removed 4 4 8 49 4 active sync /dev/sdd1 5 5 8 17 5 active sync /dev/sdb1 6 6 8 81 6 active sync /dev/sdf1 /dev/sdb1: Magic : a92b4efc Version : 0.90.00 UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3 Creation Time : Thu Oct 11 00:01:49 2007 Raid Level : raid6 Used Dev Size : 732571904 (698.64 GiB 750.15 GB) Array Size : 3662859520 (3493.18 GiB 3750.77 GB) Raid Devices : 7 Total Devices : 7 Preferred Minor : 4 Update Time : Sat Oct 30 05:10:12 2010 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : c22df0b8 - correct Events : 1882828 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 5 8 17 5 active sync /dev/sdb1 0 0 8 97 0 active sync /dev/sdg1 1 1 8 1 1 active sync /dev/sda1 2 2 8 65 2 active sync /dev/sde1 3 3 0 0 3 faulty removed 4 4 8 49 4 active sync /dev/sdd1 5 5 8 17 5 active sync /dev/sdb1 6 6 8 81 6 active sync /dev/sdf1 /dev/sdd1: Magic : a92b4efc Version : 0.90.00 UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3 Creation Time : Thu Oct 11 00:01:49 2007 Raid Level : raid6 Used Dev Size : 732571904 (698.64 GiB 750.15 GB) Array Size : 3662859520 (3493.18 GiB 3750.77 GB) Raid Devices : 7 Total Devices : 7 Preferred Minor : 4 Update Time : Sat Oct 30 05:10:12 2010 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : c22df0d6 - correct Events : 1882828 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 4 8 49 4 active sync /dev/sdd1 0 0 8 97 0 active sync /dev/sdg1 1 1 8 1 1 active sync /dev/sda1 2 2 8 65 2 active sync /dev/sde1 3 3 0 0 3 faulty removed 4 4 8 49 4 active sync /dev/sdd1 5 5 8 17 5 active sync /dev/sdb1 6 6 8 81 6 active sync /dev/sdf1 /dev/sde1: Magic : a92b4efc Version : 0.90.00 UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3 Creation Time : Thu Oct 11 00:01:49 2007 Raid Level : raid6 Used Dev Size : 732571904 (698.64 GiB 750.15 GB) Array Size : 3662859520 (3493.18 GiB 3750.77 GB) Raid Devices : 7 Total Devices : 7 Preferred Minor : 4 Update Time : Sat Oct 30 05:10:12 2010 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : c22df0e2 - correct Events : 1882828 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 65 2 active sync /dev/sde1 0 0 8 97 0 active sync /dev/sdg1 1 1 8 1 1 active sync /dev/sda1 2 2 8 65 2 active sync /dev/sde1 3 3 0 0 3 faulty removed 4 4 8 49 4 active sync /dev/sdd1 5 5 8 17 5 active sync /dev/sdb1 6 6 8 81 6 active sync /dev/sdf1 /dev/sdf1: Magic : a92b4efc Version : 0.90.00 UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3 Creation Time : Thu Oct 11 00:01:49 2007 Raid Level : raid6 Used Dev Size : 732571904 (698.64 GiB 750.15 GB) Array Size : 3662859520 (3493.18 GiB 3750.77 GB) Raid Devices : 7 Total Devices : 7 Preferred Minor : 4 Update Time : Sat Oct 30 05:10:12 2010 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : c22df0fa - correct Events : 1882828 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 6 8 81 6 active sync /dev/sdf1 0 0 8 97 0 active sync /dev/sdg1 1 1 8 1 1 active sync /dev/sda1 2 2 8 65 2 active sync /dev/sde1 3 3 0 0 3 faulty removed 4 4 8 49 4 active sync /dev/sdd1 5 5 8 17 5 active sync /dev/sdb1 6 6 8 81 6 active sync /dev/sdf1 /dev/sdg1: Magic : a92b4efc Version : 0.90.00 UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3 Creation Time : Thu Oct 11 00:01:49 2007 Raid Level : raid6 Used Dev Size : 732571904 (698.64 GiB 750.15 GB) Array Size : 3662859520 (3493.18 GiB 3750.77 GB) Raid Devices : 7 Total Devices : 7 Preferred Minor : 4 Update Time : Sat Oct 30 05:10:12 2010 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : c22df0fe - correct Events : 1882828 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 97 0 active sync /dev/sdg1 0 0 8 97 0 active sync /dev/sdg1 1 1 8 1 1 active sync /dev/sda1 2 2 8 65 2 active sync /dev/sde1 3 3 0 0 3 faulty removed 4 4 8 49 4 active sync /dev/sdd1 5 5 8 17 5 active sync /dev/sdb1 6 6 8 81 6 active sync /dev/sdf1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-30 11:56 argh! Jon Hardcastle @ 2010-10-30 15:45 ` Phil Turmel 2010-10-30 21:10 ` argh! Leslie Rhorer 0 siblings, 1 reply; 16+ messages in thread From: Phil Turmel @ 2010-10-30 15:45 UTC (permalink / raw) To: Jon; +Cc: Jon Hardcastle, linux-raid On 10/30/2010 07:56 AM, Jon Hardcastle wrote: > Guys, > > a new HDD has failed on me during a scrub.... i tried to remove/fail it but > it kept saying the device was busy. so i forced a reboot. > > I have physically disconnected the drive.. > > can anyone take alook at the examine below and tell me if it is should > assemble ok? > > I tried > > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1 > /dev/sdg1 I'd try: mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 HTH, Phil ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: argh! 2010-10-30 15:45 ` argh! Phil Turmel @ 2010-10-30 21:10 ` Leslie Rhorer 2010-10-30 21:52 ` argh! Jon Hardcastle 0 siblings, 1 reply; 16+ messages in thread From: Leslie Rhorer @ 2010-10-30 21:10 UTC (permalink / raw) To: 'Phil Turmel', Jon; +Cc: 'Jon Hardcastle', linux-raid > -----Original Message----- > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > owner@vger.kernel.org] On Behalf Of Phil Turmel > Sent: Saturday, October 30, 2010 10:46 AM > To: Jon@eHardcastle.com > Cc: Jon Hardcastle; linux-raid@vger.kernel.org > Subject: Re: argh! > > On 10/30/2010 07:56 AM, Jon Hardcastle wrote: > > Guys, > > > > a new HDD has failed on me during a scrub.... i tried to remove/fail it > but > > it kept saying the device was busy. so i forced a reboot. > > > > I have physically disconnected the drive.. > > > > can anyone take alook at the examine below and tell me if it is should > > assemble ok? > > > > I tried > > > > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 > /dev/sdf1 > > /dev/sdg1 > > I'd try: > > mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 Yeah, I would, too. Also, what are the contents of /etc/mdadm/mdadm.conf? If it is correct, then `mdadm --assemble --scan` should work. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-30 21:10 ` argh! Leslie Rhorer @ 2010-10-30 21:52 ` Jon Hardcastle 2010-10-30 21:54 ` argh! Jon Hardcastle 2010-10-30 23:57 ` argh! Leslie Rhorer 0 siblings, 2 replies; 16+ messages in thread From: Jon Hardcastle @ 2010-10-30 21:52 UTC (permalink / raw) To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid On 30 October 2010 22:10, Leslie Rhorer <lrhorer@satx.rr.com> wrote: >> -----Original Message----- >> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- >> owner@vger.kernel.org] On Behalf Of Phil Turmel >> Sent: Saturday, October 30, 2010 10:46 AM >> To: Jon@eHardcastle.com >> Cc: Jon Hardcastle; linux-raid@vger.kernel.org >> Subject: Re: argh! >> >> On 10/30/2010 07:56 AM, Jon Hardcastle wrote: >> > Guys, >> > >> > a new HDD has failed on me during a scrub.... i tried to remove/fail it >> but >> > it kept saying the device was busy. so i forced a reboot. >> > >> > I have physically disconnected the drive.. >> > >> > can anyone take alook at the examine below and tell me if it is should >> > assemble ok? >> > >> > I tried >> > >> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 >> /dev/sdf1 >> > /dev/sdg1 >> >> I'd try: >> >> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 > > > Yeah, I would, too. Also, what are the contents of > /etc/mdadm/mdadm.conf? If it is correct, then `mdadm --assemble --scan` > should work. > > Hey, yeah I am confused as drives have failed before and it has still assembled. I think it is because it is unclean.... Can I ask how did you arrive at the command list? what is wrong with dbf? also this is my mdadm.conf DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1 ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3 ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98 ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662 ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-30 21:52 ` argh! Jon Hardcastle @ 2010-10-30 21:54 ` Jon Hardcastle 2010-10-30 22:01 ` argh! Jon Hardcastle 2010-10-31 0:05 ` argh! Leslie Rhorer 2010-10-30 23:57 ` argh! Leslie Rhorer 1 sibling, 2 replies; 16+ messages in thread From: Jon Hardcastle @ 2010-10-30 21:54 UTC (permalink / raw) To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid Also, What commands can I not run? I.e. what are destructive? On 30 October 2010 22:52, Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote: > On 30 October 2010 22:10, Leslie Rhorer <lrhorer@satx.rr.com> wrote: >>> -----Original Message----- >>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- >>> owner@vger.kernel.org] On Behalf Of Phil Turmel >>> Sent: Saturday, October 30, 2010 10:46 AM >>> To: Jon@eHardcastle.com >>> Cc: Jon Hardcastle; linux-raid@vger.kernel.org >>> Subject: Re: argh! >>> >>> On 10/30/2010 07:56 AM, Jon Hardcastle wrote: >>> > Guys, >>> > >>> > a new HDD has failed on me during a scrub.... i tried to remove/fail it >>> but >>> > it kept saying the device was busy. so i forced a reboot. >>> > >>> > I have physically disconnected the drive.. >>> > >>> > can anyone take alook at the examine below and tell me if it is should >>> > assemble ok? >>> > >>> > I tried >>> > >>> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 >>> /dev/sdf1 >>> > /dev/sdg1 >>> >>> I'd try: >>> >>> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 >> >> >> Yeah, I would, too. Also, what are the contents of >> /etc/mdadm/mdadm.conf? If it is correct, then `mdadm --assemble --scan` >> should work. >> >> > > Hey, yeah I am confused as drives have failed before and it has still > assembled. I think it is because it is unclean.... > > Can I ask how did you arrive at the command list? what is wrong with dbf? > > also this is my mdadm.conf > > > DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1 > > ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3 > ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98 > ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662 > ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499 > -- ----------------------- N: Jon Hardcastle E: Jon@eHardcastle.com JK: "What's a wombat for? Why for hitting Woms of course!" 'Q': 'There comes a time when you look into the mirror, and you realise that what you see is all that you will ever be. Then you accept it, or you kill yourself. Or you stop looking into mirrors... :)' Please note, I am phasing out jonathan DOT hardcastle AT gmail.com and replacing it with jon AT eHardcastle.com ----------------------- -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-30 21:54 ` argh! Jon Hardcastle @ 2010-10-30 22:01 ` Jon Hardcastle 2010-10-31 0:07 ` argh! Leslie Rhorer 2010-10-31 0:05 ` argh! Leslie Rhorer 1 sibling, 1 reply; 16+ messages in thread From: Jon Hardcastle @ 2010-10-30 22:01 UTC (permalink / raw) To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid Sorry to spam.. if i run 'mdadm --assemble --scan -R' the array assemles in an inactive state but it is suggesting i use force.. but I am worried about doing damage? Also, perhaps some extra commands for thick people would be cool? i.e. force for things that are ideal.. like mounting an incomplete array but having to specifiy it twice i.e. '-F -F' for things that can do damage? On 30 October 2010 22:54, Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote: > Also, > > What commands can I not run? I.e. what are destructive? > > > > On 30 October 2010 22:52, Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote: >> On 30 October 2010 22:10, Leslie Rhorer <lrhorer@satx.rr.com> wrote: >>>> -----Original Message----- >>>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- >>>> owner@vger.kernel.org] On Behalf Of Phil Turmel >>>> Sent: Saturday, October 30, 2010 10:46 AM >>>> To: Jon@eHardcastle.com >>>> Cc: Jon Hardcastle; linux-raid@vger.kernel.org >>>> Subject: Re: argh! >>>> >>>> On 10/30/2010 07:56 AM, Jon Hardcastle wrote: >>>> > Guys, >>>> > >>>> > a new HDD has failed on me during a scrub.... i tried to remove/fail it >>>> but >>>> > it kept saying the device was busy. so i forced a reboot. >>>> > >>>> > I have physically disconnected the drive.. >>>> > >>>> > can anyone take alook at the examine below and tell me if it is should >>>> > assemble ok? >>>> > >>>> > I tried >>>> > >>>> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 >>>> /dev/sdf1 >>>> > /dev/sdg1 >>>> >>>> I'd try: >>>> >>>> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 >>> >>> >>> Yeah, I would, too. Also, what are the contents of >>> /etc/mdadm/mdadm.conf? If it is correct, then `mdadm --assemble --scan` >>> should work. >>> >>> >> >> Hey, yeah I am confused as drives have failed before and it has still >> assembled. I think it is because it is unclean.... >> >> Can I ask how did you arrive at the command list? what is wrong with dbf? >> >> also this is my mdadm.conf >> >> >> DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1 >> >> ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3 >> ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98 >> ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662 >> ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499 >> > > > > -- > ----------------------- > N: Jon Hardcastle > E: Jon@eHardcastle.com > JK: "What's a wombat for? Why for hitting Woms of course!" > 'Q': 'There comes a time when you look into the mirror, and you > realise that what you see is all that you will ever be. Then you > accept it, or you kill yourself. Or you stop looking into mirrors... > :)' > > Please note, I am phasing out jonathan DOT hardcastle AT gmail.com and > replacing it with jon AT eHardcastle.com > ----------------------- > -- ----------------------- N: Jon Hardcastle E: Jon@eHardcastle.com JK: "What's a wombat for? Why for hitting Woms of course!" 'Q': 'There comes a time when you look into the mirror, and you realise that what you see is all that you will ever be. Then you accept it, or you kill yourself. Or you stop looking into mirrors... :)' Please note, I am phasing out jonathan DOT hardcastle AT gmail.com and replacing it with jon AT eHardcastle.com ----------------------- -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: argh! 2010-10-30 22:01 ` argh! Jon Hardcastle @ 2010-10-31 0:07 ` Leslie Rhorer 2010-10-31 18:52 ` argh! Jon Hardcastle 0 siblings, 1 reply; 16+ messages in thread From: Leslie Rhorer @ 2010-10-31 0:07 UTC (permalink / raw) To: Jon; +Cc: 'Phil Turmel', linux-raid > -----Original Message----- > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > owner@vger.kernel.org] On Behalf Of Jon Hardcastle > Sent: Saturday, October 30, 2010 5:01 PM > To: Leslie Rhorer > Cc: Phil Turmel; linux-raid@vger.kernel.org > Subject: Re: argh! > > Sorry to spam.. if i run > > 'mdadm --assemble --scan -R' > > the array assemles in an inactive state but it is suggesting i use > force.. but I am worried about doing damage? > > Also, perhaps some extra commands for thick people would be cool? i.e. > force for things that are ideal.. like mounting an incomplete array > but having to specifiy it twice i.e. '-F -F' for things that can do > damage? Assembly using --force won't do damage. It simply will either pass or fail. If it passes, proceed to mounting the array read-only. If it fails, you'll have to do more work. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-31 0:07 ` argh! Leslie Rhorer @ 2010-10-31 18:52 ` Jon Hardcastle 2010-10-31 19:43 ` argh! Neil Brown 2010-11-01 21:39 ` argh! Leslie Rhorer 0 siblings, 2 replies; 16+ messages in thread From: Jon Hardcastle @ 2010-10-31 18:52 UTC (permalink / raw) To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid On 31 October 2010 01:07, Leslie Rhorer <lrhorer@satx.rr.com> wrote: >> -----Original Message----- >> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- >> owner@vger.kernel.org] On Behalf Of Jon Hardcastle >> Sent: Saturday, October 30, 2010 5:01 PM >> To: Leslie Rhorer >> Cc: Phil Turmel; linux-raid@vger.kernel.org >> Subject: Re: argh! >> >> Sorry to spam.. if i run >> >> 'mdadm --assemble --scan -R' >> >> the array assemles in an inactive state but it is suggesting i use >> force.. but I am worried about doing damage? >> >> Also, perhaps some extra commands for thick people would be cool? i.e. >> force for things that are ideal.. like mounting an incomplete array >> but having to specifiy it twice i.e. '-F -F' for things that can do >> damage? > > Assembly using --force won't do damage. It simply will either pass or fail. > If it passes, proceed to mounting the array read-only. If it fails, you'll > have to do more work. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Thanks for your help! mdadm --assemble /dev/md4 --run --force did it. I don't have backups as this is 4TB's of data and have never beeen able to afford having a whole second machine, but the price of drives has come down alot now so think I may build a noddy machine for weekly backups. Thanks for listing what commands are desrustive. Am running a 'check' before mounting the arrays, I will then kick off a FS check on all partitions. I have been combing my log files, I think it was my controller that failed not the drive (not confirmed as the drive is yet to be reconnected) but I noticed that the mdadm booted the drive but then I think it crashed due to a bug and hence the drive was still part of the array.. i.e. it was still 'checking' when i checked and even after a --fail the drive was still in the array and 'checking' I have this from messages Oct 30 05:02:08 localhost mdadm[13271]: Fail event detected on md device /dev/md/4, component device /dev/sdc1 Oct 30 05:02:08 localhost kernel: ------------[ cut here ]------------ Oct 30 05:02:08 localhost kernel: kernel BUG at drivers/md/raid5.c:2768! Oct 30 05:02:08 localhost kernel: invalid opcode: 0000 [#1] SMP Oct 30 05:02:08 localhost kernel: last sysfs file: /sys/devices/virtual/block/md4/md/metadata_version Oct 30 05:02:08 localhost kernel: Modules linked in: ipv6 snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_hda_codec_analog snd_cs4236 snd_wavefront snd_wss_lib snd_opl3_lib snd_hda_intel snd_hda_codec snd_mpu401 snd_hwdep snd_mpu401_uart snd_pcm snd_rawmidi snd_seq_device i2c_nforce2 ppdev pcspkr snd_timer k8temp snd_page_alloc forcedeth i2c_core fan rtc_cmos ns558 snd gameport processor rtc_core thermal rtc_lib button thermal_sys parport_pc tg3 libphy e1000 fuse xfs exportfs nfs auth_rpcgss nfs_acl lockd sunrpc jfs raid10 dm_bbr dm_snapshot dm_crypt dm_mirror dm_region_hash dm_log dm_mod scsi_wait_scan sbp2 ohci1394 ieee1394 sl811_hcd usbhid ohci_hcd ssb uhci_hcd usb_storage ehci_hcd usbcore aic94xx libsas lpfc qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280 imm parport dmx3191d sym53c8xx qlogicfas408 gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix ahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise pata_pcmcia pcmcia pcmcia_core Oct 30 05:02:08 localhost kernel: Oct 30 05:02:08 localhost kernel: Pid: 9967, comm: md4_raid6 Not tainted (2.6.32-gentoo-r1 #1) System Product Name Oct 30 05:02:08 localhost kernel: EIP: 0060:[<c0363658>] EFLAGS: 00010297 CPU: 0 Oct 30 05:02:08 localhost kernel: EIP is at handle_stripe+0x819/0x1617 Oct 30 05:02:08 localhost kernel: EAX: 00000006 EBX: dd19d1ac ECX: 00000003 EDX: 00000001 Oct 30 05:02:08 localhost kernel: ESI: dd19d1d4 EDI: 00000002 EBP: dc843f1c ESP: dc843e50 Oct 30 05:02:08 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Oct 30 05:02:08 localhost kernel: Process md4_raid6 (pid: 9967, ti=dc843000 task=de5e8510 task.ti=dc843000) Oct 30 05:02:08 localhost kernel: Stack: Oct 30 05:02:08 localhost kernel: de5e8510 97ac2223 00000007 dd0a8400 de4b91dc 00000007 c13a1360 00020003 Oct 30 05:02:08 localhost kernel: <0> dc89b7c0 00000008 00000003 00000246 dc843eb4 c04017c8 00000010 dd19d524 Oct 30 05:02:08 localhost kernel: <0> 00000006 fffffffc dd025534 dc843eb8 00000000 00000000 00000246 dd025534 Oct 30 05:02:08 localhost kernel: Call Trace: Oct 30 05:02:08 localhost kernel: [<c04017c8>] ? __mutex_lock_slowpath+0x1f4/0x1fc Oct 30 05:02:08 localhost kernel: [<c0364796>] ? raid5d+0x340/0x37e ...... alot more -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-31 18:52 ` argh! Jon Hardcastle @ 2010-10-31 19:43 ` Neil Brown 2010-10-31 19:54 ` argh! Jon Hardcastle 2010-11-01 21:39 ` argh! Leslie Rhorer 1 sibling, 1 reply; 16+ messages in thread From: Neil Brown @ 2010-10-31 19:43 UTC (permalink / raw) To: Jon; +Cc: jonathan.hardcastle, Leslie Rhorer, Phil Turmel, linux-raid On Sun, 31 Oct 2010 18:52:10 +0000 Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote: > > I have this from messages > > Oct 30 05:02:08 localhost mdadm[13271]: Fail event detected on md > device /dev/md/4, component device /dev/sdc1 > Oct 30 05:02:08 localhost kernel: ------------[ cut here ]------------ > Oct 30 05:02:08 localhost kernel: kernel BUG at drivers/md/raid5.c:2768! What kernel version was this? Thanks, NeilBrown ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-31 19:43 ` argh! Neil Brown @ 2010-10-31 19:54 ` Jon Hardcastle 0 siblings, 0 replies; 16+ messages in thread From: Jon Hardcastle @ 2010-10-31 19:54 UTC (permalink / raw) To: Neil Brown; +Cc: Leslie Rhorer, Phil Turmel, linux-raid On 31 October 2010 19:43, Neil Brown <neilb@suse.de> wrote: > On Sun, 31 Oct 2010 18:52:10 +0000 > Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote: >> >> I have this from messages >> >> Oct 30 05:02:08 localhost mdadm[13271]: Fail event detected on md >> device /dev/md/4, component device /dev/sdc1 >> Oct 30 05:02:08 localhost kernel: ------------[ cut here ]------------ >> Oct 30 05:02:08 localhost kernel: kernel BUG at drivers/md/raid5.c:2768! > > What kernel version was this? > > Thanks, > NeilBrown > > > 2.6.32-gentoo-r1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: argh! 2010-10-31 18:52 ` argh! Jon Hardcastle 2010-10-31 19:43 ` argh! Neil Brown @ 2010-11-01 21:39 ` Leslie Rhorer 1 sibling, 0 replies; 16+ messages in thread From: Leslie Rhorer @ 2010-11-01 21:39 UTC (permalink / raw) To: Jon; +Cc: 'Phil Turmel', linux-raid > I don't have backups as this is 4TB's of data and have never beeen > able to afford having a whole second machine, but the price of drives > has come down alot now so think I may build a noddy machine for weekly > backups. 4T worth of backup space can be had for $200. If the data is not worth $200 to you, then by all means you are free to ignore the need for backup, but eventually you will lose at least some, if not all, of the data. Although an on-line backup system is very handy, it is not absolutely essential. You could employ dar or some similar utility to backup the data to individual off-line disks, for example. ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: argh! 2010-10-30 21:54 ` argh! Jon Hardcastle 2010-10-30 22:01 ` argh! Jon Hardcastle @ 2010-10-31 0:05 ` Leslie Rhorer 1 sibling, 0 replies; 16+ messages in thread From: Leslie Rhorer @ 2010-10-31 0:05 UTC (permalink / raw) To: Jon; +Cc: 'Phil Turmel', linux-raid > -----Original Message----- > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > owner@vger.kernel.org] On Behalf Of Jon Hardcastle > Sent: Saturday, October 30, 2010 4:55 PM > To: Leslie Rhorer > Cc: Phil Turmel; linux-raid@vger.kernel.org > Subject: Re: argh! > > Also, > > What commands can I not run? I.e. what are destructive? --build, --create, --add, --zero-superblock. Some of the growth commands can get you into trouble if something happens, and depending on the version of mdadm. --assemble --force should not, at least not in and of itself. I would take care to mount the filesystem read-only and check it out thoroughly before re-mounting as read-write. If there are problems, you may need to restore some of the data from backups. You have backups, right? ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: argh! 2010-10-30 21:52 ` argh! Jon Hardcastle 2010-10-30 21:54 ` argh! Jon Hardcastle @ 2010-10-30 23:57 ` Leslie Rhorer 2010-10-31 21:18 ` argh! Jon Hardcastle 1 sibling, 1 reply; 16+ messages in thread From: Leslie Rhorer @ 2010-10-30 23:57 UTC (permalink / raw) To: Jon; +Cc: 'Phil Turmel', linux-raid > >> > a new HDD has failed on me during a scrub.... i tried to remove/fail > it > >> but > >> > it kept saying the device was busy. so i forced a reboot. BTW, it's better, if you can, to free up the device, rather than reboot. Now that you have rebooted, that's no longer possible. > >> > I have physically disconnected the drive.. > >> > > >> > can anyone take alook at the examine below and tell me if it is > should > >> > assemble ok? > >> > > >> > I tried > >> > > >> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 > >> /dev/sdf1 > >> > /dev/sdg1 > >> > >> I'd try: > >> > >> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 > > > > > > Yeah, I would, too. Also, what are the contents of > > /etc/mdadm/mdadm.conf? If it is correct, then `mdadm --assemble --scan` > > should work. > > > > > > Hey, yeah I am confused as drives have failed before and it has still > assembled. I think it is because it is unclean.... > > Can I ask how did you arrive at the command list? Look at the results of --examine. Every one shows the list of drives and their order. > what is wrong with dbf? 'No idea. SMART might give you an idea, or the kernel logs. > also this is my mdadm.conf > > > DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1 > > ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3 > ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98 > ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662 > ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499 --scan may work. I suggest updating the file with all the array members. Why are all the arrays assembled with 0.90 superblocks? The 0.90 superblock has some significant limitations. They may not be causing you grief right now, but they could in the future. The only arrays I have built with 0.90 superblocks are the /boot targets, because GRUB2 does not support 1.x superblocks. I've chosen 1.2 for all the others. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-30 23:57 ` argh! Leslie Rhorer @ 2010-10-31 21:18 ` Jon Hardcastle 2010-10-31 21:44 ` argh! Neil Brown 0 siblings, 1 reply; 16+ messages in thread From: Jon Hardcastle @ 2010-10-31 21:18 UTC (permalink / raw) To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid On 31 October 2010 00:57, Leslie Rhorer <lrhorer@satx.rr.com> wrote: >> >> > a new HDD has failed on me during a scrub.... i tried to remove/fail >> it >> >> but >> >> > it kept saying the device was busy. so i forced a reboot. > > BTW, it's better, if you can, to free up the device, rather than > reboot. Now that you have rebooted, that's no longer possible. > >> >> > I have physically disconnected the drive.. >> >> > >> >> > can anyone take alook at the examine below and tell me if it is >> should >> >> > assemble ok? >> >> > >> >> > I tried >> >> > >> >> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 >> >> /dev/sdf1 >> >> > /dev/sdg1 >> >> >> >> I'd try: >> >> >> >> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1 >> > >> > >> > Yeah, I would, too. Also, what are the contents of >> > /etc/mdadm/mdadm.conf? If it is correct, then `mdadm --assemble --scan` >> > should work. >> > >> > >> >> Hey, yeah I am confused as drives have failed before and it has still >> assembled. I think it is because it is unclean.... >> >> Can I ask how did you arrive at the command list? > > Look at the results of --examine. Every one shows the list of > drives and their order. > >> what is wrong with dbf? > > 'No idea. SMART might give you an idea, or the kernel logs. > >> also this is my mdadm.conf >> >> >> DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1 >> >> ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3 >> ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98 >> ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662 >> ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499 > > --scan may work. I suggest updating the file with all the array > members. Why are all the arrays assembled with 0.90 superblocks? The 0.90 > superblock has some significant limitations. They may not be causing you > grief right now, but they could in the future. The only arrays I have built > with 0.90 superblocks are the /boot targets, because GRUB2 does not support > 1.x superblocks. I've chosen 1.2 for all the others. > Hi, Thanks for your help. I use 0.90 as that is what there was when the machine was build ~3yrs ago.. the array has been grown and resized since then. Does anyone have a feature list for the superblocks? Why upgrade.....? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-31 21:18 ` argh! Jon Hardcastle @ 2010-10-31 21:44 ` Neil Brown 2010-11-01 1:51 ` argh! John Robinson 0 siblings, 1 reply; 16+ messages in thread From: Neil Brown @ 2010-10-31 21:44 UTC (permalink / raw) To: Jon; +Cc: jonathan.hardcastle, Leslie Rhorer, Phil Turmel, linux-raid On Sun, 31 Oct 2010 21:18:52 +0000 Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote: > Hi, > > Thanks for your help. I use 0.90 as that is what there was when the > machine was build ~3yrs ago.. the array has been grown and resized > since then. > > Does anyone have a feature list for the superblocks? Why upgrade.....? The "md" man page mentions a couple of differences: - v1.x can handle more than 28 devices in an array - v1.x can easily be moved between hosts with different endian-ness - v1.x can put the metadata at the front of the array I should probably add the other differences. - with 0.90 there can be confusion about whether a superblock applies to the whole device or to just the last partition (if it start on a 64K boundary). 1.x doesn't have that problem - With 1.x a device recovery can be checkpointed and restarted. - with 0.90, the maximum component for RAID1 or higher is 2TB (or maybe 4TB, not sure). With 1.x you can go much higher. Those are the only ones I can think of at the moment. It is rarely worth the effort to upgrade, but usually best to choose 1.2 for new arrays that you don't want to boot off. If you want to boot of the array, then whatever works with your boot-loader is the best choice. NeilBrown ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: argh! 2010-10-31 21:44 ` argh! Neil Brown @ 2010-11-01 1:51 ` John Robinson 0 siblings, 0 replies; 16+ messages in thread From: John Robinson @ 2010-11-01 1:51 UTC (permalink / raw) To: Neil Brown Cc: Jon, jonathan.hardcastle, Leslie Rhorer, Phil Turmel, linux-raid On 31/10/2010 21:44, Neil Brown wrote: > On Sun, 31 Oct 2010 21:18:52 +0000 > Jon Hardcastle<jonathan.hardcastle@gmail.com> wrote: > >> Hi, >> >> Thanks for your help. I use 0.90 as that is what there was when the >> machine was build ~3yrs ago.. the array has been grown and resized >> since then. >> >> Does anyone have a feature list for the superblocks? Why upgrade.....? > > The "md" man page mentions a couple of differences: > - v1.x can handle more than 28 devices in an array > - v1.x can easily be moved between hosts with different endian-ness > - v1.x can put the metadata at the front of the array > > I should probably add the other differences. > > - with 0.90 there can be confusion about whether a superblock applies > to the whole device or to just the last partition (if it start on a > 64K boundary). 1.x doesn't have that problem > - With 1.x a device recovery can be checkpointed and restarted. > - with 0.90, the maximum component for RAID1 or higher is 2TB (or maybe > 4TB, not sure). With 1.x you can go much higher. Aha. Some other good info for me to perhaps incorporate if I ever get round to trying to patch the man page. In fact I probably ought to review the last few months' list postings, and especially Neil B's. Cheers, John. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-11-01 21:39 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-30 11:56 argh! Jon Hardcastle 2010-10-30 15:45 ` argh! Phil Turmel 2010-10-30 21:10 ` argh! Leslie Rhorer 2010-10-30 21:52 ` argh! Jon Hardcastle 2010-10-30 21:54 ` argh! Jon Hardcastle 2010-10-30 22:01 ` argh! Jon Hardcastle 2010-10-31 0:07 ` argh! Leslie Rhorer 2010-10-31 18:52 ` argh! Jon Hardcastle 2010-10-31 19:43 ` argh! Neil Brown 2010-10-31 19:54 ` argh! Jon Hardcastle 2010-11-01 21:39 ` argh! Leslie Rhorer 2010-10-31 0:05 ` argh! Leslie Rhorer 2010-10-30 23:57 ` argh! Leslie Rhorer 2010-10-31 21:18 ` argh! Jon Hardcastle 2010-10-31 21:44 ` argh! Neil Brown 2010-11-01 1:51 ` argh! John Robinson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).