argh!

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* argh!
@ 2010-10-30 11:56 Jon Hardcastle
  2010-10-30 15:45 ` argh! Phil Turmel
  0 siblings, 1 reply; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-30 11:56 UTC (permalink / raw)
  To: linux-raid

Guys,

a new HDD has failed on me during a scrub.... i tried to remove/fail it but
it kept saying the device was busy. so i forced a reboot.

I have physically disconnected the drive..

can anyone take alook at the examine below and tell me if it is should
assemble ok?

I tried

mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1
/dev/sdg1

is that the correct command as it won't assemble and when i --force it still
is right.

hlep!
--- examine starts


/dev/sda1:
         Magic : a92b4efc
       Version : 0.90.00
          UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3
 Creation Time : Thu Oct 11 00:01:49 2007
    Raid Level : raid6
 Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Array Size : 3662859520 (3493.18 GiB 3750.77 GB)
  Raid Devices : 7
 Total Devices : 7
Preferred Minor : 4

   Update Time : Sat Oct 30 05:10:12 2010
         State : active
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
 Spare Devices : 0
      Checksum : c22df0a0 - correct
        Events : 1882828

        Layout : left-symmetric
    Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
this     1       8        1        1      active sync   /dev/sda1

  0     0       8       97        0      active sync   /dev/sdg1
  1     1       8        1        1      active sync   /dev/sda1
  2     2       8       65        2      active sync   /dev/sde1
  3     3       0        0        3      faulty removed
  4     4       8       49        4      active sync   /dev/sdd1
  5     5       8       17        5      active sync   /dev/sdb1
  6     6       8       81        6      active sync   /dev/sdf1
/dev/sdb1:
         Magic : a92b4efc
       Version : 0.90.00
          UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3
 Creation Time : Thu Oct 11 00:01:49 2007
    Raid Level : raid6
 Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Array Size : 3662859520 (3493.18 GiB 3750.77 GB)
  Raid Devices : 7
 Total Devices : 7
Preferred Minor : 4

   Update Time : Sat Oct 30 05:10:12 2010
         State : active
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
 Spare Devices : 0
      Checksum : c22df0b8 - correct
        Events : 1882828

        Layout : left-symmetric
    Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
this     5       8       17        5      active sync   /dev/sdb1

  0     0       8       97        0      active sync   /dev/sdg1
  1     1       8        1        1      active sync   /dev/sda1
  2     2       8       65        2      active sync   /dev/sde1
  3     3       0        0        3      faulty removed
  4     4       8       49        4      active sync   /dev/sdd1
  5     5       8       17        5      active sync   /dev/sdb1
  6     6       8       81        6      active sync   /dev/sdf1
/dev/sdd1:
         Magic : a92b4efc
       Version : 0.90.00
          UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3
 Creation Time : Thu Oct 11 00:01:49 2007
    Raid Level : raid6
 Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Array Size : 3662859520 (3493.18 GiB 3750.77 GB)
  Raid Devices : 7
 Total Devices : 7
Preferred Minor : 4

   Update Time : Sat Oct 30 05:10:12 2010
         State : active
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
 Spare Devices : 0
      Checksum : c22df0d6 - correct
        Events : 1882828

        Layout : left-symmetric
    Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
this     4       8       49        4      active sync   /dev/sdd1

  0     0       8       97        0      active sync   /dev/sdg1
  1     1       8        1        1      active sync   /dev/sda1
  2     2       8       65        2      active sync   /dev/sde1
  3     3       0        0        3      faulty removed
  4     4       8       49        4      active sync   /dev/sdd1
  5     5       8       17        5      active sync   /dev/sdb1
  6     6       8       81        6      active sync   /dev/sdf1
/dev/sde1:
         Magic : a92b4efc
       Version : 0.90.00
          UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3
 Creation Time : Thu Oct 11 00:01:49 2007
    Raid Level : raid6
 Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Array Size : 3662859520 (3493.18 GiB 3750.77 GB)
  Raid Devices : 7
 Total Devices : 7
Preferred Minor : 4

   Update Time : Sat Oct 30 05:10:12 2010
         State : active
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
 Spare Devices : 0
      Checksum : c22df0e2 - correct
        Events : 1882828

        Layout : left-symmetric
    Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
this     2       8       65        2      active sync   /dev/sde1

  0     0       8       97        0      active sync   /dev/sdg1
  1     1       8        1        1      active sync   /dev/sda1
  2     2       8       65        2      active sync   /dev/sde1
  3     3       0        0        3      faulty removed
  4     4       8       49        4      active sync   /dev/sdd1
  5     5       8       17        5      active sync   /dev/sdb1
  6     6       8       81        6      active sync   /dev/sdf1
/dev/sdf1:
         Magic : a92b4efc
       Version : 0.90.00
          UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3
 Creation Time : Thu Oct 11 00:01:49 2007
    Raid Level : raid6
 Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Array Size : 3662859520 (3493.18 GiB 3750.77 GB)
  Raid Devices : 7
 Total Devices : 7
Preferred Minor : 4

   Update Time : Sat Oct 30 05:10:12 2010
         State : active
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
 Spare Devices : 0
      Checksum : c22df0fa - correct
        Events : 1882828

        Layout : left-symmetric
    Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
this     6       8       81        6      active sync   /dev/sdf1

  0     0       8       97        0      active sync   /dev/sdg1
  1     1       8        1        1      active sync   /dev/sda1
  2     2       8       65        2      active sync   /dev/sde1
  3     3       0        0        3      faulty removed
  4     4       8       49        4      active sync   /dev/sdd1
  5     5       8       17        5      active sync   /dev/sdb1
  6     6       8       81        6      active sync   /dev/sdf1
/dev/sdg1:
         Magic : a92b4efc
       Version : 0.90.00
          UUID : 7438efd1:9e6ca2b5:d6b88274:7003b1d3
 Creation Time : Thu Oct 11 00:01:49 2007
    Raid Level : raid6
 Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Array Size : 3662859520 (3493.18 GiB 3750.77 GB)
  Raid Devices : 7
 Total Devices : 7
Preferred Minor : 4

   Update Time : Sat Oct 30 05:10:12 2010
         State : active
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
 Spare Devices : 0
      Checksum : c22df0fe - correct
        Events : 1882828

        Layout : left-symmetric
    Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
this     0       8       97        0      active sync   /dev/sdg1

  0     0       8       97        0      active sync   /dev/sdg1
  1     1       8        1        1      active sync   /dev/sda1
  2     2       8       65        2      active sync   /dev/sde1
  3     3       0        0        3      faulty removed
  4     4       8       49        4      active sync   /dev/sdd1
  5     5       8       17        5      active sync   /dev/sdb1
  6     6       8       81        6      active sync   /dev/sdf1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-30 11:56 argh! Jon Hardcastle
@ 2010-10-30 15:45 ` Phil Turmel
  2010-10-30 21:10   ` argh! Leslie Rhorer
  0 siblings, 1 reply; 16+ messages in thread
From: Phil Turmel @ 2010-10-30 15:45 UTC (permalink / raw)
  To: Jon; +Cc: Jon Hardcastle, linux-raid

On 10/30/2010 07:56 AM, Jon Hardcastle wrote:
> Guys,
> 
> a new HDD has failed on me during a scrub.... i tried to remove/fail it but
> it kept saying the device was busy. so i forced a reboot.
> 
> I have physically disconnected the drive..
> 
> can anyone take alook at the examine below and tell me if it is should
> assemble ok?
> 
> I tried
> 
> mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1
> /dev/sdg1

I'd try:

mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1

HTH,

Phil

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: argh!
  2010-10-30 15:45 ` argh! Phil Turmel
@ 2010-10-30 21:10   ` Leslie Rhorer
  2010-10-30 21:52     ` argh! Jon Hardcastle
  0 siblings, 1 reply; 16+ messages in thread
From: Leslie Rhorer @ 2010-10-30 21:10 UTC (permalink / raw)
  To: 'Phil Turmel', Jon; +Cc: 'Jon Hardcastle', linux-raid

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Phil Turmel
> Sent: Saturday, October 30, 2010 10:46 AM
> To: Jon@eHardcastle.com
> Cc: Jon Hardcastle; linux-raid@vger.kernel.org
> Subject: Re: argh!
> 
> On 10/30/2010 07:56 AM, Jon Hardcastle wrote:
> > Guys,
> >
> > a new HDD has failed on me during a scrub.... i tried to remove/fail it
> but
> > it kept saying the device was busy. so i forced a reboot.
> >
> > I have physically disconnected the drive..
> >
> > can anyone take alook at the examine below and tell me if it is should
> > assemble ok?
> >
> > I tried
> >
> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
> /dev/sdf1
> > /dev/sdg1
> 
> I'd try:
> 
> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1


	Yeah, I would, too.  Also, what are the contents of
/etc/mdadm/mdadm.conf?  If it is correct, then `mdadm --assemble --scan`
should work.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-30 21:10   ` argh! Leslie Rhorer
@ 2010-10-30 21:52     ` Jon Hardcastle
  2010-10-30 21:54       ` argh! Jon Hardcastle
  2010-10-30 23:57       ` argh! Leslie Rhorer
  0 siblings, 2 replies; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-30 21:52 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid

On 30 October 2010 22:10, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>> -----Original Message-----
>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
>> owner@vger.kernel.org] On Behalf Of Phil Turmel
>> Sent: Saturday, October 30, 2010 10:46 AM
>> To: Jon@eHardcastle.com
>> Cc: Jon Hardcastle; linux-raid@vger.kernel.org
>> Subject: Re: argh!
>>
>> On 10/30/2010 07:56 AM, Jon Hardcastle wrote:
>> > Guys,
>> >
>> > a new HDD has failed on me during a scrub.... i tried to remove/fail it
>> but
>> > it kept saying the device was busy. so i forced a reboot.
>> >
>> > I have physically disconnected the drive..
>> >
>> > can anyone take alook at the examine below and tell me if it is should
>> > assemble ok?
>> >
>> > I tried
>> >
>> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
>> /dev/sdf1
>> > /dev/sdg1
>>
>> I'd try:
>>
>> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1
>
>
>        Yeah, I would, too.  Also, what are the contents of
> /etc/mdadm/mdadm.conf?  If it is correct, then `mdadm --assemble --scan`
> should work.
>
>

Hey, yeah I am confused as drives have failed before and it has still
assembled. I think it is because it is unclean....

Can I ask how did you arrive at the command list? what is wrong with dbf?

also this is my mdadm.conf


DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1

ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3
ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98
ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662
ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-30 21:52     ` argh! Jon Hardcastle
@ 2010-10-30 21:54       ` Jon Hardcastle
  2010-10-30 22:01         ` argh! Jon Hardcastle
  2010-10-31  0:05         ` argh! Leslie Rhorer
  2010-10-30 23:57       ` argh! Leslie Rhorer
  1 sibling, 2 replies; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-30 21:54 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid

Also,

What commands can I not run? I.e. what are destructive?



On 30 October 2010 22:52, Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote:
> On 30 October 2010 22:10, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>>> -----Original Message-----
>>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
>>> owner@vger.kernel.org] On Behalf Of Phil Turmel
>>> Sent: Saturday, October 30, 2010 10:46 AM
>>> To: Jon@eHardcastle.com
>>> Cc: Jon Hardcastle; linux-raid@vger.kernel.org
>>> Subject: Re: argh!
>>>
>>> On 10/30/2010 07:56 AM, Jon Hardcastle wrote:
>>> > Guys,
>>> >
>>> > a new HDD has failed on me during a scrub.... i tried to remove/fail it
>>> but
>>> > it kept saying the device was busy. so i forced a reboot.
>>> >
>>> > I have physically disconnected the drive..
>>> >
>>> > can anyone take alook at the examine below and tell me if it is should
>>> > assemble ok?
>>> >
>>> > I tried
>>> >
>>> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
>>> /dev/sdf1
>>> > /dev/sdg1
>>>
>>> I'd try:
>>>
>>> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1
>>
>>
>>        Yeah, I would, too.  Also, what are the contents of
>> /etc/mdadm/mdadm.conf?  If it is correct, then `mdadm --assemble --scan`
>> should work.
>>
>>
>
> Hey, yeah I am confused as drives have failed before and it has still
> assembled. I think it is because it is unclean....
>
> Can I ask how did you arrive at the command list? what is wrong with dbf?
>
> also this is my mdadm.conf
>
>
> DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1
>
> ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3
> ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98
> ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662
> ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499
>



-- 
-----------------------
N: Jon Hardcastle
E: Jon@eHardcastle.com
JK: "What's a wombat for? Why for hitting Woms of course!"
'Q': 'There comes a time when you look into the mirror, and you
realise that what you see is all that you will ever be. Then you
accept it, or you kill yourself. Or you stop looking into mirrors...
:)'

Please note, I am phasing out jonathan DOT hardcastle AT gmail.com and
replacing it with jon AT eHardcastle.com
-----------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-30 21:54       ` argh! Jon Hardcastle
@ 2010-10-30 22:01         ` Jon Hardcastle
  2010-10-31  0:07           ` argh! Leslie Rhorer
  2010-10-31  0:05         ` argh! Leslie Rhorer
  1 sibling, 1 reply; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-30 22:01 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid

Sorry to spam.. if i run

'mdadm --assemble --scan -R'

the array assemles in an inactive state but it is suggesting i use
force.. but I am worried about doing damage?

Also, perhaps some extra commands for thick people would be cool? i.e.
force for things that are ideal.. like mounting an incomplete array
but having to specifiy it twice i.e. '-F -F' for things that can do
damage?

On 30 October 2010 22:54, Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote:
> Also,
>
> What commands can I not run? I.e. what are destructive?
>
>
>
> On 30 October 2010 22:52, Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote:
>> On 30 October 2010 22:10, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>>>> -----Original Message-----
>>>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
>>>> owner@vger.kernel.org] On Behalf Of Phil Turmel
>>>> Sent: Saturday, October 30, 2010 10:46 AM
>>>> To: Jon@eHardcastle.com
>>>> Cc: Jon Hardcastle; linux-raid@vger.kernel.org
>>>> Subject: Re: argh!
>>>>
>>>> On 10/30/2010 07:56 AM, Jon Hardcastle wrote:
>>>> > Guys,
>>>> >
>>>> > a new HDD has failed on me during a scrub.... i tried to remove/fail it
>>>> but
>>>> > it kept saying the device was busy. so i forced a reboot.
>>>> >
>>>> > I have physically disconnected the drive..
>>>> >
>>>> > can anyone take alook at the examine below and tell me if it is should
>>>> > assemble ok?
>>>> >
>>>> > I tried
>>>> >
>>>> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
>>>> /dev/sdf1
>>>> > /dev/sdg1
>>>>
>>>> I'd try:
>>>>
>>>> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1
>>>
>>>
>>>        Yeah, I would, too.  Also, what are the contents of
>>> /etc/mdadm/mdadm.conf?  If it is correct, then `mdadm --assemble --scan`
>>> should work.
>>>
>>>
>>
>> Hey, yeah I am confused as drives have failed before and it has still
>> assembled. I think it is because it is unclean....
>>
>> Can I ask how did you arrive at the command list? what is wrong with dbf?
>>
>> also this is my mdadm.conf
>>
>>
>> DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1
>>
>> ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3
>> ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98
>> ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662
>> ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499
>>
>
>
>
> --
> -----------------------
> N: Jon Hardcastle
> E: Jon@eHardcastle.com
> JK: "What's a wombat for? Why for hitting Woms of course!"
> 'Q': 'There comes a time when you look into the mirror, and you
> realise that what you see is all that you will ever be. Then you
> accept it, or you kill yourself. Or you stop looking into mirrors...
> :)'
>
> Please note, I am phasing out jonathan DOT hardcastle AT gmail.com and
> replacing it with jon AT eHardcastle.com
> -----------------------
>



-- 
-----------------------
N: Jon Hardcastle
E: Jon@eHardcastle.com
JK: "What's a wombat for? Why for hitting Woms of course!"
'Q': 'There comes a time when you look into the mirror, and you
realise that what you see is all that you will ever be. Then you
accept it, or you kill yourself. Or you stop looking into mirrors...
:)'

Please note, I am phasing out jonathan DOT hardcastle AT gmail.com and
replacing it with jon AT eHardcastle.com
-----------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: argh!
  2010-10-30 22:01         ` argh! Jon Hardcastle
@ 2010-10-31  0:07           ` Leslie Rhorer
  2010-10-31 18:52             ` argh! Jon Hardcastle
  0 siblings, 1 reply; 16+ messages in thread
From: Leslie Rhorer @ 2010-10-31  0:07 UTC (permalink / raw)
  To: Jon; +Cc: 'Phil Turmel', linux-raid

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Jon Hardcastle
> Sent: Saturday, October 30, 2010 5:01 PM
> To: Leslie Rhorer
> Cc: Phil Turmel; linux-raid@vger.kernel.org
> Subject: Re: argh!
> 
> Sorry to spam.. if i run
> 
> 'mdadm --assemble --scan -R'
> 
> the array assemles in an inactive state but it is suggesting i use
> force.. but I am worried about doing damage?
> 
> Also, perhaps some extra commands for thick people would be cool? i.e.
> force for things that are ideal.. like mounting an incomplete array
> but having to specifiy it twice i.e. '-F -F' for things that can do
> damage?

Assembly using --force won't do damage.  It simply will either pass or fail.
If it passes, proceed to mounting the array read-only.  If it fails, you'll
have to do more work.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-31  0:07           ` argh! Leslie Rhorer
@ 2010-10-31 18:52             ` Jon Hardcastle
  2010-10-31 19:43               ` argh! Neil Brown
  2010-11-01 21:39               ` argh! Leslie Rhorer
  0 siblings, 2 replies; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-31 18:52 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid

On 31 October 2010 01:07, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>> -----Original Message-----
>> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
>> owner@vger.kernel.org] On Behalf Of Jon Hardcastle
>> Sent: Saturday, October 30, 2010 5:01 PM
>> To: Leslie Rhorer
>> Cc: Phil Turmel; linux-raid@vger.kernel.org
>> Subject: Re: argh!
>>
>> Sorry to spam.. if i run
>>
>> 'mdadm --assemble --scan -R'
>>
>> the array assemles in an inactive state but it is suggesting i use
>> force.. but I am worried about doing damage?
>>
>> Also, perhaps some extra commands for thick people would be cool? i.e.
>> force for things that are ideal.. like mounting an incomplete array
>> but having to specifiy it twice i.e. '-F -F' for things that can do
>> damage?
>
> Assembly using --force won't do damage.  It simply will either pass or fail.
> If it passes, proceed to mounting the array read-only.  If it fails, you'll
> have to do more work.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Thanks for your help! mdadm --assemble /dev/md4 --run --force

did it.

I don't have backups as this is 4TB's of data and have never beeen
able to afford having a whole second machine, but the price of drives
has come down alot now so think I may build a noddy machine for weekly
backups.

Thanks for listing what commands are desrustive. Am running a 'check'
before mounting the arrays, I will then kick off a FS check on all
partitions.

I have been combing my log files, I think it was my controller that
failed not the drive (not confirmed as the drive is yet to be
reconnected) but I noticed that the mdadm booted the drive but then I
think it crashed due to a bug and hence the drive was still part of
the array.. i.e. it was still 'checking' when i checked and even after
a --fail the drive was still in the array and 'checking'

I have this from messages

Oct 30 05:02:08 localhost mdadm[13271]: Fail event detected on md
device /dev/md/4, component device /dev/sdc1
Oct 30 05:02:08 localhost kernel: ------------[ cut here ]------------
Oct 30 05:02:08 localhost kernel: kernel BUG at drivers/md/raid5.c:2768!
Oct 30 05:02:08 localhost kernel: invalid opcode: 0000 [#1] SMP
Oct 30 05:02:08 localhost kernel: last sysfs file:
/sys/devices/virtual/block/md4/md/metadata_version
Oct 30 05:02:08 localhost kernel: Modules linked in: ipv6 snd_seq_midi
snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss
snd_hda_codec_analog snd_cs4236 snd_wavefront snd_wss_lib snd_opl3_lib
snd_hda_intel snd_hda_codec snd_mpu401 snd_hwdep snd_mpu401_uart
snd_pcm snd_rawmidi snd_seq_device i2c_nforce2 ppdev pcspkr snd_timer
k8temp snd_page_alloc forcedeth i2c_core fan rtc_cmos ns558 snd
gameport processor rtc_core thermal rtc_lib button thermal_sys
parport_pc tg3 libphy e1000 fuse xfs exportfs nfs auth_rpcgss nfs_acl
lockd sunrpc jfs raid10 dm_bbr dm_snapshot dm_crypt dm_mirror
dm_region_hash dm_log dm_mod scsi_wait_scan sbp2 ohci1394 ieee1394
sl811_hcd usbhid ohci_hcd ssb uhci_hcd usb_storage ehci_hcd usbcore
aic94xx libsas lpfc qla2xxx megaraid_sas megaraid_mbox megaraid_mm
megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_xxxx mptsas
scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih
mptbase atp870u dc395x qla1280 imm parport dmx3191d sym53c8xx
qlogicfas408 gdth advansys initio BusLogic arcmsr aic7xxx aic79xx
scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix ahci
sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via
sata_svw sata_sil24 sata_sil sata_promise pata_pcmcia pcmcia
pcmcia_core
Oct 30 05:02:08 localhost kernel:
Oct 30 05:02:08 localhost kernel: Pid: 9967, comm: md4_raid6 Not
tainted (2.6.32-gentoo-r1 #1) System Product Name
Oct 30 05:02:08 localhost kernel: EIP: 0060:[<c0363658>] EFLAGS: 00010297 CPU: 0
Oct 30 05:02:08 localhost kernel: EIP is at handle_stripe+0x819/0x1617
Oct 30 05:02:08 localhost kernel: EAX: 00000006 EBX: dd19d1ac ECX:
00000003 EDX: 00000001
Oct 30 05:02:08 localhost kernel: ESI: dd19d1d4 EDI: 00000002 EBP:
dc843f1c ESP: dc843e50
Oct 30 05:02:08 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Oct 30 05:02:08 localhost kernel: Process md4_raid6 (pid: 9967,
ti=dc843000 task=de5e8510 task.ti=dc843000)
Oct 30 05:02:08 localhost kernel: Stack:
Oct 30 05:02:08 localhost kernel: de5e8510 97ac2223 00000007 dd0a8400
de4b91dc 00000007 c13a1360 00020003
Oct 30 05:02:08 localhost kernel: <0> dc89b7c0 00000008 00000003
00000246 dc843eb4 c04017c8 00000010 dd19d524
Oct 30 05:02:08 localhost kernel: <0> 00000006 fffffffc dd025534
dc843eb8 00000000 00000000 00000246 dd025534
Oct 30 05:02:08 localhost kernel: Call Trace:
Oct 30 05:02:08 localhost kernel: [<c04017c8>] ?
__mutex_lock_slowpath+0x1f4/0x1fc
Oct 30 05:02:08 localhost kernel: [<c0364796>] ? raid5d+0x340/0x37e

...... alot more
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-31 18:52             ` argh! Jon Hardcastle
@ 2010-10-31 19:43               ` Neil Brown
  2010-10-31 19:54                 ` argh! Jon Hardcastle
  2010-11-01 21:39               ` argh! Leslie Rhorer
  1 sibling, 1 reply; 16+ messages in thread
From: Neil Brown @ 2010-10-31 19:43 UTC (permalink / raw)
  To: Jon; +Cc: jonathan.hardcastle, Leslie Rhorer, Phil Turmel, linux-raid

On Sun, 31 Oct 2010 18:52:10 +0000
Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote:
> 
> I have this from messages
> 
> Oct 30 05:02:08 localhost mdadm[13271]: Fail event detected on md
> device /dev/md/4, component device /dev/sdc1
> Oct 30 05:02:08 localhost kernel: ------------[ cut here ]------------
> Oct 30 05:02:08 localhost kernel: kernel BUG at drivers/md/raid5.c:2768!

What kernel version was this?

Thanks,
NeilBrown



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-31 19:43               ` argh! Neil Brown
@ 2010-10-31 19:54                 ` Jon Hardcastle
  0 siblings, 0 replies; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-31 19:54 UTC (permalink / raw)
  To: Neil Brown; +Cc: Leslie Rhorer, Phil Turmel, linux-raid

On 31 October 2010 19:43, Neil Brown <neilb@suse.de> wrote:
> On Sun, 31 Oct 2010 18:52:10 +0000
> Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote:
>>
>> I have this from messages
>>
>> Oct 30 05:02:08 localhost mdadm[13271]: Fail event detected on md
>> device /dev/md/4, component device /dev/sdc1
>> Oct 30 05:02:08 localhost kernel: ------------[ cut here ]------------
>> Oct 30 05:02:08 localhost kernel: kernel BUG at drivers/md/raid5.c:2768!
>
> What kernel version was this?
>
> Thanks,
> NeilBrown
>
>
>

2.6.32-gentoo-r1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: argh!
  2010-10-31 18:52             ` argh! Jon Hardcastle
  2010-10-31 19:43               ` argh! Neil Brown
@ 2010-11-01 21:39               ` Leslie Rhorer
  1 sibling, 0 replies; 16+ messages in thread
From: Leslie Rhorer @ 2010-11-01 21:39 UTC (permalink / raw)
  To: Jon; +Cc: 'Phil Turmel', linux-raid

> I don't have backups as this is 4TB's of data and have never beeen
> able to afford having a whole second machine, but the price of drives
> has come down alot now so think I may build a noddy machine for weekly
> backups.

	4T worth of backup space can be had for $200.  If the data is not
worth $200 to you, then by all means you are free to ignore the need for
backup, but eventually you will lose at least some, if not all, of the data.
Although an on-line backup system is very handy, it is not absolutely
essential.  You could employ dar or some similar utility to backup the data
to individual off-line disks, for example.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: argh!
  2010-10-30 21:54       ` argh! Jon Hardcastle
  2010-10-30 22:01         ` argh! Jon Hardcastle
@ 2010-10-31  0:05         ` Leslie Rhorer
  1 sibling, 0 replies; 16+ messages in thread
From: Leslie Rhorer @ 2010-10-31  0:05 UTC (permalink / raw)
  To: Jon; +Cc: 'Phil Turmel', linux-raid



> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Jon Hardcastle
> Sent: Saturday, October 30, 2010 4:55 PM
> To: Leslie Rhorer
> Cc: Phil Turmel; linux-raid@vger.kernel.org
> Subject: Re: argh!
> 
> Also,
> 
> What commands can I not run? I.e. what are destructive?

	--build, --create, --add, --zero-superblock.  Some of the growth
commands can get you into trouble if something happens, and depending on the
version of mdadm.  --assemble --force should not, at least not in and of
itself.  I would take care to mount the filesystem read-only and check it
out thoroughly before re-mounting as read-write.  If there are problems, you
may need to restore some of the data from backups.  You have backups, right?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: argh!
  2010-10-30 21:52     ` argh! Jon Hardcastle
  2010-10-30 21:54       ` argh! Jon Hardcastle
@ 2010-10-30 23:57       ` Leslie Rhorer
  2010-10-31 21:18         ` argh! Jon Hardcastle
  1 sibling, 1 reply; 16+ messages in thread
From: Leslie Rhorer @ 2010-10-30 23:57 UTC (permalink / raw)
  To: Jon; +Cc: 'Phil Turmel', linux-raid

> >> > a new HDD has failed on me during a scrub.... i tried to remove/fail
> it
> >> but
> >> > it kept saying the device was busy. so i forced a reboot.

	BTW, it's better, if you can, to free up the device, rather than
reboot.  Now that you have rebooted, that's no longer possible.

> >> > I have physically disconnected the drive..
> >> >
> >> > can anyone take alook at the examine below and tell me if it is
> should
> >> > assemble ok?
> >> >
> >> > I tried
> >> >
> >> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
> >> /dev/sdf1
> >> > /dev/sdg1
> >>
> >> I'd try:
> >>
> >> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1
> >
> >
> >        Yeah, I would, too.  Also, what are the contents of
> > /etc/mdadm/mdadm.conf?  If it is correct, then `mdadm --assemble --scan`
> > should work.
> >
> >
> 
> Hey, yeah I am confused as drives have failed before and it has still
> assembled. I think it is because it is unclean....
> 
> Can I ask how did you arrive at the command list?

	Look at the results of --examine.  Every one shows the list of
drives and their order.

> what is wrong with dbf?

	'No idea.  SMART might give you an idea, or the kernel logs.

> also this is my mdadm.conf
> 
> 
> DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1
> 
> ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3
> ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98
> ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662
> ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499

	--scan may work.  I suggest updating the file with all the array
members.  Why are all the arrays assembled with 0.90 superblocks?  The 0.90
superblock has some significant limitations.  They may not be causing you
grief right now, but they could in the future.  The only arrays I have built
with 0.90 superblocks are the /boot targets, because GRUB2 does not support
1.x superblocks.  I've chosen 1.2 for all the others.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-30 23:57       ` argh! Leslie Rhorer
@ 2010-10-31 21:18         ` Jon Hardcastle
  2010-10-31 21:44           ` argh! Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Jon Hardcastle @ 2010-10-31 21:18 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Phil Turmel, linux-raid

On 31 October 2010 00:57, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>> >> > a new HDD has failed on me during a scrub.... i tried to remove/fail
>> it
>> >> but
>> >> > it kept saying the device was busy. so i forced a reboot.
>
>        BTW, it's better, if you can, to free up the device, rather than
> reboot.  Now that you have rebooted, that's no longer possible.
>
>> >> > I have physically disconnected the drive..
>> >> >
>> >> > can anyone take alook at the examine below and tell me if it is
>> should
>> >> > assemble ok?
>> >> >
>> >> > I tried
>> >> >
>> >> > mdadm --assemble /dev/md4 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
>> >> /dev/sdf1
>> >> > /dev/sdg1
>> >>
>> >> I'd try:
>> >>
>> >> mdadm --assemble /dev/md4 /dev/sd{g,a,e}1 missing /dev/sd{d,b,f}1
>> >
>> >
>> >        Yeah, I would, too.  Also, what are the contents of
>> > /etc/mdadm/mdadm.conf?  If it is correct, then `mdadm --assemble --scan`
>> > should work.
>> >
>> >
>>
>> Hey, yeah I am confused as drives have failed before and it has still
>> assembled. I think it is because it is unclean....
>>
>> Can I ask how did you arrive at the command list?
>
>        Look at the results of --examine.  Every one shows the list of
> drives and their order.
>
>> what is wrong with dbf?
>
>        'No idea.  SMART might give you an idea, or the kernel logs.
>
>> also this is my mdadm.conf
>>
>>
>> DEVICE /dev/sd[abcdefg]1 /dev/hd[ab]1
>>
>> ARRAY /dev/md/4 metadata=0.90 UUID=7438efd1:9e6ca2b5:d6b88274:7003b1d3
>> ARRAY /dev/md/3 metadata=0.90 UUID=a1f24bc9:4e72a820:3a03f7dc:07f9ab98
>> ARRAY /dev/md/2 metadata=0.90 UUID=0642323a:938992ef:b750ab21:e5a55662
>> ARRAY /dev/md/1 metadata=0.90 UUID=d4eeec62:148b3425:3f5e931c:bb3ef499
>
>        --scan may work.  I suggest updating the file with all the array
> members.  Why are all the arrays assembled with 0.90 superblocks?  The 0.90
> superblock has some significant limitations.  They may not be causing you
> grief right now, but they could in the future.  The only arrays I have built
> with 0.90 superblocks are the /boot targets, because GRUB2 does not support
> 1.x superblocks.  I've chosen 1.2 for all the others.
>

Hi,

Thanks for your help. I use 0.90 as that is what there was when the
machine was build ~3yrs ago.. the array has been grown and resized
since then.

Does anyone have a feature list for the superblocks? Why upgrade.....?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-31 21:18         ` argh! Jon Hardcastle
@ 2010-10-31 21:44           ` Neil Brown
  2010-11-01  1:51             ` argh! John Robinson
  0 siblings, 1 reply; 16+ messages in thread
From: Neil Brown @ 2010-10-31 21:44 UTC (permalink / raw)
  To: Jon; +Cc: jonathan.hardcastle, Leslie Rhorer, Phil Turmel, linux-raid

On Sun, 31 Oct 2010 21:18:52 +0000
Jon Hardcastle <jonathan.hardcastle@gmail.com> wrote:

> Hi,
> 
> Thanks for your help. I use 0.90 as that is what there was when the
> machine was build ~3yrs ago.. the array has been grown and resized
> since then.
> 
> Does anyone have a feature list for the superblocks? Why upgrade.....?

The "md" man page mentions a couple of differences:
  - v1.x can handle more than 28 devices in an array
  - v1.x can easily be moved between hosts with different endian-ness
  - v1.x can put the metadata at the front of the array

I should probably add the other differences.

  - with 0.90 there can be confusion about whether a superblock applies
    to the whole device or to just the last partition (if it start on a
    64K boundary).  1.x doesn't have that problem
  - With 1.x a device recovery can be checkpointed and restarted.
  - with 0.90, the maximum component for RAID1 or higher is 2TB (or maybe
    4TB, not sure).  With 1.x you can go much higher.


Those are the only ones I can think of at the moment.

It is rarely worth the effort to upgrade, but usually best to choose 1.2
for new arrays that you don't want to boot off.  If you want to boot of the
array, then whatever works with your boot-loader is the best choice.

NeilBrown


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: argh!
  2010-10-31 21:44           ` argh! Neil Brown
@ 2010-11-01  1:51             ` John Robinson
  0 siblings, 0 replies; 16+ messages in thread
From: John Robinson @ 2010-11-01  1:51 UTC (permalink / raw)
  To: Neil Brown
  Cc: Jon, jonathan.hardcastle, Leslie Rhorer, Phil Turmel, linux-raid

On 31/10/2010 21:44, Neil Brown wrote:
> On Sun, 31 Oct 2010 21:18:52 +0000
> Jon Hardcastle<jonathan.hardcastle@gmail.com>  wrote:
>
>> Hi,
>>
>> Thanks for your help. I use 0.90 as that is what there was when the
>> machine was build ~3yrs ago.. the array has been grown and resized
>> since then.
>>
>> Does anyone have a feature list for the superblocks? Why upgrade.....?
>
> The "md" man page mentions a couple of differences:
>    - v1.x can handle more than 28 devices in an array
>    - v1.x can easily be moved between hosts with different endian-ness
>    - v1.x can put the metadata at the front of the array
>
> I should probably add the other differences.
>
>    - with 0.90 there can be confusion about whether a superblock applies
>      to the whole device or to just the last partition (if it start on a
>      64K boundary).  1.x doesn't have that problem
>    - With 1.x a device recovery can be checkpointed and restarted.
>    - with 0.90, the maximum component for RAID1 or higher is 2TB (or maybe
>      4TB, not sure).  With 1.x you can go much higher.

Aha. Some other good info for me to perhaps incorporate if I ever get 
round to trying to patch the man page. In fact I probably ought to 
review the last few months' list postings, and especially Neil B's.

Cheers,

John.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-11-01 21:39 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-30 11:56 argh! Jon Hardcastle
2010-10-30 15:45 ` argh! Phil Turmel
2010-10-30 21:10   ` argh! Leslie Rhorer
2010-10-30 21:52     ` argh! Jon Hardcastle
2010-10-30 21:54       ` argh! Jon Hardcastle
2010-10-30 22:01         ` argh! Jon Hardcastle
2010-10-31  0:07           ` argh! Leslie Rhorer
2010-10-31 18:52             ` argh! Jon Hardcastle
2010-10-31 19:43               ` argh! Neil Brown
2010-10-31 19:54                 ` argh! Jon Hardcastle
2010-11-01 21:39               ` argh! Leslie Rhorer
2010-10-31  0:05         ` argh! Leslie Rhorer
2010-10-30 23:57       ` argh! Leslie Rhorer
2010-10-31 21:18         ` argh! Jon Hardcastle
2010-10-31 21:44           ` argh! Neil Brown
2010-11-01  1:51             ` argh! John Robinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).