linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* why the kernel and mdadm report differently
@ 2005-09-05  9:01 Farkas Levente
  2005-09-05  9:11 ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Farkas Levente @ 2005-09-05  9:01 UTC (permalink / raw)
  To: linux-raid

hi,
one of our raid array is crash all the time (once a week), and one more 
stange thing that it's currently not working, kernel report inactive 
while mdadm said it's active, degraded. what's more we cant put this 
array into active state.
this is mdadm 1.12, just another site note there is no rpm for version 
2.0:-(
yours.
---------------------------------------------------
[root@kek:~] mdadm --detail /dev/md2
/dev/md2:
         Version : 00.90.01
   Creation Time : Tue Jun  1 09:37:17 2004
      Raid Level : raid5
     Device Size : 120053632 (114.49 GiB 122.93 GB)
    Raid Devices : 7
   Total Devices : 7
Preferred Minor : 2
     Persistence : Superblock is persistent

     Update Time : Sat Sep  3 16:10:39 2005
           State : active, degraded
  Active Devices : 6
Working Devices : 7
  Failed Devices : 0
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 128K

            UUID : 79b566fd:924d9c94:15304031:0c945006
          Events : 0.3130

     Number   Major   Minor   RaidDevice State
        0       8        1        0      active sync   /dev/sda1
        1       8       17        1      active sync   /dev/sdb1
        2       8       33        2      active sync   /dev/sdc1
        3       0        0        -      removed
        4       8       65        4      active sync   /dev/sde1
        5       8       81        5      active sync   /dev/sdf1
        6       8       97        6      active sync   /dev/sdg1

        7       8      113        -      spare   /dev/sdh1
[root@kek:~] cat /proc/mdstat
Personalities : [raid1] [raid5]
md1 : active raid1 hdc1[0] hda1[1]
       1048704 blocks [2/2] [UU]

md2 : inactive sda1[0] sdh1[7] sdg1[6] sdf1[5] sde1[4] sdc1[2] sdb1[1]
       840375424 blocks
md0 : active raid1 hdc2[0] hda2[1]
       39097664 blocks [2/2] [UU]

unused devices: <none>
---------------------------------------------------

-- 
   Levente                               "Si vis pacem para bellum!"


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05  9:01 why the kernel and mdadm report differently Farkas Levente
@ 2005-09-05  9:11 ` Neil Brown
  2005-09-05 11:49   ` Farkas Levente
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2005-09-05  9:11 UTC (permalink / raw)
  To: Farkas Levente; +Cc: linux-raid

On Monday September 5, lfarkas@bppiac.hu wrote:
> hi,
> one of our raid array is crash all the time (once a week), 

Any kernel error messages?

> and one more 
> stange thing that it's currently not working, kernel report inactive 
> while mdadm said it's active, degraded. what's more we cant put this 
> array into active state.

Looks like you need to stop in (mdadm -S /dev/md2) and re-assemble it
with --force:
  mdadm -A /dev/md2 -f /dev/sd[abcefgh]1

It looks like the computer crashed and when it came back up it was
missing a drive.  This situation can result in silent data corruption,
which is why md won't automatically assemble it.  When you do assemble
it, you should at least fsck the filesystem, and possibly check for
data corruption if that is possible.  At least be aware that some data
could be corrupt (there is a good chance that nothing is, but it is by
no means certain).


> this is mdadm 1.12, just another site note there is no rpm for version 
> 2.0:-(

No.  I seem to remember some odd compile issue with making the RPM
and thinking "I don't care".   Maybe I should care a bit more....

NeilBronw

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05  9:11 ` Neil Brown
@ 2005-09-05 11:49   ` Farkas Levente
  2005-09-05 12:36     ` David M. Strang
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Farkas Levente @ 2005-09-05 11:49 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Neil Brown wrote:
> On Monday September 5, lfarkas@bppiac.hu wrote:
> 
>>hi,
>>one of our raid array is crash all the time (once a week), 
> 
> 
> Any kernel error messages?

i already send report about this a few times to this list without any 
response, but i send you a private message will all logs.

>>and one more 
>>stange thing that it's currently not working, kernel report inactive 
>>while mdadm said it's active, degraded. what's more we cant put this 
>>array into active state.
> 
> 
> Looks like you need to stop in (mdadm -S /dev/md2) and re-assemble it
> with --force:
>   mdadm -A /dev/md2 -f /dev/sd[abcefgh]1
> 
> It looks like the computer crashed and when it came back up it was
> missing a drive.  This situation can result in silent data corruption,
> which is why md won't automatically assemble it.  When you do assemble
> it, you should at least fsck the filesystem, and possibly check for
> data corruption if that is possible.  At least be aware that some data
> could be corrupt (there is a good chance that nothing is, but it is by
> no means certain).

it works. but shouldn't it have to be both inactive or active?


>>this is mdadm 1.12, just another site note there is no rpm for version 
>>2.0:-(
> 
> 
> No.  I seem to remember some odd compile issue with making the RPM
> and thinking "I don't care".   Maybe I should care a bit more....

would be useful.


-- 
   Levente                               "Si vis pacem para bellum!"


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05 11:49   ` Farkas Levente
@ 2005-09-05 12:36     ` David M. Strang
  2005-09-05 13:39       ` Farkas Levente
  2005-09-05 13:30     ` Farkas Levente
  2005-09-06  1:16     ` Neil Brown
  2 siblings, 1 reply; 8+ messages in thread
From: David M. Strang @ 2005-09-05 12:36 UTC (permalink / raw)
  To: Farkas Levente, Neil Brown; +Cc: linux-raid

Farkas Levente wrote:
> >>this is mdadm 1.12, just another site note there is no rpm for version 
> >>2.0:-(
> > 
> > 
> > No.  I seem to remember some odd compile issue with making the RPM
> > and thinking "I don't care".   Maybe I should care a bit more....
> 
> would be useful.

Not trying to be rude; but the install of mdadm is pathetically easy. 

Untar/Gzip the source; and type:

    make
    make install


-- David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05 11:49   ` Farkas Levente
  2005-09-05 12:36     ` David M. Strang
@ 2005-09-05 13:30     ` Farkas Levente
  2005-09-06  1:18       ` Neil Brown
  2005-09-06  1:16     ` Neil Brown
  2 siblings, 1 reply; 8+ messages in thread
From: Farkas Levente @ 2005-09-05 13:30 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Farkas Levente wrote:
>>> and one more stange thing that it's currently not working, kernel 
>>> report inactive while mdadm said it's active, degraded. what's more 
>>> we cant put this array into active state.
>>
>>
>>
>> Looks like you need to stop in (mdadm -S /dev/md2) and re-assemble it
>> with --force:
>>   mdadm -A /dev/md2 -f /dev/sd[abcefgh]1
>>
>> It looks like the computer crashed and when it came back up it was
>> missing a drive.  This situation can result in silent data corruption,
>> which is why md won't automatically assemble it.  When you do assemble
>> it, you should at least fsck the filesystem, and possibly check for
>> data corruption if that is possible.  At least be aware that some data
>> could be corrupt (there is a good chance that nothing is, but it is by
>> no means certain).
> 
> 
> it works. but shouldn't it have to be both inactive or active?

or seems to works, but now do nothing?!:
--------------------------------------------------------
[root@kek:~] cat /proc/mdstat
Personalities : [raid1] [raid5]
md1 : active raid1 hdc1[0] hda1[1]
       1048704 blocks [2/2] [UU]

md2 : active raid5 sdc1[7] sda1[0] sdh1[8] sdg1[6] sdf1[5] sde1[4] sdb1[1]
       720321792 blocks level 5, 128k chunk, algorithm 2 [7/5] [UU__UUU]

md0 : active raid1 hdc2[0] hda2[1]
       39097664 blocks [2/2] [UU]

unused devices: <none>
[root@kek:~] mdadm --detail /dev/md2
/dev/md2:
         Version : 00.90.01
   Creation Time : Tue Jun  1 09:37:17 2004
      Raid Level : raid5
      Array Size : 720321792 (686.95 GiB 737.61 GB)
     Device Size : 120053632 (114.49 GiB 122.93 GB)
    Raid Devices : 7
   Total Devices : 7
Preferred Minor : 2
     Persistence : Superblock is persistent

     Update Time : Mon Sep  5 15:28:20 2005
           State : clean, degraded
  Active Devices : 5
Working Devices : 7
  Failed Devices : 0
   Spare Devices : 2

          Layout : left-symmetric
      Chunk Size : 128K

            UUID : 79b566fd:924d9c94:15304031:0c945006
          Events : 0.4244279

     Number   Major   Minor   RaidDevice State
        0       8        1        0      active sync   /dev/sda1
        1       8       17        1      active sync   /dev/sdb1
        2       0        0        -      removed
        3       0        0        -      removed
        4       8       65        4      active sync   /dev/sde1
        5       8       81        5      active sync   /dev/sdf1
        6       8       97        6      active sync   /dev/sdg1

        7       8       33        2      spare rebuilding   /dev/sdc1
        8       8      113        3      spare rebuilding   /dev/sdh1
--------------------------------------------------------


-- 
   Levente                               "Si vis pacem para bellum!"


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05 12:36     ` David M. Strang
@ 2005-09-05 13:39       ` Farkas Levente
  0 siblings, 0 replies; 8+ messages in thread
From: Farkas Levente @ 2005-09-05 13:39 UTC (permalink / raw)
  To: David M. Strang; +Cc: Neil Brown, linux-raid

David M. Strang wrote:
> Farkas Levente wrote:
> 
>> >>this is mdadm 1.12, just another site note there is no rpm for 
>> version >>2.0:-(
>> > > > No.  I seem to remember some odd compile issue with making the RPM
>> > and thinking "I don't care".   Maybe I should care a bit more....
>>
>> would be useful.
> 
> 
> Not trying to be rude; but the install of mdadm is pathetically easy.
> Untar/Gzip the source; and type:
> 
>    make
>    make install

yes, but simply i don't like to put anything into the main filesystem 
which can't be checked later (ie. which package own it, it has the right 
checksum, etc..) and these are just with rpm. what's more i'd like to 
update packages automaticaly on all of our server from or local packages 
repository, so if i put mdadm into this list than all of our server do 
it without my manual install.

-- 
   Levente                               "Si vis pacem para bellum!"


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05 11:49   ` Farkas Levente
  2005-09-05 12:36     ` David M. Strang
  2005-09-05 13:30     ` Farkas Levente
@ 2005-09-06  1:16     ` Neil Brown
  2 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2005-09-06  1:16 UTC (permalink / raw)
  To: Farkas Levente; +Cc: linux-raid

On Monday September 5, lfarkas@bppiac.hu wrote:
> Neil Brown wrote:
> > On Monday September 5, lfarkas@bppiac.hu wrote:
> > 
> >>hi,
> >>one of our raid array is crash all the time (once a week), 
> > 
> > 
> > Any kernel error messages?
> 
> i already send report about this a few times to this list without any 
> response, but i send you a private message will all logs.

Yes, so you had, I must has missed them, sorry.

It isn't clear exactly what the problem is from the information you
were able to provide.
There was one bug in raid5 that was hitting you that has been fixed
since 2.9.6, but I don't think it was the main part of your problem.
I'm a bit suspicious of the low-level driver - is it SCSI or SATA?

In any case, I strongly suggest trying 2.6.13 and seeing if that
helps.

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: why the kernel and mdadm report differently
  2005-09-05 13:30     ` Farkas Levente
@ 2005-09-06  1:18       ` Neil Brown
  0 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2005-09-06  1:18 UTC (permalink / raw)
  To: Farkas Levente; +Cc: linux-raid

On Monday September 5, lfarkas@bppiac.hu wrote:
> Farkas Levente wrote:
> >>> and one more stange thing that it's currently not working, kernel 
> >>> report inactive while mdadm said it's active, degraded. what's more 
> >>> we cant put this array into active state.
> >>
> >>
> >>
> >> Looks like you need to stop in (mdadm -S /dev/md2) and re-assemble it
> >> with --force:
> >>   mdadm -A /dev/md2 -f /dev/sd[abcefgh]1
> >>
> >> It looks like the computer crashed and when it came back up it was
> >> missing a drive.  This situation can result in silent data corruption,
> >> which is why md won't automatically assemble it.  When you do assemble
> >> it, you should at least fsck the filesystem, and possibly check for
> >> data corruption if that is possible.  At least be aware that some data
> >> could be corrupt (there is a good chance that nothing is, but it is by
> >> no means certain).
> > 
> > 
> > it works. but shouldn't it have to be both inactive or active?
> 
> or seems to works, but now do nothing?!:
> --------------------------------------------------------
> [root@kek:~] cat /proc/mdstat
> Personalities : [raid1] [raid5]
> md1 : active raid1 hdc1[0] hda1[1]
>        1048704 blocks [2/2] [UU]
> 
> md2 : active raid5 sdc1[7] sda1[0] sdh1[8] sdg1[6] sdf1[5] sde1[4] sdb1[1]
>        720321792 blocks level 5, 128k chunk, algorithm 2 [7/5] [UU__UUU]

Looks like you've either got some really sick drives, or a sick
controller or drivers.  Try a newer kernel, and see if you can run
some tests in the drives.

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-09-06  1:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-05  9:01 why the kernel and mdadm report differently Farkas Levente
2005-09-05  9:11 ` Neil Brown
2005-09-05 11:49   ` Farkas Levente
2005-09-05 12:36     ` David M. Strang
2005-09-05 13:39       ` Farkas Levente
2005-09-05 13:30     ` Farkas Levente
2005-09-06  1:18       ` Neil Brown
2005-09-06  1:16     ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).