All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Problem with mdadm 2.6.7
       [not found] <488C531B.5060005@mandriva.org>
@ 2008-07-27 14:26 ` Thomas Backlund
  2008-07-27 15:06   ` Doug Ledford
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Backlund @ 2008-07-27 14:26 UTC (permalink / raw)
  To: linux-raid

Hi,
(please cc me as I'm not subscribed)

I have hit a bug with mdadm 2.6.7

It rebuilds my raid5 array on every boot
(raid0 and raid1 arrays are not affected)

This didn't happend with 2.6.4

kernels tested are 2.6.24.7 and 2.6.25.12

Arch is x86_64
Distro Mandriva 2008.1, but I've tested wich kernel.org kernels and 
upstream mdadm 2.6.7 and have the same problem

Now I could try to bisect it, but every raid5 rebuild takes 6-7 hours, 
so I thought about asking for pointers before...

Any ideas where to start looking ?

here is the info on the array that gets rebuilt...


[root@tmb ~]# mdadm --detail /dev/md8
/dev/md8:
         Version : 00.90
   Creation Time : Fri Feb  1 17:44:23 2008
      Raid Level : raid5
      Array Size : 1465143808 (1397.27 GiB 1500.31 GB)
   Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
    Raid Devices : 3
   Total Devices : 3
Preferred Minor : 8
     Persistence : Superblock is persistent

     Update Time : Sun Jul 27 13:32:30 2008
           State : clean, degraded, recovering
  Active Devices : 2
Working Devices : 3
  Failed Devices : 0
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 128K

  Rebuild Status : 1% complete

            UUID : dc482f3f:ad67b9ef:bb6636b8:e9392071
          Events : 0.16162

     Number   Major   Minor   RaidDevice State
        0       8       33        0      active sync   /dev/sdc1
        1       8       49        1      active sync   /dev/sdd1
        3       8       65        2      spare rebuilding   /dev/sde1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with mdadm 2.6.7
  2008-07-27 14:26 ` Problem with mdadm 2.6.7 Thomas Backlund
@ 2008-07-27 15:06   ` Doug Ledford
  2008-07-27 15:24     ` Thomas Backlund
  0 siblings, 1 reply; 5+ messages in thread
From: Doug Ledford @ 2008-07-27 15:06 UTC (permalink / raw)
  To: Thomas Backlund; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2671 bytes --]

On Sun, 2008-07-27 at 17:26 +0300, Thomas Backlund wrote:
> Hi,
> (please cc me as I'm not subscribed)
> 
> I have hit a bug with mdadm 2.6.7
> 
> It rebuilds my raid5 array on every boot
> (raid0 and raid1 arrays are not affected)
> 
> This didn't happend with 2.6.4
> 
> kernels tested are 2.6.24.7 and 2.6.25.12
> 
> Arch is x86_64
> Distro Mandriva 2008.1, but I've tested wich kernel.org kernels and 
> upstream mdadm 2.6.7 and have the same problem
> 
> Now I could try to bisect it, but every raid5 rebuild takes 6-7 hours, 
> so I thought about asking for pointers before...
> 
> Any ideas where to start looking ?

Are you using mkinitrd (or something similar) to start the arrays, or
are you using udev rules that call mdadm --incremental --run?  If it's
the later, then this is what you get when A) the array is started as
soon as there are enough devices to run in degraded mode and B)
something writes to the array before the last device gets added and C)
you don't have a bitmap to allow the array to keep track of what blocks
need resynced and therefore it resynces the entire drive.

> here is the info on the array that gets rebuilt...
> 
> 
> [root@tmb ~]# mdadm --detail /dev/md8
> /dev/md8:
>          Version : 00.90
>    Creation Time : Fri Feb  1 17:44:23 2008
>       Raid Level : raid5
>       Array Size : 1465143808 (1397.27 GiB 1500.31 GB)
>    Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>     Raid Devices : 3
>    Total Devices : 3
> Preferred Minor : 8
>      Persistence : Superblock is persistent
> 
>      Update Time : Sun Jul 27 13:32:30 2008
>            State : clean, degraded, recovering
>   Active Devices : 2
> Working Devices : 3
>   Failed Devices : 0
>    Spare Devices : 1
> 
>           Layout : left-symmetric
>       Chunk Size : 128K
> 
>   Rebuild Status : 1% complete
> 
>             UUID : dc482f3f:ad67b9ef:bb6636b8:e9392071
>           Events : 0.16162
> 
>      Number   Major   Minor   RaidDevice State
>         0       8       33        0      active sync   /dev/sdc1
>         1       8       49        1      active sync   /dev/sdd1
>         3       8       65        2      spare rebuilding   /dev/sde1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Doug Ledford <dledford@redhat.com>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with mdadm 2.6.7
  2008-07-27 15:06   ` Doug Ledford
@ 2008-07-27 15:24     ` Thomas Backlund
  2008-07-27 19:31       ` Doug Ledford
  2008-07-29  0:59       ` Neil Brown
  0 siblings, 2 replies; 5+ messages in thread
From: Thomas Backlund @ 2008-07-27 15:24 UTC (permalink / raw)
  To: Doug Ledford; +Cc: linux-raid@vger.kernel.org

Doug Ledford skrev:
> On Sun, 2008-07-27 at 17:26 +0300, Thomas Backlund wrote:
>> Hi,
>> (please cc me as I'm not subscribed)
>>
>> I have hit a bug with mdadm 2.6.7
>>
>> It rebuilds my raid5 array on every boot
>> (raid0 and raid1 arrays are not affected)
>>
>> This didn't happend with 2.6.4
>>
>> kernels tested are 2.6.24.7 and 2.6.25.12
>>
>> Arch is x86_64
>> Distro Mandriva 2008.1, but I've tested wich kernel.org kernels and 
>> upstream mdadm 2.6.7 and have the same problem
>>
>> Now I could try to bisect it, but every raid5 rebuild takes 6-7 hours, 
>> so I thought about asking for pointers before...
>>
>> Any ideas where to start looking ?
> 
> Are you using mkinitrd (or something similar) to start the arrays, or
> are you using udev rules that call mdadm --incremental --run?  If it's
> the later, then this is what you get when A) the array is started as
> soon as there are enough devices to run in degraded mode and B)
> something writes to the array before the last device gets added and C)
> you don't have a bitmap to allow the array to keep track of what blocks
> need resynced and therefore it resynces the entire drive.
> 

I'm using udev.

but looking at the difference between 2.6.4 and 2.6.7:

diff -Nurp mdadm-2.6.4/etc/udev/rules.d/70-mdadm.rules 
mdadm-2.6.7/etc/udev/rules.d/70-mdadm.rules
--- mdadm-2.6.4/etc/udev/rules.d/70-mdadm.rules	2008-07-27 
13:14:10.000000000 +0300
+++ mdadm-2.6.7/etc/udev/rules.d/70-mdadm.rules	2008-07-27 
13:11:13.000000000 +0300
@@ -3,4 +3,4 @@
  # See udev(8) for syntax

  SUBSYSTEM=="block", ACTION=="add|change", 
ENV{ID_FS_TYPE}=="linux_raid*", \
-	RUN+="/sbin/mdadm --incremental $root/%k"
+	RUN+="/sbin/mdadm --incremental --run --scan $root/%k"


I see that --incremental was already there in 2.6.4, so I guess the 
--run is the one messing with me...




as for --bitmap, can it be added to an existing array ?

What is the better choice, bitmap=internal or bitmap=<some_file> ?

I'd hate to have to recreate the array, as I have about 1.2GB of data on 
it...

--
Thomas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with mdadm 2.6.7
  2008-07-27 15:24     ` Thomas Backlund
@ 2008-07-27 19:31       ` Doug Ledford
  2008-07-29  0:59       ` Neil Brown
  1 sibling, 0 replies; 5+ messages in thread
From: Doug Ledford @ 2008-07-27 19:31 UTC (permalink / raw)
  To: Thomas Backlund; +Cc: linux-raid@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2614 bytes --]

On Sun, 2008-07-27 at 18:24 +0300, Thomas Backlund wrote:
> Doug Ledford skrev:
> > On Sun, 2008-07-27 at 17:26 +0300, Thomas Backlund wrote:
> >> Hi,
> >> (please cc me as I'm not subscribed)
> >>
> >> I have hit a bug with mdadm 2.6.7
> >>
> >> It rebuilds my raid5 array on every boot
> >> (raid0 and raid1 arrays are not affected)
> >>
> >> This didn't happend with 2.6.4
> >>
> >> kernels tested are 2.6.24.7 and 2.6.25.12
> >>
> >> Arch is x86_64
> >> Distro Mandriva 2008.1, but I've tested wich kernel.org kernels and 
> >> upstream mdadm 2.6.7 and have the same problem
> >>
> >> Now I could try to bisect it, but every raid5 rebuild takes 6-7 hours, 
> >> so I thought about asking for pointers before...
> >>
> >> Any ideas where to start looking ?
> > 
> > Are you using mkinitrd (or something similar) to start the arrays, or
> > are you using udev rules that call mdadm --incremental --run?  If it's
> > the later, then this is what you get when A) the array is started as
> > soon as there are enough devices to run in degraded mode and B)
> > something writes to the array before the last device gets added and C)
> > you don't have a bitmap to allow the array to keep track of what blocks
> > need resynced and therefore it resynces the entire drive.
> > 
> 
> I'm using udev.
> 
> but looking at the difference between 2.6.4 and 2.6.7:
> 
> diff -Nurp mdadm-2.6.4/etc/udev/rules.d/70-mdadm.rules 
> mdadm-2.6.7/etc/udev/rules.d/70-mdadm.rules
> --- mdadm-2.6.4/etc/udev/rules.d/70-mdadm.rules	2008-07-27 
> 13:14:10.000000000 +0300
> +++ mdadm-2.6.7/etc/udev/rules.d/70-mdadm.rules	2008-07-27 
> 13:11:13.000000000 +0300
> @@ -3,4 +3,4 @@
>   # See udev(8) for syntax
> 
>   SUBSYSTEM=="block", ACTION=="add|change", 
> ENV{ID_FS_TYPE}=="linux_raid*", \
> -	RUN+="/sbin/mdadm --incremental $root/%k"
> +	RUN+="/sbin/mdadm --incremental --run --scan $root/%k"
> 
> 
> I see that --incremental was already there in 2.6.4, so I guess the 
> --run is the one messing with me...

Yep.  That'd be it.
> 
> 
> 
> as for --bitmap, can it be added to an existing array ?
> 
> What is the better choice, bitmap=internal or bitmap=<some_file> ?
> 
> I'd hate to have to recreate the array, as I have about 1.2GB of data on 
> it...

I use bitmap=internal on my arrays and never have a problem.

> --
> Thomas
-- 
Doug Ledford <dledford@redhat.com>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with mdadm 2.6.7
  2008-07-27 15:24     ` Thomas Backlund
  2008-07-27 19:31       ` Doug Ledford
@ 2008-07-29  0:59       ` Neil Brown
  1 sibling, 0 replies; 5+ messages in thread
From: Neil Brown @ 2008-07-29  0:59 UTC (permalink / raw)
  To: Thomas Backlund; +Cc: Doug Ledford, linux-raid@vger.kernel.org

On Sunday July 27, tmb@mandriva.org wrote:
> 
> as for --bitmap, can it be added to an existing array ?

Yes.
> 
> What is the better choice, bitmap=internal or bitmap=<some_file> ?

Internal is easier.

The command is
   mdadm --grow /dev/mdXXX --bitmap=internal

This could have a negative impact on performance.  The degree depends
a lot on your workload.
If you notice a reduction in write speed that is not acceptable, and
if removing the bitmap (mdadm --grow /dev/mdXXX --bitmap=none) fixes
the problem, then we can try to find an alternate solution.

> 
> I'd hate to have to recreate the array, as I have about 1.2GB of data on 
> it...

There is no need for that.

NeilBrown


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-07-29  0:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <488C531B.5060005@mandriva.org>
2008-07-27 14:26 ` Problem with mdadm 2.6.7 Thomas Backlund
2008-07-27 15:06   ` Doug Ledford
2008-07-27 15:24     ` Thomas Backlund
2008-07-27 19:31       ` Doug Ledford
2008-07-29  0:59       ` Neil Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.