possible bug - bitmap dirty pages status

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* possible bug - bitmap dirty pages status
@ 2011-08-27  9:58 CoolCold
  2011-08-31  9:05 ` CoolCold
  0 siblings, 1 reply; 13+ messages in thread
From: CoolCold @ 2011-08-27  9:58 UTC (permalink / raw)
  To: Linux RAID

Hello!
I have raid1 array with bitmap (md3), one disk has died, been replaced
and array resynced:

Aug 25 15:38:03 gamma2 kernel: [    5.986791] md: md3 stopped.
Aug 25 15:38:03 gamma2 kernel: [    6.043306] raid1: raid set md3
active with 1 out of 2 mirrors
Aug 25 15:38:03 gamma2 kernel: [    6.044378] md3: bitmap initialized
from disk: read 2/2 pages, set 357 bits
Aug 25 15:38:03 gamma2 kernel: [    6.044442] created bitmap (22
pages) for device md3
Aug 25 15:38:03 gamma2 kernel: [    6.070492] md3: detected capacity
change from 0 to 1478197903360
Aug 25 15:38:03 gamma2 kernel: [    6.070862]  md3: unknown partition table
Aug 26 19:33:33 gamma2 kernel: [100325.814179] md: md3: recovery done.

Now, /proc/mdstats still shows "dirty" bitmap, 16 of 22 pages:

root@gamma2:~# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sdc3[1] sdb3[0]
      1443552640 blocks [2/2] [UU]
      bitmap: 16/22 pages [64KB], 32768KB chunk

root@gamma2:~# mdadm -D /dev/md3
/dev/md3:
        Version : 00.90
  Creation Time : Wed Oct 13 03:13:52 2010
     Raid Level : raid1
     Array Size : 1443552640 (1376.68 GiB 1478.20 GB)
  Used Dev Size : 1443552640 (1376.68 GiB 1478.20 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 3
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Sat Aug 27 13:53:57 2011
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : f5a9c1da:83dd4c40:d363f6aa:3cbcebe5
         Events : 0.1381014

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       1       8       35        1      active sync   /dev/sdc3



That is strange as I can understand.

distrib & mdadm info:
Debian Lenny with openvz kernel from lenny-backports.

root@gamma2:~# cat /proc/version
Linux version 2.6.32-bpo.5-openvz-amd64 (Debian 2.6.32-35~bpo50+1)
(norbert@tretkowski.de) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 SMP
Wed Jul 20 11:23:01 UTC 2011
root@gamma2:~# mdadm --version
mdadm - v2.6.7.2 - 14th November 2008




-- 
Best regards,
[COOLCOLD-RIPN]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-27  9:58 possible bug - bitmap dirty pages status CoolCold
@ 2011-08-31  9:05 ` CoolCold
  2011-08-31 12:30   ` Paul Clements
  0 siblings, 1 reply; 13+ messages in thread
From: CoolCold @ 2011-08-31  9:05 UTC (permalink / raw)
  To: Linux RAID

On Sat, Aug 27, 2011 at 1:58 PM, CoolCold <coolthecold@gmail.com> wrote:
> Hello!
> I have raid1 array with bitmap (md3), one disk has died, been replaced
> and array resynced:
>
> Aug 25 15:38:03 gamma2 kernel: [    5.986791] md: md3 stopped.
> Aug 25 15:38:03 gamma2 kernel: [    6.043306] raid1: raid set md3
> active with 1 out of 2 mirrors
> Aug 25 15:38:03 gamma2 kernel: [    6.044378] md3: bitmap initialized
> from disk: read 2/2 pages, set 357 bits
> Aug 25 15:38:03 gamma2 kernel: [    6.044442] created bitmap (22
> pages) for device md3
> Aug 25 15:38:03 gamma2 kernel: [    6.070492] md3: detected capacity
> change from 0 to 1478197903360
> Aug 25 15:38:03 gamma2 kernel: [    6.070862]  md3: unknown partition table
> Aug 26 19:33:33 gamma2 kernel: [100325.814179] md: md3: recovery done.
>
> Now, /proc/mdstats still shows "dirty" bitmap, 16 of 22 pages:
>
> root@gamma2:~# cat /proc/mdstat
> Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> md3 : active raid1 sdc3[1] sdb3[0]
>      1443552640 blocks [2/2] [UU]
>      bitmap: 16/22 pages [64KB], 32768KB chunk
>
> root@gamma2:~# mdadm -D /dev/md3
> /dev/md3:
>        Version : 00.90
>  Creation Time : Wed Oct 13 03:13:52 2010
>     Raid Level : raid1
>     Array Size : 1443552640 (1376.68 GiB 1478.20 GB)
>  Used Dev Size : 1443552640 (1376.68 GiB 1478.20 GB)
>   Raid Devices : 2
>  Total Devices : 2
> Preferred Minor : 3
>    Persistence : Superblock is persistent
>
>  Intent Bitmap : Internal
>
>    Update Time : Sat Aug 27 13:53:57 2011
>          State : active
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>  Spare Devices : 0
>
>           UUID : f5a9c1da:83dd4c40:d363f6aa:3cbcebe5
>         Events : 0.1381014
>
>    Number   Major   Minor   RaidDevice State
>       0       8       19        0      active sync   /dev/sdb3
>       1       8       35        1      active sync   /dev/sdc3
>
>
>
> That is strange as I can understand.
>
> distrib & mdadm info:
> Debian Lenny with openvz kernel from lenny-backports.
>
> root@gamma2:~# cat /proc/version
> Linux version 2.6.32-bpo.5-openvz-amd64 (Debian 2.6.32-35~bpo50+1)
> (norbert@tretkowski.de) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 SMP
> Wed Jul 20 11:23:01 UTC 2011
> root@gamma2:~# mdadm --version
> mdadm - v2.6.7.2 - 14th November 2008
>
More details now:
Array still has 16/22 dirty pages:

root@gamma2:~# date;cat /proc/mdstat |head -n 5
Wed Aug 31 13:04:26 MSD 2011
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sdc3[1] sdb3[0]
      1443552640 blocks [2/2] [UU]
      bitmap: 16/22 pages [64KB], 32768KB chunk


bitmap info:

root@gamma2:~# mdadm --examine-bitmap /dev/sdb3
        Filename : /dev/sdb3
           Magic : 6d746962
         Version : 4
            UUID : f5a9c1da:83dd4c40:d363f6aa:3cbcebe5
          Events : 1381014
  Events Cleared : 1381014
           State : OK
       Chunksize : 32 MB
          Daemon : 5s flush period
      Write Mode : Normal
       Sync Size : 1443552640 (1376.68 GiB 1478.20 GB)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
root@gamma2:~# mdadm --examine-bitmap /dev/sdc3
        Filename : /dev/sdc3
           Magic : 6d746962
         Version : 4
            UUID : f5a9c1da:83dd4c40:d363f6aa:3cbcebe5
          Events : 1381014
  Events Cleared : 1381014
           State : OK
       Chunksize : 32 MB
          Daemon : 5s flush period
      Write Mode : Normal
       Sync Size : 1443552640 (1376.68 GiB 1478.20 GB)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)


>
> --
> Best regards,
> [COOLCOLD-RIPN]
>



-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-31  9:05 ` CoolCold
@ 2011-08-31 12:30   ` Paul Clements
  2011-08-31 12:56     ` John Robinson
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Clements @ 2011-08-31 12:30 UTC (permalink / raw)
  To: CoolCold; +Cc: Linux RAID

On Wed, Aug 31, 2011 at 5:05 AM, CoolCold <coolthecold@gmail.com> wrote:

>> root@gamma2:~# cat /proc/mdstat
>> Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
>> md3 : active raid1 sdc3[1] sdb3[0]
>>      1443552640 blocks [2/2] [UU]
>>      bitmap: 16/22 pages [64KB], 32768KB chunk

> More details now:
> Array still has 16/22 dirty pages:

> root@gamma2:~# mdadm --examine-bitmap /dev/sdc3

>       Sync Size : 1443552640 (1376.68 GiB 1478.20 GB)
>          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)

But only 189 bits dirty. This means the bits are just distributed
across the disk (which is why you have 16/22 pages dirty).

Any activity on the disk? 189 bits could easily be explained by a
small amount of background disk activity on a disk that big.

--
Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-31 12:30   ` Paul Clements
@ 2011-08-31 12:56     ` John Robinson
  2011-08-31 13:16       ` CoolCold
  0 siblings, 1 reply; 13+ messages in thread
From: John Robinson @ 2011-08-31 12:56 UTC (permalink / raw)
  To: Paul Clements; +Cc: CoolCold, Linux RAID

On 31/08/2011 13:30, Paul Clements wrote:
> On Wed, Aug 31, 2011 at 5:05 AM, CoolCold<coolthecold@gmail.com>  wrote:
>
>>> root@gamma2:~# cat /proc/mdstat
>>> Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
>>> md3 : active raid1 sdc3[1] sdb3[0]
>>>       1443552640 blocks [2/2] [UU]
>>>       bitmap: 16/22 pages [64KB], 32768KB chunk
>
>> More details now:
>> Array still has 16/22 dirty pages:
>
>> root@gamma2:~# mdadm --examine-bitmap /dev/sdc3
>
>>        Sync Size : 1443552640 (1376.68 GiB 1478.20 GB)
>>           Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
>
> But only 189 bits dirty. This means the bits are just distributed
> across the disk (which is why you have 16/22 pages dirty).
>
> Any activity on the disk? 189 bits could easily be explained by a
> small amount of background disk activity on a disk that big.

That makes sense to me. I have:

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1]
md1 : active raid6 sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
       2929966080 blocks level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]
       bitmap: 2/4 pages [8KB], 131072KB chunk

Oh no! Half my array is dirty! But then:

# mdadm --examine-bitmap /dev/sdb2
         Filename : /dev/sdb2
            Magic : 6d746962
          Version : 4
             UUID : d8c57a89:166ee722:23adec48:1574b5fc
           Events : 1338800
   Events Cleared : 1338800
            State : OK
        Chunksize : 128 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 976655360 (931.41 GiB 1000.10 GB)
           Bitmap : 7452 bits (chunks), 7 dirty (0.1%)

Not so bad after all.

On the other hand, repeatedly checking /proc/mdstat shows different 
numbers of pages being dirty, and --examine-bitmap shows different 
numbers of bits being dirty each time, whereas CoolCold managed 16 pages 
repeatedly and 189 bits being dirty twice in a row. CoolCold, please can 
you test --examine-bitmap again several times at least 5 seconds apart?

Cheers,

John.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-31 12:56     ` John Robinson
@ 2011-08-31 13:16       ` CoolCold
  2011-08-31 14:08         ` Paul Clements
  0 siblings, 1 reply; 13+ messages in thread
From: CoolCold @ 2011-08-31 13:16 UTC (permalink / raw)
  To: John Robinson; +Cc: Paul Clements, Linux RAID

On Wed, Aug 31, 2011 at 4:56 PM, John Robinson
<john.robinson@anonymous.org.uk> wrote:
> On 31/08/2011 13:30, Paul Clements wrote:
>>
>> On Wed, Aug 31, 2011 at 5:05 AM, CoolCold<coolthecold@gmail.com>  wrote:
>>
>>>> root@gamma2:~# cat /proc/mdstat
>>>> Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
>>>> md3 : active raid1 sdc3[1] sdb3[0]
>>>>      1443552640 blocks [2/2] [UU]
>>>>      bitmap: 16/22 pages [64KB], 32768KB chunk
>>
>>> More details now:
>>> Array still has 16/22 dirty pages:
>>
>>> root@gamma2:~# mdadm --examine-bitmap /dev/sdc3
>>
>>>       Sync Size : 1443552640 (1376.68 GiB 1478.20 GB)
>>>          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
>>
>> But only 189 bits dirty. This means the bits are just distributed
>> across the disk (which is why you have 16/22 pages dirty).
>>
>> Any activity on the disk? 189 bits could easily be explained by a
>> small amount of background disk activity on a disk that big.
>
> That makes sense to me. I have:
>
> $ cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4] [raid1]
> md1 : active raid6 sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
>      2929966080 blocks level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]
>      bitmap: 2/4 pages [8KB], 131072KB chunk
>
> Oh no! Half my array is dirty! But then:
>
> # mdadm --examine-bitmap /dev/sdb2
>        Filename : /dev/sdb2
>           Magic : 6d746962
>         Version : 4
>            UUID : d8c57a89:166ee722:23adec48:1574b5fc
>          Events : 1338800
>  Events Cleared : 1338800
>           State : OK
>       Chunksize : 128 MB
>          Daemon : 5s flush period
>      Write Mode : Normal
>       Sync Size : 976655360 (931.41 GiB 1000.10 GB)
>          Bitmap : 7452 bits (chunks), 7 dirty (0.1%)
>
> Not so bad after all.
Makes sense, but...

>
> On the other hand, repeatedly checking /proc/mdstat shows different numbers
> of pages being dirty, and --examine-bitmap shows different numbers of bits
> being dirty each time, whereas CoolCold managed 16 pages repeatedly and 189
> bits being dirty twice in a row. CoolCold, please can you test
> --examine-bitmap again several times at least 5 seconds apart?

fast test with "sleep 5" reveals:
root@gamma2:~# for i in {1..20};do mdadm --examine-bitmap
/dev/sdc3|grep "Bitmap :";sleep 5;done
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)

And 16/22 lasts for 4 days.

>
> Cheers,
>
> John.
>
>



-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-31 13:16       ` CoolCold
@ 2011-08-31 14:08         ` Paul Clements
  2011-08-31 20:16           ` CoolCold
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Clements @ 2011-08-31 14:08 UTC (permalink / raw)
  To: CoolCold; +Cc: John Robinson, Linux RAID

On Wed, Aug 31, 2011 at 9:16 AM, CoolCold <coolthecold@gmail.com> wrote:

>          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
>
> And 16/22 lasts for 4 days.

So if you force another resync, does it change/clear up?

If you unmount/stop all activity does it change?

--
Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-31 14:08         ` Paul Clements
@ 2011-08-31 20:16           ` CoolCold
  2011-09-01  5:40             ` NeilBrown
  0 siblings, 1 reply; 13+ messages in thread
From: CoolCold @ 2011-08-31 20:16 UTC (permalink / raw)
  To: Paul Clements; +Cc: John Robinson, Linux RAID

On Wed, Aug 31, 2011 at 6:08 PM, Paul Clements
<paul.clements@us.sios.com> wrote:
> On Wed, Aug 31, 2011 at 9:16 AM, CoolCold <coolthecold@gmail.com> wrote:
>
>>          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
>>
>> And 16/22 lasts for 4 days.
>
> So if you force another resync, does it change/clear up?
>
> If you unmount/stop all activity does it change?
Well, this server is in production now, may be i'll be able to do
array stop/start later..right now i've set "cat /proc/mdstat" every
minute, and bitmap examine every minute, will see later is it changing
or not.

>
> --
> Paul
>



-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-08-31 20:16           ` CoolCold
@ 2011-09-01  5:40             ` NeilBrown
  2011-11-14 23:11               ` linbloke
  0 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2011-09-01  5:40 UTC (permalink / raw)
  To: CoolCold; +Cc: Paul Clements, John Robinson, Linux RAID

On Thu, 1 Sep 2011 00:16:36 +0400 CoolCold <coolthecold@gmail.com> wrote:

> On Wed, Aug 31, 2011 at 6:08 PM, Paul Clements
> <paul.clements@us.sios.com> wrote:
> > On Wed, Aug 31, 2011 at 9:16 AM, CoolCold <coolthecold@gmail.com> wrote:
> >
> >>          Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
> >>
> >> And 16/22 lasts for 4 days.
> >
> > So if you force another resync, does it change/clear up?
> >
> > If you unmount/stop all activity does it change?
> Well, this server is in production now, may be i'll be able to do
> array stop/start later..right now i've set "cat /proc/mdstat" every
> minute, and bitmap examine every minute, will see later is it changing
> or not.
> 

I spent altogether too long staring at the code and I can see various things
that could be usefully tidied but but nothing that really explains what you
have.

If there was no write activity to the array at all I can just see how that
last bits to be set might not get cleared, but as soon as another write
happened all those old bits would get cleared pretty quickly.  And it seems
unlikely that there have been no writes for over 4 days (???).

I don't think having these bits here is harmful and it would be easy to get
rid of them by using "mdadm --grow" to remove and then re-add the bitmap,
but I wish I knew what caused it...

I clean up the little issues I found in mainline and hope there isn't a
larger problem luking behind all this..

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-09-01  5:40             ` NeilBrown
@ 2011-11-14 23:11               ` linbloke
  2011-11-16  2:30                 ` NeilBrown
       [not found]                 ` <CAGqmV7qpQBHLcJ9J9cP1zDw6kp6aLcaCMneFYEgcPOu7doXSMA@mail.gmail.com>
  0 siblings, 2 replies; 13+ messages in thread
From: linbloke @ 2011-11-14 23:11 UTC (permalink / raw)
  To: NeilBrown; +Cc: CoolCold, Paul Clements, John Robinson, Linux RAID

On 1/09/11 3:40 PM, NeilBrown wrote:
> On Thu, 1 Sep 2011 00:16:36 +0400 CoolCold<coolthecold@gmail.com>  wrote:
>
>> On Wed, Aug 31, 2011 at 6:08 PM, Paul Clements
>> <paul.clements@us.sios.com>  wrote:
>>> On Wed, Aug 31, 2011 at 9:16 AM, CoolCold<coolthecold@gmail.com>  wrote:
>>>
>>>>           Bitmap : 44054 bits (chunks), 189 dirty (0.4%)
>>>>
>>>> And 16/22 lasts for 4 days.
>>> So if you force another resync, does it change/clear up?
>>>
>>> If you unmount/stop all activity does it change?
>> Well, this server is in production now, may be i'll be able to do
>> array stop/start later..right now i've set "cat /proc/mdstat" every
>> minute, and bitmap examine every minute, will see later is it changing
>> or not.
>>
> I spent altogether too long staring at the code and I can see various things
> that could be usefully tidied but but nothing that really explains what you
> have.
>
> If there was no write activity to the array at all I can just see how that
> last bits to be set might not get cleared, but as soon as another write
> happened all those old bits would get cleared pretty quickly.  And it seems
> unlikely that there have been no writes for over 4 days (???).
>
> I don't think having these bits here is harmful and it would be easy to get
> rid of them by using "mdadm --grow" to remove and then re-add the bitmap,
> but I wish I knew what caused it...
>
> I clean up the little issues I found in mainline and hope there isn't a
> larger problem luking behind all this..
>
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message tomajordomo@vger.kernel.org
> More majordomo info athttp://vger.kernel.org/majordomo-info.html
Hello,

Sorry for bumping this thread but I couldn't find any resolution 
post-dated. I'm seeing the same thing with SLES11 SP1. No matter how 
long I wait or how often I sync(8), the number of dirty bitmap pages 
does not reduce to zero - 52 has become the new zero for this array 
(md101). I've tried writing more data to prod the sync  - the result was 
an increase in the dirty page count (53/465) and then return to the base 
count (52/465) after 5seconds. I haven't tried removing the bitmaps and 
am a little reluctant to unless this would help to diagnose the bug.

This array is part of a nested array set as mentioned in another mail 
list thread with the Subject: Rotating RAID 1. Another thing happening 
with this array is that the top array (md106), the one with the 
filesystem on it, has the file system exported via NFS to a dozen or so 
other systems. There has been no activity on this array for at least a 
couple of minutes.

I certainly don't feel comfortable that I have created a mirror of the 
component devices. Can I expect the devices to actually be in sync at 
this point?

Thanks,

Josh

wynyard:~ # mdadm -V
mdadm - v3.0.3 - 22nd October 2009
wynyard:~ # uname -a
Linux wynyard 2.6.32.36-0.5-xen #1 SMP 2011-04-14 10:12:31 +0200 x86_64 
x86_64 x86_64 GNU/Linux
wynyard:~ #



Info with disks A and B connected:
======================

wynyard:~ # cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] [linear]
md106 : active raid1 md105[0]
       1948836134 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md105 : active raid1 md104[0]
       1948836270 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md104 : active raid1 md103[0]
       1948836406 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md103 : active raid1 md102[0]
       1948836542 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md102 : active raid1 md101[0]
       1948836678 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md101 : active raid1 md100[0]
       1948836814 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md100 : active raid1 sdm1[0] sdl1[1]
       1948836950 blocks super 1.2 [2/2] [UU]
       bitmap: 2/465 pages [8KB], 2048KB chunk

wynyard:~ # mdadm -Dvv /dev/md100
/dev/md100:
         Version : 1.02
   Creation Time : Thu Oct 27 13:38:09 2011
      Raid Level : raid1
      Array Size : 1948836950 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 1948836950 (1858.56 GiB 1995.61 GB)
    Raid Devices : 2
   Total Devices : 2
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Mon Nov 14 16:39:56 2011
           State : active
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            Name : wynyard:h001r006  (local to host wynyard)
            UUID : 0996cae3:fc585bc5:64443402:bf1bef33
          Events : 8694

     Number   Major   Minor   RaidDevice State
        0       8      193        0      active sync   /dev/sdm1
        1       8      177        1      active sync   /dev/sdl1

wynyard:~ # mdadm -Evv /dev/sd[ml]1
/dev/sdl1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 0996cae3:fc585bc5:64443402:bf1bef33
            Name : wynyard:h001r006  (local to host wynyard)
   Creation Time : Thu Oct 27 13:38:09 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673900 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673900 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 5d5bf5ef:e17923ec:0e6e683a:e27f4470

Internal Bitmap : 8 sectors from superblock
     Update Time : Mon Nov 14 16:52:12 2011
        Checksum : 987bd49d - correct
          Events : 8694


    Device Role : Active device 1
    Array State : AA ('A' == active, '.' == missing)
/dev/sdm1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 0996cae3:fc585bc5:64443402:bf1bef33
            Name : wynyard:h001r006  (local to host wynyard)
   Creation Time : Thu Oct 27 13:38:09 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673900 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673900 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 59bc1fed:426ef5e6:cf840334:4e95eb5b

Internal Bitmap : 8 sectors from superblock
     Update Time : Mon Nov 14 16:52:12 2011
        Checksum : 75ba5626 - correct
          Events : 8694


    Device Role : Active device 0
    Array State : AA ('A' == active, '.' == missing)

Disk B failed and removed with mdadm and physically. Disk C inserted, 
partition table written and then added to array:
======================================
Nov 14 17:08:50 wynyard kernel: [1122597.943932] raid1: Disk failure on 
sdl1, disabling device.
Nov 14 17:08:50 wynyard kernel: [1122597.943934] raid1: Operation 
continuing on 1 devices.
Nov 14 17:08:50 wynyard kernel: [1122597.989996] RAID1 conf printout:
Nov 14 17:08:50 wynyard kernel: [1122597.989999]  --- wd:1 rd:2
Nov 14 17:08:50 wynyard kernel: [1122597.990002]  disk 0, wo:0, o:1, 
dev:sdm1
Nov 14 17:08:50 wynyard kernel: [1122597.990005]  disk 1, wo:1, o:0, 
dev:sdl1
Nov 14 17:08:50 wynyard kernel: [1122598.008913] RAID1 conf printout:
Nov 14 17:08:50 wynyard kernel: [1122598.008917]  --- wd:1 rd:2
Nov 14 17:08:50 wynyard kernel: [1122598.008921]  disk 0, wo:0, o:1, 
dev:sdm1
Nov 14 17:08:50 wynyard kernel: [1122598.008949] md: unbind<sdl1>
Nov 14 17:08:50 wynyard kernel: [1122598.056909] md: export_rdev(sdl1)
Nov 14 17:09:43 wynyard kernel: [1122651.587010] 3w-9xxx: scsi6: AEN: 
WARNING (0x04:0x0019): Drive removed:port=8.
Nov 14 17:10:03 wynyard kernel: [1122671.723726] 3w-9xxx: scsi6: AEN: 
ERROR (0x04:0x001E): Unit inoperable:unit=8.
Nov 14 17:11:33 wynyard kernel: [1122761.729297] 3w-9xxx: scsi6: AEN: 
INFO (0x04:0x001A): Drive inserted:port=8.
Nov 14 17:13:44 wynyard kernel: [1122892.474990] 3w-9xxx: scsi6: AEN: 
INFO (0x04:0x001F): Unit operational:unit=8.
Nov 14 17:19:36 wynyard kernel: [1123244.535530]  sdl: unknown partition 
table
Nov 14 17:19:40 wynyard kernel: [1123248.384154]  sdl: sdl1
Nov 14 17:24:18 wynyard kernel: [1123526.292861] md: bind<sdl1>
Nov 14 17:24:19 wynyard kernel: [1123526.904213] RAID1 conf printout:
Nov 14 17:24:19 wynyard kernel: [1123526.904217]  --- wd:1 rd:2
Nov 14 17:24:19 wynyard kernel: [1123526.904221]  disk 0, wo:0, o:1, 
dev:md100
Nov 14 17:24:19 wynyard kernel: [1123526.904224]  disk 1, wo:1, o:1, 
dev:sdl1
Nov 14 17:24:19 wynyard kernel: [1123526.904362] md: recovery of RAID 
array md101
Nov 14 17:24:19 wynyard kernel: [1123526.904367] md: minimum 
_guaranteed_  speed: 1000 KB/sec/disk.
Nov 14 17:24:19 wynyard kernel: [1123526.904370] md: using maximum 
available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Nov 14 17:24:19 wynyard kernel: [1123526.904376] md: using 128k window, 
over a total of 1948836814 blocks.
Nov 15 00:32:07 wynyard kernel: [1149195.478735] md: md101: recovery done.
Nov 15 00:32:07 wynyard kernel: [1149195.599964] RAID1 conf printout:
Nov 15 00:32:07 wynyard kernel: [1149195.599967]  --- wd:2 rd:2
Nov 15 00:32:07 wynyard kernel: [1149195.599971]  disk 0, wo:0, o:1, 
dev:md100
Nov 15 00:32:07 wynyard kernel: [1149195.599975]  disk 1, wo:0, o:1, 
dev:sdl1

Write data to filesystem on md106. Then idle:

wynyard:~ # iostat 5 /dev/md106 | grep md106
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
md106           156.35         0.05      1249.25      54878 1473980720
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0
md106             0.00         0.00         0.00          0          0


Info with disks A and C connected:
======================
wynyard:~ # cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] [linear]
md106 : active raid1 md105[0]
       1948836134 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md105 : active raid1 md104[0]
       1948836270 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md104 : active raid1 md103[0]
       1948836406 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md103 : active raid1 md102[0]
       1948836542 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md102 : active raid1 md101[0]
       1948836678 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md101 : active raid1 sdl1[2] md100[0]
       1948836814 blocks super 1.2 [2/2] [UU]
       bitmap: 52/465 pages [208KB], 2048KB chunk

md100 : active raid1 sdm1[0]
       1948836950 blocks super 1.2 [2/1] [U_]
       bitmap: 26/465 pages [104KB], 2048KB chunk


wynyard:~ # mdadm -Dvv /dev/md101
/dev/md101:
         Version : 1.02
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
      Array Size : 1948836814 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 1948836814 (1858.56 GiB 1995.61 GB)
    Raid Devices : 2
   Total Devices : 2
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Tue Nov 15 09:07:25 2011
           State : active
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            Name : wynyard:h001r007  (local to host wynyard)
            UUID : 8846dfde:ab7e2902:4a37165d:c7269466
          Events : 53486

     Number   Major   Minor   RaidDevice State
        0       9      100        0      active sync   /dev/md100
        2       8      177        1      active sync   /dev/sdl1

wynyard:~ # mdadm -Evv /dev/md100 /dev/sdl1
/dev/md100:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 8846dfde:ab7e2902:4a37165d:c7269466
            Name : wynyard:h001r007  (local to host wynyard)
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673628 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673628 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : d806cfd5:d641043e:70b32b6b:082c730b

Internal Bitmap : 8 sectors from superblock
     Update Time : Tue Nov 15 09:07:48 2011
        Checksum : 628f9f77 - correct
          Events : 53486


    Device Role : Active device 0
    Array State : AA ('A' == active, '.' == missing)
/dev/sdl1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 8846dfde:ab7e2902:4a37165d:c7269466
            Name : wynyard:h001r007  (local to host wynyard)
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673900 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673628 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 3897673628 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 4689d883:19bbaa1f:584c89fc:7fafd176

Internal Bitmap : 8 sectors from superblock
     Update Time : Tue Nov 15 09:07:48 2011
        Checksum : eefbb899 - correct
          Events : 53486


    Device Role : spare
    Array State : AA ('A' == active, '.' == missing)

wynyard:~ # mdadm -vv --examine-bitmap /dev/md100 /dev/sdl1
         Filename : /dev/md100
            Magic : 6d746962
          Version : 4
             UUID : 8846dfde:ab7e2902:4a37165d:c7269466
           Events : 53486
   Events Cleared : 0
            State : OK
        Chunksize : 2 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 1948836814 (1858.56 GiB 1995.61 GB)
           Bitmap : 951581 bits (chunks), 29902 dirty (3.1%)
         Filename : /dev/sdl1
            Magic : 6d746962
          Version : 4
             UUID : 8846dfde:ab7e2902:4a37165d:c7269466
           Events : 53486
   Events Cleared : 0
            State : OK
        Chunksize : 2 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 1948836814 (1858.56 GiB 1995.61 GB)
           Bitmap : 951581 bits (chunks), 29902 dirty (3.1%)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-11-14 23:11               ` linbloke
@ 2011-11-16  2:30                 ` NeilBrown
  2011-11-21 21:50                   ` linbloke
       [not found]                 ` <CAGqmV7qpQBHLcJ9J9cP1zDw6kp6aLcaCMneFYEgcPOu7doXSMA@mail.gmail.com>
  1 sibling, 1 reply; 13+ messages in thread
From: NeilBrown @ 2011-11-16  2:30 UTC (permalink / raw)
  To: linbloke; +Cc: CoolCold, Paul Clements, John Robinson, Linux RAID

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

On Tue, 15 Nov 2011 10:11:51 +1100 linbloke <linbloke@fastmail.fm> wrote:
> Hello,
> 
> Sorry for bumping this thread but I couldn't find any resolution 
> post-dated. I'm seeing the same thing with SLES11 SP1. No matter how 
> long I wait or how often I sync(8), the number of dirty bitmap pages 
> does not reduce to zero - 52 has become the new zero for this array 
> (md101). I've tried writing more data to prod the sync  - the result was 
> an increase in the dirty page count (53/465) and then return to the base 
> count (52/465) after 5seconds. I haven't tried removing the bitmaps and 
> am a little reluctant to unless this would help to diagnose the bug.
> 
> This array is part of a nested array set as mentioned in another mail 
> list thread with the Subject: Rotating RAID 1. Another thing happening 
> with this array is that the top array (md106), the one with the 
> filesystem on it, has the file system exported via NFS to a dozen or so 
> other systems. There has been no activity on this array for at least a 
> couple of minutes.
> 
> I certainly don't feel comfortable that I have created a mirror of the 
> component devices. Can I expect the devices to actually be in sync at 
> this point?

Hi,
 thanks for the report.
 I can understand your discomfort.  Unfortunately I haven't been able to
 discover with any confidence what the problem is, so I cannot completely
 relieve that discomfort.  I have found another possible issue - a race that
 could cause md to forget that it needs to clean out a page of the bitmap.
 I could imagine that causing 1 or maybe 2 pages to be stuck, but I don't
 think it can explain 52.

 Can can check if you actually have a mirror by:
    echo check > /sys/block/md101/md/sync_action
 then wait for that to finish and check ..../mismatch_cnt.
 I'm quite confident that will report 0.  I strongly suspect the problem is
 that we forget to clear pages or bits, not that we forget to use them during
 recovery.

 So don't think that keeping the bitmaps will help in diagnosing the
 problem.   We I need is a sequence of events that is likely to produce the
 problem, and I realise that is hard to come by.

 Sorry that I cannot be more helpful yet.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
       [not found]                 ` <CAGqmV7qpQBHLcJ9J9cP1zDw6kp6aLcaCMneFYEgcPOu7doXSMA@mail.gmail.com>
@ 2011-11-16  3:07                   ` NeilBrown
  2011-11-16  9:36                     ` CoolCold
  0 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2011-11-16  3:07 UTC (permalink / raw)
  To: CoolCold; +Cc: linbloke, Paul Clements, John Robinson, Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1630 bytes --]

On Wed, 16 Nov 2011 03:13:51 +0400 CoolCold <coolthecold@gmail.com> wrote:

> As I promised I was collecting data, but forgot to return to that
> problem, bumping thread returned me to that state ;)
> So, data was collected for almost the month - from 31 August to 26 September:
> root@gamma2:/root# grep -A 1 dirty component_examine.txt |head
>           Bitmap : 44054 bits (chunks), 190 dirty (0.4%)
> Wed Aug 31 17:32:16 MSD 2011
> 
> root@gamma2:/root# grep -A 1 dirty component_examine.txt |tail -n 2
>           Bitmap : 44054 bits (chunks), 1 dirty (0.0%)
> Mon Sep 26 00:28:33 MSD 2011
> 
> As i can understand from that dump, it was bitmap examination (-X key)
> of component /dev/sdc3 of raid /dev/md3.
> Decreasing happend, though after some increase on 23 of September, and
> first decrease to 0 happened on 24 of September (line number 436418).
> 
> So almost for month, dirty count was no decreasing!
> I'm attaching that log, may be it will help somehow.

Thanks a lot.
Any idea what happened at on Fri Sep 23??
Between 6:23am and midnight the number of dirty bits dropped from 180 to 2.

This does seem to suggest that md is just losing track of some of the pages
of bits and once they are modified again md remembers to flush them and write
them out - which is a fairly safe way to fail.

The one issue I have found is that set_page_attr uses a non-atomic __set_bit
because it should always be called under a spinlock.  But bitmap_write_all()
- which is called when a spare is added - calls it without the spinlock so
that could corrupt some of the bits.

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-11-16  3:07                   ` NeilBrown
@ 2011-11-16  9:36                     ` CoolCold
  0 siblings, 0 replies; 13+ messages in thread
From: CoolCold @ 2011-11-16  9:36 UTC (permalink / raw)
  To: NeilBrown; +Cc: linbloke, Paul Clements, John Robinson, Linux RAID

On Wed, Nov 16, 2011 at 7:07 AM, NeilBrown <neilb@suse.de> wrote:
> On Wed, 16 Nov 2011 03:13:51 +0400 CoolCold <coolthecold@gmail.com> wrote:
>
>> As I promised I was collecting data, but forgot to return to that
>> problem, bumping thread returned me to that state ;)
>> So, data was collected for almost the month - from 31 August to 26 September:
>> root@gamma2:/root# grep -A 1 dirty component_examine.txt |head
>>           Bitmap : 44054 bits (chunks), 190 dirty (0.4%)
>> Wed Aug 31 17:32:16 MSD 2011
>>
>> root@gamma2:/root# grep -A 1 dirty component_examine.txt |tail -n 2
>>           Bitmap : 44054 bits (chunks), 1 dirty (0.0%)
>> Mon Sep 26 00:28:33 MSD 2011
>>
>> As i can understand from that dump, it was bitmap examination (-X key)
>> of component /dev/sdc3 of raid /dev/md3.
>> Decreasing happend, though after some increase on 23 of September, and
>> first decrease to 0 happened on 24 of September (line number 436418).
>>
>> So almost for month, dirty count was no decreasing!
>> I'm attaching that log, may be it will help somehow.
>
> Thanks a lot.
> Any idea what happened at on Fri Sep 23??
> Between 6:23am and midnight the number of dirty bits dropped from 180 to 2.
Have no idea, sorry. 6.25 am scheduled in cron for logrotation, but
6.23 has nothing specific

But changes (dirty increase) begun to happen on 2:30 AM , which
corresponds with some cron-running script which does data import &
database update  - database lives on that LVMed md array.

>
> This does seem to suggest that md is just losing track of some of the pages
> of bits and once they are modified again md remembers to flush them and write
> them out - which is a fairly safe way to fail.
>
> The one issue I have found is that set_page_attr uses a non-atomic __set_bit
> because it should always be called under a spinlock.  But bitmap_write_all()
> - which is called when a spare is added - calls it without the spinlock so
> that could corrupt some of the bits.
>
> Thanks,
> NeilBrown
>
>



-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: possible bug - bitmap dirty pages status
  2011-11-16  2:30                 ` NeilBrown
@ 2011-11-21 21:50                   ` linbloke
  0 siblings, 0 replies; 13+ messages in thread
From: linbloke @ 2011-11-21 21:50 UTC (permalink / raw)
  To: NeilBrown; +Cc: CoolCold, Paul Clements, John Robinson, Linux RAID

On 16/11/11 1:30 PM, NeilBrown wrote:
> On Tue, 15 Nov 2011 10:11:51 +1100 linbloke<linbloke@fastmail.fm>  wrote:
>> Hello,
>>
>> Sorry for bumping this thread but I couldn't find any resolution
>> post-dated. I'm seeing the same thing with SLES11 SP1. No matter how
>> long I wait or how often I sync(8), the number of dirty bitmap pages
>> does not reduce to zero - 52 has become the new zero for this array
>> (md101). I've tried writing more data to prod the sync  - the result was
>> an increase in the dirty page count (53/465) and then return to the base
>> count (52/465) after 5seconds. I haven't tried removing the bitmaps and
>> am a little reluctant to unless this would help to diagnose the bug.
>>
>> This array is part of a nested array set as mentioned in another mail
>> list thread with the Subject: Rotating RAID 1. Another thing happening
>> with this array is that the top array (md106), the one with the
>> filesystem on it, has the file system exported via NFS to a dozen or so
>> other systems. There has been no activity on this array for at least a
>> couple of minutes.
>>
>> I certainly don't feel comfortable that I have created a mirror of the
>> component devices. Can I expect the devices to actually be in sync at
>> this point?
> Hi,
>   thanks for the report.
>   I can understand your discomfort.  Unfortunately I haven't been able to
>   discover with any confidence what the problem is, so I cannot completely
>   relieve that discomfort.  I have found another possible issue - a race that
>   could cause md to forget that it needs to clean out a page of the bitmap.
>   I could imagine that causing 1 or maybe 2 pages to be stuck, but I don't
>   think it can explain 52.
>
>   Can can check if you actually have a mirror by:
>      echo check>  /sys/block/md101/md/sync_action
>   then wait for that to finish and check ..../mismatch_cnt.
>   I'm quite confident that will report 0.  I strongly suspect the problem is
>   that we forget to clear pages or bits, not that we forget to use them during
>   recovery.
>
>   So don't think that keeping the bitmaps will help in diagnosing the
>   problem.   We I need is a sequence of events that is likely to produce the
>   problem, and I realise that is hard to come by.
>
>   Sorry that I cannot be more helpful yet.
>
> NeilBrown

G'day Neil,

Thanks again for looking at this. I have performed the check as 
suggested and indeed we have a mirror (mismatch_cnt=0). I'm about to 
fail+remove the disk sdl1 from md101 and re-add the missing disk to 
md100. I'll run a check on that array once resync'd. When I'm done with 
that I'll look to test creating a new array from one of the offline 
components and run some md5sums on the contents to further validate data 
integrity. Are there any other tests I can include to help you identify 
the dirty bitmap cause?


Cheers,
Josh


Here is the state before and after check:


Before:
=====

wynyard:~ # cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] [linear]
md106 : active raid1 md105[0]
       1948836134 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md105 : active raid1 md104[0]
       1948836270 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md104 : active raid1 md103[0]
       1948836406 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md103 : active raid1 md102[0]
       1948836542 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md102 : active raid1 md101[0]
       1948836678 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md101 : active raid1 sdl1[2] md100[0]
       1948836814 blocks super 1.2 [2/2] [UU]
       bitmap: 45/465 pages [180KB], 2048KB chunk

md100 : active raid1 sdm1[0]
       1948836950 blocks super 1.2 [2/1] [U_]
       bitmap: 152/465 pages [608KB], 2048KB chunk

wynyard:~ # mdadm -Evv /dev/md100 /dev/sdl1
/dev/md100:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 8846dfde:ab7e2902:4a37165d:c7269466
            Name : wynyard:h001r007  (local to host wynyard)
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673628 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673628 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : d806cfd5:d641043e:70b32b6b:082c730b

Internal Bitmap : 8 sectors from superblock
     Update Time : Mon Nov 21 13:27:05 2011
        Checksum : 6297c5ea - correct
          Events : 53660


    Device Role : Active device 0
    Array State : AA ('A' == active, '.' == missing)
/dev/sdl1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 8846dfde:ab7e2902:4a37165d:c7269466
            Name : wynyard:h001r007  (local to host wynyard)
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673900 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673628 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 3897673628 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 4689d883:19bbaa1f:584c89fc:7fafd176

Internal Bitmap : 8 sectors from superblock
     Update Time : Mon Nov 21 13:27:05 2011
        Checksum : ef03df0c - correct
          Events : 53660


    Device Role : spare
    Array State : AA ('A' == active, '.' == missing)


wynyard:~ # mdadm -Dvv /dev/md101
/dev/md101:
         Version : 1.02
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
      Array Size : 1948836814 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 1948836814 (1858.56 GiB 1995.61 GB)
    Raid Devices : 2
   Total Devices : 2
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Mon Nov 21 13:27:05 2011
           State : active
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            Name : wynyard:h001r007  (local to host wynyard)
            UUID : 8846dfde:ab7e2902:4a37165d:c7269466
          Events : 53660

     Number   Major   Minor   RaidDevice State
        0       9      100        0      active sync   /dev/md100
        2       8      177        1      active sync   /dev/sdl1

wynyard:~ # mdadm -vv --examine-bitmap /dev/md100 /dev/sdl1
         Filename : /dev/md100
            Magic : 6d746962
          Version : 4
             UUID : 8846dfde:ab7e2902:4a37165d:c7269466
           Events : 53660
   Events Cleared : 53660
            State : OK
        Chunksize : 2 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 1948836814 (1858.56 GiB 1995.61 GB)
           Bitmap : 951581 bits (chunks), 23276 dirty (2.4%)
         Filename : /dev/sdl1
            Magic : 6d746962
          Version : 4
             UUID : 8846dfde:ab7e2902:4a37165d:c7269466
           Events : 53660
   Events Cleared : 53660
            State : OK
        Chunksize : 2 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 1948836814 (1858.56 GiB 1995.61 GB)
           Bitmap : 951581 bits (chunks), 23276 dirty (2.4%)

wynyard:~ # cat /sys/block/md101/md/sync_{tabtab}
sync_action          sync_force_parallel  sync_min             
sync_speed_max
sync_completed       sync_max             sync_speed           
sync_speed_min
wynyard:~ # cat /sys/block/md101/md/sync_*
idle
none
0
max
0
none
200000 (system)
50000 (system)

wynyard:~ # echo check > /sys/block/md101/md/sync_action



After:
====

wynyard:~ # cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] [linear]
md106 : active raid1 md105[0]
       1948836134 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md105 : active raid1 md104[0]
       1948836270 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md104 : active raid1 md103[0]
       1948836406 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md103 : active raid1 md102[0]
       1948836542 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md102 : active raid1 md101[0]
       1948836678 blocks super 1.2 [2/1] [U_]
       bitmap: 465/465 pages [1860KB], 2048KB chunk

md101 : active raid1 sdl1[2] md100[0]
       1948836814 blocks super 1.2 [2/2] [UU]
       bitmap: 22/465 pages [88KB], 2048KB chunk

md100 : active raid1 sdm1[0]
       1948836950 blocks super 1.2 [2/1] [U_]
       bitmap: 152/465 pages [608KB], 2048KB chunk

unused devices:<none>


wynyard:~ # cat /sys/block/md101/md/mismatch_cnt
0


wynyard:~ # mdadm -vv --examine-bitmap /dev/md100 /dev/sdl1
         Filename : /dev/md100
            Magic : 6d746962
          Version : 4
             UUID : 8846dfde:ab7e2902:4a37165d:c7269466
           Events : 60976
   Events Cleared : 53660
            State : OK
        Chunksize : 2 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 1948836814 (1858.56 GiB 1995.61 GB)
           Bitmap : 951581 bits (chunks), 10527 dirty (1.1%)
         Filename : /dev/sdl1
            Magic : 6d746962
          Version : 4
             UUID : 8846dfde:ab7e2902:4a37165d:c7269466
           Events : 60976
   Events Cleared : 53660
            State : OK
        Chunksize : 2 MB
           Daemon : 5s flush period
       Write Mode : Normal
        Sync Size : 1948836814 (1858.56 GiB 1995.61 GB)
           Bitmap : 951581 bits (chunks), 10527 dirty (1.1%)
wynyard:~ # mdadm -Evv /dev/md100 /dev/sdl1
/dev/md100:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 8846dfde:ab7e2902:4a37165d:c7269466
            Name : wynyard:h001r007  (local to host wynyard)
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673628 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673628 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : d806cfd5:d641043e:70b32b6b:082c730b

Internal Bitmap : 8 sectors from superblock
     Update Time : Mon Nov 21 18:57:13 2011
        Checksum : 62982fde - correct
          Events : 60976


    Device Role : Active device 0
    Array State : AA ('A' == active, '.' == missing)
/dev/sdl1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x1
      Array UUID : 8846dfde:ab7e2902:4a37165d:c7269466
            Name : wynyard:h001r007  (local to host wynyard)
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
    Raid Devices : 2

  Avail Dev Size : 3897673900 (1858.56 GiB 1995.61 GB)
      Array Size : 3897673628 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 3897673628 (1858.56 GiB 1995.61 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 4689d883:19bbaa1f:584c89fc:7fafd176

Internal Bitmap : 8 sectors from superblock
     Update Time : Mon Nov 21 18:57:13 2011
        Checksum : ef044900 - correct
          Events : 60976


    Device Role : spare
    Array State : AA ('A' == active, '.' == missing)
wynyard:~ # mdadm -Dvv /dev/md101
/dev/md101:
         Version : 1.02
   Creation Time : Thu Oct 27 13:39:18 2011
      Raid Level : raid1
      Array Size : 1948836814 (1858.56 GiB 1995.61 GB)
   Used Dev Size : 1948836814 (1858.56 GiB 1995.61 GB)
    Raid Devices : 2
   Total Devices : 2
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Mon Nov 21 18:57:13 2011
           State : active
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            Name : wynyard:h001r007  (local to host wynyard)
            UUID : 8846dfde:ab7e2902:4a37165d:c7269466
          Events : 60976

     Number   Major   Minor   RaidDevice State
        0       9      100        0      active sync   /dev/md100
        2       8      177        1      active sync   /dev/sdl1










^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-11-21 21:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-27  9:58 possible bug - bitmap dirty pages status CoolCold
2011-08-31  9:05 ` CoolCold
2011-08-31 12:30   ` Paul Clements
2011-08-31 12:56     ` John Robinson
2011-08-31 13:16       ` CoolCold
2011-08-31 14:08         ` Paul Clements
2011-08-31 20:16           ` CoolCold
2011-09-01  5:40             ` NeilBrown
2011-11-14 23:11               ` linbloke
2011-11-16  2:30                 ` NeilBrown
2011-11-21 21:50                   ` linbloke
     [not found]                 ` <CAGqmV7qpQBHLcJ9J9cP1zDw6kp6aLcaCMneFYEgcPOu7doXSMA@mail.gmail.com>
2011-11-16  3:07                   ` NeilBrown
2011-11-16  9:36                     ` CoolCold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).