mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
@ 2007-10-17 15:37 Mike Snitzer
  2007-10-17 20:54 ` Bill Davidsen
  2007-10-18  5:53 ` Neil Brown
  0 siblings, 2 replies; 16+ messages in thread
From: Mike Snitzer @ 2007-10-17 15:37 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
use of space for bitmaps in version1 metadata"
(199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
offending change.  Using 1.2 metdata works.

I get the following using the tip of the mdadm git repo or any other
version of mdadm 2.6.x:

# mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
-n 2 /dev/sdf --write-mostly /dev/nbd2
mdadm: /dev/sdf appears to be part of a raid array:
    level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
mdadm: /dev/nbd2 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
mdadm: RUN_ARRAY failed: Input/output error
mdadm: stopped /dev/md2

kernel log shows:
md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
created bitmap (350 pages) for device md2
md2: failed to create bitmap (-5)
md: pers->run() failed ...
md: md2 stopped.
md: unbind<nbd2>
md: export_rdev(nbd2)
md: unbind<sdf>
md: export_rdev(sdf)
md: md2 stopped.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-17 15:37 mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap Mike Snitzer
@ 2007-10-17 20:54 ` Bill Davidsen
  2007-10-17 21:21   ` Mike Snitzer
  2007-10-18  5:53 ` Neil Brown
  1 sibling, 1 reply; 16+ messages in thread
From: Bill Davidsen @ 2007-10-17 20:54 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Neil Brown, linux-raid

Mike Snitzer wrote:
> mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> use of space for bitmaps in version1 metadata"
> (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> offending change.  Using 1.2 metdata works.
>
> I get the following using the tip of the mdadm git repo or any other
> version of mdadm 2.6.x:
>
> # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> -n 2 /dev/sdf --write-mostly /dev/nbd2
> mdadm: /dev/sdf appears to be part of a raid array:
>     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> mdadm: /dev/nbd2 appears to be part of a raid array:
>     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> mdadm: RUN_ARRAY failed: Input/output error
> mdadm: stopped /dev/md2
>
> kernel log shows:
> md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
> created bitmap (350 pages) for device md2
> md2: failed to create bitmap (-5)
> md: pers->run() failed ...
> md: md2 stopped.
> md: unbind<nbd2>
> md: export_rdev(nbd2)
> md: unbind<sdf>
> md: export_rdev(sdf)
> md: md2 stopped.
>   

I would start by retrying with an external bitmap, to see if for some 
reason there isn't room for the bitmap. If that fails, perhaps no bitmap 
at all would be a useful data point. Was the original metadata the same 
version? Things moved depending on the exact version, and some 
--zero-superblock magic might be needed. Hopefully Neil can clarify, I'm 
just telling you what I suspect is the problem, and maybe a 
non-destructive solution.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-17 20:54 ` Bill Davidsen
@ 2007-10-17 21:21   ` Mike Snitzer
  2007-10-18 12:43     ` Bill Davidsen
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2007-10-17 21:21 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Neil Brown, linux-raid

On 10/17/07, Bill Davidsen <davidsen@tmr.com> wrote:
> Mike Snitzer wrote:
> > mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> > use of space for bitmaps in version1 metadata"
> > (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> > offending change.  Using 1.2 metdata works.
> >
> > I get the following using the tip of the mdadm git repo or any other
> > version of mdadm 2.6.x:
> >
> > # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> > -n 2 /dev/sdf --write-mostly /dev/nbd2
> > mdadm: /dev/sdf appears to be part of a raid array:
> >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > mdadm: /dev/nbd2 appears to be part of a raid array:
> >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > mdadm: RUN_ARRAY failed: Input/output error
> > mdadm: stopped /dev/md2
> >
> > kernel log shows:
> > md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
> > created bitmap (350 pages) for device md2
> > md2: failed to create bitmap (-5)
> > md: pers->run() failed ...
> > md: md2 stopped.
> > md: unbind<nbd2>
> > md: export_rdev(nbd2)
> > md: unbind<sdf>
> > md: export_rdev(sdf)
> > md: md2 stopped.
> >
>
> I would start by retrying with an external bitmap, to see if for some
> reason there isn't room for the bitmap. If that fails, perhaps no bitmap
> at all would be a useful data point. Was the original metadata the same
> version? Things moved depending on the exact version, and some
> --zero-superblock magic might be needed. Hopefully Neil can clarify, I'm
> just telling you what I suspect is the problem, and maybe a
> non-destructive solution.

Creating with an external bitmap works perfectly fine.  As does
creating without a bitmap.  --zero-superblock hasn't helped.  Metadata
v1.1 and v1.2 works with an internal bitmap.  I'd like to use v1.0
with an internal bitmap (using an external bitmap isn't an option for
me).

It does appear that the changes to sb super1.c aren't leaving adequate
room for the bitmap.  Looking at the relevant diff for v1.0 metadata
the newer super1.c code makes use of a larger bitmap (128K) for
devices > 200GB.  My blockdevice is 700GB.  So could the larger
blockdevice possibly explain why others haven't noticed this?

Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-17 21:21   ` Mike Snitzer
@ 2007-10-18 12:43     ` Bill Davidsen
  2007-10-19  2:41       ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Bill Davidsen @ 2007-10-18 12:43 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Neil Brown, linux-raid

Mike Snitzer wrote:
> On 10/17/07, Bill Davidsen <davidsen@tmr.com> wrote:
>   
>> Mike Snitzer wrote:
>>     
>>> mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
>>> use of space for bitmaps in version1 metadata"
>>> (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
>>> offending change.  Using 1.2 metdata works.
>>>
>>> I get the following using the tip of the mdadm git repo or any other
>>> version of mdadm 2.6.x:
>>>
>>> # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
>>> -n 2 /dev/sdf --write-mostly /dev/nbd2
>>> mdadm: /dev/sdf appears to be part of a raid array:
>>>     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
>>> mdadm: /dev/nbd2 appears to be part of a raid array:
>>>     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
>>> mdadm: RUN_ARRAY failed: Input/output error
>>> mdadm: stopped /dev/md2
>>>
>>> kernel log shows:
>>> md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
>>> created bitmap (350 pages) for device md2
>>> md2: failed to create bitmap (-5)
>>> md: pers->run() failed ...
>>> md: md2 stopped.
>>> md: unbind<nbd2>
>>> md: export_rdev(nbd2)
>>> md: unbind<sdf>
>>> md: export_rdev(sdf)
>>> md: md2 stopped.
>>>
>>>       
>> I would start by retrying with an external bitmap, to see if for some
>> reason there isn't room for the bitmap. If that fails, perhaps no bitmap
>> at all would be a useful data point. Was the original metadata the same
>> version? Things moved depending on the exact version, and some
>> --zero-superblock magic might be needed. Hopefully Neil can clarify, I'm
>> just telling you what I suspect is the problem, and maybe a
>> non-destructive solution.
>>     
>
> Creating with an external bitmap works perfectly fine.  As does
> creating without a bitmap.  --zero-superblock hasn't helped.  Metadata
> v1.1 and v1.2 works with an internal bitmap.  I'd like to use v1.0
> with an internal bitmap (using an external bitmap isn't an option for
> me).
>   

Unless there's a substantial benefit from using the 1.0 format, you 
might want to go with something which works. I would suggest using 
--bitmap-chunk, but the man page claims it doesn't apply to internal 
bitmaps. It also claims the bitmap size is chosen automatically to best 
use available space, but "doesn't work" seems an exception to "best 
use." ;-)
> It does appear that the changes to sb super1.c aren't leaving adequate
> room for the bitmap.  Looking at the relevant diff for v1.0 metadata
> the newer super1.c code makes use of a larger bitmap (128K) for
> devices > 200GB.  My blockdevice is 700GB.  So could the larger
> blockdevice possibly explain why others haven't noticed this?
>   
Could be, although I have arrays larger than that and haven't been 
bitten. Then again, none of mine use the 1.0 metadata format as I 
recall. Perhaps Neil will explain his thinking on these formats, I can 
see 1.0 and 1.1, but the 1.2 format uses the 4k offset to no obvious 
benefit. In any case, you have a workable solution, so you can move on 
if time is an issue, or wait for Neil to comment on this. I didn't see 
an obvious problem with the 1.0 code WRT bitmaps, but I looked quickly 
and at a 2.6.3 source I had handy.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-18 12:43     ` Bill Davidsen
@ 2007-10-19  2:41       ` Neil Brown
  2007-10-19  3:45         ` Bill Davidsen
  0 siblings, 1 reply; 16+ messages in thread
From: Neil Brown @ 2007-10-19  2:41 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Mike Snitzer, linux-raid

On Thursday October 18, davidsen@tmr.com wrote:
> 
> Unless there's a substantial benefit from using the 1.0 format, you 
> might want to go with something which works. 

I am every grateful to the many people who do not just find work
arounds, but report problems and persist until they get fixed.

Obviously everyone should choose the path that suits them best, but
I'd rather we didn't discourage people who are working to get a bug
fixed. 

>                                               I would suggest using 
> --bitmap-chunk, but the man page claims it doesn't apply to internal 
> bitmaps. It also claims the bitmap size is chosen automatically to best 
> use available space, but "doesn't work" seems an exception to "best 
> use." ;-)

In this case, it does work, and it is best use.  It just appears that
the 'available space' is not truly available - read error.

NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-19  2:41       ` Neil Brown
@ 2007-10-19  3:45         ` Bill Davidsen
  0 siblings, 0 replies; 16+ messages in thread
From: Bill Davidsen @ 2007-10-19  3:45 UTC (permalink / raw)
  To: Neil Brown; +Cc: Mike Snitzer, linux-raid

Neil Brown wrote:
> On Thursday October 18, davidsen@tmr.com wrote:
>   
>> Unless there's a substantial benefit from using the 1.0 format, you 
>> might want to go with something which works. 
>>     
>
> I am every grateful to the many people who do not just find work
> arounds, but report problems and persist until they get fixed.
>
> Obviously everyone should choose the path that suits them best, but
> I'd rather we didn't discourage people who are working to get a bug
> fixed. 
>   

Giving them a way to get working and copying you directly seems to be a 
reasonable compromise between ignoring the problem and leaving the 
original poster to think software raid is neither reliable nor supported.

I have never seen any guidance on when 1.0 or 1.2 format is better than 
1.1, perhaps that can go on the documentation queue.
>>                                               I would suggest using 
>> --bitmap-chunk, but the man page claims it doesn't apply to internal 
>> bitmaps. It also claims the bitmap size is chosen automatically to best 
>> use available space, but "doesn't work" seems an exception to "best 
>> use." ;-)
>>     
>
> In this case, it does work, and it is best use.  It just appears that
> the 'available space' is not truly available - read error.
>   

It's not always clear without dmesg if read error is hardware :-( I 
certainly assumed it was config or software in nature.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-17 15:37 mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap Mike Snitzer
  2007-10-17 20:54 ` Bill Davidsen
@ 2007-10-18  5:53 ` Neil Brown
  2007-10-18 12:10   ` Mike Snitzer
  1 sibling, 1 reply; 16+ messages in thread
From: Neil Brown @ 2007-10-18  5:53 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-raid

On Wednesday October 17, snitzer@gmail.com wrote:
> mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> use of space for bitmaps in version1 metadata"
> (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> offending change.  Using 1.2 metdata works.
> 
> I get the following using the tip of the mdadm git repo or any other
> version of mdadm 2.6.x:
> 
> # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> -n 2 /dev/sdf --write-mostly /dev/nbd2
> mdadm: /dev/sdf appears to be part of a raid array:
>     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> mdadm: /dev/nbd2 appears to be part of a raid array:
>     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> mdadm: RUN_ARRAY failed: Input/output error
> mdadm: stopped /dev/md2
> 
> kernel log shows:
> md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
> created bitmap (350 pages) for device md2
> md2: failed to create bitmap (-5)

Could you please tell me the exact size of your device?  Then should
be able to reproduce it and test a fix.

(It works for a 734003201K device).

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-18  5:53 ` Neil Brown
@ 2007-10-18 12:10   ` Mike Snitzer
  2007-10-19  2:38     ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2007-10-18 12:10 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 10/18/07, Neil Brown <neilb@suse.de> wrote:
> On Wednesday October 17, snitzer@gmail.com wrote:
> > mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> > use of space for bitmaps in version1 metadata"
> > (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> > offending change.  Using 1.2 metdata works.
> >
> > I get the following using the tip of the mdadm git repo or any other
> > version of mdadm 2.6.x:
> >
> > # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> > -n 2 /dev/sdf --write-mostly /dev/nbd2
> > mdadm: /dev/sdf appears to be part of a raid array:
> >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > mdadm: /dev/nbd2 appears to be part of a raid array:
> >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > mdadm: RUN_ARRAY failed: Input/output error
> > mdadm: stopped /dev/md2
> >
> > kernel log shows:
> > md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
> > created bitmap (350 pages) for device md2
> > md2: failed to create bitmap (-5)
>
> Could you please tell me the exact size of your device?  Then should
> be able to reproduce it and test a fix.
>
> (It works for a 734003201K device).

732456960K, it is fairly surprising that such a relatively small
difference in size would prevent it from working...

regards,
Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-18 12:10   ` Mike Snitzer
@ 2007-10-19  2:38     ` Neil Brown
  2007-10-19  4:52       ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Neil Brown @ 2007-10-19  2:38 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-raid

Sorry, I wasn't paying close enough attention and missed the obvious.
.....

On Thursday October 18, snitzer@gmail.com wrote:
> On 10/18/07, Neil Brown <neilb@suse.de> wrote:
> > On Wednesday October 17, snitzer@gmail.com wrote:
> > > mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> > > use of space for bitmaps in version1 metadata"
> > > (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> > > offending change.  Using 1.2 metdata works.
> > >
> > > I get the following using the tip of the mdadm git repo or any other
> > > version of mdadm 2.6.x:
> > >
> > > # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> > > -n 2 /dev/sdf --write-mostly /dev/nbd2
> > > mdadm: /dev/sdf appears to be part of a raid array:
> > >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > > mdadm: /dev/nbd2 appears to be part of a raid array:
> > >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > > mdadm: RUN_ARRAY failed: Input/output error
                               ^^^^^^^^^^^^^^^^^^

This means there was an IO error.  i.e. there is a block on the device
that cannot be read from.
It worked with earlier version of mdadm because they used a much
smaller bitmap.  With the patch you mention in place, mdadm tries
harder to find a good location and good size for a bitmap and to
make sure that space is available.
The important fact is that the bitmap ends up at a different
location. 

You have a bad block at that location, it would seem.

I would have expected an error in the kernel logs about the read error
though - that is strange.

What do
  mdadm -E
and
  mdadm -X

on each device say?

> > > mdadm: stopped /dev/md2
> > >
> > > kernel log shows:
> > > md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
> > > created bitmap (350 pages) for device md2
> > > md2: failed to create bitmap (-5)
> >
> > Could you please tell me the exact size of your device?  Then should
> > be able to reproduce it and test a fix.
> >
> > (It works for a 734003201K device).
> 
> 732456960K, it is fairly surprising that such a relatively small
> difference in size would prevent it from working...

There was a case once where the calculation was wrong, and rounding
sometimes left enough space and sometimes didn't.  That is why I
wanted to know the exact size.  I turns out it wasn't relevant in this
case.

NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-19  2:38     ` Neil Brown
@ 2007-10-19  4:52       ` Mike Snitzer
  2007-10-19  5:15         ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2007-10-19  4:52 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 10/18/07, Neil Brown <neilb@suse.de> wrote:
>
> Sorry, I wasn't paying close enough attention and missed the obvious.
> .....
>
> On Thursday October 18, snitzer@gmail.com wrote:
> > On 10/18/07, Neil Brown <neilb@suse.de> wrote:
> > > On Wednesday October 17, snitzer@gmail.com wrote:
> > > > mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> > > > use of space for bitmaps in version1 metadata"
> > > > (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> > > > offending change.  Using 1.2 metdata works.
> > > >
> > > > I get the following using the tip of the mdadm git repo or any other
> > > > version of mdadm 2.6.x:
> > > >
> > > > # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> > > > -n 2 /dev/sdf --write-mostly /dev/nbd2
> > > > mdadm: /dev/sdf appears to be part of a raid array:
> > > >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > > > mdadm: /dev/nbd2 appears to be part of a raid array:
> > > >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > > > mdadm: RUN_ARRAY failed: Input/output error
>                                ^^^^^^^^^^^^^^^^^^
>
> This means there was an IO error.  i.e. there is a block on the device
> that cannot be read from.
> It worked with earlier version of mdadm because they used a much
> smaller bitmap.  With the patch you mention in place, mdadm tries
> harder to find a good location and good size for a bitmap and to
> make sure that space is available.
> The important fact is that the bitmap ends up at a different
> location.
>
> You have a bad block at that location, it would seem.

I'm a bit skeptical of that being the case considering I get this
error on _any_ pair of disks I try in an environment where I'm
mirroring across servers that each have access to 8 of these disks.
Each of the 8 mirrors consists of a local member and a remote (nbd)
member.  I can't see all 16 disks having the very same bad block(s) at
the end of the disk ;)

I feels to me like the calculation that you're making isn't leaving
adequate room for the 128K bitmap without hitting the superblock...
but I don't have hard proof yet ;)

> I would have expected an error in the kernel logs about the read error
> though - that is strange.

What about the "md2: failed to create bitmap (-5)"?

> What do
>   mdadm -E
> and
>   mdadm -X
>
> on each device say?

# mdadm -E /dev/sdf
/dev/sdf:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
     Array UUID : caabb900:616bfc5a:03763b95:83ea99a7
           Name : 2
  Creation Time : Fri Oct 19 00:38:45 2007
     Raid Level : raid1
   Raid Devices : 2

  Used Dev Size : 1464913648 (698.53 GiB 750.04 GB)
     Array Size : 1464913648 (698.53 GiB 750.04 GB)
   Super Offset : 1464913904 sectors
          State : clean
    Device UUID : 978cdd42:abaa82a1:4ad79285:1b56ed86

Internal Bitmap : -176 sectors from superblock
    Update Time : Fri Oct 19 00:38:45 2007
       Checksum : c6bb03db - correct
         Events : 0


    Array Slot : 0 (0, 1)
   Array State : Uu

# mdadm -E /dev/nbd2
/dev/nbd2:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
     Array UUID : caabb900:616bfc5a:03763b95:83ea99a7
           Name : 2
  Creation Time : Fri Oct 19 00:38:45 2007
     Raid Level : raid1
   Raid Devices : 2

  Used Dev Size : 1464913648 (698.53 GiB 750.04 GB)
     Array Size : 1464913648 (698.53 GiB 750.04 GB)
   Super Offset : 1464913904 sectors
          State : clean
    Device UUID : 180209d2:cff9b5d0:05054d19:2e4930f2

Internal Bitmap : -176 sectors from superblock
      Flags : write-mostly
    Update Time : Fri Oct 19 00:38:45 2007
       Checksum : 8416e951 - correct
         Events : 0


    Array Slot : 1 (0, 1)
   Array State : uU

# mdadm -X /dev/sdf
        Filename : /dev/sdf
           Magic : 6d746962
         Version : 4
            UUID : caabb900:616bfc5a:03763b95:83ea99a7
          Events : 0
  Events Cleared : 0
           State : OK
       Chunksize : 1 MB
          Daemon : 5s flush period
      Write Mode : Normal
       Sync Size : 732456824 (698.53 GiB 750.04 GB)
          Bitmap : 715290 bits (chunks), 715290 dirty (100.0%)

# mdadm -X /dev/nbd2
        Filename : /dev/nbd2
           Magic : 6d746962
         Version : 4
            UUID : caabb900:616bfc5a:03763b95:83ea99a7
          Events : 0
  Events Cleared : 0
           State : OK
       Chunksize : 1 MB
          Daemon : 5s flush period
      Write Mode : Normal
       Sync Size : 732456824 (698.53 GiB 750.04 GB)
          Bitmap : 715290 bits (chunks), 715290 dirty (100.0%)

> > > > mdadm: stopped /dev/md2
> > > >
> > > > kernel log shows:
> > > > md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
> > > > created bitmap (350 pages) for device md2
> > > > md2: failed to create bitmap (-5)

I assumed that the RUN_ARRAY failed (via IO error) was a side-effect
of MD's inability to create the bitmap (-5):

md2: bitmap initialized from disk: read 22/22 pages, set 715290 bits, status: 0
created bitmap (350 pages) for device md2
md2: failed to create bitmap (-5)
md: pers->run() failed ...
md: md2 stopped.

So you're saying one has nothing to do with the other?

regards,
Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-19  4:52       ` Mike Snitzer
@ 2007-10-19  5:15         ` Mike Snitzer
  2007-10-19  5:51           ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2007-10-19  5:15 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 10/19/07, Mike Snitzer <snitzer@gmail.com> wrote:
> On 10/18/07, Neil Brown <neilb@suse.de> wrote:
> >
> > Sorry, I wasn't paying close enough attention and missed the obvious.
> > .....
> >
> > On Thursday October 18, snitzer@gmail.com wrote:
> > > On 10/18/07, Neil Brown <neilb@suse.de> wrote:
> > > > On Wednesday October 17, snitzer@gmail.com wrote:
> > > > > mdadm 2.4.1 through 2.5.6 works. mdadm-2.6's "Improve allocation and
> > > > > use of space for bitmaps in version1 metadata"
> > > > > (199171a297a87d7696b6b8c07ee520363f4603c1) would seem like the
> > > > > offending change.  Using 1.2 metdata works.
> > > > >
> > > > > I get the following using the tip of the mdadm git repo or any other
> > > > > version of mdadm 2.6.x:
> > > > >
> > > > > # mdadm --create /dev/md2 --run -l 1 --metadata=1.0 --bitmap=internal
> > > > > -n 2 /dev/sdf --write-mostly /dev/nbd2
> > > > > mdadm: /dev/sdf appears to be part of a raid array:
> > > > >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > > > > mdadm: /dev/nbd2 appears to be part of a raid array:
> > > > >     level=raid1 devices=2 ctime=Wed Oct 17 10:17:31 2007
> > > > > mdadm: RUN_ARRAY failed: Input/output error
> >                                ^^^^^^^^^^^^^^^^^^
> >
> > This means there was an IO error.  i.e. there is a block on the device
> > that cannot be read from.
> > It worked with earlier version of mdadm because they used a much
> > smaller bitmap.  With the patch you mention in place, mdadm tries
> > harder to find a good location and good size for a bitmap and to
> > make sure that space is available.
> > The important fact is that the bitmap ends up at a different
> > location.
> >
> > You have a bad block at that location, it would seem.
>
> I'm a bit skeptical of that being the case considering I get this
> error on _any_ pair of disks I try in an environment where I'm
> mirroring across servers that each have access to 8 of these disks.
> Each of the 8 mirrors consists of a local member and a remote (nbd)
> member.  I can't see all 16 disks having the very same bad block(s) at
> the end of the disk ;)
>
> I feels to me like the calculation that you're making isn't leaving
> adequate room for the 128K bitmap without hitting the superblock...
> but I don't have hard proof yet ;)

To further test this I used 2 local sparse 732456960K loopback devices
and attempted to create the raid1 in the same manner.  It failed in
exactly the same way.  This should cast further doubt on the bad block
theory no?

I'm using a stock 2.6.19.7 that I then backported various MD fixes to
from 2.6.20 -> 2.6.23...  this kernel has worked great until I
attempted v1.0 sb w/ bitmap=internal using mdadm 2.6.x.

But would you like me to try a stock 2.6.22 or 2.6.23 kernel?

Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-19  5:15         ` Mike Snitzer
@ 2007-10-19  5:51           ` Neil Brown
  2007-10-19 23:18             ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Neil Brown @ 2007-10-19  5:51 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-raid

On Friday October 19, snitzer@gmail.com wrote:
> 
> To further test this I used 2 local sparse 732456960K loopback devices
> and attempted to create the raid1 in the same manner.  It failed in
> exactly the same way.  This should cast further doubt on the bad block
> theory no?

Yes :-)

> 
> I'm using a stock 2.6.19.7 that I then backported various MD fixes to
> from 2.6.20 -> 2.6.23...  this kernel has worked great until I
> attempted v1.0 sb w/ bitmap=internal using mdadm 2.6.x.
> 
> But would you like me to try a stock 2.6.22 or 2.6.23 kernel?

Yes please.
I'm suspecting the code in write_sb_page where it tests if the bitmap
overlaps the data or metadata.  The only way I can see you getting the
exact error that you do get it for that to fail.
That test was introduced in 2.6.22.  Did you backport that?  Any
chance it got mucked up a bit?

I did the loopback test on current -mm and it works fine.

NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-19  5:51           ` Neil Brown
@ 2007-10-19 23:18             ` Mike Snitzer
  2007-10-22  6:55               ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2007-10-19 23:18 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 10/19/07, Neil Brown <neilb@suse.de> wrote:
> On Friday October 19, snitzer@gmail.com wrote:

> > I'm using a stock 2.6.19.7 that I then backported various MD fixes to
> > from 2.6.20 -> 2.6.23...  this kernel has worked great until I
> > attempted v1.0 sb w/ bitmap=internal using mdadm 2.6.x.
> >
> > But would you like me to try a stock 2.6.22 or 2.6.23 kernel?
>
> Yes please.
> I'm suspecting the code in write_sb_page where it tests if the bitmap
> overlaps the data or metadata.  The only way I can see you getting the
> exact error that you do get it for that to fail.
> That test was introduced in 2.6.22.  Did you backport that?  Any
> chance it got mucked up a bit?

I believe you're referring to commit
f0d76d70bc77b9b11256a3a23e98e80878be1578.  That change actually made
it into 2.6.23 AFAIK; but yes I actually did backport that fix (which
depended on ab6085c795a71b6a21afe7469d30a365338add7a).

If I back-out f0d76d70bc77b9b11256a3a23e98e80878be1578 I can create a
raid1 w/ v1.0 sb and an internal bitmap.  But clearly that is just
because I removed the negative checks that you introduced ;)

For me this begs the question: what else would
f0d76d70bc77b9b11256a3a23e98e80878be1578 depend on that I missed?  I
included 505fa2c4a2f125a70951926dfb22b9cf273994f1 and
	ab6085c795a71b6a21afe7469d30a365338add7a too.

*shrug*...

Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-19 23:18             ` Mike Snitzer
@ 2007-10-22  6:55               ` Neil Brown
  2007-10-22 14:05                 ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Neil Brown @ 2007-10-22  6:55 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-raid

On Friday October 19, snitzer@gmail.com wrote:
> On 10/19/07, Neil Brown <neilb@suse.de> wrote:
> > On Friday October 19, snitzer@gmail.com wrote:
> 
> > > I'm using a stock 2.6.19.7 that I then backported various MD fixes to
> > > from 2.6.20 -> 2.6.23...  this kernel has worked great until I
> > > attempted v1.0 sb w/ bitmap=internal using mdadm 2.6.x.
> > >
> > > But would you like me to try a stock 2.6.22 or 2.6.23 kernel?
> >
> > Yes please.
> > I'm suspecting the code in write_sb_page where it tests if the bitmap
> > overlaps the data or metadata.  The only way I can see you getting the
> > exact error that you do get it for that to fail.
> > That test was introduced in 2.6.22.  Did you backport that?  Any
> > chance it got mucked up a bit?
> 
> I believe you're referring to commit
> f0d76d70bc77b9b11256a3a23e98e80878be1578.  That change actually made
> it into 2.6.23 AFAIK; but yes I actually did backport that fix (which
> depended on ab6085c795a71b6a21afe7469d30a365338add7a).
> 
> If I back-out f0d76d70bc77b9b11256a3a23e98e80878be1578 I can create a
> raid1 w/ v1.0 sb and an internal bitmap.  But clearly that is just
> because I removed the negative checks that you introduced ;)
> 
> For me this begs the question: what else would
> f0d76d70bc77b9b11256a3a23e98e80878be1578 depend on that I missed?  I
> included 505fa2c4a2f125a70951926dfb22b9cf273994f1 and
> 	ab6085c795a71b6a21afe7469d30a365338add7a too.
> 
> *shrug*...
> 

This is all very odd...
I definitely tested this last week and couldn't reproduce the
problem.  This week I can reproduce it easily.  And given the nature
of the bug, I cannot see how it ever worked.

Anyway, here is a fix that works for me.

NeilBrown

--------
Fix an unsigned compare to allow creation of bitmaps with v1.0 metadata.

As page->index is unsigned, this all becomes an unsigned comparison, which
 almost always returns an error.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/bitmap.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/bitmap.c ./drivers/md/bitmap.c
--- .prev/drivers/md/bitmap.c	2007-10-22 16:47:52.000000000 +1000
+++ ./drivers/md/bitmap.c	2007-10-22 16:50:10.000000000 +1000
@@ -274,7 +274,7 @@ static int write_sb_page(struct bitmap *
 			if (bitmap->offset < 0) {
 				/* DATA  BITMAP METADATA  */
 				if (bitmap->offset
-				    + page->index * (PAGE_SIZE/512)
+				    + (long)(page->index * (PAGE_SIZE/512))
 				    + size/512 > 0)
 					/* bitmap runs in to metadata */
 					return -EINVAL;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-22  6:55               ` Neil Brown
@ 2007-10-22 14:05                 ` Mike Snitzer
  2007-10-23  5:11                   ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2007-10-22 14:05 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 10/22/07, Neil Brown <neilb@suse.de> wrote:
> On Friday October 19, snitzer@gmail.com wrote:
> > On 10/19/07, Neil Brown <neilb@suse.de> wrote:
> > > On Friday October 19, snitzer@gmail.com wrote:
> >
> > > > I'm using a stock 2.6.19.7 that I then backported various MD fixes to
> > > > from 2.6.20 -> 2.6.23...  this kernel has worked great until I
> > > > attempted v1.0 sb w/ bitmap=internal using mdadm 2.6.x.
> > > >
> > > > But would you like me to try a stock 2.6.22 or 2.6.23 kernel?
> > >
> > > Yes please.
> > > I'm suspecting the code in write_sb_page where it tests if the bitmap
> > > overlaps the data or metadata.  The only way I can see you getting the
> > > exact error that you do get it for that to fail.
> > > That test was introduced in 2.6.22.  Did you backport that?  Any
> > > chance it got mucked up a bit?
> >
> > I believe you're referring to commit
> > f0d76d70bc77b9b11256a3a23e98e80878be1578.  That change actually made
> > it into 2.6.23 AFAIK; but yes I actually did backport that fix (which
> > depended on ab6085c795a71b6a21afe7469d30a365338add7a).
> >
> > If I back-out f0d76d70bc77b9b11256a3a23e98e80878be1578 I can create a
> > raid1 w/ v1.0 sb and an internal bitmap.  But clearly that is just
> > because I removed the negative checks that you introduced ;)
> >
> > For me this begs the question: what else would
> > f0d76d70bc77b9b11256a3a23e98e80878be1578 depend on that I missed?  I
> > included 505fa2c4a2f125a70951926dfb22b9cf273994f1 and
> >       ab6085c795a71b6a21afe7469d30a365338add7a too.
> >
> > *shrug*...
> >
>
> This is all very odd...
> I definitely tested this last week and couldn't reproduce the
> problem.  This week I can reproduce it easily.  And given the nature
> of the bug, I cannot see how it ever worked.
>
> Anyway, here is a fix that works for me.

Hey Neil,

Your fix works for me too.  However, I'm wondering why you held back
on fixing the same issue in the "bitmap runs into data" comparison
that follows:

--- ./drivers/md/bitmap.c 2007-10-19 19:11:58.000000000 -0400
+++ ./drivers/md/bitmap.c 2007-10-22 09:53:41.000000000 -0400
@@ -286,7 +286,7 @@
                                /* METADATA BITMAP DATA */
                                if (rdev->sb_offset*2
                                    + bitmap->offset
-                                   + page->index*(PAGE_SIZE/512) + size/512
+                                   +
(long)(page->index*(PAGE_SIZE/512)) + size/512
                                    > rdev->data_offset)
                                        /* bitmap runs in to data */
                                        return -EINVAL;

Thanks,
Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap
  2007-10-22 14:05                 ` Mike Snitzer
@ 2007-10-23  5:11                   ` Neil Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Neil Brown @ 2007-10-23  5:11 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-raid

On Monday October 22, snitzer@gmail.com wrote:
> 
> Hey Neil,
> 
> Your fix works for me too.  However, I'm wondering why you held back
> on fixing the same issue in the "bitmap runs into data" comparison
> that follows:

It isn't really needed here.  In this case bitmap->offset is positive,
so all the numbers are positive, so it doesn't matter if the
comparison is signed or not.

Thanks for mentioning it though.

NeilBrown


> 
> --- ./drivers/md/bitmap.c 2007-10-19 19:11:58.000000000 -0400
> +++ ./drivers/md/bitmap.c 2007-10-22 09:53:41.000000000 -0400
> @@ -286,7 +286,7 @@
>                                 /* METADATA BITMAP DATA */
>                                 if (rdev->sb_offset*2
>                                     + bitmap->offset
> -                                   + page->index*(PAGE_SIZE/512) + size/512
> +                                   +
> (long)(page->index*(PAGE_SIZE/512)) + size/512
>                                     > rdev->data_offset)
>                                         /* bitmap runs in to data */
>                                         return -EINVAL;
> 
> Thanks,
> Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-10-23  5:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-17 15:37 mdadm 2.6.x regression, fails creation of raid1 w/ v1.0 sb and internal bitmap Mike Snitzer
2007-10-17 20:54 ` Bill Davidsen
2007-10-17 21:21   ` Mike Snitzer
2007-10-18 12:43     ` Bill Davidsen
2007-10-19  2:41       ` Neil Brown
2007-10-19  3:45         ` Bill Davidsen
2007-10-18  5:53 ` Neil Brown
2007-10-18 12:10   ` Mike Snitzer
2007-10-19  2:38     ` Neil Brown
2007-10-19  4:52       ` Mike Snitzer
2007-10-19  5:15         ` Mike Snitzer
2007-10-19  5:51           ` Neil Brown
2007-10-19 23:18             ` Mike Snitzer
2007-10-22  6:55               ` Neil Brown
2007-10-22 14:05                 ` Mike Snitzer
2007-10-23  5:11                   ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).