md0: invalid bitmap page request: 249 (> 223)

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* md0: invalid bitmap page request: 249 (> 223)
@ 2007-04-12 16:04 John Stoffel
  2007-04-12 16:28 ` Bill Davidsen
  2007-04-13 12:56 ` John Stoffel
  0 siblings, 2 replies; 4+ messages in thread
From: John Stoffel @ 2007-04-12 16:04 UTC (permalink / raw)
  To: linux-raid

Hi Neil,

I've just installed a new SATA controller and a pair of 320Gb disks
into my system.  Went great.  I'm running 2.6.21-rc6, with the ATA
drivers for my disks.

I had a RAID1 mirror consisting of two 120gb disks.  I used mdadm and
grew the number of disks in md0 to four, then added in the two new
disks.  Let it resync overnight, and then this morning I removed the
two old disks.  Went really really really well.

But now I'm trying to grow (using mdadm v2.5.6, Debian unstable
system) the array to use the full space now available.  Then I'll grow
the PVs and LVs I have on top of these to make them bigger as well.

The re-sync is going:

    > cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
    [raid4] 
    md0 : active raid1 sdd1[1] sdc1[0]
	  312568000 blocks [2/2] [UU]
	  [========>............]  resync = 42.1% (131637248/312568000)
    finish=373264.5min speed=0K/sec
	  bitmap: 1/224 pages [4KB], 256KB chunk

    unused devices: <none>

But it's going slowly and dragging down the whole system with pauses,
and I'm getting tons of the following messages in my dmesg output:

    [50683.698708] md0: invalid bitmap page request: 251 (> 223)
    [50683.763687] md0: invalid bitmap page request: 251 (> 223)
    [50683.828621] md0: invalid bitmap page request: 251 (> 223)
    [50683.893520] md0: invalid bitmap page request: 251 (> 223)
    [50683.958396] md0: invalid bitmap page request: 251 (> 223)
    [50684.023265] md0: invalid bitmap page request: 251 (> 223)
    [50684.088202] md0: invalid bitmap page request: 251 (> 223)
    [50684.153196] md0: invalid bitmap page request: 251 (> 223)
    [50684.218129] md0: invalid bitmap page request: 251 (> 223)
    [50684.283044] md0: invalid bitmap page request: 251 (> 223)

Is there anyway I can interrupt the command I used:

	mdadm --grow /dev/md0 --size=#####

which I know now I should have used the --size=max paramter instead,
but it wasn't in the man page or the online help.  Oh well...

I tried removing the bitmap with:

	mdadm --grow /dev/md0 --bitmap=none

but of course it won't let me do that.  Would I have to hot-fail one
of my disks to interrupt the re-sync, so I can remove the bitmap, so I
can then grow the RAID1 to the max volume size?

Thanks,
John
john@stoffel.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: md0: invalid bitmap page request: 249 (> 223)
  2007-04-12 16:04 md0: invalid bitmap page request: 249 (> 223) John Stoffel
@ 2007-04-12 16:28 ` Bill Davidsen
  2007-04-13 17:38   ` John Stoffel
  2007-04-13 12:56 ` John Stoffel
  1 sibling, 1 reply; 4+ messages in thread
From: Bill Davidsen @ 2007-04-12 16:28 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-raid

John Stoffel wrote:
> Hi Neil,
>
> I've just installed a new SATA controller and a pair of 320Gb disks
> into my system.  Went great.  I'm running 2.6.21-rc6, with the ATA
> drivers for my disks.
>
> I had a RAID1 mirror consisting of two 120gb disks.  I used mdadm and
> grew the number of disks in md0 to four, then added in the two new
> disks.  Let it resync overnight, and then this morning I removed the
> two old disks.  Went really really really well.
>
> But now I'm trying to grow (using mdadm v2.5.6, Debian unstable
> system) the array to use the full space now available.  Then I'll grow
> the PVs and LVs I have on top of these to make them bigger as well.
>
> The re-sync is going:
>
>     > cat /proc/mdstat
>     Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>     [raid4] 
>     md0 : active raid1 sdd1[1] sdc1[0]
> 	  312568000 blocks [2/2] [UU]
> 	  [========>............]  resync = 42.1% (131637248/312568000)
>     finish=373264.5min speed=0K/sec
> 	  bitmap: 1/224 pages [4KB], 256KB chunk
>
>     unused devices: <none>
>
>
> But it's going slowly and dragging down the whole system with pauses,
> and I'm getting tons of the following messages in my dmesg output:
>
>     [50683.698708] md0: invalid bitmap page request: 251 (> 223)
>     [50683.763687] md0: invalid bitmap page request: 251 (> 223)
>     [50683.828621] md0: invalid bitmap page request: 251 (> 223)
>     [50683.893520] md0: invalid bitmap page request: 251 (> 223)
>     [50683.958396] md0: invalid bitmap page request: 251 (> 223)
>     [50684.023265] md0: invalid bitmap page request: 251 (> 223)
>     [50684.088202] md0: invalid bitmap page request: 251 (> 223)
>     [50684.153196] md0: invalid bitmap page request: 251 (> 223)
>     [50684.218129] md0: invalid bitmap page request: 251 (> 223)
>     [50684.283044] md0: invalid bitmap page request: 251 (> 223)
>
>
> Is there anyway I can interrupt the command I used:
>
> 	mdadm --grow /dev/md0 --size=#####
>
> which I know now I should have used the --size=max paramter instead,
> but it wasn't in the man page or the online help.  Oh well...
>
> I tried removing the bitmap with:
>
> 	mdadm --grow /dev/md0 --bitmap=none
>
> but of course it won't let me do that.  Would I have to hot-fail one
> of my disks to interrupt the re-sync, so I can remove the bitmap, so I
> can then grow the RAID1 to the max volume size?
>   
I think you could interrupt it by echoing 'idle' to the sync_action, but 
I personally wouldn't do that, since there are indications of some 
problem already, and doing another unusual thing is not prudent. I doubt 
you problems are caused by specifying the size rather than using "max," 
unless you got it wrong, in which case I wouldn't guess what is going to 
happen.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: md0: invalid bitmap page request: 249 (> 223)
  2007-04-12 16:04 md0: invalid bitmap page request: 249 (> 223) John Stoffel
  2007-04-12 16:28 ` Bill Davidsen
@ 2007-04-13 12:56 ` John Stoffel
  1 sibling, 0 replies; 4+ messages in thread
From: John Stoffel @ 2007-04-13 12:56 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-raid

>>>>> "John" == John Stoffel <john@stoffel.org> writes:

This is an update email, my system is now up and running properly,
though with some caveats.  

John> I've just installed a new SATA controller and a pair of 320Gb
John> disks into my system.  Went great.  I'm running 2.6.21-rc6, with
John> the ATA drivers for my disks.

John> I had a RAID1 mirror consisting of two 120gb disks.  I used
John> mdadm and grew the number of disks in md0 to four, then added in
John> the two new disks.  Let it resync overnight, and then this
John> morning I removed the two old disks.  Went really really really
John> well.

This is where I think part of the problem came in.  When you do a:

  mdadm /dev/md0 --fail /dev/sde1

The superblock on the disk isn't wiped nicely, or at least the UUID
isn't changed to be something different.  This can cause problems
later on if you have to reboot the system and it discoveres one of the
removed disks first, before the actual live disks are found.  

Not fun, and certainly close to heart attack time.  *grin*

John> But now I'm trying to grow (using mdadm v2.5.6, Debian unstable
John> system) the array to use the full space now available.  Then
John> I'll grow the PVs and LVs I have on top of these to make them
John> bigger as well.

I've also found issues with the LVM2 tools, in that you can't muck
with UUIDs or PV UUIDs easily.  

John> The re-sync is going:

>> cat /proc/mdstat
John>     Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
John>     [raid4] 
John>     md0 : active raid1 sdd1[1] sdc1[0]
John> 	  312568000 blocks [2/2] [UU]
John> 	  [========>............]  resync = 42.1% (131637248/312568000)
John>     finish=373264.5min speed=0K/sec
John> 	  bitmap: 1/224 pages [4KB], 256KB chunk

John>     unused devices: <none>

John> But it's going slowly and dragging down the whole system with
John> pauses, and I'm getting tons of the following messages in my
John> dmesg output:

John>     [50683.698708] md0: invalid bitmap page request: 251 (> 223)
John>     [50683.763687] md0: invalid bitmap page request: 251 (> 223)
John>     [50683.828621] md0: invalid bitmap page request: 251 (> 223)
John>     [50683.893520] md0: invalid bitmap page request: 251 (> 223)
John>     [50683.958396] md0: invalid bitmap page request: 251 (> 223)
John>     [50684.023265] md0: invalid bitmap page request: 251 (> 223)
John>     [50684.088202] md0: invalid bitmap page request: 251 (> 223)
John>     [50684.153196] md0: invalid bitmap page request: 251 (> 223)
John>     [50684.218129] md0: invalid bitmap page request: 251 (> 223)
John>     [50684.283044] md0: invalid bitmap page request: 251 (> 223)

John> Is there anyway I can interrupt the command I used:

John> 	mdadm --grow /dev/md0 --size=#####

John> which I know now I should have used the --size=max paramter
John> instead, but it wasn't in the man page or the online help.  Oh
John> well...

John> I tried removing the bitmap with:

John> 	mdadm --grow /dev/md0 --bitmap=none

John> but of course it won't let me do that.  Would I have to hot-fail
John> one of my disks to interrupt the re-sync, so I can remove the
John> bitmap, so I can then grow the RAID1 to the max volume size?

Well, once I had tried to remove the bitap during a sync, I couldn't
actually look at the output of /proc/mdstat anymore, it would just
hang when I did:  cat /proc/mdstat

So I ended up doing a reboot, which is where I then ran into a couple
of problems:

1. When you have a UUID listed in your /etc/mdadm/mdadm.conf, and
   you've changed the UUID on an array, you better change the conf file
   as well.  

   This sucks because I don't want to change the UUID of the live
   array, I want to change the UUIDs of the devices I failed and removed,
   so that they /WILL NOT/ be considered during the next assembly of
   an array.

2. LVM2 PV (Physical Volumes) have the same damm problem.  Grrr...

So I ended up unplugging the power to my old disks and rebooting a
couple of times and I managed to get all my data back, lvextend the
volumes and resize2fs the filesystems.  I'm happy.  Though I'm sad I
had as much downtime as I did.  

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: md0: invalid bitmap page request: 249 (> 223)
  2007-04-12 16:28 ` Bill Davidsen
@ 2007-04-13 17:38   ` John Stoffel
  0 siblings, 0 replies; 4+ messages in thread
From: John Stoffel @ 2007-04-13 17:38 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: John Stoffel, linux-raid

>>>>> "Bill" == Bill Davidsen <davidsen@tmr.com> writes:

>> Is there anyway I can interrupt the command I used:
>> 
>> mdadm --grow /dev/md0 --size=#####
>> 
>> which I know now I should have used the --size=max paramter instead,
>> but it wasn't in the man page or the online help.  Oh well...
>> 
>> I tried removing the bitmap with:
>> 
>> mdadm --grow /dev/md0 --bitmap=none
>> 
>> but of course it won't let me do that.  Would I have to hot-fail one
>> of my disks to interrupt the re-sync, so I can remove the bitmap, so I
>> can then grow the RAID1 to the max volume size?

Bill> I think you could interrupt it by echoing 'idle' to the
Bill> sync_action, but I personally wouldn't do that, since there are
Bill> indications of some problem already, and doing another unusual
Bill> thing is not prudent.

Doing an 'echo idle > sync_action' was not a good idea.  The system
was not happy with me.  This I consider to be a bug, since the MD
system should just ignore commands it can't handle at a point.  

So you'd need to do the echo first, then a cat to check the status.  

Bill> I doubt you problems are caused by specifying the size rather
Bill> than using "max," unless you got it wrong, in which case I
Bill> wouldn't guess what is going to happen.

It looks like it's a known issue with bitmaps that Neil hasn't fixed
yet.  Oh well.  Now my problems seem to be bad blocks on the new
disk.  Sigh...  *grin*  

But I'm now running on the new disks ok and enjoying the extra disk
space.

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-04-13 17:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-12 16:04 md0: invalid bitmap page request: 249 (> 223) John Stoffel
2007-04-12 16:28 ` Bill Davidsen
2007-04-13 17:38   ` John Stoffel
2007-04-13 12:56 ` John Stoffel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).