linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
@ 2007-09-01 17:19 noah
  2007-09-01 19:08 ` Hannes Dorbath
  0 siblings, 1 reply; 9+ messages in thread
From: noah @ 2007-09-01 17:19 UTC (permalink / raw)
  To: linux-lvm

Pvmove hung and now all commands that access the filesystem hang too,
possibly in uninterruptible sleep.
Is this a bug or did I do something forbidden, somehow? ;)

The most irritating thing right now is that the keys used for dm-crypt
for the new PVs weren't moved out of the box before the crash
happened.. I didn't even consider pvmove to fail since it's worked
flawlessly for me before, but now it apparently did. Guess I'm beyond
fucked here.


My original setup was the following:
/dev/sda + missing -> md0 (raid 1)
dmcrypt on md0 -> aesmd0
aesmd0 is part of a VG aes

The rootfilesystem and a bunch of others are all LVMs in VG aes.


Prior to running pvmove I did the following:
/dev/sdb + missing -> md3 (raid 1)
dmcrypt on md3 -> aesmd3
/dev/sdd + missing -> md4 (raid 1)
dmcrypt on md4 -> aesmd4
aesmd3 + aesmd4 -> md5 (raid 0)

pvcreate /dev/md5
vgextend aes /dev/md5

Then I did pvmove -v -i 60 /dev/aesmd0 to move over extents from
aesmd0 to /dev/md5.
It started spitting out some logs and seemed to progress just fine.
Meanwhile I did 'mdadm /dev/md3 -a /dev/sdc' to complete the degraded
RAID1 mirror.
pvmove still seemed to progress well but 10 minutes later I
experienced hangs anything that deals with the filesystem.

The system is running Ubuntu Gutsy Gibbon on an AMD64 X2 CPU with 2GB
memory, no swap.
Vmware server was also running on the machine when this happened in
case it's interesting.
The terminal sessions I had to this machine when it hung continued to
work as long as they didn't try to access the filesystem where as the
sessions to my vmware guests were all hung.

Since I was working in screen I've recovered some parts of the output,
mainly the lvm2 commands and stuff regarding md.
It's available here: http://pastebin.com/m34412938

  -- noah

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-01 17:19 [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy) noah
@ 2007-09-01 19:08 ` Hannes Dorbath
  2007-09-01 23:05   ` noah
  0 siblings, 1 reply; 9+ messages in thread
From: Hannes Dorbath @ 2007-09-01 19:08 UTC (permalink / raw)
  To: LVM general discussion and development

noah wrote:
> Pvmove hung and now all commands that access the filesystem hang too,
> possibly in uninterruptible sleep.
> Is this a bug or did I do something forbidden, somehow? ;)

To my knowledge pvmove is broken on 2.6.22. I'd go back to 2.6.21 until
there is a fix. Maybe there is already a fix, I haven't followed it.


-- 
Best regards,
Hannes Dorbath

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-01 19:08 ` Hannes Dorbath
@ 2007-09-01 23:05   ` noah
  2007-09-05  9:57     ` noah
  0 siblings, 1 reply; 9+ messages in thread
From: noah @ 2007-09-01 23:05 UTC (permalink / raw)
  To: LVM general discussion and development

2007/9/1, Hannes Dorbath <light@theendofthetunnel.de>:
> noah wrote:
> > Pvmove hung and now all commands that access the filesystem hang too,
> > possibly in uninterruptible sleep.
> > Is this a bug or did I do something forbidden, somehow? ;)
>
> To my knowledge pvmove is broken on 2.6.22. I'd go back to 2.6.21 until
> there is a fix. Maybe there is already a fix, I haven't followed it.

Thanks for a quick reply.
Has this problem been documented elsewhere?
Just curious what the problem is and whether anybody is working on a fix.

While looking through Ubuntu's bug database a found others who had
been countering problems with pvmove; something related to
device-mapper.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=376283
Is the problem I stumbled upon a different one?

  -- noah

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-01 23:05   ` noah
@ 2007-09-05  9:57     ` noah
  2007-09-05 10:06       ` Bryn M. Reeves
  0 siblings, 1 reply; 9+ messages in thread
From: noah @ 2007-09-05  9:57 UTC (permalink / raw)
  To: LVM general discussion and development

2007/9/2, noah <noah123@gmail.com>:
> 2007/9/1, Hannes Dorbath <light@theendofthetunnel.de>:
> > noah wrote:
> > > Pvmove hung and now all commands that access the filesystem hang too,
> > > possibly in uninterruptible sleep.
> > > Is this a bug or did I do something forbidden, somehow? ;)
> >
> > To my knowledge pvmove is broken on 2.6.22. I'd go back to 2.6.21 until
> > there is a fix. Maybe there is already a fix, I haven't followed it.
>
> Thanks for a quick reply.
> Has this problem been documented elsewhere?
> Just curious what the problem is and whether anybody is working on a fix.

I'm still curious what the problem is, when it happend, why and
whether or not somebody is working on it.
Is this a problem with dm, lvm, barriers, userland utilities or something else?

>
> While looking through Ubuntu's bug database a found others who had
> been countering problems with pvmove; something related to
> device-mapper.
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=376283
> Is the problem I stumbled upon a different one?

  -- noah

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-05  9:57     ` noah
@ 2007-09-05 10:06       ` Bryn M. Reeves
  2007-09-05 11:42         ` noah
  0 siblings, 1 reply; 9+ messages in thread
From: Bryn M. Reeves @ 2007-09-05 10:06 UTC (permalink / raw)
  To: LVM general discussion and development

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

noah wrote:
> 2007/9/2, noah <noah123@gmail.com>:
>> 2007/9/1, Hannes Dorbath <light@theendofthetunnel.de>:
>>> noah wrote:
>>>> Pvmove hung and now all commands that access the filesystem hang too,
>>>> possibly in uninterruptible sleep.
>>>> Is this a bug or did I do something forbidden, somehow? ;)
>>> To my knowledge pvmove is broken on 2.6.22. I'd go back to 2.6.21 until
>>> there is a fix. Maybe there is already a fix, I haven't followed it.
>> Thanks for a quick reply.
>> Has this problem been documented elsewhere?
>> Just curious what the problem is and whether anybody is working on a fix.
> 
> I'm still curious what the problem is, when it happend, why and
> whether or not somebody is working on it.
> Is this a problem with dm, lvm, barriers, userland utilities or something else?

The fix was merged in 2.6.22.2 - file a bug with your distribution to
have it included.

The patch is here:

http://lkml.org/lkml/2007/8/7/401

Regards,
Bryn.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFG3n+26YSQoMYUY94RAtk5AKCAGLOnNed1kXmR7zhP86NKK0ufGQCcDfFo
aF9pK74vakanyaf4VzKzJ1o=
=iHq+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-05 10:06       ` Bryn M. Reeves
@ 2007-09-05 11:42         ` noah
  2007-09-05 12:31           ` Bryn M. Reeves
  0 siblings, 1 reply; 9+ messages in thread
From: noah @ 2007-09-05 11:42 UTC (permalink / raw)
  To: LVM general discussion and development

2007/9/5, Bryn M. Reeves <breeves@redhat.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> noah wrote:
> > 2007/9/2, noah <noah123@gmail.com>:
> >> 2007/9/1, Hannes Dorbath <light@theendofthetunnel.de>:
> >>> noah wrote:
> >>>> Pvmove hung and now all commands that access the filesystem hang too,
> >>>> possibly in uninterruptible sleep.
> >>>> Is this a bug or did I do something forbidden, somehow? ;)
> >>> To my knowledge pvmove is broken on 2.6.22. I'd go back to 2.6.21 until
> >>> there is a fix. Maybe there is already a fix, I haven't followed it.
> >> Thanks for a quick reply.
> >> Has this problem been documented elsewhere?
> >> Just curious what the problem is and whether anybody is working on a fix.
> >
> > I'm still curious what the problem is, when it happend, why and
> > whether or not somebody is working on it.
> > Is this a problem with dm, lvm, barriers, userland utilities or something else?
>
> The fix was merged in 2.6.22.2 - file a bug with your distribution to
> have it included.
>
> The patch is here:
>
> http://lkml.org/lkml/2007/8/7/401

This patch is already applied on the kernel i was running.
Also, I'm not sure I'm using dm-raid at all since I don't have one of
these lame software (BIOS) RAID cards. I'm using md and dm-crypt
straight on top of the harddrives.

Could I have hit a different bug?

  -- noah

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-05 11:42         ` noah
@ 2007-09-05 12:31           ` Bryn M. Reeves
  2007-09-05 13:37             ` noah
  0 siblings, 1 reply; 9+ messages in thread
From: Bryn M. Reeves @ 2007-09-05 12:31 UTC (permalink / raw)
  To: LVM general discussion and development

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

noah wrote:
> This patch is already applied on the kernel i was running.
> Also, I'm not sure I'm using dm-raid at all since I don't have one of
> these lame software (BIOS) RAID cards. 

dm-raid1 is nothing to do with dmraid per-se (it's just one of the
targets dmraid might use to construct a raid set).

It is also used by pvmove - when you pvmove a volume, a temporary mirror
is constructed using dm-raid1, synchronised and then broken. After the
pvmove the newly sync'ed copy is substituted for the original PV.

> I'm using md and dm-crypt straight on top of the harddrives.

I'm not sure I understand - if you're only using md and dm-crypt, why is
pvmove involved?

You can directly view the status of the mirror/pvmoved volume using
"dmsetup status" - the patch I linked to earlier corrected a problem
with the status report that prevented userspace correctly parsing the
status updates.

It does sound from your description like you may have hit a separate
problem since you mention that the pvmove does initially proceed correctly.

Regards,
Bryn.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFG3qGc6YSQoMYUY94RAo1SAJ0Vwpq5CaOxTw58T6hEPILDBb53CgCgwh8N
zY3ZdCnQKBt42/WrE2+ngVg=
=BAve
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-05 12:31           ` Bryn M. Reeves
@ 2007-09-05 13:37             ` noah
  2007-09-05 15:04               ` Bryn M. Reeves
  0 siblings, 1 reply; 9+ messages in thread
From: noah @ 2007-09-05 13:37 UTC (permalink / raw)
  To: LVM general discussion and development

2007/9/5, Bryn M. Reeves <breeves@redhat.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> noah wrote:
> > This patch is already applied on the kernel i was running.
> > Also, I'm not sure I'm using dm-raid at all since I don't have one of
> > these lame software (BIOS) RAID cards.
>
> dm-raid1 is nothing to do with dmraid per-se (it's just one of the
> targets dmraid might use to construct a raid set).
>
> It is also used by pvmove - when you pvmove a volume, a temporary mirror
> is constructed using dm-raid1, synchronised and then broken. After the
> pvmove the newly sync'ed copy is substituted for the original PV.

I see, thanks for the explaination.

>
> > I'm using md and dm-crypt straight on top of the harddrives.
>
> I'm not sure I understand - if you're only using md and dm-crypt, why is
> pvmove involved?

Sorry, I forgot to mention LVM. Just wanted to make it clear I wasn't
using dmraid in case it was related -- which it obviously wasn't.

>
> You can directly view the status of the mirror/pvmoved volume using
> "dmsetup status" - the patch I linked to earlier corrected a problem
> with the status report that prevented userspace correctly parsing the
> status updates.

Interesting.
Unfortunately my whole system went away in the crash and I cannot
reproduce the error again to see what dmstatus table would have
reported.
But it does look good on the new system with the same kernel AFAICT.

aes-root: 0 6291456 linear 9:2 384
aes-root: 6291456 2097152 linear 9:2 887095680
aes-root: 8388608 2097152 linear 9:2 897941888
raidB: 0 586113528 crypt aes-cbc-essiv:sha256
00000000000000000000000000000000 0 9:1 1032
raidA: 0 976771960 crypt aes-cbc-essiv:sha256
00000000000000000000000000000000 0 9:0 1032


>
> It does sound from your description like you may have hit a separate
> problem since you mention that the pvmove does initially proceed correctly.

Check.
Is this list the correct place for bug reports or should I resend my
mail to dm-devel@ or even lkml?

  -- noah

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy)
  2007-09-05 13:37             ` noah
@ 2007-09-05 15:04               ` Bryn M. Reeves
  0 siblings, 0 replies; 9+ messages in thread
From: Bryn M. Reeves @ 2007-09-05 15:04 UTC (permalink / raw)
  To: LVM general discussion and development

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

noah wrote:
> aes-root: 0 6291456 linear 9:2 384
> aes-root: 6291456 2097152 linear 9:2 887095680
> aes-root: 8388608 2097152 linear 9:2 897941888
> raidB: 0 586113528 crypt aes-cbc-essiv:sha256
> 00000000000000000000000000000000 0 9:1 1032
> raidA: 0 976771960 crypt aes-cbc-essiv:sha256
> 00000000000000000000000000000000 0 9:0 1032

Yep - that all looks normal, the devices with "linear" in the 4th field
are your regular logical volumes, then there are two crypts as well.

During a pvmove, the linear mappings will change to mirror mappings.
E.g. on a test box here, I have a VG named "t1" with an LV named "l0"
backed by two loop devices (only one is in use initially):

t1-l0: 0 4194304 linear

During a pvmove, the LV being moved would have its linear status line
replaced with a status line like this:

t1-pvmove0: 0 4194304 mirror 2 7:0 7:1 720/4096 1 AA 1 core

The 720/4096 is the progress - we've sync'ed 720 out of 4096 mirror
regions on the device. Running status repeatedly I see:

t1-pvmove0: 0 4194304 mirror 2 7:0 7:1 904/4096 1 AA 1 core
[...]
t1-pvmove0: 0 4194304 mirror 2 7:0 7:1 2557/4096 1 AA 1 core

Etc. If you are still able to reproduce the pvmove problem you can use
this to see if it's the status line bug in 2.6.22.

If not, as you describe the pvmove as hanging, a set of sysrq-t data
would be helpful to see where it's getting stuck. You can trigger this
from the keyboard or via /proc/sysrq-trigger.

The output will be sent to dmesg/syslog - it's likely to be very large,
so better to put it on something like pastebin than send it to the list.

It's most useful if you collect two sets, 5-10 seconds apart as this
allows to see if things are really stuck.

> Check.
> Is this list the correct place for bug reports or should I resend my
> mail to dm-devel@ or even lkml?

Yes - definitely the right place!

Regards,
Bryn.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFG3sVo6YSQoMYUY94RAskyAKCmmLYWDwSSQ+i0rG5H6W0qgoEEAwCfUnje
MDfXUHIXwT7dVmshka6AOAM=
=RAhk
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-09-05 15:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-01 17:19 [linux-lvm] pvmove hung on 2.6.22 (ubuntu gutsy) noah
2007-09-01 19:08 ` Hannes Dorbath
2007-09-01 23:05   ` noah
2007-09-05  9:57     ` noah
2007-09-05 10:06       ` Bryn M. Reeves
2007-09-05 11:42         ` noah
2007-09-05 12:31           ` Bryn M. Reeves
2007-09-05 13:37             ` noah
2007-09-05 15:04               ` Bryn M. Reeves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).