All of lore.kernel.org
 help / color / mirror / Atom feed
From: Urban Loesch <bind@enas.net>
To: Frank Steinborn <steinex@nognu.de>
Cc: drbd-dev@lists.linbit.com, 659762@bugs.debian.org
Subject: Re: [Drbd-dev] Bug#659762: lvm2: LVM commands freeze after snapshot delete fails
Date: Tue, 04 Mar 2014 14:19:15 -0000	[thread overview]
Message-ID: <51F29992.2000305@enas.net> (raw)
In-Reply-To: <CANhLx2MO6aDsiTMmt-O4Fx8mmaF8QvNpC8WOBAT9-kfKEaQmVQ@mail.gmail.com>

Hi,

we had the same problems with Debian Wheezy, LVM2 and DRBD.
But this seems not DRBD related. It seems to be some problem between lvm and udevd.

See:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=549691

Stopping udevd before taking the snapshot and starting after removing the
snapshot solved the problem for us. It's only a workaround, but it works for us.

Regards
Urban


Am 26.07.2013 17:14, schrieb Frank Steinborn:
> Hi,
>
> we are a bit further in debugging this. We installed a DELL PowerEdge r620 (same hardware as used in our DRBD-cluster where this problem happens). As
> noone in this thread brought DRBD into play, I didn't expect any interaction with it related to this bug. However, we were not able to reproduce with
> just LVM2 (eg. configure LV, do IO in LV, remove LV, hang.)
>
> So we installed a second machine and put DRBD on top of the LVs. And voila, as soon as we create a snapshot of the LV where DRBD is on top and remove
> this snapshot it fails ca. 1/3 of the time.
>
> Some facts:
>
> root@drbd-primary:~# lvremove --force /dev/vg0/lv0-snap
>    Unable to deactivate open vg0-lv0--snap-cow (254:3)
>    Failed to resume lv0-snap.
>    libdevmapper exiting with 1 device(s) still suspended.
>
> After this, "dmsetup info" gives the following output:
>
> <<< snip >>>
>
> Name:              vg0-lv0--snap
> State:             ACTIVE
> Read Ahead:        256
> Tables present:    LIVE
> Open count:        0
> Event number:      0
> Major, minor:      254, 1
> Number of targets: 1
> UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ
>
> Name:              vg0-lv0-real
> State:             ACTIVE
> Read Ahead:        0
> Tables present:    LIVE
> Open count:        1
> Event number:      0
> Major, minor:      254, 2
> Number of targets: 1
> UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j-real
>
> Name:              vg0-lv0
> State:             SUSPENDED
> Read Ahead:        256
> Tables present:    LIVE & INACTIVE
> Open count:        2
> Event number:      0
> Major, minor:      254, 0
> Number of targets: 1
> UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j
>
> Name:              vg0-lv0--snap-cow
> State:             ACTIVE
> Read Ahead:        0
> Tables present:    LIVE
> Open count:        0
> Event number:      0
> Major, minor:      254, 3
> Number of targets: 1
> UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ-cow
>
> <<< snap >>>
>
> As you can see, the real LV with DRBD on top is now in state SUSPENDED - which causes the cluster to be non-functional as IO operations stall on both
> the primary and secondary node until one does "dmsetup resume /dev/vg0/lv0".
>
> Another interesting issue we've seen: after doing "dmsetup resume /dev/vg0/lv0", lv0-snap doesn't appear to be a snapshot anymore, given the output of
> lvs (lv0-snap has no origin anymore):
>
>    LV       VG   Attr     LSize   Pool Origin Data%  Move Log Copy%  Convert
>    lv0      vg0  -wi-ao-- 200.00g
>    lv0-snap vg0  -wi-a---  40.00g
>
>
> Some miscellaneous notes:
> * It _feels_ to only happen when the snapshot is filled at least something around 50-60%.
> * We can trigger something like this even without DRBD. When triggered however, the LV will never end up in SUSPENDED state and a second try of
> lvremove will always succeed.
>
> Thats all we have so far. I already had a private conversation with waldi@debian.org <mailto:waldi@debian.org> on this and we will (probably) provide
> him remote access on this system as soon as we have the setup reachable from the outside.
>
> Please let me know if I can provide any more information to get this fixed. I put drbd-dev in cc, maybe someone over there has an idea on this?
>
> @drbd-dev: system is debian wheezy, w/ drbd 8.3.11, lvm2 2.02.95.
>
> Thanks,
> Frank


           reply	other threads:[~2014-03-04 14:19 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <CANhLx2MO6aDsiTMmt-O4Fx8mmaF8QvNpC8WOBAT9-kfKEaQmVQ@mail.gmail.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F29992.2000305@enas.net \
    --to=bind@enas.net \
    --cc=659762@bugs.debian.org \
    --cc=drbd-dev@lists.linbit.com \
    --cc=steinex@nognu.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.