All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Roeleveld" <joost@antarean.org>
To: xen-devel@lists.xensource.com
Subject: Re: Making snapshot of logical volumes handling HVM domU causes OOPS and instability
Date: Mon, 13 Sep 2010 10:33:40 +0200	[thread overview]
Message-ID: <201009131033.40474.joost@antarean.org> (raw)
In-Reply-To: <4C8D2069.10609@sce.pridelands.org>

On Sunday 12 September 2010 20:48:09 Scott Garron wrote:
> On 9/12/2010 5:41 AM, J. Roeleveld wrote:
> > I also use LVMs extensively and do similar steps for backups.
> > 1) umount in domU
> > 2) block-detach
> > 3) lvcreate snapshot
> > 4) block-attach
> > 5) mount in domU
> 
>       I think the biggest difference, here, is that you unmount and
> detach the source volumes before creating the snapshot whereas I just
> leave them active and mounted in the guest.  I don't know if that will
> end up being the difference between stability and instability on my
> system, but it's an observation and probably worth experimentation.

I tend to umount first to ensure the filesystem is consistent and no writes are 
still left in the write-buffer on the guest.
Filesystem recoveries are fine, but why rely on them when it's not necessary? 
:)

> > I, however, have no need for HVM and only use PV guests.
> 
>       It turns out that it doesn't seem isolated to HVM guests on my
> system any longer.  That was just coincidental during the first few
> crashes that I observed.

Ok, I believe the issue might be related to the LVM-stack and the way Xen 
holds the devices locked when they are actually mounted and attached?

> > Are you certain the snapshots are large enough to hold all possible
> > changes that might occur on the LV during the existence of the
> > snapshot?
> 
>       Certainly.  The most recent one to cause a crash has existed
> through the crash and for 3 days now, and is only using 2.65% of its COW
> space.  They usually don't get a chance to go above even 0.3% before the
> rsync on them is finished and they are unmounted and removed by the
> backup script.

Ok, guess that's not the cause :)
Although, I get the "unable to remove active" error when there is 0% used, but 
also over 20% used, so there is no clear indication what is causing it (to me)

> > Another thing I notice, which might be of help to people who
> > understand this better then I do, in my backup-script, sometimes step
> > "5" fails because the domU hasn't noticed the device is attached
> > again when I try to mount it. The domU-commands are run using
> > SSH-connections.
> 
>       That probably just has to do with variations in how long it takes
> the guest kernel to poll or be notified of device changes, and how long
> it takes for its udev to create the device files and whatnot.
> Introducing some sanity checks or just a longer delay in your backup
> script would likely get around that problem.  (I could be wrong, though)

I do need to add some sanity checks into the script at some point, but 
currently I start these manually and 'fix' the left-overs myself.
The mount-issue is a simple one and I notice this within 30-40 seconds of the 
scripts starting.

--
Joost

  parent reply	other threads:[~2010-09-13  8:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-28  1:22 Making snapshot of logical volumes handling HVM domU causes OOPS and instability Scott Garron
2010-08-30 16:52 ` Jeremy Fitzhardinge
2010-08-30 18:18   ` Scott Garron
2010-09-12  9:33     ` J. Roeleveld
2010-08-30 19:13   ` Daniel Stodden
2010-08-30 20:30     ` Scott Garron
2010-08-31  9:20       ` Daniel Stodden
2010-08-31 18:06         ` Scott Garron
2010-09-03  8:06           ` Scott Garron
2010-09-12  9:41             ` J. Roeleveld
2010-09-12 18:48               ` Scott Garron
2010-09-13  0:15                 ` Making snapshot of logical volumes handling HVM domUcauses " James Harper
2010-09-13  8:35                   ` J. Roeleveld
2010-09-13  8:33                 ` J. Roeleveld [this message]
     [not found]           ` <4C80ABA6.6000203@pridelands.org>
2010-09-03 15:40             ` Making snapshot of logical volumes handling HVM domU causes " Jeremy Fitzhardinge
2010-09-11 19:16               ` Scott Garron
2010-09-12  0:20                 ` Making snapshot of logical volumes handling HVM domUcauses " James Harper
2010-08-31  6:59   ` Making snapshot of logical volumes handling HVM domU causes " Xu, Dongxiao
2010-08-31  8:16     ` Scott Garron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201009131033.40474.joost@antarean.org \
    --to=joost@antarean.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.