From mboxrd@z Thu Jan 1 00:00:00 1970 From: Scott Garron Subject: Re: Making snapshot of logical volumes handling HVM domU causes OOPS and instability Date: Sun, 12 Sep 2010 14:48:09 -0400 Message-ID: <4C8D2069.10609@sce.pridelands.org> References: <4C7864BB.1010808@sce.pridelands.org> <4C7D44B0.9060105@sce.pridelands.org> <4C80AC95.5080503@sce.pridelands.org> <201009121141.46734.joost@antarean.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <201009121141.46734.joost@antarean.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 9/12/2010 5:41 AM, J. Roeleveld wrote: > I also use LVMs extensively and do similar steps for backups. > 1) umount in domU > 2) block-detach > 3) lvcreate snapshot > 4) block-attach > 5) mount in domU I think the biggest difference, here, is that you unmount and detach the source volumes before creating the snapshot whereas I just leave them active and mounted in the guest. I don't know if that will end up being the difference between stability and instability on my system, but it's an observation and probably worth experimentation. > I, however, have no need for HVM and only use PV guests. It turns out that it doesn't seem isolated to HVM guests on my system any longer. That was just coincidental during the first few crashes that I observed. > Are you certain the snapshots are large enough to hold all possible > changes that might occur on the LV during the existence of the > snapshot? Certainly. The most recent one to cause a crash has existed through the crash and for 3 days now, and is only using 2.65% of its COW space. They usually don't get a chance to go above even 0.3% before the rsync on them is finished and they are unmounted and removed by the backup script. > Another thing I notice, which might be of help to people who > understand this better then I do, in my backup-script, sometimes step > "5" fails because the domU hasn't noticed the device is attached > again when I try to mount it. The domU-commands are run using > SSH-connections. That probably just has to do with variations in how long it takes the guest kernel to poll or be notified of device changes, and how long it takes for its udev to create the device files and whatnot. Introducing some sanity checks or just a longer delay in your backup script would likely get around that problem. (I could be wrong, though) -- Scott Garron