From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54294) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1asVfQ-0008Lc-53 for qemu-devel@nongnu.org; Tue, 19 Apr 2016 09:26:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1asVfP-0005T5-2D for qemu-devel@nongnu.org; Tue, 19 Apr 2016 09:26:12 -0400 Date: Tue, 19 Apr 2016 21:19:02 +0800 From: Fam Zheng Message-ID: <20160419131902.GJ28572@ad-mail.usersys.redhat.com> References: <1460690887-32751-1-git-send-email-famz@redhat.com> <1460690887-32751-6-git-send-email-famz@redhat.com> <20160417192725.GJ19398@redhat.com> <20160418011036.GA18893@ad-mail.usersys.redhat.com> <20160418080419.GC11600@redhat.com> <20160419123714.GC28572@ad-mail.usersys.redhat.com> <20160419130724.GH11600@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160419130724.GH11600@redhat.com> Subject: Re: [Qemu-devel] [PATCH for-2.7 v2 05/17] raw-posix: Implement .bdrv_lockf List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Richard W.M. Jones" Cc: qemu-devel@nongnu.org, Kevin Wolf , qemu-block@nongnu.org, Jeff Cody , Markus Armbruster , Max Reitz , den@openvz.org, pbonzini@redhat.com, John Snow On Tue, 04/19 14:07, Richard W.M. Jones wrote: > On Tue, Apr 19, 2016 at 08:37:14PM +0800, Fam Zheng wrote: > > On Mon, 04/18 09:04, Richard W.M. Jones wrote: > > > On Mon, Apr 18, 2016 at 09:10:36AM +0800, Fam Zheng wrote: > > > > On Sun, 04/17 20:27, Richard W.M. Jones wrote: > > > > > On Fri, Apr 15, 2016 at 11:27:55AM +0800, Fam Zheng wrote: > > > > > > virtlockd in libvirt locks the first byte, we lock byte 1 to avoid > > > > > > the intervene. > > > > > > +static int raw_lockf(BlockDriverState *bs, BdrvLockfCmd cmd) > > > > > > +{ > > > > > > + > > > > > > + BDRVRawState *s = bs->opaque; > > > > > > + int ret; > > > > > > + struct flock fl = (struct flock) { > > > > > > + .l_whence = SEEK_SET, > > > > > > + /* Locking byte 1 avoids interfereing with virtlockd. */ > > > > > > + .l_start = 1, > > > > > > + .l_len = 1, > > > > > > + }; > > > > > > + > > > > > > + switch (cmd) { > > > > > > + case BDRV_LOCKF_RWLOCK: > > > > > > + fl.l_type = F_WRLCK; > > > > > > + break; > > > > > > + case BDRV_LOCKF_ROLOCK: > > > > > > + fl.l_type = F_RDLCK; > > > > > > + break; > > > > > > + case BDRV_LOCKF_UNLOCK: > > > > > > + fl.l_type = F_UNLCK; > > > > > > + break; > > > > > > > > > > My understanding is this prevents libguestfs from working on live disk > > > > > images -- we want to be able to read live disk images (using a > > > > > writable overlay and the real disk image as a read-only backing file). > > > > > > > > Do you lock the live image or the backing file? > > > > > > At the moment we don't need to do anything, but we do have the problem > > > that if someone uses libguestfs in write mode on a live image, then > > > obviously they end up with a corrupted guest. > > > > > > However in this email I'm talking about using libguestfs in > > > "read-only" mode on a live guest, which is a completely different > > > thing, and should not be prevented. > > > > > > My assumption [with this patch applied] is the live image (being live) > > > is locked by some other qemu. > > > > > > Now libguestfs creates a temporary overlay with the live image as > > > backing, using a command similar to: > > > > > > qemu-img create -f qcow2 -b live.img tmp-overlay.img > > > > > > and opens 'tmp-overlay.img'. But since the live image has an > > > exclusive lock, the open will *fail*, and that is a problem. > > > > I'm not sure that is what we want to support. The image is being read-write > > open by the VM, any other accessing is a bad idea. > > We've done this successfully for years, for people monitoring their > VMs using virt-df, pulling out files using guestfish and so on. We > allow you to do it while the guest is live and running, with the > proviso that a consistent view cannot always be guaranteed (although > it works so reliably that it's never really a problem), or users can > briefly pause VMs if they need a guaranteed consistent view. > > I'm afraid the onus is on you to explain why this existing practice is > a bad idea. It is bad idea because it can produce erroneous data. Perhaps it's not critical and is rare enough to be practically useful. As a tradeoff, I guess, we can skip the shared lock in this series. Does that work for you? Fam