From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54294)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <famz@redhat.com>) id 1asVfQ-0008Lc-53
	for qemu-devel@nongnu.org; Tue, 19 Apr 2016 09:26:17 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <famz@redhat.com>) id 1asVfP-0005T5-2D
	for qemu-devel@nongnu.org; Tue, 19 Apr 2016 09:26:12 -0400
Date: Tue, 19 Apr 2016 21:19:02 +0800
From: Fam Zheng <famz@redhat.com>
Message-ID: <20160419131902.GJ28572@ad-mail.usersys.redhat.com>
References: <1460690887-32751-1-git-send-email-famz@redhat.com>
	<1460690887-32751-6-git-send-email-famz@redhat.com>
	<20160417192725.GJ19398@redhat.com>
	<20160418011036.GA18893@ad-mail.usersys.redhat.com>
	<20160418080419.GC11600@redhat.com>
	<20160419123714.GC28572@ad-mail.usersys.redhat.com>
	<20160419130724.GH11600@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160419130724.GH11600@redhat.com>
Subject: Re: [Qemu-devel] [PATCH for-2.7 v2 05/17] raw-posix: Implement
 .bdrv_lockf
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Richard W.M. Jones" <rjones@redhat.com>
Cc: qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>, qemu-block@nongnu.org, Jeff Cody <jcody@redhat.com>, Markus Armbruster <armbru@redhat.com>, Max Reitz <mreitz@redhat.com>, den@openvz.org, pbonzini@redhat.com, John Snow <jsnow@redhat.com>

On Tue, 04/19 14:07, Richard W.M. Jones wrote:
> On Tue, Apr 19, 2016 at 08:37:14PM +0800, Fam Zheng wrote:
> > On Mon, 04/18 09:04, Richard W.M. Jones wrote:
> > > On Mon, Apr 18, 2016 at 09:10:36AM +0800, Fam Zheng wrote:
> > > > On Sun, 04/17 20:27, Richard W.M. Jones wrote:
> > > > > On Fri, Apr 15, 2016 at 11:27:55AM +0800, Fam Zheng wrote:
> > > > > > virtlockd in libvirt locks the first byte, we lock byte 1 to avoid
> > > > > > the intervene.
> > > > > > +static int raw_lockf(BlockDriverState *bs, BdrvLockfCmd cmd)
> > > > > > +{
> > > > > > +
> > > > > > +    BDRVRawState *s = bs->opaque;
> > > > > > +    int ret;
> > > > > > +    struct flock fl = (struct flock) {
> > > > > > +        .l_whence  = SEEK_SET,
> > > > > > +        /* Locking byte 1 avoids interfereing with virtlockd. */
> > > > > > +        .l_start = 1,
> > > > > > +        .l_len = 1,
> > > > > > +    };
> > > > > > +
> > > > > > +    switch (cmd) {
> > > > > > +    case BDRV_LOCKF_RWLOCK:
> > > > > > +        fl.l_type = F_WRLCK;
> > > > > > +        break;
> > > > > > +    case BDRV_LOCKF_ROLOCK:
> > > > > > +        fl.l_type = F_RDLCK;
> > > > > > +        break;
> > > > > > +    case BDRV_LOCKF_UNLOCK:
> > > > > > +        fl.l_type = F_UNLCK;
> > > > > > +        break;
> > > > > 
> > > > > My understanding is this prevents libguestfs from working on live disk
> > > > > images -- we want to be able to read live disk images (using a
> > > > > writable overlay and the real disk image as a read-only backing file).
> > > > 
> > > > Do you lock the live image or the backing file?
> > > 
> > > At the moment we don't need to do anything, but we do have the problem
> > > that if someone uses libguestfs in write mode on a live image, then
> > > obviously they end up with a corrupted guest.
> > > 
> > > However in this email I'm talking about using libguestfs in
> > > "read-only" mode on a live guest, which is a completely different
> > > thing, and should not be prevented.
> > > 
> > > My assumption [with this patch applied] is the live image (being live)
> > > is locked by some other qemu.
> > > 
> > > Now libguestfs creates a temporary overlay with the live image as
> > > backing, using a command similar to:
> > > 
> > >   qemu-img create -f qcow2 -b live.img tmp-overlay.img
> > > 
> > > and opens 'tmp-overlay.img'.  But since the live image has an
> > > exclusive lock, the open will *fail*, and that is a problem.
> > 
> > I'm not sure that is what we want to support. The image is being read-write
> > open by the VM, any other accessing is a bad idea.
> 
> We've done this successfully for years, for people monitoring their
> VMs using virt-df, pulling out files using guestfish and so on.  We
> allow you to do it while the guest is live and running, with the
> proviso that a consistent view cannot always be guaranteed (although
> it works so reliably that it's never really a problem), or users can
> briefly pause VMs if they need a guaranteed consistent view.
> 
> I'm afraid the onus is on you to explain why this existing practice is
> a bad idea.

It is bad idea because it can produce erroneous data. Perhaps it's not
critical and is rare enough to be practically useful.

As a tradeoff, I guess, we can skip the shared lock in this series. Does that
work for you?

Fam