qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Chris Webb <chris@arachsys.com>
To: "Richard W.M. Jones" <rjones@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] Disk image shared and exclusive locks.
Date: Mon, 7 Dec 2009 10:39:08 +0000	[thread overview]
Message-ID: <20091207103908.GI2271@arachsys.com> (raw)
In-Reply-To: <20091204165301.GA4167@amd.home.annexia.org>

Hi. There's a connected discussion on the sheepdog list about locking, and I
have a patch there which could complement this one quite well.

Sheepdog is a distributed, replicated block store being developed
(primarily) for Qemu. Images have a mandatory exclusive locking requirement,
enforced by the cluster manager. Without this, the replication scheme
breaks down and you can end up with inconsistent copies of the block
image.

The initial release of sheepdog took these locks in the block driver
bdrv_open() and bdrv_close() hooks. They also added a bdrv_closeall() and
ensured it was called in all the usual qemu exit paths to avoid stray locks.
(The rarer case of crashing hosts or crashing qemus will have to be handled
externally, and is 'to do'.)

The problem was that this prevented live migration, because both ends wanted
to open the image at once, even though only one would be using it at a time.
To get around this, we introduced bdrv_claim() and bdrv_release() block
driver hooks, which can be used to claim and release the lock on vm start
and stop. Since at most one end of a live migration is running at any one
time, this is sufficient to allow migration for Sheepdog devices. It would
also allow live migration of virtual machines on any other shared block
devices with mandatory exclusive locking, such as the locks introduced by
your patch.

My patch in the sheepdog qemu branch is here:

  http://sheepdog.git.sourceforge.net/git/gitweb.cgi?p=sheepdog/qemu-kvm;a=commitdiff;h=fd9d7c739831cf04e5a913a89bdca680ba028ada

I was intending to rebase it against current qemu head for more detailed
review over the next couple of weeks after it's had a little more testing,
but given that your patch now provides a second use-case for it outside of
sheepdog, this seems like a timely moment to ask for any more general
feedback about our proposed approach.

Cheers,

Chris.

  parent reply	other threads:[~2009-12-07 10:39 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-04 16:53 [Qemu-devel] [PATCH] Disk image shared and exclusive locks Richard W.M. Jones
2009-12-04 17:15 ` Anthony Liguori
2009-12-04 21:57   ` Richard W.M. Jones
2009-12-04 22:29     ` Anthony Liguori
2009-12-05 17:31       ` Avi Kivity
2009-12-05 17:47         ` Anthony Liguori
2009-12-05 17:55           ` Avi Kivity
2009-12-05 17:59             ` Anthony Liguori
2009-12-07 10:31               ` Jamie Lokier
2009-12-07 10:42                 ` Kevin Wolf
2009-12-07 10:48                   ` Avi Kivity
2009-12-07 10:56                     ` Kevin Wolf
2009-12-07 11:28                   ` Jamie Lokier
2009-12-07 11:51                     ` Kevin Wolf
2009-12-07 12:06                     ` Daniel P. Berrange
2009-12-07 10:45                 ` Daniel P. Berrange
2009-12-07 11:19                   ` Jamie Lokier
2009-12-07 11:30                     ` Daniel P. Berrange
2009-12-07 11:31                       ` Richard W.M. Jones
2009-12-07 11:38                         ` Jamie Lokier
2009-12-07 11:49                         ` Daniel P. Berrange
2009-12-07 11:59                           ` Richard W.M. Jones
2009-12-07 14:35                           ` [Qemu-devel] " Paolo Bonzini
2009-12-07 13:43                       ` [Qemu-devel] " Anthony Liguori
2009-12-07 14:01                         ` Daniel P. Berrange
2009-12-07 14:15                           ` Anthony Liguori
2009-12-07 14:28                             ` Daniel P. Berrange
2009-12-07 14:53                               ` Anthony Liguori
2009-12-08  9:40                                 ` Kevin Wolf
2009-12-07 11:04                 ` Richard W.M. Jones
2009-12-07 10:58           ` Richard W.M. Jones
2009-12-07 11:35             ` Jamie Lokier
2009-12-07 13:39             ` Anthony Liguori
2009-12-07 14:08               ` Richard W.M. Jones
2009-12-07 14:22                 ` Anthony Liguori
2009-12-07 14:31                   ` Richard W.M. Jones
2009-12-07 14:55                     ` Anthony Liguori
2009-12-08  9:48                     ` Kevin Wolf
2009-12-08 10:16                       ` Richard W.M. Jones
2009-12-07 14:38               ` [Qemu-devel] " Paolo Bonzini
2009-12-07  9:38   ` [Qemu-devel] " Daniel P. Berrange
2009-12-07 10:39 ` Chris Webb [this message]
2009-12-07 13:32   ` Anthony Liguori
2009-12-07 13:38     ` Chris Webb
2009-12-07 13:47       ` Anthony Liguori
2009-12-07 14:25     ` Daniel P. Berrange
2009-12-07 14:58       ` Chris Webb
2009-12-07 14:16 ` [Qemu-devel] [PATCH VERSION 2] " Richard W.M. Jones
2009-12-07 15:06   ` Anthony Liguori
2009-12-08  8:48     ` [Qemu-devel] " Paolo Bonzini
2009-12-08 10:00   ` [Qemu-devel] " Kevin Wolf
2009-12-08 10:25     ` Richard W.M. Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091207103908.GI2271@arachsys.com \
    --to=chris@arachsys.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).