From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52149) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bzQDU-0000yu-SC for qemu-devel@nongnu.org; Wed, 26 Oct 2016 11:34:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bzQDS-0003g2-6O for qemu-devel@nongnu.org; Wed, 26 Oct 2016 11:34:12 -0400 Date: Wed, 26 Oct 2016 17:33:57 +0200 From: Kevin Wolf Message-ID: <20161026153357.GX4758@noname.str.redhat.com> References: <1475237406-26917-1-git-send-email-famz@redhat.com> <20161024101111.GB4374@noname.redhat.com> <20161025082435.GA4695@noname.str.redhat.com> <527aa7ab-3e62-90d9-3551-22036e2ce7fe@redhat.com> <20161025145739.GL4695@noname.str.redhat.com> <20161026110128.GF14605@lemon> <3144559c-6f10-317b-072e-fa4a40ef18d7@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="imjhCm/Pyz7Rq5F2" Content-Disposition: inline In-Reply-To: <3144559c-6f10-317b-072e-fa4a40ef18d7@redhat.com> Subject: Re: [Qemu-devel] [PATCH v8 00/36] block: Image locking series List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz Cc: Fam Zheng , qemu-block@nongnu.org, Markus Armbruster , Jeff Cody , qemu-devel@nongnu.org, rjones@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, den@openvz.org, John Snow --imjhCm/Pyz7Rq5F2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Am 26.10.2016 um 17:12 hat Max Reitz geschrieben: > On 26.10.2016 13:01, Fam Zheng wrote: > > On Tue, 10/25 16:57, Kevin Wolf wrote: > >> Am 25.10.2016 um 15:30 hat Max Reitz geschrieben: > >>> On 25.10.2016 10:24, Kevin Wolf wrote: > >>>> Am 24.10.2016 um 20:03 hat Max Reitz geschrieben: > >>>>> On 24.10.2016 12:11, Kevin Wolf wrote: > >>>>> > >>>>> [...] > >>>>> > >>>>>> Now, the big question is how to translate this into file locking. = This > >>>>>> could become a little tricky. I had a few thoughts involving anoth= er > >>>>>> lock on byte 2, but none of them actually worked out so far, becau= se > >>>>>> what we want is essentially a lock that can be shared by readers, = that > >>>>>> can also be shared by writers, but not by readers and writers at t= he > >>>>>> same time. > >>>>> > >>>>> You can also share it between readers and writers, as long as every= one > >>>>> can cope with volatile data. > >>>> > >>>> Sorry, that was ambiguous. I meant a file-level lock rather than the > >>>> high-level one. If we had a lock that can be shared by one or the ot= her, > >>>> but not both, then two locks would be enough to build what we really > >>>> want. > >>>> > >>>>> I agree that it's very similar to the proposed op blocker style, bu= t I > >>>>> can't really come up with a meaningful translation either. > >>>>> > >>>>> Maybe something like this (?): All readers who do not want the file= to > >>>>> be modified grab a shared lock on byte 1. All writers who can deal = with > >>>>> volatile data grab a shared lock on byte 2. Exclusive writers grab = an > >>>>> exclusive lock on byte 1 and 2. Readers who can cope with volatile = data > >>>>> get no lock at all. > >>>>> > >>>>> When opening, the first and second group would always have to test > >>>>> whether there is a lock on the other byte, respectively. E.g. shari= ng > >>>>> writers would first grab an exclusive lock on byte 1, then the shar= ed > >>>>> lock on byte 2 and then release the exclusive lock again. > >>>>> > >>>>> Would that work? > >>>> > >>>> I'm afraid it wouldn't. If you start the sharing writer first and th= en > >>>> the writer-blocking reader, the writer doesn't hold a lock on byte 1= any > >>>> more, > >>> > >>> But it holds a lock on byte 2. > >>> > >>>> so the reader can start even though someone is writing to the > >>>> image. > >>> > >>> It can't because it would try to grab an exclusive lock on byte 2 bef= ore > >>> grabbing the shared lock on byte 1. > >> > >> Apparently I failed to understand the most important part of the > >> proposal. :-) > >> > >> So we have two locks. Both are only held for a longer time in shared > >> mode. Exclusive mode is only used for testing whether the lock is being > >> held and is immediately given up again. > >> > >> The meaning of holding a shared lock is: > >> > >> byte 1: I can't allow other processes to write to the image > >> byte 2: I am writing to the image > >> > >> The four cases that we have involve: > >> > >> * shared writer: Take shared lock on byte 2. Test whether byte 1 is > >> locked using an exclusive lock, and fail if so. > >> > >> * exclusive writer: Take shared lock on byte 2. Test whether byte 1 is > >> locked using an exclusive lock, and fail if so. Then take shared lock > >> on byte 1. I suppose this is racy, but we can probably tolerate that. > >> > >> * reader that can tolerate writers: Don't do anything > >> > >> * reader that can't tolerate writers: Take shared lock on byte 1. Test > >> whether byte 2 is locked, and fail if so. > >> > >> Seems to work if I got that right. > >=20 > > Does this mean I should change ImageLockMode to: > >=20 > > * exclusive > > * shared-write > > * shared-read >=20 > Hm, those don't sound quite right, since it sounds as if you could mix > shared-read and shared-write. But you shouldn't be able to open an image > in shared-read lock mode when someone has opened it in shared-write lock > mode already. >=20 > It's difficult to come up with a clear but short name for shared-read > ("exclusive", "shared-write", and "nolock" sound good to me). Maybe > "non-volatile" or "constant"? Or maybe "shared-only-read" would be clear > enough? As we concluded above, this is really a matrix a two bools rather than a single property. We need a new option for "I can't allow other processes to write to the image", but we don't need one for "I am writing to the image" because that's the read-only property that we already have. > > * nolock > > * auto > >=20 > > Where "auto" maps to exclusive for O_RDWR and shared-read for O_RDONLY? >=20 > Yep, that would be the correct mapping. Maybe later we can introduce an > auto-shared mode that maps to shared-write or nolock, respectively. No auto needed any more, the default is simply false (i.e. don't share with other writers). Kevin --imjhCm/Pyz7Rq5F2 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJYEMzlAAoJEH8JsnLIjy/WgFUQALjmADoVshdByZIZRYmoUISq C3BgJWU6y5FUQrprTycRpUxX+/4Sb0xNhs+nK6WkUJPXanCwEcjKTJ+jEx3rfFYO Sbh9twqeCvc2cPHUTuGK2itOEPO1w3Wcw01j34v6kMGPgakfvY/6MGOdav8VrzSu ShySPlik/E3tjX5R+m1UmQjO2yB/PahgWIMVovaMd6A9Nvh+MI4I8ADQc5KnCEit fn11wnne3E8stzf6wIA68lUNL6OHs4CWRWdcVFpKkAf/IjjHzUVZY1mYKGm1T0/4 ejmR/pW2zrOVB4Mz5qjrLwRKBgowoAvosCpyu8VZRi4qMCd9/b9s817LkIf0ESLO 1nuXin4aDXMj4N0J+4WdCek4DlSgXLgKinqR/ehg29hqpyFzQkSzOXTACDXhFmyx 9/plOd7mpPoTApkP27IEopn24UxAARl91PJYLFfR9s7jtyEzVPtut7sljRjo2kIS y5TUu/z1UToWboG1Rxisl+4yOBvaVRA+tAkJPhhJMAtGG25K3pvENQjoHeI7mqt/ oBRt19HKKHZxhDyHt3vabMv89A9rv64k7VZIdv7JRWycAG/GfCGev59BEseoRdWM WMvRwhP3YJVzA9hTUnKw1ZDyQFdhJ5XzSOwQuKmB0MVazH7cEq0NXac+OTA/9zlW tmpEWIpgNCtTkg9Jwbkl =CmEE -----END PGP SIGNATURE----- --imjhCm/Pyz7Rq5F2--