From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52296) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bzQEE-0001Y9-Fh for qemu-devel@nongnu.org; Wed, 26 Oct 2016 11:35:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bzQED-0004Mo-6G for qemu-devel@nongnu.org; Wed, 26 Oct 2016 11:34:58 -0400 References: <1475237406-26917-1-git-send-email-famz@redhat.com> <20161024101111.GB4374@noname.redhat.com> <20161025082435.GA4695@noname.str.redhat.com> <527aa7ab-3e62-90d9-3551-22036e2ce7fe@redhat.com> <20161025145739.GL4695@noname.str.redhat.com> <20161026110128.GF14605@lemon> <3144559c-6f10-317b-072e-fa4a40ef18d7@redhat.com> <20161026153357.GX4758@noname.str.redhat.com> From: Max Reitz Message-ID: <632c8181-da49-c52f-08ef-ff6600536cfa@redhat.com> Date: Wed, 26 Oct 2016 17:34:42 +0200 MIME-Version: 1.0 In-Reply-To: <20161026153357.GX4758@noname.str.redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="X8ssWB5of4LnRS0u4F5KNdAps5KKIhcch" Subject: Re: [Qemu-devel] [PATCH v8 00/36] block: Image locking series List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Fam Zheng , qemu-block@nongnu.org, Markus Armbruster , Jeff Cody , qemu-devel@nongnu.org, rjones@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, den@openvz.org, John Snow This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --X8ssWB5of4LnRS0u4F5KNdAps5KKIhcch From: Max Reitz To: Kevin Wolf Cc: Fam Zheng , qemu-block@nongnu.org, Markus Armbruster , Jeff Cody , qemu-devel@nongnu.org, rjones@redhat.com, pbonzini@redhat.com, stefanha@redhat.com, den@openvz.org, John Snow Message-ID: <632c8181-da49-c52f-08ef-ff6600536cfa@redhat.com> Subject: Re: [Qemu-devel] [PATCH v8 00/36] block: Image locking series References: <1475237406-26917-1-git-send-email-famz@redhat.com> <20161024101111.GB4374@noname.redhat.com> <20161025082435.GA4695@noname.str.redhat.com> <527aa7ab-3e62-90d9-3551-22036e2ce7fe@redhat.com> <20161025145739.GL4695@noname.str.redhat.com> <20161026110128.GF14605@lemon> <3144559c-6f10-317b-072e-fa4a40ef18d7@redhat.com> <20161026153357.GX4758@noname.str.redhat.com> In-Reply-To: <20161026153357.GX4758@noname.str.redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 26.10.2016 17:33, Kevin Wolf wrote: > Am 26.10.2016 um 17:12 hat Max Reitz geschrieben: >> On 26.10.2016 13:01, Fam Zheng wrote: >>> On Tue, 10/25 16:57, Kevin Wolf wrote: >>>> Am 25.10.2016 um 15:30 hat Max Reitz geschrieben: >>>>> On 25.10.2016 10:24, Kevin Wolf wrote: >>>>>> Am 24.10.2016 um 20:03 hat Max Reitz geschrieben: >>>>>>> On 24.10.2016 12:11, Kevin Wolf wrote: >>>>>>> >>>>>>> [...] >>>>>>> >>>>>>>> Now, the big question is how to translate this into file locking= =2E This >>>>>>>> could become a little tricky. I had a few thoughts involving ano= ther >>>>>>>> lock on byte 2, but none of them actually worked out so far, bec= ause >>>>>>>> what we want is essentially a lock that can be shared by readers= , that >>>>>>>> can also be shared by writers, but not by readers and writers at= the >>>>>>>> same time. >>>>>>> >>>>>>> You can also share it between readers and writers, as long as eve= ryone >>>>>>> can cope with volatile data. >>>>>> >>>>>> Sorry, that was ambiguous. I meant a file-level lock rather than t= he >>>>>> high-level one. If we had a lock that can be shared by one or the = other, >>>>>> but not both, then two locks would be enough to build what we real= ly >>>>>> want. >>>>>> >>>>>>> I agree that it's very similar to the proposed op blocker style, = but I >>>>>>> can't really come up with a meaningful translation either. >>>>>>> >>>>>>> Maybe something like this (?): All readers who do not want the fi= le to >>>>>>> be modified grab a shared lock on byte 1. All writers who can dea= l with >>>>>>> volatile data grab a shared lock on byte 2. Exclusive writers gra= b an >>>>>>> exclusive lock on byte 1 and 2. Readers who can cope with volatil= e data >>>>>>> get no lock at all. >>>>>>> >>>>>>> When opening, the first and second group would always have to tes= t >>>>>>> whether there is a lock on the other byte, respectively. E.g. sha= ring >>>>>>> writers would first grab an exclusive lock on byte 1, then the sh= ared >>>>>>> lock on byte 2 and then release the exclusive lock again. >>>>>>> >>>>>>> Would that work? >>>>>> >>>>>> I'm afraid it wouldn't. If you start the sharing writer first and = then >>>>>> the writer-blocking reader, the writer doesn't hold a lock on byte= 1 any >>>>>> more, >>>>> >>>>> But it holds a lock on byte 2. >>>>> >>>>>> so the reader can start even though someone is writing to th= e >>>>>> image. >>>>> >>>>> It can't because it would try to grab an exclusive lock on byte 2 b= efore >>>>> grabbing the shared lock on byte 1. >>>> >>>> Apparently I failed to understand the most important part of the >>>> proposal. :-) >>>> >>>> So we have two locks. Both are only held for a longer time in shared= >>>> mode. Exclusive mode is only used for testing whether the lock is be= ing >>>> held and is immediately given up again. >>>> >>>> The meaning of holding a shared lock is: >>>> >>>> byte 1: I can't allow other processes to write to the image >>>> byte 2: I am writing to the image >>>> >>>> The four cases that we have involve: >>>> >>>> * shared writer: Take shared lock on byte 2. Test whether byte 1 is >>>> locked using an exclusive lock, and fail if so. >>>> >>>> * exclusive writer: Take shared lock on byte 2. Test whether byte 1 = is >>>> locked using an exclusive lock, and fail if so. Then take shared l= ock >>>> on byte 1. I suppose this is racy, but we can probably tolerate th= at. >>>> >>>> * reader that can tolerate writers: Don't do anything >>>> >>>> * reader that can't tolerate writers: Take shared lock on byte 1. Te= st >>>> whether byte 2 is locked, and fail if so. >>>> >>>> Seems to work if I got that right. >>> >>> Does this mean I should change ImageLockMode to: >>> >>> * exclusive >>> * shared-write >>> * shared-read >> >> Hm, those don't sound quite right, since it sounds as if you could mix= >> shared-read and shared-write. But you shouldn't be able to open an ima= ge >> in shared-read lock mode when someone has opened it in shared-write lo= ck >> mode already. >> >> It's difficult to come up with a clear but short name for shared-read >> ("exclusive", "shared-write", and "nolock" sound good to me). Maybe >> "non-volatile" or "constant"? Or maybe "shared-only-read" would be cle= ar >> enough? >=20 > As we concluded above, this is really a matrix a two bools rather than = a > single property. We need a new option for "I can't allow other processe= s > to write to the image", but we don't need one for "I am writing to the > image" because that's the read-only property that we already have. >=20 >>> * nolock >>> * auto >>> >>> Where "auto" maps to exclusive for O_RDWR and shared-read for O_RDONL= Y? >> >> Yep, that would be the correct mapping. Maybe later we can introduce a= n >> auto-shared mode that maps to shared-write or nolock, respectively. >=20 > No auto needed any more, the default is simply false (i.e. don't share > with other writers). Well, that was too easy. :-) Max --X8ssWB5of4LnRS0u4F5KNdAps5KKIhcch Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQEvBAEBCAAZBQJYEM0SEhxtcmVpdHpAcmVkaGF0LmNvbQAKCRD0B9sAYdXPQPHn CACgivvXO9DBvYMz04UCUTNeQUs07Z3a9FmSDsIJSorucG6PvNwdFyz28KgP56Ii q0whU8XP6Z7SiIu49+6SSLLG909at80MvOC1PQ50JwEamqZ9dazieXRgX1Urn8SN 2YiuMEXEtGpYp9OSB3Z6fjpGNPtr/NLYn+LNQETrlaiBQxoMq6xElpN/0JO5uAeS lYtgkGLqyLoohXD+LqYdY0Mw959ppoDa9QNkbOoy+tmibyFbnbH7HYt9UWPBizM6 GnMOOWdPcGXkjFd9pinQvpYTmFomNL3VSMvXjoDwBI7RmM4xapJMSg8fV/Fw1mDQ MT1Y3KWG4qW8xA7yL1+uCJy2 =b2pR -----END PGP SIGNATURE----- --X8ssWB5of4LnRS0u4F5KNdAps5KKIhcch--