From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NHc61-0003uH-Rp for qemu-devel@nongnu.org; Mon, 07 Dec 2009 06:49:41 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NHc5w-0003pS-HT for qemu-devel@nongnu.org; Mon, 07 Dec 2009 06:49:40 -0500 Received: from [199.232.76.173] (port=55198 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NHc5w-0003p7-8f for qemu-devel@nongnu.org; Mon, 07 Dec 2009 06:49:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:28809) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NHc5v-0003FI-Kr for qemu-devel@nongnu.org; Mon, 07 Dec 2009 06:49:35 -0500 Date: Mon, 7 Dec 2009 11:49:32 +0000 From: "Daniel P. Berrange" Subject: Re: [Qemu-devel] [PATCH] Disk image shared and exclusive locks. Message-ID: <20091207114932.GL24530@redhat.com> References: <4B198D5B.5080803@codemonkey.ws> <4B1A98D9.7010408@redhat.com> <4B1A9C9F.5040705@codemonkey.ws> <4B1A9E83.2050103@redhat.com> <4B1A9F8C.3010106@codemonkey.ws> <20091207103128.GA26970@shareable.org> <20091207104517.GJ24530@redhat.com> <20091207111953.GA29980@shareable.org> <20091207113014.GK24530@redhat.com> <20091207113147.GO23109@amd.home.annexia.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20091207113147.GO23109@amd.home.annexia.org> Reply-To: "Daniel P. Berrange" List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Richard W.M. Jones" Cc: qemu-devel@nongnu.org, Avi Kivity On Mon, Dec 07, 2009 at 11:31:47AM +0000, Richard W.M. Jones wrote: > On Mon, Dec 07, 2009 at 11:30:14AM +0000, Daniel P. Berrange wrote: > > On Mon, Dec 07, 2009 at 11:19:54AM +0000, Jamie Lokier wrote: > > > > > > No, the question is whether it makes sense to provide a 'shared' > > > option on the command line, or simply to always map: > > > > > > image opened read only => F_FDLCK > > > image opened writable => F_WRLCK > > > > > > and provide only a single command line option: 'lock'. > > > > That doesn't work in the case of setting up a clustered filesystem > > shared between guests. That requires that the disk be opened writable, > > but with a shared (F_RDLOCK) lock. > > I think Jamie's point is that you might as well use no locking at all > in this configuration. It's hard to see what lock=shared is > protecting you against. This is saying that there is no need to protect 'shared writers' from 'exclusive writers', which is not true. Consider you have a cluster of VMs managed by libvirt with a shared writable disk running the GFS filesystem, and you've not done any locking on the disk file. Now an admin comes along with with libguestfs and attempts to access the disk containing the GFS volume. libguestfs isn't part of the cluster but that doesn't matter because you can happily access a GFS filesystem in standalone mode provided it is not in use by any other nodes. We need to stop libguestfs opening the disk in exclusive-write mode if other QEMU VMs are using it in shared-write mode. If QEMU with the shared-writable disks is been using F_RDLOCK, then this would have prevent libguestfs opening the disk for write with F_WRLOCK, since the F_RDLOCK blocks all F_WRLOCK attempts. We really do have 3 combinations of locking / access mode here - read-only + F_RDLOCK - read-write + F_RDLOCK - read-write + F_WRLOCK So a single 'lock' flag is not sufficient, we need the shared/exclusive semantics of the original patch. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|