From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juan Quintela Subject: Re: KVM call agenda for Jan 11 Date: Tue, 11 Jan 2011 14:41:44 +0100 Message-ID: References: <4D2C3B4D.2090709@redhat.com> Reply-To: quintela@redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: qemu-devel@nongnu.org, Chris Wright , kvm-devel To: Kevin Wolf Return-path: Received: from mx1.redhat.com ([209.132.183.28]:10755 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753621Ab1AKNmq (ORCPT ); Tue, 11 Jan 2011 08:42:46 -0500 In-Reply-To: <4D2C3B4D.2090709@redhat.com> (Kevin Wolf's message of "Tue, 11 Jan 2011 12:13:17 +0100") Sender: kvm-owner@vger.kernel.org List-ID: Kevin Wolf wrote: > Am 10.01.2011 14:32, schrieb Juan Quintela: >> Juan Quintela wrote: >>> Juan Quintela wrote: >>> >>> Now sent it to the right kvm list. Sorry for the second sent. >>> >>>> Please send any agenda items you are interested in covering. >>>> >>>> - KVM Forum 2011 (Jes). >>>> >>>> thanks, Juan. >> >> - migration and block devices: a mess. >> * patches I sent last week: only work for root (for some definition of >> work) >> * qemu is used as non-root user. >> * forcing to have cache=none solves the issue > > I need to have a look at the specific problem, but it's hard to imagine > that cache=none fixes anything reliably. It uses O_DIRECT, that means that we don't have buffering problems. I state the problem again: machine A read 1st block of device. machine B reads writes lots of places including 1st block now guest from machine A migrates to machine B machine A re-reads the 1st block, and lo and behold, it reads the old contents, not the new ones. Solutions: - invalidate all buffers for that block device on machine A after migration. * with NFS, just close + reopen the file (and pray that nobody else has it also opened) * with block devices: use BLKFLBLK ioctl, and pray that nobody else is using the device, that device is not a ramdisk, and some more things. To add injury to insult, you need to be root to be able to issue that ioctl (technically have CAP_SYS_ADMIN). O_DIRECT fixes this problem altogether, because there is no buffering, and if there are not buffers, they can't be invalid O:-) Notice the "pray" part in the other solutions, we are basically trying to do a "poor man" DLM, and that is not trivial to do. (althougth our problem is not the general one, the principles are the same). Later, Juan.