From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:36666) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UDdwu-0003ph-Sd for qemu-devel@nongnu.org; Thu, 07 Mar 2013 11:45:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UDdwn-0002iT-Ek for qemu-devel@nongnu.org; Thu, 07 Mar 2013 11:45:44 -0500 Received: from mail-qa0-f45.google.com ([209.85.216.45]:45924) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UDdwn-0002iN-9X for qemu-devel@nongnu.org; Thu, 07 Mar 2013 11:45:37 -0500 Received: by mail-qa0-f45.google.com with SMTP id g10so3343741qah.18 for ; Thu, 07 Mar 2013 08:45:36 -0800 (PST) Sender: Paolo Bonzini Message-ID: <5138C42B.7060701@redhat.com> Date: Thu, 07 Mar 2013 17:45:31 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <51378200.5010705@dlhnet.de> <513785A0.1010001@redhat.com> <20130306181454.GB3743@localhost.localdomain> <51378B97.2000709@redhat.com> <20130306184800.GB22782@localhost.localdomain> <51379304.50307@dlhnet.de> <5137A964.5020803@redhat.com> <20130307085038.GB2536@dhcp-200-207.str.redhat.com> In-Reply-To: <20130307085038.GB2536@dhcp-200-207.str.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: sw@weilnetz.de, Jeff Cody , Peter Lieven , qemu-devel@nongnu.org, stefanha@redhat.com Il 07/03/2013 09:50, Kevin Wolf ha scritto: > Am 06.03.2013 um 21:39 hat Paolo Bonzini geschrieben: >> Il 06/03/2013 20:03, Peter Lieven ha scritto: >>> Am 06.03.2013 19:48, schrieb Jeff Cody: >>>> On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote: >>>>> Il 06/03/2013 19:14, Jeff Cody ha scritto: >>>>>> QCOW breaks with it using a normal raw posix file as a device. As a >>>>>> test: qemu-img create -f qcow test.qcow 5G. Now run qemu with that >>>>>> drive mounted, and try to partition and format it. QEMU now asserts. >>>>>> >>>>>> The nicety of being able to using truncate during a write call, >>>>>> especially for VHDX (which can have relatively large block/cluster >>>>>> sizes), so to grow the file sparsely in a dynamically allocated file. >>>>> >>>>> Perhaps we need two APIs, "truncate" and "revalidate". >>>>> >>>>> Truncate should be a no-op if (!bs->growable). >>>>> >>>>> Revalidate could be called by the block_resize monitor command with no >>>>> size specified. >>>> >>>> I think that is a good solution. Is it better to have "truncate" and >>>> "revalidate", or "truncate" and "grow", with grow being a subset of >>>> truncate, with fewer restrictions? There may still be operations >>>> where it is OK to grow a file, but not OK to shrink it. > > What semantics would the both operations have? Is truncate the same as > it used to be? I don't really understand what "revalidate" would do, it > sounds like a read-only operation from its name? Revalidate would update the current size. Files fetch it on all calls to bdrv_getlength, but other backends cache it. It would visit the BDS ->file chain from the bottom (bs->file->file->file...) to the top implicitly, with no need for an explicit recursion in the callback (like we do for flush_to_os). Before starting the recursion, bdrv_revalidate does a bdrv_drain_all, so using the current iscsi_truncate for iscsi_revalidate would be fine. The block_resize command will always call revalidate before doing anything. Then block_resize will try to do a bdrv_truncate if the callback is there. Another change that could make sense, is to make bdrv_truncate succeed if there is no bdrv_truncate callback but bs is larger than the requested size. This would be a difference from today's if (!drv->bdrv_truncate) return -ENOTSUP; And it could even do the same if the callback is there, but returns ENOTSUP. It would simplify some code, like this in raw-posix.c's raw_truncate: } else if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) { if (offset > raw_getlength(bs)) { return -EINVAL; } } else { return -ENOTSUP; } The "else if" branch can just go away. The remaining question is about usage of bdrv_truncate in the formats. One important aspect is that both QCOW and VHDX, in addition to using bdrv_truncate to extend the file, rely on bdrv_getlength to fetch the last _used_ byte of the file, not the available space. I think we can say that bdrv_getlength behaves like that iff bs->growable. If so, QCOW and VHDX should fail to open for write if the underlying file has !bs->growable. Which will never fail right now, but it will as soon as we implement this: >> Let's start from (c). bdrv_file_open sets bs->growable = 1. I think it >> should be removed and only the file protocol should set it. > > This is probably right. Good. More precisely, raw_open should only set it if the file is a regular file. Paolo