From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50102) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uzm0t-0003Lh-U2 for qemu-devel@nongnu.org; Thu, 18 Jul 2013 07:04:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uzm0l-0005nM-Vp for qemu-devel@nongnu.org; Thu, 18 Jul 2013 07:04:47 -0400 Received: from mx.ipv6.kamp.de ([2a02:248:0:51::16]:46919 helo=mx01.kamp.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1Uzm0l-0005nF-Ki for qemu-devel@nongnu.org; Thu, 18 Jul 2013 07:04:39 -0400 Message-ID: <51E7CBC8.1010804@kamp.de> Date: Thu, 18 Jul 2013 13:04:40 +0200 From: Peter Lieven MIME-Version: 1.0 References: <1373885375-13601-5-git-send-email-pl@kamp.de> <20130717084648.GD2458@dhcp-200-207.str.redhat.com> <51E66ACD.70706@redhat.com> <20130717102551.GF2458@dhcp-200-207.str.redhat.com> <51E6C5FC.1030304@redhat.com> <7C1EEB41-E2B3-4186-9188-379F02E76FF9@kamp.de> <51E6CE81.6000400@redhat.com> <36C25446-54C7-4D1F-9D8D-E8A3991489BD@kamp.de> <20130718092316.GG3582@dhcp-200-207.str.redhat.com> <51E7C260.50404@redhat.com> <51E7C707.7010101@kamp.de> <51E7C9C4.5010202@redhat.com> In-Reply-To: <51E7C9C4.5010202@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 4/4] qemu-img: conditionally discard target on convert List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Kevin Wolf , Stefan Hajnoczi , qemu-devel , ronnie sahlberg On 18.07.2013 12:56, Paolo Bonzini wrote: > Il 18/07/2013 12:44, Peter Lieven ha scritto: >> On 18.07.2013 12:24, Paolo Bonzini wrote: >>> Il 18/07/2013 11:23, Kevin Wolf ha scritto: >>>> Am 17.07.2013 um 19:48 hat Peter Lieven geschrieben: >>>>> Am 17.07.2013 um 19:04 schrieb Paolo Bonzini : >>>>> >>>>>> Il 17/07/2013 19:02, Peter Lieven ha scritto: >>>>>>> For Disks we always use read/write16 so i think we Should also use >>>>>>> writesame16. Or not? >>>>>> Yes. >>>>>> >>>>>> Remember you can still use UNMAP if LBPRZ=0. >>>>> I can always use it if writesame is not available, but in this case >>>>> bdi->discard_zeroes must be 0. >>>>> >>>>> Maybe we should call it discard_writes_zeroes or similar. >>>>> >>>>> Discard_zeroes is sth that should only indicate if lbprz == 1. At >>>>> least if we refer to the Linux ioctl. We could include both in BDI. >>>> Maybe what we really should do is to define different operations (with >>>> an exact behaviour) instead of having one bdrv_discard() and then adding >>>> flags everywhere to tell what the operation is doing exactly. >>> A BDRV_MAY_UNMAP flag for bdrv_write_zeroes? >> I thought that we wanted to add a paramter to the BDI (call it >> write_zeroes_w_discard). > I also thought so, but I like Kevin's idea of not shoehorning it in > bdrv_discard(). Extending bdrv_write_zeroes is a better API. > > So far we've avoided "discard zeroes" semantics in QEMU (device models > are also careful not to expose that). Since the only sane way to > implement what you want is to use the SCSI WRITE SAME command, adding > flags to bdrv_write_zeroes will even be easier because the mapping with > SCSI is natural: discard = UNMAP, write_zeroes = WRITE SAME (either > without or with the UNMAP bit). > >> If this is set the bdrv MUST accept a flag to bdrv_discard() lets call >> it BDRV_DISCARD_WRITE_ZEROES >> and he has to ensure that all sectors specified in bdrv_discard() read >> as zero after the operation. >> >> If this flag is not set (e.g. when the OS issues a normal discard) the >> operation may still silently fail with >> undefined provisioning state and content of the specified sectors. > But if you set BDRV_DISCARD_WRITE_ZEROES, then you always need a > fallback to bdrv_write_zeroes. Why not just call bdrv_write_zeroes to > begin with? That's why extending bdrv_write_zeroes is preferable. In this case wo do not need a flag to the function at all. If the driver sets bdi->write_zeroes_w_discard = 1 then bdrv_write_zeroes can use bdrv_discard to write zeroes and the driver has to ensure that all is zero afterwards. If the driver would have a better method of writing zeroes than discard it simply should not set bdi->write_zeroes_w_discard = 1. Peter