From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:59337)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <wangww.fnst@cn.fujitsu.com>) id 1bQ34S-0007bM-Rh
	for qemu-devel@nongnu.org; Wed, 20 Jul 2016 21:46:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <wangww.fnst@cn.fujitsu.com>) id 1bQ34M-0003sP-S6
	for qemu-devel@nongnu.org; Wed, 20 Jul 2016 21:46:39 -0400
Received: from [59.151.112.132] (port=42428 helo=heian.cn.fujitsu.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <wangww.fnst@cn.fujitsu.com>) id 1bQ34M-0003pr-2l
	for qemu-devel@nongnu.org; Wed, 20 Jul 2016 21:46:34 -0400
Message-ID: <5790D211.4060108@cn.fujitsu.com>
Date: Thu, 21 Jul 2016 09:45:53 -0400
From: wangweiwei <wangww.fnst@cn.fujitsu.com>
MIME-Version: 1.0
References: <577A6955.6020603@kamp.de> <57900AB3.3040705@redhat.com>
	<5790D064.4050805@cn.fujitsu.com>
In-Reply-To: <5790D064.4050805@cn.fujitsu.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] Regression: block: Add .bdrv_co_pwrite_zeroes()
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Eric Blake <eblake@redhat.com>, Peter Lieven <pl@kamp.de>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: Kevin Wolf <kwolf@redhat.com>

Sorry,  reply is wrong.
在 2016年07月21日 09:38, wangweiwei 写道:
> 在 2016年07月20日 19:35, Eric Blake 写道:
>> On 07/04/2016 07:49 AM, Peter Lieven wrote:
>>> Hi,
>>>
>>> the above commit:
>>>
>>> commit d05aa8bb4a8b6aa9a915ec5074fb12ae632d2323
>>> Author: Eric Blake <eblake@redhat.com>
>>> Date:   Wed Jun 1 15:10:03 2016 -0600
>>>
>>>      block: Add .bdrv_co_pwrite_zeroes()
>>>
>>> introduces a regression (at least for me).
>>>
>>> The Limits from the iSCSI Block Limits VPD have no requirement of being
>>> a power of two.
>>> We use Dell Equallogic iSCSI SANs for instance. They have an internal
>>> page size of 15MB. And
>>> they advertise this page size as max_ws_len, opt_transfer_len and
>>> opt_discard_alignment.
>>
>> Since I don't have access to this device, let me double check: if you
>> put a breakpoint in iscsi.c:iscsi_refresh_limits(), can you dump the
>> contents of the struct iscsilun->bl?  What is the block size of this
>> device (512, 4096, something else)?
>>
>> Also, while the device is advertising that the optimal discard alignment
>> is 15M, that does not tell me the minimum granularity that it can
>> actually discard.  Can you determine that value?  That is, if I try to
>> discard only 1M, does that actually result in a 1M allocation hole, or
>> is it ignored?  It sounds like qemu should be tracking 2 separate
>> values: the minimum discard granularity (I suspect this number is a
>> power of 2, at least the block size, and perhaps precisely equal to the
>> block size), and the maximum discard granularity that results in the
>> fewest/fastest discard of the entire device (not necessarily a power of
>> 2).  Or, maybe that merely means that qemu's pdiscard_alignment should
>> be the MINIMUM granularity, and NOT the non-power-of-2
>> iscsilun->bl.opt_unmap_gran.
>>
>> Or put another way, I get that I can't discard more than 15M at a time.
>>   But I highly suspect that I do not have to align my discard requests to
>> 15M boundaries.  That is, if the discard granularity is 1M, then in
>> qemu-io, 'discard 1M 15M' should result in a 15M hole, and should be no
>> different from the result of 'discard 1M 14M; discard 15M 1M'.  But if
>> qemu sticks to pdiscard_alignment == iscsilun->bl.opt_unmap_gran of 15M,
>> then both operations mistakenly discard nothing (because it is not
>> aligned to a 15M boundary).
>>
>>>
>>> I think we cannot assert that that these alignments are a power of 2.
>>
>> Optimal size not being a power of 2 is not a problem, but I still
>> suspect MINIMUM alignment is a power of 2, and I need to know how much
>> head and tail to discard in the new byte-based discard routines in order
>> to align requests up to the minimal discard alignment boundaries.
>>
>>
>
>