From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:60167)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@kamp.de>)
	id 1YijR8-0007Ba-3W
	for qemu-devel@nongnu.org; Thu, 16 Apr 2015 09:02:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pl@kamp.de>) id 1YijR2-0004r9-RV
	for qemu-devel@nongnu.org; Thu, 16 Apr 2015 09:02:30 -0400
Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:36875 helo=mx01.kamp.de)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@kamp.de>)
	id 1YijR2-0004qs-G2
	for qemu-devel@nongnu.org; Thu, 16 Apr 2015 09:02:24 -0400
Message-ID: <552FB2DD.9010902@kamp.de>
Date: Thu, 16 Apr 2015 15:02:21 +0200
From: Peter Lieven <pl@kamp.de>
MIME-Version: 1.0
References: <1429186730-3866-1-git-send-email-pl@kamp.de>
	<1429186730-3866-6-git-send-email-pl@kamp.de>
	<552FAE53.7050102@redhat.com>
In-Reply-To: <552FAE53.7050102@redhat.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH for-2.4 05/10] block/iscsi: optimize
 WRITE10/16 if cache.writeback is not set
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org
Cc: ronniesahlberg@gmail.com, qemu-block@nongnu.org

Am 16.04.2015 um 14:42 schrieb Paolo Bonzini:
>
> On 16/04/2015 14:18, Peter Lieven wrote:
>> SCSI allowes to tell the target to not return from a write command
>> if the date is not written to the disk. Use this so called FUA
>> bit if it is supported to optimize WRITE commands if writeback is
>> not allowed.
>>
>> In this case qemu always issues a WRITE followed by a FLUSH. This
>> is 2 round trip times. If we set the FUA bit we can ignore the
>> following FLUSH.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>   block/iscsi.c | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/iscsi.c b/block/iscsi.c
>> index 237faa1..7fb04d7 100644
>> --- a/block/iscsi.c
>> +++ b/block/iscsi.c
>> @@ -68,6 +68,7 @@ typedef struct IscsiLun {
>>       bool lbprz;
>>       bool dpofua;
>>       bool has_write_same;
>> +    bool force_next_flush;
>>   } IscsiLun;
>>   
>>   typedef struct IscsiTask {
>> @@ -370,6 +371,7 @@ static int coroutine_fn iscsi_co_writev(BlockDriverState *bs,
>>       struct IscsiTask iTask;
>>       uint64_t lba;
>>       uint32_t num_sectors;
>> +    int fua = iscsilun->dpofua && !bs->enable_write_cache;
>>   
>>       if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
>>           return -EINVAL;
>> @@ -388,12 +390,12 @@ retry:
>>       if (iscsilun->use_16_for_rw) {
>>           iTask.task = iscsi_write16_task(iscsilun->iscsi, iscsilun->lun, lba,
>>                                           NULL, num_sectors * iscsilun->block_size,
>> -                                        iscsilun->block_size, 0, 0, 0, 0, 0,
>> +                                        iscsilun->block_size, 0, 0, fua, 0, 0,
>>                                           iscsi_co_generic_cb, &iTask);
>>       } else {
>>           iTask.task = iscsi_write10_task(iscsilun->iscsi, iscsilun->lun, lba,
>>                                           NULL, num_sectors * iscsilun->block_size,
>> -                                        iscsilun->block_size, 0, 0, 0, 0, 0,
>> +                                        iscsilun->block_size, 0, 0, fua, 0, 0,
>>                                           iscsi_co_generic_cb, &iTask);
>>       }
>>       if (iTask.task == NULL) {
>> @@ -621,6 +623,11 @@ static int coroutine_fn iscsi_co_flush(BlockDriverState *bs)
>>           return 0;
>>       }
>>   
>> +    if (iscsilun->dpofua && !bs->enable_write_cache &&
>> +        !iscsilun->force_next_flush) {
>> +        return 0;
>> +    }
>> +
>>       iscsi_co_init_iscsitask(iscsilun, &iTask);
>>   
>>   retry:
>> @@ -648,6 +655,8 @@ retry:
>>           return -EIO;
>>       }
>>   
>> +    iscsilun->force_next_flush = false;
> You still need a flush if you do WRITE SAME, WRITE+FUA, WRITE+FUA.
> Also, since bs->enable_write_cache can be toggled arbitrarily, I would
> prefer to set force_next_flush on all non-FUA writes, including those
> done with bs->enable_write_cache.

Good idea. So we can avoid flushes if there was never a write before.

>
>>       return 0;
>>   }
>>   
>> @@ -969,6 +978,8 @@ retry:
>>           iscsi_allocationmap_set(iscsilun, sector_num, nb_sectors);
>>       }
>>   
>> +    iscsilun->force_next_flush = true;
> Also, I think it is iscsi_co_generic_cb that should set
> force_next_flush, so that it is only set on failure.  Not really for the
> optimization value, but because it's clearer.

I don't get what you mean with it should only "set on failure".
My approach would be to add a flag to the iTask that the command
requires to set force_flush after successful completion. Is it that
what you mean?

>
> I think that if you do these changes, iscsi_co_flush can just check "if
> (!iscsilun->force_next_flush)".
>
> But still---nice approach. :)

Thanks :-)

Peter