From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35356) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VqZlq-0005I9-82 for qemu-devel@nongnu.org; Tue, 10 Dec 2013 21:43:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VqZlg-00016G-VK for qemu-devel@nongnu.org; Tue, 10 Dec 2013 21:43:30 -0500 Received: from e23smtp03.au.ibm.com ([202.81.31.145]:47233) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VqZlg-00015v-6g for qemu-devel@nongnu.org; Tue, 10 Dec 2013 21:43:20 -0500 Received: from /spool/local by e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 11 Dec 2013 12:43:14 +1000 Received: from d23relay04.au.ibm.com (d23relay04.au.ibm.com [9.190.234.120]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id 0E2AA2BB0055 for ; Wed, 11 Dec 2013 13:43:11 +1100 (EST) Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rBB2Owvo65994988 for ; Wed, 11 Dec 2013 13:24:59 +1100 Received: from d23av02.au.ibm.com (localhost [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id rBB2h9aI000718 for ; Wed, 11 Dec 2013 13:43:09 +1100 Message-ID: <52A7D13E.3050801@linux.vnet.ibm.com> Date: Wed, 11 Dec 2013 10:43:10 +0800 From: Wenchao Xia MIME-Version: 1.0 References: <1386350580-5666-1-git-send-email-kwolf@redhat.com> <1386350580-5666-4-git-send-email-kwolf@redhat.com> <52A687F8.8000205@linux.vnet.ibm.com> <20131210094214.GB3656@dhcp-200-207.str.redhat.com> In-Reply-To: <20131210094214.GB3656@dhcp-200-207.str.redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC PATCH 03/19] block: Don't use guest sector size for qemu_blockalign() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, armbru@redhat.com 于 2013/12/10 17:42, Kevin Wolf 写道: > Am 10.12.2013 um 04:18 hat Wenchao Xia geschrieben: >> 于 2013/12/7 1:22, Kevin Wolf 写道: >>> bs->buffer_alignment is set by the device emulation and contains the >>> logical block size of the guest device. This isn't something that the >>> block layer should know, and even less something to use for determining >>> the right alignment of buffers to be used for the host. >>> >>> The new function bdrv_opt_mem_align() allows for hooks in a BlockDriver >>> so that it can tell the qemu block layer the optimal alignment to be >>> used so that no bounce buffer must be used in the driver. >>> >>> This patch may change the buffer alignment from 4k to 512 for all >>> callers that used qemu_blockalign() with the top-level image format >>> BlockDriverState. The value was never propagated to other levels in the >>> tree, so in particular raw-posix never required anything else than 512. >>> >>> While on disks with 4k sectors direct I/O requires a 4k alignment, >>> memory may still be okay when aligned to 512 byte boundaries. This is >>> what must have happened in practice, because otherwise this would >>> already have failed earlier. Therefore I don't expect regressions even >>> with this intermediate state. Later, raw-posix can implement the hook >>> and expose a different memory alignment requirement. >>> >>> Signed-off-by: Kevin Wolf >>> --- >>> block.c | 32 +++++++++++++++++++++++++++++--- >>> include/block/block.h | 1 + >>> include/block/block_int.h | 4 ++++ >>> 3 files changed, 34 insertions(+), 3 deletions(-) >>> >>> diff --git a/block.c b/block.c >>> index 613201b..669793b 100644 >>> --- a/block.c >>> +++ b/block.c >>> @@ -213,6 +213,31 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs, >>> qemu_co_queue_next(&bs->throttled_reqs[is_write]); >>> } >>> >>> +size_t bdrv_opt_mem_align(BlockDriverState *bs) >>> +{ >>> + size_t alignment; >>> + >>> + if (!bs || !bs->drv) { >>> + /* 4k should be on the safe side */ >>> + return 4096; >>> + } >>> + >>> + if (bs->drv->bdrv_opt_mem_align) { >>> + return bs->drv->bdrv_opt_mem_align(bs); >>> + } >>> + >>> + if (bs->file) { >>> + alignment = bdrv_opt_mem_align(bs->file); >>> + } else { >>> + alignment = 512; >>> + } >>> + >>> + if (bs->backing_hd) { >>> + alignment = MAX(alignment, bdrv_opt_mem_align(bs->backing_hd)); >>> + } >> >> Maybe I didn't understand the commit message correctly, does this code >> intend to get MAX alignment value in a chain? For example: >> base(4096)->mid(512)->top(1024) results: 4096 >> The condition to traver the backing files seems complex. > > Yes, you want to align any newly allocated buffer such that no matter > what direction it takes on its way through the block layer, it will > never be misaligned. This means that you need to use the highest > alignment restriction. > > Kevin > The condition to travel and compare, seems complex, since when "return" is executed, then travers would stop. for example: base(4096)->mid(512)->top(1024), when top's bs->drv->bdrv_opt_mem_align != NULL, the result is 1024. So I wonder whether this is on purpose, or the code should be: size_t bdrv_opt_mem_align(BlockDriverState *bs) { size_t alignment; if (!bs || !bs->drv) { /* 4k should be on the safe side */ alignment = 4096; goto compare; } if (bs->drv->bdrv_opt_mem_align) { alignment = bs->drv->bdrv_opt_mem_align(bs); goto compare; } if (bs->file) { alignment = bdrv_opt_mem_align(bs->file); } else { alignment = 512; } compare: if (bs->backing_hd) { alignment = MAX(alignment, bdrv_opt_mem_align(bs->backing_hd)); } return alignment; }