From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JtfJi-0004Xy-UR for qemu-devel@nongnu.org; Wed, 07 May 2008 04:48:03 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JtfJg-0004WL-SV for qemu-devel@nongnu.org; Wed, 07 May 2008 04:48:01 -0400 Received: from [199.232.76.173] (port=58835 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JtfJg-0004WE-Eb for qemu-devel@nongnu.org; Wed, 07 May 2008 04:48:00 -0400 Received: from ns.suse.de ([195.135.220.2] helo=mx1.suse.de) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JteuF-0001qb-BM for qemu-devel@nongnu.org; Wed, 07 May 2008 04:21:43 -0400 Received: from Relay2.suse.de (relay-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 5873E4168D for ; Wed, 7 May 2008 10:21:37 +0200 (CEST) Message-ID: <48216579.3060204@suse.de> Date: Wed, 07 May 2008 10:16:57 +0200 From: Kevin Wolf MIME-Version: 1.0 Subject: Re: [Qemu-devel] [4367] Align file accesses with cache=off (Kevin Wolf, Laurent Vivier) References: <4820D905.4020407@bellard.org> In-Reply-To: <4820D905.4020407@bellard.org> Content-Type: multipart/mixed; boundary="------------000900000003030903070903" Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org This is a multi-part message in MIME format. --------------000900000003030903070903 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Fabrice Bellard schrieb: > A note: in order to avoid uncontrolled recursions, it is better to call > the read/write AIO callback outside the aio_read/write (see > bdrv_aio_read_em). Something along the lines of the attached patch? > Personally I would not trust the OS to correctly handle the mix of > O_DIRECT and buffered operations, especially if the corresponding file > regions intersect ! We might have to go back to the pwrite implementation of my first patch then which emulates the accesses by using a temporary aligned buffer. Btw, it is quite interesting to see that a serious discussion of a patch happens only if it is already committed. This could have been discussed a week ago when we agreed to go in the apparently wrong direction. And the patch has been on the list much longer than this one week. Kevin --------------000900000003030903070903 Content-Type: text/x-patch; name="avoid-recursions.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="avoid-recursions.patch" Index: qemu-svn/block-raw-posix.c =================================================================== --- qemu-svn.orig/block-raw-posix.c +++ qemu-svn/block-raw-posix.c @@ -313,6 +313,7 @@ typedef struct RawAIOCB { BlockDriverAIOCB common; struct aiocb aiocb; struct RawAIOCB *next; + int ret; } RawAIOCB; static int aio_sig_num = SIGUSR2; @@ -473,26 +474,37 @@ static RawAIOCB *raw_aio_setup(BlockDriv return acb; } +#ifndef QEMU_IMG +static void raw_aio_em_cb(void* opaque) +{ + RawAIOCB *acb = opaque; + acb->common.cb(acb->common.opaque, acb->ret); + qemu_aio_release(acb); +} +#endif + static BlockDriverAIOCB *raw_aio_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque) { RawAIOCB *acb; - BDRVRawState *s = bs->opaque; - /* + /* * If O_DIRECT is used and the buffer is not aligned fall back * to synchronous IO. */ - if (unlikely((s->open_flags & O_DIRECT) && ((uintptr_t) buf % 512))) { - int ret; +#ifndef QEMU_IMG + BDRVRawState *s = bs->opaque; + if (unlikely((s->open_flags & O_DIRECT) && ((uintptr_t) buf % 512))) { + QEMUBH *bh; acb = qemu_aio_get(bs, cb, opaque); - ret = raw_pread(bs, 512 * sector_num, buf, 512 * nb_sectors); - acb->common.cb(acb->common.opaque, ret); - qemu_aio_release(acb); + acb->ret = raw_pread(bs, 512 * sector_num, buf, 512 * nb_sectors); + bh = qemu_bh_new(raw_aio_em_cb, acb); + qemu_bh_schedule(bh); return &acb->common; } +#endif acb = raw_aio_setup(bs, sector_num, buf, nb_sectors, cb, opaque); if (!acb) @@ -510,21 +522,23 @@ static BlockDriverAIOCB *raw_aio_write(B BlockDriverCompletionFunc *cb, void *opaque) { RawAIOCB *acb; - BDRVRawState *s = bs->opaque; - /* + /* * If O_DIRECT is used and the buffer is not aligned fall back * to synchronous IO. */ - if (unlikely((s->open_flags & O_DIRECT) && ((uintptr_t) buf % 512))) { - int ret; +#ifndef QEMU_IMG + BDRVRawState *s = bs->opaque; + if (unlikely((s->open_flags & O_DIRECT) && ((uintptr_t) buf % 512))) { + QEMUBH *bh; acb = qemu_aio_get(bs, cb, opaque); - ret = raw_pwrite(bs, 512 * sector_num, buf, 512 * nb_sectors); - acb->common.cb(acb->common.opaque, ret); - qemu_aio_release(acb); + acb->ret = raw_pwrite(bs, 512 * sector_num, buf, 512 * nb_sectors); + bh = qemu_bh_new(raw_aio_em_cb, acb); + qemu_bh_schedule(bh); return &acb->common; } +#endif acb = raw_aio_setup(bs, sector_num, (uint8_t*)buf, nb_sectors, cb, opaque); if (!acb) --------------000900000003030903070903--