From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40816) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VClAd-0000Bz-In for qemu-devel@nongnu.org; Fri, 23 Aug 2013 02:48:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VClAT-0007t0-Uy for qemu-devel@nongnu.org; Fri, 23 Aug 2013 02:48:31 -0400 Received: from e28smtp03.in.ibm.com ([122.248.162.3]:51017) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VClAT-0007lV-5t for qemu-devel@nongnu.org; Fri, 23 Aug 2013 02:48:21 -0400 Received: from /spool/local by e28smtp03.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 23 Aug 2013 12:10:10 +0530 Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com [9.184.220.59]) by d28dlp02.in.ibm.com (Postfix) with ESMTP id 42717394004E for ; Fri, 23 Aug 2013 12:17:42 +0530 (IST) Received: from d28av03.in.ibm.com (d28av03.in.ibm.com [9.184.220.65]) by d28relay02.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r7N6nOsk28246116 for ; Fri, 23 Aug 2013 12:19:27 +0530 Received: from d28av03.in.ibm.com (localhost [127.0.0.1]) by d28av03.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r7N6llg8020388 for ; Fri, 23 Aug 2013 12:17:47 +0530 Date: Fri, 23 Aug 2013 12:18:04 +0530 From: Bharata B Rao Message-ID: <20130823064804.GG2755@in.ibm.com> References: <1377050567-19122-1-git-send-email-asias@redhat.com> <20130821152440.GB18303@stefanha-thinkpad.redhat.com> <5214DF5B.50203@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5214DF5B.50203@redhat.com> Subject: Re: [Qemu-devel] [PATCH] block: Fix race in gluster_finish_aiocb Reply-To: bharata@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Kevin Wolf , Vijay Bellur , Stefan Hajnoczi , qemu-devel@nongnu.org, Stefan Hajnoczi , Asias He , MORITA Kazutaka On Wed, Aug 21, 2013 at 05:40:11PM +0200, Paolo Bonzini wrote: > > We could just use a bottom half, too. Add a bottom half to acb, > schedule it in gluster_finish_aiocb, delete it in the bottom half's own > callback. I tried this approach (the patch at the end of this mail), but see this occasional errors, doesn't happen always. Any clues on what am I missing here ? Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffcbfff700 (LWP 11301)] 0x000055555597156b in event_notifier_set (e=0xa4) at util/event_notifier-posix.c:97 97 ret = write(e->wfd, &value, sizeof(value)); (gdb) bt #0 0x000055555597156b in event_notifier_set (e=0xa4) at util/event_notifier-posix.c:97 #1 0x00005555555dd3cd in aio_notify (ctx=0x0) at async.c:233 #2 0x00005555555dd00f in qemu_bh_schedule (bh=0x7fff300008e0) at async.c:128 #3 0x00005555556002da in gluster_finish_aiocb (fd=0x5555562ae5d0, ret=4096, arg=0x7fffd00419c0) at block/gluster.c:409 #4 0x00007ffff5e70cdc in glfs_io_async_cbk (ret=4096, frame=0x0, data= 0x555556443ee0) at glfs-fops.c:567 #5 0x00007ffff5c2843e in synctask_wrap (old_task=0x555556365940) at syncop.c:133 #6 0x00007ffff3b03370 in ?? () from /lib64/libc.so.6 #7 0x0000000000000000 in ?? () (gdb) up #1 0x00005555555dd3cd in aio_notify (ctx=0x0) at async.c:233 233 event_notifier_set(&ctx->notifier); (gdb) up #2 0x00005555555dd00f in qemu_bh_schedule (bh=0x7fff300008e0) at async.c:128 128 aio_notify(bh->ctx); (gdb) p *bh $1 = {ctx = 0x0, cb = 0x5555555ffdcd , opaque = 0x7fffd00419c0, next = 0x555556345e70, scheduled = false, idle = false, deleted = true} gluster: Use BH mechanism for AIO completion Replace the existing pipe based AIO completion handing by BH based method. Signed-off-by: Bharata B Rao --- block/gluster.c | 69 ++++++------------------------------------------------- 1 file changed, 8 insertions(+), 61 deletions(-) diff --git a/block/gluster.c b/block/gluster.c index 46f36f8..598b335 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -30,7 +30,6 @@ typedef struct GlusterAIOCB { typedef struct BDRVGlusterState { struct glfs *glfs; - int fds[2]; struct glfs_fd *fd; int event_reader_pos; GlusterAIOCB *event_acb; @@ -231,12 +230,13 @@ out: return NULL; } -static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s) +static void qemu_gluster_aio_bh(void *opaque) { int ret; + GlusterAIOCB *acb = opaque; bool *finished = acb->finished; BlockDriverCompletionFunc *cb = acb->common.cb; - void *opaque = acb->common.opaque; + void *cb_opaque = acb->common.opaque; if (!acb->ret || acb->ret == acb->size) { ret = 0; /* Success */ @@ -246,33 +246,15 @@ static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s) ret = -EIO; /* Partial read/write - fail it */ } + qemu_bh_delete(acb->bh); qemu_aio_release(acb); - cb(opaque, ret); + + cb(cb_opaque, ret); if (finished) { *finished = true; } } -static void qemu_gluster_aio_event_reader(void *opaque) -{ - BDRVGlusterState *s = opaque; - ssize_t ret; - - do { - char *p = (char *)&s->event_acb; - - ret = read(s->fds[GLUSTER_FD_READ], p + s->event_reader_pos, - sizeof(s->event_acb) - s->event_reader_pos); - if (ret > 0) { - s->event_reader_pos += ret; - if (s->event_reader_pos == sizeof(s->event_acb)) { - s->event_reader_pos = 0; - qemu_gluster_complete_aio(s->event_acb, s); - } - } - } while (ret < 0 && errno == EINTR); -} - /* TODO Convert to fine grained options */ static QemuOptsList runtime_opts = { .name = "gluster", @@ -329,17 +311,7 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options, s->fd = glfs_open(s->glfs, gconf->image, open_flags); if (!s->fd) { ret = -errno; - goto out; - } - - ret = qemu_pipe(s->fds); - if (ret < 0) { - ret = -errno; - goto out; } - fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK); - qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], - qemu_gluster_aio_event_reader, NULL, s); out: qemu_opts_del(opts); @@ -417,31 +389,10 @@ static const AIOCBInfo gluster_aiocb_info = { static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg) { GlusterAIOCB *acb = (GlusterAIOCB *)arg; - BlockDriverState *bs = acb->common.bs; - BDRVGlusterState *s = bs->opaque; - int retval; acb->ret = ret; - retval = qemu_write_full(s->fds[GLUSTER_FD_WRITE], &acb, sizeof(acb)); - if (retval != sizeof(acb)) { - /* - * Gluster AIO callback thread failed to notify the waiting - * QEMU thread about IO completion. - * - * Complete this IO request and make the disk inaccessible for - * subsequent reads and writes. - */ - error_report("Gluster failed to notify QEMU about IO completion"); - - qemu_mutex_lock_iothread(); /* We are in gluster thread context */ - acb->common.cb(acb->common.opaque, -EIO); - qemu_aio_release(acb); - close(s->fds[GLUSTER_FD_READ]); - close(s->fds[GLUSTER_FD_WRITE]); - qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL); - bs->drv = NULL; /* Make the disk inaccessible */ - qemu_mutex_unlock_iothread(); - } + acb->bh = qemu_bh_new(qemu_gluster_aio_bh, acb); + qemu_bh_schedule(acb->bh); } static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs, @@ -592,10 +543,6 @@ static void qemu_gluster_close(BlockDriverState *bs) { BDRVGlusterState *s = bs->opaque; - close(s->fds[GLUSTER_FD_READ]); - close(s->fds[GLUSTER_FD_WRITE]); - qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL); - if (s->fd) { glfs_close(s->fd); s->fd = NULL;