From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:38309) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QWYCT-0005xE-J5 for qemu-devel@nongnu.org; Tue, 14 Jun 2011 14:18:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QWYCQ-00068w-5q for qemu-devel@nongnu.org; Tue, 14 Jun 2011 14:18:53 -0400 Received: from mtagate3.uk.ibm.com ([194.196.100.163]:41636) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QWYCP-00068F-Fb for qemu-devel@nongnu.org; Tue, 14 Jun 2011 14:18:49 -0400 Received: from d06nrmr1806.portsmouth.uk.ibm.com (d06nrmr1806.portsmouth.uk.ibm.com [9.149.39.193]) by mtagate3.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p5EIIkVe013483 for ; Tue, 14 Jun 2011 18:18:46 GMT Received: from d06av12.portsmouth.uk.ibm.com (d06av12.portsmouth.uk.ibm.com [9.149.37.247]) by d06nrmr1806.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p5EIIjRG2396342 for ; Tue, 14 Jun 2011 19:18:46 +0100 Received: from d06av12.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av12.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p5EIIjnq015185 for ; Tue, 14 Jun 2011 12:18:45 -0600 From: Stefan Hajnoczi Date: Tue, 14 Jun 2011 19:18:25 +0100 Message-Id: <1308075511-4745-8-git-send-email-stefanha@linux.vnet.ibm.com> In-Reply-To: <1308075511-4745-1-git-send-email-stefanha@linux.vnet.ibm.com> References: <1308075511-4745-1-git-send-email-stefanha@linux.vnet.ibm.com> Subject: [Qemu-devel] [PATCH 07/13] qed: avoid deadlock on emulated synchronous I/O List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Kevin Wolf , Anthony Liguori , Stefan Hajnoczi , Adam Litke The block layer emulates synchronous bdrv_read()/bdrv_write() for drivers that only provide the asynchronous interfaces. The emulation issues an asynchronous request inside a new "async context" and waits for that request to complete. If currently outstanding requests complete during this time, their completion functions are not invoked until the async context is popped again. This can lead to deadlock if an allocating write is being processed when synchronous I/O emulation starts. The emulated synchronous write will be queued because an existing request is being processed. But the existing request on cannot complete until the async context is popped. The result is that qemu_aio_wait() sits in a deadlock. Address this problem in two ways: 1. Add an assertion so that we instantly know if this corner case is hit. This saves us time by giving a clear failure indication. 2. Ignore the copy-on-read hint for emulated synchronous reads. This allows us to do emulated synchronous reads without hitting the deadlock. Signed-off-by: Stefan Hajnoczi --- block/qed.c | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/block/qed.c b/block/qed.c index 6ca57f2..ffdbc2d 100644 --- a/block/qed.c +++ b/block/qed.c @@ -1120,6 +1120,14 @@ static bool qed_start_allocating_write(QEDAIOCB *acb) } if (acb != QSIMPLEQ_FIRST(&s->allocating_write_reqs) || s->allocating_write_reqs_plugged) { + /* Queuing an emulated synchronous write causes deadlock since + * currently outstanding requests are not in the current async context + * and their completion will never be invoked. Once the block layer + * moves to truly asynchronous semantics this failure case will be + * eliminated. + */ + assert(get_async_context_id() == 0); + return false; } return true; @@ -1246,7 +1254,9 @@ static void qed_aio_read_data(void *opaque, int ret, } else if (ret != QED_CLUSTER_FOUND) { BlockDriverCompletionFunc *cb = qed_aio_next_io; - if (bs->backing_hd && (acb->flags & QED_AIOCB_COPY_ON_READ)) { + /* See qed_start_allocating_write() for get_async_context_id() hack */ + if (bs->backing_hd && (acb->flags & QED_AIOCB_COPY_ON_READ) && + get_async_context_id() == 0) { if (!qed_start_allocating_write(acb)) { qemu_iovec_reset(&acb->cur_qiov); return; /* wait for current allocating write to complete */ -- 1.7.5.3