From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36988) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZXXrL-0001Yk-CE for qemu-devel@nongnu.org; Thu, 03 Sep 2015 12:59:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZXXrI-0004Oa-4g for qemu-devel@nongnu.org; Thu, 03 Sep 2015 12:59:35 -0400 Date: Thu, 3 Sep 2015 17:59:28 +0100 From: Stefan Hajnoczi Message-ID: <20150903165928.GF18405@stefanha-thinkpad.redhat.com> References: <1440058448-27847-1-git-send-email-pl@kamp.de> <1440058448-27847-3-git-send-email-pl@kamp.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1440058448-27847-3-git-send-email-pl@kamp.de> Subject: Re: [Qemu-devel] [PATCH 2/2] ide/atapi: partially avoid deadlock if the storage backend is dead List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: kwolf@redhat.com, pbonzini@redhat.com, jsnow@redhat.com, qemu-devel@nongnu.org, qemu-block@nongnu.org On Thu, Aug 20, 2015 at 10:14:08AM +0200, Peter Lieven wrote: > the blk_drain_all() that is executed if the guest issues a DMA cancel > leads to a stuck main loop if the storage backend (e.g. a NFS share) > is unresponsive. > > This scenario is a common case for CDROM images mounted from an > NFS share. In this case a broken NFS server can take down the > whole VM even if the mounted CDROM is not used and was just not > unmounted after usage. > > This approach avoids the blk_drain_all for read-only media and > cancelles the AIO locally and makes the callback a NOP if the > original request is completed after the NFS share is responsive > again. > > Signed-off-by: Peter Lieven > --- > hw/ide/pci.c | 32 ++++++++++++++++++-------------- > 1 file changed, 18 insertions(+), 14 deletions(-) > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c > index d31ff88..a8b4175 100644 > --- a/hw/ide/pci.c > +++ b/hw/ide/pci.c > @@ -240,21 +240,25 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val) > /* Ignore writes to SSBM if it keeps the old value */ > if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) { > if (!(val & BM_CMD_START)) { > - /* > - * We can't cancel Scatter Gather DMA in the middle of the > - * operation or a partial (not full) DMA transfer would reach > - * the storage so we wait for completion instead (we beahve > - * like if the DMA was completed by the time the guest trying > - * to cancel dma with bmdma_cmd_writeb with BM_CMD_START not > - * set). > - * > - * In the future we'll be able to safely cancel the I/O if the > - * whole DMA operation will be submitted to disk with a single > - * aio operation with preadv/pwritev. > - */ > if (bm->bus->dma->aiocb) { > - blk_drain_all(); > - assert(bm->bus->dma->aiocb == NULL); > + if (!bdrv_is_read_only(bm->bus->dma->aiocb->bs)) { > + /* We can't cancel Scatter Gather DMA in the middle of the > + * operation or a partial (not full) DMA transfer would > + * reach the storage so we wait for completion instead > + * (we beahve like if the DMA was completed by the time the > + * guest trying to cancel dma with bmdma_cmd_writeb with > + * BM_CMD_START not set). */ > + blk_drain_all(); > + assert(bm->bus->dma->aiocb == NULL); > + } else { > + /* On a read-only device (e.g. CDROM) we can't cause incon- > + * sistencies and thus cancel the AIOCB locally and avoid > + * to be called back later if the original request is > + * completed. */ > + BlockAIOCB *aiocb = bm->bus->dma->aiocb; > + aiocb->cb(aiocb->opaque, -ECANCELED); > + aiocb->cb = NULL; I'm concerned that this isn't safe. What happens if the request does complete (e.g. will guest RAM be modified by the read operation)? What happens if a new request is started and then old NOPed request completes? Taking a step back, what are the semantics of writing !(val & BM_CMD_START)? Is the device guaranteed to cancel/complete requests during the register write?