From: Peter Lieven <pl@kamp.de>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: kwolf@redhat.com, pbonzini@redhat.com, jsnow@redhat.com,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/2] ide/atapi: partially avoid deadlock if the storage backend is dead
Date: Sun, 06 Sep 2015 11:24:10 +0200 [thread overview]
Message-ID: <55EC063A.6080501@kamp.de> (raw)
In-Reply-To: <20150903165928.GF18405@stefanha-thinkpad.redhat.com>
Am 03.09.2015 um 18:59 schrieb Stefan Hajnoczi:
> On Thu, Aug 20, 2015 at 10:14:08AM +0200, Peter Lieven wrote:
>> the blk_drain_all() that is executed if the guest issues a DMA cancel
>> leads to a stuck main loop if the storage backend (e.g. a NFS share)
>> is unresponsive.
>>
>> This scenario is a common case for CDROM images mounted from an
>> NFS share. In this case a broken NFS server can take down the
>> whole VM even if the mounted CDROM is not used and was just not
>> unmounted after usage.
>>
>> This approach avoids the blk_drain_all for read-only media and
>> cancelles the AIO locally and makes the callback a NOP if the
>> original request is completed after the NFS share is responsive
>> again.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> hw/ide/pci.c | 32 ++++++++++++++++++--------------
>> 1 file changed, 18 insertions(+), 14 deletions(-)
>>
>> diff --git a/hw/ide/pci.c b/hw/ide/pci.c
>> index d31ff88..a8b4175 100644
>> --- a/hw/ide/pci.c
>> +++ b/hw/ide/pci.c
>> @@ -240,21 +240,25 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
>> /* Ignore writes to SSBM if it keeps the old value */
>> if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) {
>> if (!(val & BM_CMD_START)) {
>> - /*
>> - * We can't cancel Scatter Gather DMA in the middle of the
>> - * operation or a partial (not full) DMA transfer would reach
>> - * the storage so we wait for completion instead (we beahve
>> - * like if the DMA was completed by the time the guest trying
>> - * to cancel dma with bmdma_cmd_writeb with BM_CMD_START not
>> - * set).
>> - *
>> - * In the future we'll be able to safely cancel the I/O if the
>> - * whole DMA operation will be submitted to disk with a single
>> - * aio operation with preadv/pwritev.
>> - */
>> if (bm->bus->dma->aiocb) {
>> - blk_drain_all();
>> - assert(bm->bus->dma->aiocb == NULL);
>> + if (!bdrv_is_read_only(bm->bus->dma->aiocb->bs)) {
>> + /* We can't cancel Scatter Gather DMA in the middle of the
>> + * operation or a partial (not full) DMA transfer would
>> + * reach the storage so we wait for completion instead
>> + * (we beahve like if the DMA was completed by the time the
>> + * guest trying to cancel dma with bmdma_cmd_writeb with
>> + * BM_CMD_START not set). */
>> + blk_drain_all();
>> + assert(bm->bus->dma->aiocb == NULL);
>> + } else {
>> + /* On a read-only device (e.g. CDROM) we can't cause incon-
>> + * sistencies and thus cancel the AIOCB locally and avoid
>> + * to be called back later if the original request is
>> + * completed. */
>> + BlockAIOCB *aiocb = bm->bus->dma->aiocb;
>> + aiocb->cb(aiocb->opaque, -ECANCELED);
>> + aiocb->cb = NULL;
> I'm concerned that this isn't safe.
>
> What happens if the request does complete (e.g. will guest RAM be
> modified by the read operation)?
I am afraid you are right. The callback of the storage driver will
likely overwrite the memory. This could be solved for storage drivers
which don't do zero copy if it would be possible to notifiy them about
the cancellation.
>
> What happens if a new request is started and then old NOPed request
> completes?
That should work because the callback of the NOPed request is never
fired.
>
> Taking a step back, what are the semantics of writing !(val &
> BM_CMD_START)? Is the device guaranteed to cancel/complete requests
> during the register write?
I have to check that. John, do you have an idea?
Stefan, or do you have a better approach for getting rid of the bdrv_drain_all
in this piece of code?
Peter
next prev parent reply other threads:[~2015-09-06 9:24 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-20 8:14 [Qemu-devel] [PATCH 0/2] ide/atapi: partially avoid deadlock if the storage backend is dead Peter Lieven
2015-08-20 8:14 ` [Qemu-devel] [PATCH 1/2] block/io: allow AIOCB without callback Peter Lieven
2015-08-21 6:12 ` Eric Blake
2015-08-31 8:38 ` Peter Lieven
2015-08-20 8:14 ` [Qemu-devel] [PATCH 2/2] ide/atapi: partially avoid deadlock if the storage backend is dead Peter Lieven
2015-08-21 6:13 ` Eric Blake
2015-09-03 16:59 ` Stefan Hajnoczi
2015-09-06 9:24 ` Peter Lieven [this message]
2015-09-07 13:43 ` Stefan Hajnoczi
2015-09-07 14:05 ` Peter Lieven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55EC063A.6080501@kamp.de \
--to=pl@kamp.de \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).