qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hanna Czenczek <hreitz@redhat.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, Hanna Czenczek <hreitz@redhat.com>,
	Fiona Ebner <f.ebner@proxmox.com>, John Snow <jsnow@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>
Subject: [PATCH for-8.0] ide: Fix manual in-flight count for TRIM BH
Date: Thu,  9 Mar 2023 12:44:30 +0100	[thread overview]
Message-ID: <20230309114430.33684-1-hreitz@redhat.com> (raw)

Commit 7e5cdb345f77d76cb4877fe6230c4e17a7d0d0ca ("ide: Increment BB
in-flight counter for TRIM BH") fixed a problem when cancelling trims:
ide_cancel_dma_sync() drains the BB to stop a potentially ongoing
request, and then asserts that no request is active anymore.  When
trying to cancel a trim, this would only drain a single
blk_aio_pdiscard() operation, but not necessarily the trim as a whole.
Said commit fixed that by counting the whole trim as an operation,
incrementing the in-flight counter for it, so the blk_drain() in
ide_cancel_dma_sync() would wait for it.

Fiona found a problem with this, specifically that draining a BB while
an IDE trim operation is going on can produce a deadlock: An ongoing
blk_aio_pdiscard() will be settled, but any further discard in the same
trim will go into the BB's queued_request list and wait there until the
drain stops.  Meanwhile, the whole trim keeps the BB's in_flight_counter
incremented, so the BB will never be considered drained, and qemu hangs.

Therefore, we cannot keep the in-flight counter incremented throughout
the trim operation.  We should still increment it before the completion
BH (ide_trim_bh_cb()) is scheduled and decrement it there, so that the
blk_drain() in ide_cancel_dma_sync() can await completion of this
scheduled BH.  With that change, however, it can take multiple
iterations of blk_drain() to really settle the whole trim operation, and
so ide_cancel_dma_sync() must run it in a loop.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Fixes: 7e5cdb345f77d76cb4877fe6230c4e17a7d0d0ca
       ("ide: Increment BB in-flight counter for TRIM BH")
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
Tested with a small test boot sector that issues a trim operation with
any number of discards from 0 to 64.  Before this patch, doing so hangs
starting from 2 discards; and before 7e5cdb345f, doing so crashes qemu
with any number of discards.  With this patch, it always works.
---
 hw/ide/core.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 2d034731cf..54c51084d2 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -444,7 +444,7 @@ static void ide_trim_bh_cb(void *opaque)
     iocb->bh = NULL;
     qemu_aio_unref(iocb);
 
-    /* Paired with an increment in ide_issue_trim() */
+    /* Paired with an increment in ide_issue_trim_cb() */
     blk_dec_in_flight(blk);
 }
 
@@ -504,6 +504,14 @@ static void ide_issue_trim_cb(void *opaque, int ret)
 done:
     iocb->aiocb = NULL;
     if (iocb->bh) {
+        /*
+         * Paired with a decrement in ide_trim_bh_cb(): Ensure we have
+         * an in-flight count while this TrimAIOCB object lives.
+         * There is no ongoing blk_aio_pdiscard() operation anymore,
+         * so here we increment the counter manually before returning.
+         */
+        blk_inc_in_flight(s->blk);
+
         replay_bh_schedule_event(iocb->bh);
     }
 }
@@ -515,9 +523,6 @@ BlockAIOCB *ide_issue_trim(
     IDEState *s = opaque;
     TrimAIOCB *iocb;
 
-    /* Paired with a decrement in ide_trim_bh_cb() */
-    blk_inc_in_flight(s->blk);
-
     iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
     iocb->s = s;
     iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
@@ -737,11 +742,17 @@ void ide_cancel_dma_sync(IDEState *s)
      * In the future we'll be able to safely cancel the I/O if the
      * whole DMA operation will be submitted to disk with a single
      * aio operation with preadv/pwritev.
+     *
+     * Note that TRIM operations call blk_aio_pdiscard() multiple
+     * times (and finally increment s->blk's in-flight counter while
+     * ide_trim_bh_cb() is scheduled), so we have to loop blk_drain()
+     * until the whole operation is done.
      */
     if (s->bus->dma->aiocb) {
         trace_ide_cancel_dma_sync_remaining();
-        blk_drain(s->blk);
-        assert(s->bus->dma->aiocb == NULL);
+        while (s->bus->dma->aiocb) {
+            blk_drain(s->blk);
+        }
     }
 }
 
-- 
2.39.1



             reply	other threads:[~2023-03-09 11:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-09 11:44 Hanna Czenczek [this message]
2023-03-09 12:05 ` [PATCH for-8.0] ide: Fix manual in-flight count for TRIM BH Paolo Bonzini
2023-03-09 12:08   ` Paolo Bonzini
2023-03-09 12:31     ` Hanna Czenczek
2023-03-09 13:59       ` Paolo Bonzini
2023-03-09 17:46         ` Kevin Wolf
2023-03-10 13:05           ` Fiona Ebner
2023-03-10 14:25             ` Kevin Wolf
2023-03-10 15:13               ` Paolo Bonzini
2023-03-13 12:29                 ` Fiona Ebner
2023-03-13 13:09                   ` Paolo Bonzini
2023-03-13 16:32                 ` Kevin Wolf
2023-03-14  9:42                   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230309114430.33684-1-hreitz@redhat.com \
    --to=hreitz@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).