All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergio Lopez <slp@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH v5 4/4] blockdev: honor bdrv_try_set_aio_context() context requirements
Date: Wed, 18 Dec 2019 16:08:29 +0100	[thread overview]
Message-ID: <87bls5zn6a.fsf@redhat.com> (raw)
In-Reply-To: <7ea304ab-0a4b-8c0a-ae9f-2f6501198840@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4833 bytes --]


Eric Blake <eblake@redhat.com> writes:

> On 12/9/19 10:06 AM, Kevin Wolf wrote:
>> Am 28.11.2019 um 11:41 hat Sergio Lopez geschrieben:
>>> bdrv_try_set_aio_context() requires that the old context is held, and
>>> the new context is not held. Fix all the occurrences where it's not
>>> done this way.
>>>
>>> Suggested-by: Max Reitz <mreitz@redhat.com>
>>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>>> ---
>
>> Or in fact, I think you need to hold the AioContext of a bs to
>> bdrv_unref() it, so maybe 'goto out' is right, but you need to unref
>> target_bs while you still hold old_context.
>
> I suspect https://bugzilla.redhat.com/show_bug.cgi?id=1779036 is also
> a symptom of this.  The v5 patch did not fix this simple test case:
>
>
> $ qemu-img create -f qcow2 f1 100m
> $ qemu-img create -f qcow2 f2 100m
> $ ./qemu-kvm -nodefaults -nographic -qmp stdio -object iothread,id=io0 \
>  -drive driver=qcow2,id=drive1,file=f1,if=none -device
> virtio-scsi-pci,id=scsi0,iothread=io0 -device
> scsi-hd,id=image1,drive=drive1 \
>  -drive driver=qcow2,id=drive2,file=f2,if=none -device
> virtio-blk-pci,id=image2,drive=drive2,iothread=io0
>
> {'execute':'qmp_capabilities'}
>
> {'execute':'transaction','arguments':{'actions':[
> {'type':'blockdev-snapshot-sync','data':{'device':'drive1',
> 'snapshot-file':'sn1','mode':'absolute-paths','format':'qcow2'}},
> {'type':'blockdev-snapshot-sync','data':{'device':'drive2',
> 'snapshot-file':'/aa/sn1','mode':'absolute-paths','format':'qcow2'}}]}}
>
> which is an aio context bug somewhere on the error path of
> blockdev-snapshot-sync (the first one has to be rolled back because
> the second part of the transaction fails early on a nonexistent
> directory)

This is slightly different. The problem resides in
external_snapshot_abort():

   1717 static void external_snapshot_abort(BlkActionState *common)
   1718 {
   1719     ExternalSnapshotState *state =
   1720                              DO_UPCAST(ExternalSnapshotState, common, common);
   1721     if (state->new_bs) {
   1722         if (state->overlay_appended) {
   1723             AioContext *aio_context;
   1724 
   1725             aio_context = bdrv_get_aio_context(state->old_bs);
   1726             aio_context_acquire(aio_context);
   1727 
   1728             bdrv_ref(state->old_bs);   /* we can't let bdrv_set_backind_hd()
   1729                                           close state->old_bs; we need it */
   1730             bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
   1731             bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
   1732             bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */
   1733 
   1734             aio_context_release(aio_context);
   1735         }
   1736     }
   1737 }

bdrv_set_backing_hd() returns state->old_bs to the main AioContext,
while bdrv_replace_node() expects state->new_bs and state->old_bs to be
using the same AioContext.

I'm thinking sending this as a separate patch:

diff --git a/blockdev.c b/blockdev.c
index e33abd7fd2..6c73ac4e32 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1731,6 +1731,8 @@ static void external_snapshot_abort(BlkActionState *common)
     if (state->new_bs) {
         if (state->overlay_appended) {
             AioContext *aio_context;
+            AioContext *tmp_context;
+            int ret;
 
             aio_context = bdrv_get_aio_context(state->old_bs);
             aio_context_acquire(aio_context);
@@ -1738,6 +1740,25 @@ static void external_snapshot_abort(BlkActionState *common)
             bdrv_ref(state->old_bs);   /* we can't let bdrv_set_backind_hd()
                                           close state->old_bs; we need it */
             bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
+
+            /*
+             * The call to bdrv_set_backing_hd() above returns state->old_bs to
+             * the main AioContext. As we're still going to be using it, return
+             * it to the AioContext it was before.
+             */
+            tmp_context = bdrv_get_aio_context(state->old_bs);
+            if (aio_context != tmp_context) {
+                aio_context_release(aio_context);
+                aio_context_acquire(tmp_context);
+
+                ret = bdrv_try_set_aio_context(state->old_bs,
+                                               aio_context, NULL);
+                assert(ret == 0);
+
+                aio_context_release(tmp_context);
+                aio_context_acquire(aio_context);
+            }
+
             bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
             bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */

What do you think?

Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  parent reply	other threads:[~2019-12-18 15:10 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-28 10:41 [PATCH v5 0/4] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
2019-11-28 10:41 ` [PATCH v5 1/4] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
2019-11-28 10:41 ` [PATCH v5 2/4] blockdev: unify qmp_drive_backup and drive-backup transaction paths Sergio Lopez
2019-11-28 10:41 ` [PATCH v5 3/4] blockdev: unify qmp_blockdev_backup and blockdev-backup " Sergio Lopez
2019-11-28 10:41 ` [PATCH v5 4/4] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
2019-12-09 16:06   ` Kevin Wolf
2019-12-13 20:59     ` Eric Blake
2019-12-16 11:29       ` Kevin Wolf
2019-12-18 15:39         ` Sergio Lopez
2019-12-18 15:08       ` Sergio Lopez [this message]
2019-12-18 15:38     ` Sergio Lopez
2019-12-09 16:07 ` [PATCH v5 0/4] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bls5zn6a.fsf@redhat.com \
    --to=slp@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.