From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7138CCCFA04 for ; Tue, 4 Nov 2025 18:07:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vGLMv-0000R3-Cs; Tue, 04 Nov 2025 13:02:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vGLFI-0001Dn-VA for qemu-devel@nongnu.org; Tue, 04 Nov 2025 12:55:16 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vGLFE-0006pH-Uy for qemu-devel@nongnu.org; Tue, 04 Nov 2025 12:55:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762278899; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lKtvsZ/mQSLh7/SR+SHV5PkkdTjZ3ppItOA25V3o2fM=; b=fa5bknD2H+xFiDfe/hADLan75awRgZnIgdAVg2kMqjx8sljmfdaih4CE0w3TwPMSZ6wWFf O2j69LKzIA91MRwxTuHSVXfP4Kl0/LBP4QDivZBpoOrH+NFgRrcRocWa+egFxYMVgcBhzT chuhbyU4kGJ3UVVSXUfZDj5Jzz/bRhA= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-581-W8uw1a6OPDuiJKTcwUEodQ-1; Tue, 04 Nov 2025 12:54:56 -0500 X-MC-Unique: W8uw1a6OPDuiJKTcwUEodQ-1 X-Mimecast-MFC-AGG-ID: W8uw1a6OPDuiJKTcwUEodQ_1762278895 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2CFD0195608E; Tue, 4 Nov 2025 17:54:55 +0000 (UTC) Received: from merkur.fritz.box (unknown [10.45.226.47]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DE99D180044F; Tue, 4 Nov 2025 17:54:53 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, qemu-devel@nongnu.org Subject: [PULL 17/27] block: Drop detach_subchain for bdrv_replace_node Date: Tue, 4 Nov 2025 18:54:05 +0100 Message-ID: <20251104175415.525388-18-kwolf@redhat.com> In-Reply-To: <20251104175415.525388-1-kwolf@redhat.com> References: <20251104175415.525388-1-kwolf@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.788, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Wesley Hershberger Detaching filters using detach_subchain=true can cause segfaults as described in #3149. More specifically, this was observed when executing concurrent block-stream and query-named-block-nodes. block-stream adds a copy-on-read filter as the main BDS for the blockjob; that filter was dropped with detach_subchain=true but not unref'd until the the blockjob was free'd. Because query-named-block-nodes assumes that a filter will always have exactly one child, it caused a segfault when it observed the detached filter. Stacktrace: 0 bdrv_refresh_filename (bs=0x5efed72f8350) at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082 1 0x00005efea73cf9dc in bdrv_block_device_info (blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8) at block/qapi.c:62 2 0x00005efea7391ed3 in bdrv_named_nodes_list (flat=, errp=0x7ffeb829ebd8) at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275 3 0x00005efea7471993 in qmp_query_named_block_nodes (has_flat=, flat=, errp=0x7ffeb829ebd8) at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834 4 qmp_marshal_query_named_block_nodes (args=, ret=0x7f2b753beec0, errp=0x7f2b753beec8) at qapi/qapi-commands-block-core.c:553 5 0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0) at qapi/qmp-dispatch.c:128 6 0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430) at util/async.c:219 7 0x00005efea74ffdb2 in aio_dispatch (ctx=0x5efed6f3f430) at util/aio-posix.c:436 8 0x00005efea7512846 in aio_ctx_dispatch (source=, callback=,user_data=) at util/async.c:361 9 0x00007f2b77809bfb in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 10 0x00007f2b77809e70 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 11 0x00005efea7517228 in glib_pollfds_poll () at util/main-loop.c:287 12 os_host_main_loop_wait (timeout=0) at util/main-loop.c:310 13 main_loop_wait (nonblocking=) at util/main-loop.c:589 14 0x00005efea7140482 in qemu_main_loop () at system/runstate.c:905 15 0x00005efea744e4e8 in qemu_default_main (opaque=opaque@entry=0x0) at system/main.c:50 16 0x00005efea6e76319 in main (argc=, argv=) at system/main.c:93 As discussed in 20251024-second-fix-3149-v1-1-d997fa3d5ce2@canonical.com, a filter should not exist without children in the first place; therefore, drop the parameter entirely as it is only used for filters. This is a partial revert of 3108a15cf09865456d499b08fe14e3dbec4ccbb3. After this change, a blockdev-backup job's copy-before-write filter will hold references to its children until the filter is unref'd. This causes an additional flush during bdrv_close, so also update iotest 257. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149 Suggested-by: Kevin Wolf Signed-off-by: Wesley Hershberger Reviewed-by: Vladimir Sementsov-Ogievskiy Message-ID: <20251029-third-fix-3149-v2-1-94932bb404f4@canonical.com> Reviewed-by: Kevin Wolf Signed-off-by: Kevin Wolf --- block.c | 38 ++++---------------------------------- tests/qemu-iotests/257 | 8 ++++++-- tests/qemu-iotests/257.out | 14 +++++++------- 3 files changed, 17 insertions(+), 43 deletions(-) diff --git a/block.c b/block.c index cf08e64add..0fe0152a7c 100644 --- a/block.c +++ b/block.c @@ -5398,17 +5398,13 @@ bdrv_replace_node_noperm(BlockDriverState *from, * * With auto_skip=false the error is returned if from has a parent which should * not be updated. - * - * With @detach_subchain=true @to must be in a backing chain of @from. In this - * case backing link of the cow-parent of @to is removed. */ static int GRAPH_WRLOCK bdrv_replace_node_common(BlockDriverState *from, BlockDriverState *to, - bool auto_skip, bool detach_subchain, Error **errp) + bool auto_skip, Error **errp) { Transaction *tran = tran_new(); g_autoptr(GSList) refresh_list = NULL; - BlockDriverState *to_cow_parent = NULL; int ret; GLOBAL_STATE_CODE(); @@ -5417,17 +5413,6 @@ bdrv_replace_node_common(BlockDriverState *from, BlockDriverState *to, assert(to->quiesce_counter); assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to)); - if (detach_subchain) { - assert(bdrv_chain_contains(from, to)); - assert(from != to); - for (to_cow_parent = from; - bdrv_filter_or_cow_bs(to_cow_parent) != to; - to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent)) - { - ; - } - } - /* * Do the replacement without permission update. * Replacement may influence the permissions, we should calculate new @@ -5439,11 +5424,6 @@ bdrv_replace_node_common(BlockDriverState *from, BlockDriverState *to, goto out; } - if (detach_subchain) { - /* to_cow_parent is already drained because from is drained */ - bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran); - } - refresh_list = g_slist_prepend(refresh_list, to); refresh_list = g_slist_prepend(refresh_list, from); @@ -5462,7 +5442,7 @@ out: int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to, Error **errp) { - return bdrv_replace_node_common(from, to, true, false, errp); + return bdrv_replace_node_common(from, to, true, errp); } int bdrv_drop_filter(BlockDriverState *bs, Error **errp) @@ -5478,7 +5458,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp) bdrv_drained_begin(child_bs); bdrv_graph_wrlock(); - ret = bdrv_replace_node_common(bs, child_bs, true, true, errp); + ret = bdrv_replace_node_common(bs, child_bs, true, errp); bdrv_graph_wrunlock(); bdrv_drained_end(child_bs); @@ -5929,17 +5909,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base, updated_children = g_slist_prepend(updated_children, c); } - /* - * It seems correct to pass detach_subchain=true here, but it triggers - * one more yet not fixed bug, when due to nested aio_poll loop we switch to - * another drained section, which modify the graph (for example, removing - * the child, which we keep in updated_children list). So, it's a TODO. - * - * Note, bug triggered if pass detach_subchain=true here and run - * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash. - * That's a FIXME. - */ - bdrv_replace_node_common(top, base, false, false, &local_err); + bdrv_replace_node_common(top, base, false, &local_err); bdrv_graph_wrunlock(); if (local_err) { diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257 index 7d3720b8e5..cd0468aaa1 100755 --- a/tests/qemu-iotests/257 +++ b/tests/qemu-iotests/257 @@ -310,14 +310,18 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', failure=None): 'state': 1, 'new_state': 2 }, { - 'event': 'read_aio', + 'event': 'flush_to_disk', 'state': 2, 'new_state': 3 + }, { + 'event': "read_aio", + 'state': 3, + 'new_state': 4 }], 'inject-error': [{ 'event': 'read_aio', 'errno': 5, - 'state': 3, + 'state': 4, 'immediately': False, 'once': True }] diff --git a/tests/qemu-iotests/257.out b/tests/qemu-iotests/257.out index c33dd7f3a9..fb28333cb2 100644 --- a/tests/qemu-iotests/257.out +++ b/tests/qemu-iotests/257.out @@ -272,7 +272,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- @@ -1017,7 +1017,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- @@ -1762,7 +1762,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- @@ -2507,7 +2507,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- @@ -3252,7 +3252,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- @@ -3997,7 +3997,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- @@ -4742,7 +4742,7 @@ qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK! --- Preparing image & VM --- -{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 3}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "read_aio", "new-state": 3, "state": 2}]}, "node-name": "drive0"}} +{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": {"driver": "blkdebug", "image": {"driver": "file", "filename": "TEST_DIR/PID-img"}, "inject-error": [{"errno": 5, "event": "read_aio", "immediately": false, "once": true, "state": 4}], "set-state": [{"event": "flush_to_disk", "new-state": 2, "state": 1}, {"event": "flush_to_disk", "new-state": 3, "state": 2}, {"event": "read_aio", "new-state": 4, "state": 3}]}, "node-name": "drive0"}} {"return": {}} --- Write #0 --- -- 2.51.1