* [PATCH] block: Add missing null checks during query-named-block-nodes
@ 2025-10-24 18:07 Wesley Hershberger
  2025-10-25  8:27 ` Vladimir Sementsov-Ogievskiy
  2025-10-27  9:55 ` Kevin Wolf
  0 siblings, 2 replies; 9+ messages in thread
From: Wesley Hershberger @ 2025-10-24 18:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, qemu-block, Vladimir Sementsov-Ogievskiy,
	Wesley Hershberger
Some operations insert an implicit node above the top node in the block
graph (e.g. block-stream or blockdev-backup). The implicit node is
removed when the operation finishes. If QMP query-named-block-nodes is
called while the operation is removing the implicit node, it may observe
a node with no children, causing a segfault.
This is hypothesized to only affect the block-stream operation as other
operations use the workaround bdc4c4c5e372756a5ba3fb3a61e585b02f0dd7f4
or do not detach their children during cleanup (see
3108a15cf09865456d499b08fe14e3dbec4ccbb3).
This backtrace was observed in #3149 on a relatively recent build. The
bs passed to bdrv_refresh_filename is the cor_filter_bs from the
block-stream operation; bs->implicit was "true".
0  bdrv_refresh_filename (bs=0x5efed72f8350)
    at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082
1  0x00005efea73cf9dc in bdrv_block_device_info
    (blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8)
    at block/qapi.c:62
2  0x00005efea7391ed3 in bdrv_named_nodes_list
    (flat=<optimized out>, errp=0x7ffeb829ebd8)
    at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275
3  0x00005efea7471993 in qmp_query_named_block_nodes
    (has_flat=<optimized out>, flat=<optimized out>, errp=0x7ffeb829ebd8)
    at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834
4  qmp_marshal_query_named_block_nodes
    (args=<optimized out>, ret=0x7f2b753beec0, errp=0x7f2b753beec8)
    at qapi/qapi-commands-block-core.c:553
5  0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0)
    at qapi/qmp-dispatch.c:128
6  0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430)
    at util/async.c:219
...
The get_allocated_file_size change resolves a second segfault after the
first was resolved. Here, bdrv_filter_bs returns NULL as the
bs (cor_filter_bs) has no children:
0  bdrv_co_get_allocated_file_size (bs=<optimized out>)
    at /usr/src/qemu-1:10.1.0+ds-5ubuntu2+test8/b/qemu/block.c:6018
1  0x0000631d078522be in bdrv_co_get_allocated_file_size_entry
    (opaque=0x7ffd375c5dd0) at block/block-gen.c:288
2  0x0000631d07929ec2 in coroutine_trampoline
    (i0=<optimized out>, i1=<optimized out>)
    at util/coroutine-ucontext.c:175
...
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149
Buglink: https://bugs.launchpad.net/bugs/2126951
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
---
As discussed in the previous thread:
https://lists.gnu.org/archive/html/qemu-devel/2025-10/msg05458.html
This resolves the issue with my reproducer.
make check-block passes locally.
Please let me know if any adjustments to the patch are needed.
Thanks for the quick & helpful reviews!
---
 block.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/block.c b/block.c
index 8848e9a7ed665a1bfbde2aba29e2c414f5bbe39b..8629324703d5d5fcf0bc3908d5b2dd4b26e4031d 100644
--- a/block.c
+++ b/block.c
@@ -6024,8 +6024,18 @@ int64_t coroutine_fn bdrv_co_get_allocated_file_size(BlockDriverState *bs)
          */
         return -ENOTSUP;
     } else if (drv->is_filter) {
+        BlockDriverState *filtered = bdrv_filter_bs(bs);
+
+        /*
+         * A filter may have been removed from the graph (child detached) but
+         * not yet unrefed.
+         */
+        if (!filtered) {
+            return -ENOMEDIUM;
+        }
+
         /* Filter drivers default to the size of their filtered child */
-        return bdrv_co_get_allocated_file_size(bdrv_filter_bs(bs));
+        return bdrv_co_get_allocated_file_size(filtered);
     } else {
         /* Other drivers default to summing their children's sizes */
         return bdrv_sum_allocated_file_size(bs);
@@ -8067,6 +8077,15 @@ void bdrv_refresh_filename(BlockDriverState *bs)
     }
 
     if (bs->implicit) {
+        /*
+         * An node may have been removed from the graph (child detached) but
+         * not yet unrefed. filename and full_open_options can be left
+         * unchanged since this state is temporary.
+         */
+        if (QLIST_EMPTY(&bs->children)) {
+            return;
+        }
+
         /* For implicit nodes, just copy everything from the single child */
         child = QLIST_FIRST(&bs->children);
         assert(QLIST_NEXT(child, next) == NULL);
---
base-commit: e8779f3d1509cd07620c6166a9a280376e01ff2f
change-id: 20251024-second-fix-3149-f242bd8f57f5
Best regards,
-- 
Wesley Hershberger <wesley.hershberger@canonical.com>
^ permalink raw reply related	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-24 18:07 [PATCH] block: Add missing null checks during query-named-block-nodes Wesley Hershberger
@ 2025-10-25  8:27 ` Vladimir Sementsov-Ogievskiy
  2025-10-27  9:55 ` Kevin Wolf
  1 sibling, 0 replies; 9+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-25  8:27 UTC (permalink / raw)
  To: Wesley Hershberger, qemu-devel; +Cc: Kevin Wolf, Hanna Reitz, qemu-block
On 24.10.25 21:07, Wesley Hershberger wrote:
> Some operations insert an implicit node above the top node in the block
> graph (e.g. block-stream or blockdev-backup). The implicit node is
> removed when the operation finishes. If QMP query-named-block-nodes is
> called while the operation is removing the implicit node, it may observe
> a node with no children, causing a segfault.
> 
> This is hypothesized to only affect the block-stream operation as other
> operations use the workaround bdc4c4c5e372756a5ba3fb3a61e585b02f0dd7f4
> or do not detach their children during cleanup (see
> 3108a15cf09865456d499b08fe14e3dbec4ccbb3).
> 
> This backtrace was observed in #3149 on a relatively recent build. The
> bs passed to bdrv_refresh_filename is the cor_filter_bs from the
> block-stream operation; bs->implicit was "true".
> 
> 0  bdrv_refresh_filename (bs=0x5efed72f8350)
>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082
> 1  0x00005efea73cf9dc in bdrv_block_device_info
>      (blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8)
>      at block/qapi.c:62
> 2  0x00005efea7391ed3 in bdrv_named_nodes_list
>      (flat=<optimized out>, errp=0x7ffeb829ebd8)
>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275
> 3  0x00005efea7471993 in qmp_query_named_block_nodes
>      (has_flat=<optimized out>, flat=<optimized out>, errp=0x7ffeb829ebd8)
>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834
> 4  qmp_marshal_query_named_block_nodes
>      (args=<optimized out>, ret=0x7f2b753beec0, errp=0x7f2b753beec8)
>      at qapi/qapi-commands-block-core.c:553
> 5  0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0)
>      at qapi/qmp-dispatch.c:128
> 6  0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430)
>      at util/async.c:219
> ...
> 
> The get_allocated_file_size change resolves a second segfault after the
> first was resolved. Here, bdrv_filter_bs returns NULL as the
> bs (cor_filter_bs) has no children:
> 
> 0  bdrv_co_get_allocated_file_size (bs=<optimized out>)
>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2+test8/b/qemu/block.c:6018
> 1  0x0000631d078522be in bdrv_co_get_allocated_file_size_entry
>      (opaque=0x7ffd375c5dd0) at block/block-gen.c:288
> 2  0x0000631d07929ec2 in coroutine_trampoline
>      (i0=<optimized out>, i1=<optimized out>)
>      at util/coroutine-ucontext.c:175
> ...
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149
> Buglink: https://bugs.launchpad.net/bugs/2126951
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
-- 
Best regards,
Vladimir
^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-24 18:07 [PATCH] block: Add missing null checks during query-named-block-nodes Wesley Hershberger
  2025-10-25  8:27 ` Vladimir Sementsov-Ogievskiy
@ 2025-10-27  9:55 ` Kevin Wolf
  2025-10-27 16:04   ` Wesley Hershberger
  2025-10-27 19:44   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 2 replies; 9+ messages in thread
From: Kevin Wolf @ 2025-10-27  9:55 UTC (permalink / raw)
  To: Wesley Hershberger
  Cc: qemu-devel, Hanna Reitz, qemu-block, Vladimir Sementsov-Ogievskiy
Am 24.10.2025 um 20:07 hat Wesley Hershberger geschrieben:
> Some operations insert an implicit node above the top node in the block
> graph (e.g. block-stream or blockdev-backup). The implicit node is
> removed when the operation finishes. If QMP query-named-block-nodes is
> called while the operation is removing the implicit node, it may observe
> a node with no children, causing a segfault.
How can a QMP command run in the middle of removing a node? Is this the
real bug?
You say block-stream is affected, so let's look at stream.c. The
interesting part here is bdrv_cor_filter_drop(s->cor_filter_bs).
bdrv_drop_filter() calls bdrv_drained_begin() and bdrv_graph_wrlock(),
which can run arbitrary callbacks, but not QMP commands. It's too eary
anyway, the filter is still in the graph at this point.
Between bdrv_replace_node_common(), which removes the node from its
parent, and bdrv_unref(cor_filter_bs), I don't see any place that could
run a QMP command.
Does cor_filter_bs have a refcount > 1 before running stream_prepare()
or stream_clean()?
Aha, your previous commit message [1] is actually clearer on this:
    The cor_filter_bs was added to the blockjob as the main BDS (by
    passing it to block_job_create), so bdrv_cor_filter_drop doesn't
    actually remove it from global_bdrv_states.
[1] https://patchew.org/QEMU/20251021-fix-3149-v2-1-5ffbe701e964@canonical.com/
So we _are_ creating a state in which cor_filter_bs still exists, but
doesn't have a child any more. Which is rather untypical for a filter
(in fact, it's against the definition of a filter). And your two
different patches address this from two different angles:
1. Don't even let this situation arise. We need to make sure that
   cor_filter_bs never exists without a child - or at least, that it's
   happening only for a short time while we know that no other code is
   running. This is what your previous patch attempted.
2. Make sure that everything else in QEMU can deal with a filter node
   that doesn't have a child. This is what this one does.
Hanna, do you have an opinion on these two options?
I'm not sure myself, but I see that both aren't mutually exclusive. I
would say that having 1. is certainly a good thing that makes everything
else simpler and less likely to fail, so that's the one I would take in
any case. I'm not completely sure if that means v2 of "stream: Remove
bdrv from job in .clean()", or if another version that just removes this
one node from the job would be better. We do have bdrv_drain_all_begin()
later in stream_prepare(), which must be assumed to run any sorts of
callbacks, so removing the node in stream_clean() might be too late in
some cases.
And then we could think of having this patch for 2., probably split
into two patches - though what tells us that this is complete? I would
be surprised if there aren't more places in QEMU that assume that a
filter node has exactly one child. So I think in this case we would have
to audit the rest of the block layer to make sure we caught all of them.
Hm, I think I am relatively sure now actually that 2. is a bad idea...
So what if we don't actually do 2., but then add an assertion to
bdrv_cor_filter_drop() that verifies that the refcount is 1 before
dropping the filter node? This should help us make sure that the patch
for 1. actually does what it's supposed to do.
> This is hypothesized to only affect the block-stream operation as other
> operations use the workaround bdc4c4c5e372756a5ba3fb3a61e585b02f0dd7f4
> or do not detach their children during cleanup (see
> 3108a15cf09865456d499b08fe14e3dbec4ccbb3).
> 
> This backtrace was observed in #3149 on a relatively recent build. The
> bs passed to bdrv_refresh_filename is the cor_filter_bs from the
> block-stream operation; bs->implicit was "true".
> 
> 0  bdrv_refresh_filename (bs=0x5efed72f8350)
>     at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082
> 1  0x00005efea73cf9dc in bdrv_block_device_info
>     (blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8)
>     at block/qapi.c:62
> 2  0x00005efea7391ed3 in bdrv_named_nodes_list
>     (flat=<optimized out>, errp=0x7ffeb829ebd8)
>     at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275
> 3  0x00005efea7471993 in qmp_query_named_block_nodes
>     (has_flat=<optimized out>, flat=<optimized out>, errp=0x7ffeb829ebd8)
>     at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834
> 4  qmp_marshal_query_named_block_nodes
>     (args=<optimized out>, ret=0x7f2b753beec0, errp=0x7f2b753beec8)
>     at qapi/qapi-commands-block-core.c:553
> 5  0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0)
>     at qapi/qmp-dispatch.c:128
> 6  0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430)
>     at util/async.c:219
> ...
This is one change...
> The get_allocated_file_size change resolves a second segfault after the
> first was resolved. Here, bdrv_filter_bs returns NULL as the
> bs (cor_filter_bs) has no children:
> 
> 0  bdrv_co_get_allocated_file_size (bs=<optimized out>)
>     at /usr/src/qemu-1:10.1.0+ds-5ubuntu2+test8/b/qemu/block.c:6018
> 1  0x0000631d078522be in bdrv_co_get_allocated_file_size_entry
>     (opaque=0x7ffd375c5dd0) at block/block-gen.c:288
> 2  0x0000631d07929ec2 in coroutine_trampoline
>     (i0=<optimized out>, i1=<optimized out>)
>     at util/coroutine-ucontext.c:175
> ...
...and this is logically a separate one.
So if we were to go down this route, I think we would have a full patch
serie to make filter nodes without a child safe. These two parts would
be separate patches, and I'm almost sure that we would get more patches
for the series to make it fully safe everywhere.
As I said above, I think this is too hard to get correct to be a good
idea, though.
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149
> Buglink: https://bugs.launchpad.net/bugs/2126951
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
> ---
> As discussed in the previous thread:
> https://lists.gnu.org/archive/html/qemu-devel/2025-10/msg05458.html
> 
> This resolves the issue with my reproducer.
> 
> make check-block passes locally.
> 
> Please let me know if any adjustments to the patch are needed.
> 
> Thanks for the quick & helpful reviews!
Sorry, Wesley, that this turns out to be a bit more complicated! We'll
probably need some further discussion before we can know if or which
adjustments to the patch are needed.
Before I send this, I just had another thought: Why does copy-on-read
even drop the child in bdrv_cor_filter_drop()? Wouldn't the normal
mode of operation be that for a short time you have both the cor node
and its parent point to the same child node, and that would just work?
I see commit messages about conflicting node permissions (3108a15cf09 by
Vladimir), but I don't understand this. A filter without parents
shouldn't try to take any permissions.
So another option for never letting the situation arise would be that
cor_filter_bs just keeps its child until it's really deleted.
Vladimir, do you remember what the specific problem was?
Kevin
^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-27  9:55 ` Kevin Wolf
@ 2025-10-27 16:04   ` Wesley Hershberger
  2025-10-27 18:45     ` Kevin Wolf
  2025-10-27 20:55     ` Vladimir Sementsov-Ogievskiy
  2025-10-27 19:44   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 2 replies; 9+ messages in thread
From: Wesley Hershberger @ 2025-10-27 16:04 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, Hanna Reitz, qemu-block, Vladimir Sementsov-Ogievskiy
On Mon, Oct 27, 2025 at 4:55 AM Kevin Wolf <kwolf@redhat.com> wrote:
>
> Am 24.10.2025 um 20:07 hat Wesley Hershberger geschrieben:
>
> Sorry, Wesley, that this turns out to be a bit more complicated! We'll
> probably need some further discussion before we can know if or which
> adjustments to the patch are needed.
Hi Kevin, thanks for your thoughts.
> Before I send this, I just had another thought: Why does copy-on-read
> even drop the child in bdrv_cor_filter_drop()? Wouldn't the normal
> mode of operation be that for a short time you have both the cor node
> and its parent point to the same child node, and that would just work?
> I see commit messages about conflicting node permissions (3108a15cf09 by
> Vladimir), but I don't understand this. A filter without parents
> shouldn't try to take any permissions.
>
> So another option for never letting the situation arise would be that
> cor_filter_bs just keeps its child until it's really deleted.
It sounds like you'd be open to a patch that partially reverts 3108a15cf09?
In my local testing I did verify that preventing the child from being detached
resolves the bug and am currently double-checking it with this diff:
diff --git a/block.c b/block.c
index 8848e9a7ed..72261ea1d4 100644
--- a/block.c
+++ b/block.c
@@ -5386,17 +5386,13 @@ bdrv_replace_node_noperm(BlockDriverState *from,
  *
  * With auto_skip=false the error is returned if from has a parent which should
  * not be updated.
- *
- * With @detach_subchain=true @to must be in a backing chain of @from. In this
- * case backing link of the cow-parent of @to is removed.
  */
 static int GRAPH_WRLOCK
 bdrv_replace_node_common(BlockDriverState *from, BlockDriverState *to,
-                         bool auto_skip, bool detach_subchain, Error **errp)
+                         bool auto_skip, Error **errp)
 {
     Transaction *tran = tran_new();
     g_autoptr(GSList) refresh_list = NULL;
-    BlockDriverState *to_cow_parent = NULL;
     int ret;
     GLOBAL_STATE_CODE();
@@ -5405,17 +5401,6 @@ bdrv_replace_node_common(BlockDriverState
*from, BlockDriverState *to,
     assert(to->quiesce_counter);
     assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
-    if (detach_subchain) {
-        assert(bdrv_chain_contains(from, to));
-        assert(from != to);
-        for (to_cow_parent = from;
-             bdrv_filter_or_cow_bs(to_cow_parent) != to;
-             to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
-        {
-            ;
-        }
-    }
-
     /*
      * Do the replacement without permission update.
      * Replacement may influence the permissions, we should calculate new
@@ -5427,11 +5412,6 @@ bdrv_replace_node_common(BlockDriverState
*from, BlockDriverState *to,
         goto out;
     }
-    if (detach_subchain) {
-        /* to_cow_parent is already drained because from is drained */
-        bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran);
-    }
-
     refresh_list = g_slist_prepend(refresh_list, to);
     refresh_list = g_slist_prepend(refresh_list, from);
@@ -5450,7 +5430,7 @@ out:
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
                       Error **errp)
 {
-    return bdrv_replace_node_common(from, to, true, false, errp);
+    return bdrv_replace_node_common(from, to, true, errp);
 }
 int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
@@ -5466,7 +5446,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
     bdrv_drained_begin(child_bs);
     bdrv_graph_wrlock();
-    ret = bdrv_replace_node_common(bs, child_bs, true, true, errp);
+    ret = bdrv_replace_node_common(bs, child_bs, true, errp);
     bdrv_graph_wrunlock();
     bdrv_drained_end(child_bs);
@@ -5917,17 +5897,7 @@ int bdrv_drop_intermediate(BlockDriverState
*top, BlockDriverState *base,
         updated_children = g_slist_prepend(updated_children, c);
     }
-    /*
-     * It seems correct to pass detach_subchain=true here, but it triggers
-     * one more yet not fixed bug, when due to nested aio_poll loop
we switch to
-     * another drained section, which modify the graph (for example, removing
-     * the child, which we keep in updated_children list). So, it's a TODO.
-     *
-     * Note, bug triggered if pass detach_subchain=true here and run
-     * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash.
-     * That's a FIXME.
-     */
-    bdrv_replace_node_common(top, base, false, false, &local_err);
+    bdrv_replace_node_common(top, base, false, &local_err);
     bdrv_graph_wrunlock();
     if (local_err) {
> Vladimir, do you remember what the specific problem was?
I'd be happy to submit this if Vladimir is happy that it won't cause
other issues
(I've run make check-block with it and saw no failures).
Thanks so much!
~Wesley
^ permalink raw reply related	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-27 16:04   ` Wesley Hershberger
@ 2025-10-27 18:45     ` Kevin Wolf
  2025-10-27 20:42       ` Vladimir Sementsov-Ogievskiy
  2025-10-27 20:55     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2025-10-27 18:45 UTC (permalink / raw)
  To: Wesley Hershberger
  Cc: qemu-devel, Hanna Reitz, qemu-block, Vladimir Sementsov-Ogievskiy
Am 27.10.2025 um 17:04 hat Wesley Hershberger geschrieben:
> On Mon, Oct 27, 2025 at 4:55 AM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > Am 24.10.2025 um 20:07 hat Wesley Hershberger geschrieben:
> >
> > Sorry, Wesley, that this turns out to be a bit more complicated! We'll
> > probably need some further discussion before we can know if or which
> > adjustments to the patch are needed.
> 
> Hi Kevin, thanks for your thoughts.
> 
> > Before I send this, I just had another thought: Why does copy-on-read
> > even drop the child in bdrv_cor_filter_drop()? Wouldn't the normal
> > mode of operation be that for a short time you have both the cor node
> > and its parent point to the same child node, and that would just work?
> > I see commit messages about conflicting node permissions (3108a15cf09 by
> > Vladimir), but I don't understand this. A filter without parents
> > shouldn't try to take any permissions.
> >
> > So another option for never letting the situation arise would be that
> > cor_filter_bs just keeps its child until it's really deleted.
> 
> It sounds like you'd be open to a patch that partially reverts 3108a15cf09?
Hm, yes and no. Primarily I'm trying to find out what our options are as
the next step. So in this case, why did we make the change back then,
and can the same problem it was supposed to solve still be solved in a
different way? If so, then yes, we can essentially revert it. But I'm
not proposing that we skip the step of understanding and just go ahead,
change the code and then wait and see what breaks.
Maybe Vladimir can remember, that would be the easiest. Otherwise, maybe
some mailing list archeology can help. If none of these lead to
anything, we'd still have to thoroughly think through the consequences
for all callers before we can conclude that changing things this way is
safe.
> In my local testing I did verify that preventing the child from being
> detached resolves the bug and am currently double-checking it with
> this diff:
> 
> diff --git a/block.c b/block.c
> index 8848e9a7ed..72261ea1d4 100644
> --- a/block.c
> +++ b/block.c
> @@ -5386,17 +5386,13 @@ bdrv_replace_node_noperm(BlockDriverState *from,
>   *
>   * With auto_skip=false the error is returned if from has a parent which should
>   * not be updated.
> - *
> - * With @detach_subchain=true @to must be in a backing chain of @from. In this
> - * case backing link of the cow-parent of @to is removed.
>   */
>  static int GRAPH_WRLOCK
>  bdrv_replace_node_common(BlockDriverState *from, BlockDriverState *to,
> -                         bool auto_skip, bool detach_subchain, Error **errp)
> +                         bool auto_skip, Error **errp)
>  {
>      Transaction *tran = tran_new();
>      g_autoptr(GSList) refresh_list = NULL;
> -    BlockDriverState *to_cow_parent = NULL;
>      int ret;
> 
>      GLOBAL_STATE_CODE();
> @@ -5405,17 +5401,6 @@ bdrv_replace_node_common(BlockDriverState
> *from, BlockDriverState *to,
>      assert(to->quiesce_counter);
>      assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
> 
> -    if (detach_subchain) {
> -        assert(bdrv_chain_contains(from, to));
> -        assert(from != to);
> -        for (to_cow_parent = from;
> -             bdrv_filter_or_cow_bs(to_cow_parent) != to;
> -             to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
> -        {
> -            ;
> -        }
> -    }
> -
>      /*
>       * Do the replacement without permission update.
>       * Replacement may influence the permissions, we should calculate new
> @@ -5427,11 +5412,6 @@ bdrv_replace_node_common(BlockDriverState
> *from, BlockDriverState *to,
>          goto out;
>      }
> 
> -    if (detach_subchain) {
> -        /* to_cow_parent is already drained because from is drained */
> -        bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran);
> -    }
> -
>      refresh_list = g_slist_prepend(refresh_list, to);
>      refresh_list = g_slist_prepend(refresh_list, from);
> 
> @@ -5450,7 +5430,7 @@ out:
>  int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>                        Error **errp)
>  {
> -    return bdrv_replace_node_common(from, to, true, false, errp);
> +    return bdrv_replace_node_common(from, to, true, errp);
>  }
> 
>  int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
> @@ -5466,7 +5446,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
> 
>      bdrv_drained_begin(child_bs);
>      bdrv_graph_wrlock();
> -    ret = bdrv_replace_node_common(bs, child_bs, true, true, errp);
> +    ret = bdrv_replace_node_common(bs, child_bs, true, errp);
>      bdrv_graph_wrunlock();
>      bdrv_drained_end(child_bs);
> 
> @@ -5917,17 +5897,7 @@ int bdrv_drop_intermediate(BlockDriverState
> *top, BlockDriverState *base,
>          updated_children = g_slist_prepend(updated_children, c);
>      }
> 
> -    /*
> -     * It seems correct to pass detach_subchain=true here, but it triggers
> -     * one more yet not fixed bug, when due to nested aio_poll loop
> we switch to
> -     * another drained section, which modify the graph (for example, removing
> -     * the child, which we keep in updated_children list). So, it's a TODO.
> -     *
> -     * Note, bug triggered if pass detach_subchain=true here and run
> -     * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash.
> -     * That's a FIXME.
> -     */
> -    bdrv_replace_node_common(top, base, false, false, &local_err);
> +    bdrv_replace_node_common(top, base, false, &local_err);
>      bdrv_graph_wrunlock();
> 
>      if (local_err) {
> 
> > Vladimir, do you remember what the specific problem was?
> 
> I'd be happy to submit this if Vladimir is happy that it won't cause
> other issues
> (I've run make check-block with it and saw no failures).
Ok, that's a good start at least. And I see that before commit bcc8584,
bdrv_cor_filter_drop() didn't have this detaching behaviour either, it
used plain bdrv_replace_node(). So it might actually have been
unintentional here.
The other user is bdrv_cbw_drop(), and that one did have an explicit
bdrv_set_backing_hd(bs, NULL, &error_abort); before switching to
bdrv_drop_filter() in commit b75d64b. Not sure if it was right before,
but at least it always behaved like this, and we would change it now
rather than reverting to an older state.
Let's see if others have something to say on this.
Kevin
^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-27  9:55 ` Kevin Wolf
  2025-10-27 16:04   ` Wesley Hershberger
@ 2025-10-27 19:44   ` Vladimir Sementsov-Ogievskiy
  2025-10-28 10:29     ` Kevin Wolf
  1 sibling, 1 reply; 9+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-27 19:44 UTC (permalink / raw)
  To: Kevin Wolf, Wesley Hershberger; +Cc: qemu-devel, Hanna Reitz, qemu-block
On 27.10.25 12:55, Kevin Wolf wrote:
> Am 24.10.2025 um 20:07 hat Wesley Hershberger geschrieben:
>> Some operations insert an implicit node above the top node in the block
>> graph (e.g. block-stream or blockdev-backup). The implicit node is
>> removed when the operation finishes. If QMP query-named-block-nodes is
>> called while the operation is removing the implicit node, it may observe
>> a node with no children, causing a segfault.
> 
> How can a QMP command run in the middle of removing a node? Is this the
> real bug?
> 
> You say block-stream is affected, so let's look at stream.c. The
> interesting part here is bdrv_cor_filter_drop(s->cor_filter_bs).
> 
> bdrv_drop_filter() calls bdrv_drained_begin() and bdrv_graph_wrlock(),
> which can run arbitrary callbacks, but not QMP commands. It's too eary
> anyway, the filter is still in the graph at this point.
> 
> Between bdrv_replace_node_common(), which removes the node from its
> parent, and bdrv_unref(cor_filter_bs), I don't see any place that could
> run a QMP command.
> 
> Does cor_filter_bs have a refcount > 1 before running stream_prepare()
> or stream_clean()?
> 
> Aha, your previous commit message [1] is actually clearer on this:
> 
>      The cor_filter_bs was added to the blockjob as the main BDS (by
>      passing it to block_job_create), so bdrv_cor_filter_drop doesn't
>      actually remove it from global_bdrv_states.
> 
> [1] https://patchew.org/QEMU/20251021-fix-3149-v2-1-5ffbe701e964@canonical.com/
> 
> So we _are_ creating a state in which cor_filter_bs still exists, but
> doesn't have a child any more. Which is rather untypical for a filter
> (in fact, it's against the definition of a filter). And your two
> different patches address this from two different angles:
> 
> 1. Don't even let this situation arise. We need to make sure that
>     cor_filter_bs never exists without a child - or at least, that it's
>     happening only for a short time while we know that no other code is
>     running. This is what your previous patch attempted.
> 
> 2. Make sure that everything else in QEMU can deal with a filter node
>     that doesn't have a child. This is what this one does.
> 
> Hanna, do you have an opinion on these two options?
> 
> I'm not sure myself, but I see that both aren't mutually exclusive. I
> would say that having 1. is certainly a good thing that makes everything
> else simpler and less likely to fail, so that's the one I would take in
> any case. I'm not completely sure if that means v2 of "stream: Remove
> bdrv from job in .clean()", or if another version that just removes this
> one node from the job would be better. We do have bdrv_drain_all_begin()
> later in stream_prepare(), which must be assumed to run any sorts of
> callbacks, so removing the node in stream_clean() might be too late in
> some cases.
> 
> And then we could think of having this patch for 2., probably split
> into two patches - though what tells us that this is complete? I would
> be surprised if there aren't more places in QEMU that assume that a
> filter node has exactly one child. So I think in this case we would have
> to audit the rest of the block layer to make sure we caught all of them.
> 
> Hm, I think I am relatively sure now actually that 2. is a bad idea...
> 
> So what if we don't actually do 2., but then add an assertion to
> bdrv_cor_filter_drop() that verifies that the refcount is 1 before
> dropping the filter node? This should help us make sure that the patch
> for 1. actually does what it's supposed to do.
> 
>> This is hypothesized to only affect the block-stream operation as other
>> operations use the workaround bdc4c4c5e372756a5ba3fb3a61e585b02f0dd7f4
>> or do not detach their children during cleanup (see
>> 3108a15cf09865456d499b08fe14e3dbec4ccbb3).
>>
>> This backtrace was observed in #3149 on a relatively recent build. The
>> bs passed to bdrv_refresh_filename is the cor_filter_bs from the
>> block-stream operation; bs->implicit was "true".
>>
>> 0  bdrv_refresh_filename (bs=0x5efed72f8350)
>>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082
>> 1  0x00005efea73cf9dc in bdrv_block_device_info
>>      (blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8)
>>      at block/qapi.c:62
>> 2  0x00005efea7391ed3 in bdrv_named_nodes_list
>>      (flat=<optimized out>, errp=0x7ffeb829ebd8)
>>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275
>> 3  0x00005efea7471993 in qmp_query_named_block_nodes
>>      (has_flat=<optimized out>, flat=<optimized out>, errp=0x7ffeb829ebd8)
>>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834
>> 4  qmp_marshal_query_named_block_nodes
>>      (args=<optimized out>, ret=0x7f2b753beec0, errp=0x7f2b753beec8)
>>      at qapi/qapi-commands-block-core.c:553
>> 5  0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0)
>>      at qapi/qmp-dispatch.c:128
>> 6  0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430)
>>      at util/async.c:219
>> ...
> 
> This is one change...
> 
>> The get_allocated_file_size change resolves a second segfault after the
>> first was resolved. Here, bdrv_filter_bs returns NULL as the
>> bs (cor_filter_bs) has no children:
>>
>> 0  bdrv_co_get_allocated_file_size (bs=<optimized out>)
>>      at /usr/src/qemu-1:10.1.0+ds-5ubuntu2+test8/b/qemu/block.c:6018
>> 1  0x0000631d078522be in bdrv_co_get_allocated_file_size_entry
>>      (opaque=0x7ffd375c5dd0) at block/block-gen.c:288
>> 2  0x0000631d07929ec2 in coroutine_trampoline
>>      (i0=<optimized out>, i1=<optimized out>)
>>      at util/coroutine-ucontext.c:175
>> ...
> 
> ...and this is logically a separate one.
> 
> So if we were to go down this route, I think we would have a full patch
> serie to make filter nodes without a child safe. These two parts would
> be separate patches, and I'm almost sure that we would get more patches
> for the series to make it fully safe everywhere.
> 
> As I said above, I think this is too hard to get correct to be a good
> idea, though.
> 
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149
>> Buglink: https://bugs.launchpad.net/bugs/2126951
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com>
>> ---
>> As discussed in the previous thread:
>> https://lists.gnu.org/archive/html/qemu-devel/2025-10/msg05458.html
>>
>> This resolves the issue with my reproducer.
>>
>> make check-block passes locally.
>>
>> Please let me know if any adjustments to the patch are needed.
>>
>> Thanks for the quick & helpful reviews!
> 
> Sorry, Wesley, that this turns out to be a bit more complicated! We'll
> probably need some further discussion before we can know if or which
> adjustments to the patch are needed.
> 
> Before I send this, I just had another thought: Why does copy-on-read
> even drop the child in bdrv_cor_filter_drop()? Wouldn't the normal
> mode of operation be that for a short time you have both the cor node
> and its parent point to the same child node, and that would just work?
> I see commit messages about conflicting node permissions (3108a15cf09 by
> Vladimir), but I don't understand this. A filter without parents
> shouldn't try to take any permissions.
> 
> So another option for never letting the situation arise would be that
> cor_filter_bs just keeps its child until it's really deleted.
> 
> Vladimir, do you remember what the specific problem was?
> 
Hmm.. That's not simple to restore it now)Seems it should be about block-job
blk, which is the conflicting parent. But for backup, we've already had
this block_job_remove_all_bdrv() even at time of 3108a15cf09. Interesting...
The next commit after 3108a15cf09 is:
b75d64b3290136 "block/backup-top: drop .active"
and it introduces use of new interface, and drop old hack:
@@ -283,16 +245,9 @@ void bdrv_backup_top_drop(BlockDriverState *bs)
  {
      BDRVBackupTopState *s = bs->opaque;
-    bdrv_drained_begin(bs);
+    bdrv_drop_filter(bs, &error_abort);
      block_copy_state_free(s->bcs);
-    s->active = false;
-    bdrv_child_refresh_perms(bs, bs->backing, &error_abort);
-    bdrv_replace_node(bs, bs->backing->bs, &error_abort);
-    bdrv_set_backing_hd(bs, NULL, &error_abort);
-
-    bdrv_drained_end(bs);
-
      bdrv_unref(bs);
  }
Good. So, I go to b75d64b3290136 and do:
--- a/block.c
+++ b/block.c
@@ -5182,7 +5182,7 @@ int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
  int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
  {
-    return bdrv_replace_node_common(bs, bdrv_filter_or_cow_bs(bs), true, true,
+    return bdrv_replace_node_common(bs, bdrv_filter_or_cow_bs(bs), true, false,
                                      errp);
  }
This breaks most of iotests found by "git grep backup", it breaks:
    028 055 056 124 129 185 222 256 257 264 281 304
at least, output of 185 is readable:
+QEMU_PROG: Conflicts with use by #block1177 as 'backing', which does not allow 'write' on #block151
and looking at backup_top_child_perm(), it obvious:
     } else {
         /* Source child */
         bdrv_default_perms(bs, c, role, reopen_queue,
                            perm, shared, nperm, nshared);
         if (perm & BLK_PERM_WRITE) {
             *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
         }
         *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
     }
We've unshared write-perm unconditionally, not caring about parents.
Later in
3860c0201924d "block/copy-before-write: relax permission requirements when no parents"
we move to current behavior: don't unshare write, when no parents. And the commit say:
     Currently, filter unshares write permission unconditionally on source
     node. It's good, but it will not allow to do blockdev-add. So, let's
     relax restrictions when filter doesn't have any parent.
Was it really good or not? At that moment, I thought it was good. Now I'm not so sure..
Should filter guarantee, that some other parents will not pass it by? Or the filter should
care only to block writes, which explicitly go through it? The latter seems
maybe less safe, but actually, more native for "filters".
Anyway, starting from 3860c0201924d, permission restrictions become less aggressive, let's
check, staying at 3860c0201924d, and with same hack to bdrv_drop_filter:
check -qcow2 028 055 056 124 129 141 185 219 222 256 257 264 281 283 294 304
- only 257 fails. (why?)
Double check with 3860c0201924d^: again, almost all fails:
Failures: 028 055 056 124 129 185 222 256 257 264 281 304
And same with current master: 257 fails, if hack bdrv_drop_filter() to pass detach_subchain=false to bdrv_replace_node_common().
So to summarize: most probably, 3108a15cf09 was needed because I wanted copy-before-write
filter (backup-top in that past time) to unshare WRITE permission unconditionally, to be
extremely safe. Later, I had to drop this restriction anyway, but bdrv_drop_filter() was
kept as is (I doubt, that there were some specific reason, most probably I just didn't
think of it. And I considered 3860c0201924d as work-around, not as final good solution).
So, reverting 3108a15cf09 now is quite possible, we just will have to fix 257.
-- 
Best regards,
Vladimir
^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-27 18:45     ` Kevin Wolf
@ 2025-10-27 20:42       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-27 20:42 UTC (permalink / raw)
  To: Kevin Wolf, Wesley Hershberger; +Cc: qemu-devel, Hanna Reitz, qemu-block
On 27.10.25 21:45, Kevin Wolf wrote:
> Am 27.10.2025 um 17:04 hat Wesley Hershberger geschrieben:
>> On Mon, Oct 27, 2025 at 4:55 AM Kevin Wolf <kwolf@redhat.com> wrote:
>>>
>>> Am 24.10.2025 um 20:07 hat Wesley Hershberger geschrieben:
>>>
>>> Sorry, Wesley, that this turns out to be a bit more complicated! We'll
>>> probably need some further discussion before we can know if or which
>>> adjustments to the patch are needed.
>>
>> Hi Kevin, thanks for your thoughts.
>>
>>> Before I send this, I just had another thought: Why does copy-on-read
>>> even drop the child in bdrv_cor_filter_drop()? Wouldn't the normal
>>> mode of operation be that for a short time you have both the cor node
>>> and its parent point to the same child node, and that would just work?
>>> I see commit messages about conflicting node permissions (3108a15cf09 by
>>> Vladimir), but I don't understand this. A filter without parents
>>> shouldn't try to take any permissions.
>>>
>>> So another option for never letting the situation arise would be that
>>> cor_filter_bs just keeps its child until it's really deleted.
>>
>> It sounds like you'd be open to a patch that partially reverts 3108a15cf09?
> 
> Hm, yes and no. Primarily I'm trying to find out what our options are as
> the next step. So in this case, why did we make the change back then,
> and can the same problem it was supposed to solve still be solved in a
> different way? If so, then yes, we can essentially revert it. But I'm
> not proposing that we skip the step of understanding and just go ahead,
> change the code and then wait and see what breaks.
> 
> Maybe Vladimir can remember, that would be the easiest. Otherwise, maybe
> some mailing list archeology can help. If none of these lead to
> anything, we'd still have to thoroughly think through the consequences
> for all callers before we can conclude that changing things this way is
> safe.
> 
>> In my local testing I did verify that preventing the child from being
>> detached resolves the bug and am currently double-checking it with
>> this diff:
>>
>> diff --git a/block.c b/block.c
>> index 8848e9a7ed..72261ea1d4 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -5386,17 +5386,13 @@ bdrv_replace_node_noperm(BlockDriverState *from,
>>    *
>>    * With auto_skip=false the error is returned if from has a parent which should
>>    * not be updated.
>> - *
>> - * With @detach_subchain=true @to must be in a backing chain of @from. In this
>> - * case backing link of the cow-parent of @to is removed.
>>    */
>>   static int GRAPH_WRLOCK
>>   bdrv_replace_node_common(BlockDriverState *from, BlockDriverState *to,
>> -                         bool auto_skip, bool detach_subchain, Error **errp)
>> +                         bool auto_skip, Error **errp)
>>   {
>>       Transaction *tran = tran_new();
>>       g_autoptr(GSList) refresh_list = NULL;
>> -    BlockDriverState *to_cow_parent = NULL;
>>       int ret;
>>
>>       GLOBAL_STATE_CODE();
>> @@ -5405,17 +5401,6 @@ bdrv_replace_node_common(BlockDriverState
>> *from, BlockDriverState *to,
>>       assert(to->quiesce_counter);
>>       assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
>>
>> -    if (detach_subchain) {
>> -        assert(bdrv_chain_contains(from, to));
>> -        assert(from != to);
>> -        for (to_cow_parent = from;
>> -             bdrv_filter_or_cow_bs(to_cow_parent) != to;
>> -             to_cow_parent = bdrv_filter_or_cow_bs(to_cow_parent))
>> -        {
>> -            ;
>> -        }
>> -    }
>> -
>>       /*
>>        * Do the replacement without permission update.
>>        * Replacement may influence the permissions, we should calculate new
>> @@ -5427,11 +5412,6 @@ bdrv_replace_node_common(BlockDriverState
>> *from, BlockDriverState *to,
>>           goto out;
>>       }
>>
>> -    if (detach_subchain) {
>> -        /* to_cow_parent is already drained because from is drained */
>> -        bdrv_remove_child(bdrv_filter_or_cow_child(to_cow_parent), tran);
>> -    }
>> -
>>       refresh_list = g_slist_prepend(refresh_list, to);
>>       refresh_list = g_slist_prepend(refresh_list, from);
>>
>> @@ -5450,7 +5430,7 @@ out:
>>   int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
>>                         Error **errp)
>>   {
>> -    return bdrv_replace_node_common(from, to, true, false, errp);
>> +    return bdrv_replace_node_common(from, to, true, errp);
>>   }
>>
>>   int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
>> @@ -5466,7 +5446,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
>>
>>       bdrv_drained_begin(child_bs);
>>       bdrv_graph_wrlock();
>> -    ret = bdrv_replace_node_common(bs, child_bs, true, true, errp);
>> +    ret = bdrv_replace_node_common(bs, child_bs, true, errp);
>>       bdrv_graph_wrunlock();
>>       bdrv_drained_end(child_bs);
>>
>> @@ -5917,17 +5897,7 @@ int bdrv_drop_intermediate(BlockDriverState
>> *top, BlockDriverState *base,
>>           updated_children = g_slist_prepend(updated_children, c);
>>       }
>>
>> -    /*
>> -     * It seems correct to pass detach_subchain=true here, but it triggers
>> -     * one more yet not fixed bug, when due to nested aio_poll loop
>> we switch to
>> -     * another drained section, which modify the graph (for example, removing
>> -     * the child, which we keep in updated_children list). So, it's a TODO.
>> -     *
>> -     * Note, bug triggered if pass detach_subchain=true here and run
>> -     * test-bdrv-drain. test_drop_intermediate_poll() test-case will crash.
>> -     * That's a FIXME.
>> -     */
>> -    bdrv_replace_node_common(top, base, false, false, &local_err);
>> +    bdrv_replace_node_common(top, base, false, &local_err);
>>       bdrv_graph_wrunlock();
>>
>>       if (local_err) {
>>
>>> Vladimir, do you remember what the specific problem was?
>>
>> I'd be happy to submit this if Vladimir is happy that it won't cause
>> other issues
>> (I've run make check-block with it and saw no failures).
> 
> Ok, that's a good start at least. And I see that before commit bcc8584,
> bdrv_cor_filter_drop() didn't have this detaching behaviour either, it
> used plain bdrv_replace_node(). So it might actually have been
> unintentional here.
Note that bcc8584 also drops hack with .active field to manipulate on
permissions of copy-on-read filter.
Still, I hope, that was mostly about inserting of the filter, and
bdrv_replace_node() could be kept as is, but bdrv_drop_filter() word
looked better in context (why not, if we have appropriate function).
Let's check. Hmm staying at bcc8584, and changing
--- a/block.c
+++ b/block.c
@@ -4920,6 +4920,12 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
                                      errp);
  }
+int bdrv_drop_filter1(BlockDriverState *bs, Error **errp)
+{
+    return bdrv_replace_node_common(bs, bdrv_filter_or_cow_bs(bs), true, false,
+                                    errp);
+}
+
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -274,7 +274,7 @@ void bdrv_cor_filter_drop(BlockDriverState *cor_filter_bs)
          s->chain_frozen = false;
          bdrv_unfreeze_backing_chain(cor_filter_bs, s->bottom_bs);
      }
-    bdrv_drop_filter(cor_filter_bs, &error_abort);
+    bdrv_drop_filter1(cor_filter_bs, &error_abort);
- nothing change for iotests. 030 and 233 fails anyway. 233 is unrelated,
030 is about block-stream.. But it fails anyway.
And on master
--- a/block.c
+++ b/block.c
@@ -5466,7 +5466,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
      bdrv_drained_begin(child_bs);
      bdrv_graph_wrlock();
-    ret = bdrv_replace_node_common(bs, child_bs, true, true, errp);
+    ret = bdrv_replace_node_common(bs, child_bs, true, false, errp);
      bdrv_graph_wrunlock();
      bdrv_drained_end(child_bs);
breaks only 257 for me
> 
> The other user is bdrv_cbw_drop(), and that one did have an explicit
> bdrv_set_backing_hd(bs, NULL, &error_abort); before switching to
> bdrv_drop_filter() in commit b75d64b. Not sure if it was right before,
> but at least it always behaved like this, and we would change it now
> rather than reverting to an older state.
> 
> Let's see if others have something to say on this.
> 
For me the idea to (try to) revert 3108a15cf09 seems correct. At least,
I can't remember or find (see my paraller answer about copy-before-write)
any contradictions. The only obvious obstacle is iotest 257.
-- 
Best regards,
Vladimir
^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-27 16:04   ` Wesley Hershberger
  2025-10-27 18:45     ` Kevin Wolf
@ 2025-10-27 20:55     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 9+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-27 20:55 UTC (permalink / raw)
  To: Wesley Hershberger, Kevin Wolf; +Cc: qemu-devel, Hanna Reitz, qemu-block
On 27.10.25 19:04, Wesley Hershberger wrote:
> (I've run make check-block with it and saw no failures).
When changing internals of block-layer, it's better to run all iotests, at least for qcow2, raw and nbd. check-block runs only subset.
Like
cd build
./tests/qemu-iotests/check -qcow2
./tests/qemu-iotests/check -raw
./tests/qemu-iotests/check -nbd
At least,
    ./tests/qemu-iotests/check -qcow2 257
fails with your diff.
-- 
Best regards,
Vladimir
^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] block: Add missing null checks during query-named-block-nodes
  2025-10-27 19:44   ` Vladimir Sementsov-Ogievskiy
@ 2025-10-28 10:29     ` Kevin Wolf
  0 siblings, 0 replies; 9+ messages in thread
From: Kevin Wolf @ 2025-10-28 10:29 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Wesley Hershberger, qemu-devel, Hanna Reitz, qemu-block, jsnow
Am 27.10.2025 um 20:44 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Anyway, starting from 3860c0201924d, permission restrictions become less aggressive, let's
> check, staying at 3860c0201924d, and with same hack to bdrv_drop_filter:
> 
> check -qcow2 028 055 056 124 129 141 185 219 222 256 257 264 281 283 294 304
> 
> - only 257 fails. (why?)
> 
> Double check with 3860c0201924d^: again, almost all fails:
> 
> Failures: 028 055 056 124 129 185 222 256 257 264 281 304
> 
> 
> 
> And same with current master: 257 fails, if hack bdrv_drop_filter() to
> pass detach_subchain=false to bdrv_replace_node_common().
I'm trying to understand the 257, but that seems to be a fairly
complicated test case.
Looking at the first failing part, this uses error injection with a
blkdebug configuration where one single I/O error is injected on the
second read after the first flush. This feels very specific, like it's
targeting a specific operation to fail, but it doesn't really document
what it is.
John, I'm sure that after six years, you remember the details! :-)
Anyway, the difference that we're seeing after changing
bdrv_drop_filter() to pass detach_subchain=false is that now the error
seems to be triggered earlier. The same error that we get in the bad
output happens in the reference output, too, but only during "Test
Backup #1" rather than "Reference Backup #1".
So taking a shot in the dark, I tried failing the second read after the
_second_ flush (patch see below), and that fixes the test apart from the
changed blockdev-add commands. So not detaching the cor node causes an
additional flush on the image. I suspect this might be the flush in
bdrv_close() when the cor node is deleted?
Anyway, an additional flush is harmless, so I think we should be good if
we just change the test case.
Kevin
diff --git a/block.c b/block.c
index cf08e64add..982371e735 100644
--- a/block.c
+++ b/block.c
@@ -5478,7 +5478,7 @@ int bdrv_drop_filter(BlockDriverState *bs, Error **errp)
     bdrv_drained_begin(child_bs);
     bdrv_graph_wrlock();
-    ret = bdrv_replace_node_common(bs, child_bs, true, true, errp);
+    ret = bdrv_replace_node_common(bs, child_bs, true, false, errp);
     bdrv_graph_wrunlock();
     bdrv_drained_end(child_bs);
diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index 7d3720b8e5..dffc3ba0a4 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -310,14 +310,18 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', failure=None):
                     'state': 1,
                     'new_state': 2
                 }, {
-                    'event': 'read_aio',
+                    'event': 'flush_to_disk',
                     'state': 2,
                     'new_state': 3
+                }, {
+                    'event': 'read_aio',
+                    'state': 3,
+                    'new_state': 4
                 }],
                 'inject-error': [{
                     'event': 'read_aio',
                     'errno': 5,
-                    'state': 3,
+                    'state': 4,
                     'immediately': False,
                     'once': True
                 }]
^ permalink raw reply related	[flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-10-28 10:30 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-24 18:07 [PATCH] block: Add missing null checks during query-named-block-nodes Wesley Hershberger
2025-10-25  8:27 ` Vladimir Sementsov-Ogievskiy
2025-10-27  9:55 ` Kevin Wolf
2025-10-27 16:04   ` Wesley Hershberger
2025-10-27 18:45     ` Kevin Wolf
2025-10-27 20:42       ` Vladimir Sementsov-Ogievskiy
2025-10-27 20:55     ` Vladimir Sementsov-Ogievskiy
2025-10-27 19:44   ` Vladimir Sementsov-Ogievskiy
2025-10-28 10:29     ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).