All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Lukas Straub <lukasstraub2@web.de>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
	Laurent Vivier <lvivier@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Zhang Chen <zhangckid@gmail.com>,
	Hailiang Zhang <zhanghailiang@xfusion.com>,
	Markus Armbruster <armbru@redhat.com>,
	Li Zhijian <lizhijian@fujitsu.com>,
	"Dr. David Alan Gilbert" <dave@treblig.org>
Subject: Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Date: Mon, 2 Feb 2026 09:26:06 -0500	[thread overview]
Message-ID: <aYCz_se6Ji2g5d_L@x1.local> (raw)
In-Reply-To: <20260130112402.2c008707@penguin>

On Fri, Jan 30, 2026 at 11:24:02AM +0100, Lukas Straub wrote:
> On Tue, 27 Jan 2026 15:49:31 -0500
> Peter Xu <peterx@redhat.com> wrote:
> 
> > On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:
> > > +void migration_test_add_colo(MigrationTestEnv *env)
> > > +{
> > > +    if (!env->has_kvm) {
> > > +        g_test_skip("COLO requires KVM accelerator");
> > > +        return;
> > > +    }  
> > 
> > I'm OK if you want to explicitly bypass others, but could you explanation
> > why?
> > 
> > Thanks,
> > 
> 
> It used to hang with TCG. Now it crashes, since
> migration_bitmap_sync_precopy assumes bql is held. Something for later.

If we want to keep COLO around and be serious, let's try to make COLO the
same standard we target for migration in general whenever possible.  We
shouldn't randomly workaround bugs.  We should fix it.

It looks to me there's some locking issue instead.

Iterator's complete() requires BQL.  Would a patch like below makes sense
to you?

diff --git a/migration/colo.c b/migration/colo.c
index db783f6fa7..b3ea137120 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -458,8 +458,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     /* Note: device state is saved into buffer */
     ret = qemu_save_device_state(fb);
 
-    bql_unlock();
     if (ret < 0) {
+        bql_unlock();
         goto out;
     }
 
@@ -473,6 +473,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
      */
     qemu_savevm_live_state(s->to_dst_file);
 
+    /* Save live state requires BQL */
+    bql_unlock();
+
     qemu_fflush(fb);
 
     /*

> 
> #6  0x00007ffff7471517 in __assert_fail
>     (assertion=assertion@entry=0x555555f17aee "bql_locked() != locked", file=file@entry=0x555555f17ab0 "../system/cpus.c", line=line@entry=535, function=function@entry=0x55555609bfd0 <__PRETTY_FUNCTION__.9> "bql_update_status") at ./assert/assert.c:105
> #7  0x0000555555b09f1e in bql_update_status (locked=locked@entry=false) at ../system/cpus.c:535
> #8  0x0000555555ec60e7 in qemu_mutex_pre_unlock (mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-common.h:57
> #9  qemu_mutex_pre_unlock (line=164, file=0x555555efe1dc "../cpu-common.c", mutex=0x555557166700 <bql>) at ../util/qemu-thread-common.h:48
> #10 qemu_cond_wait_impl (cond=0x5555571442c0 <qemu_work_cond>, mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-posix.c:224
> #11 0x000055555589e6c8 in do_run_on_cpu (cpu=<optimized out>, func=<optimized out>, data=..., mutex=0x555557166700 <bql>) at ../cpu-common.c:164
> #12 0x0000555555b17a06 in memory_global_after_dirty_log_sync () at ../system/memory.c:2938
> #13 0x0000555555b55b47 in migration_bitmap_sync (rs=0x7fffe8001340, last_stage=last_stage@entry=true) at ../migration/ram.c:1157
> #14 0x0000555555b56721 in migration_bitmap_sync_precopy (last_stage=last_stage@entry=true) at ../migration/ram.c:1195
> #15 0x0000555555b59f8a in ram_save_complete (f=0x5555575db620, opaque=<optimized out>) at ../migration/ram.c:3381
> #16 0x0000555555b5e4f5 in qemu_savevm_complete (se=se@entry=0x5555574c0d80, f=f@entry=0x5555575db620) at ../migration/savevm.c:1521
> #17 0x0000555555b60437 in qemu_savevm_state_complete_precopy_iterable (f=f@entry=0x5555575db620, in_postcopy=in_postcopy@entry=false) at ../migration/savevm.c:1627
> #18 0x0000555555b60a4f in qemu_savevm_state_complete_precopy (iterable_only=true, f=0x5555575db620) at ../migration/savevm.c:1719
> #19 qemu_savevm_live_state (f=0x5555575db620) at ../migration/savevm.c:1855
> #20 0x0000555555b65ed9 in colo_do_checkpoint_transaction (fb=<optimized out>, bioc=<optimized out>, s=0x5555574c0070) at ../migration/colo.c:474
> #21 colo_process_checkpoint (s=0x5555574c0070) at ../migration/colo.c:592
> #22 migrate_start_colo_process (s=0x5555574c0070) at ../migration/colo.c:655
> #23 0x0000555555b4971e in migration_iteration_finish (s=0x5555574c0070) at ../migration/migration.c:3297
> #24 migration_thread (opaque=opaque@entry=0x5555574c0070) at ../migration/migration.c:3584
> #25 0x0000555555ec58c0 in qemu_thread_start (args=0x5555576583e0) at ../util/qemu-thread-posix.c:393
> #26 0x00007ffff74d2aa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
> #27 0x00007ffff755fc6c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78



-- 
Peter Xu



  reply	other threads:[~2026-02-02 14:26 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-25 20:40 [PATCH v3 00/10] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-01-25 20:40 ` [PATCH v3 01/10] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
2026-01-25 20:40 ` [PATCH v3 02/10] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
2026-01-25 20:40 ` [PATCH v3 03/10] Move ram state receive into multifd_ram_state_recv() Lukas Straub
2026-01-26 12:51   ` Fabiano Rosas
2026-01-25 20:40 ` [PATCH v3 04/10] multifd: Add COLO support Lukas Straub
2026-01-26 10:36   ` Zhang Chen
2026-01-26 11:13     ` Lukas Straub
2026-01-26 14:33   ` Fabiano Rosas
2026-01-26 19:33     ` Lukas Straub
2026-01-26 21:37       ` Fabiano Rosas
2026-01-27 20:36         ` Peter Xu
2026-01-28 12:30           ` Fabiano Rosas
2026-01-28 14:09             ` Peter Xu
2026-01-28 20:02               ` Fabiano Rosas
2026-02-03  9:47             ` Lukas Straub
2026-01-25 20:40 ` [PATCH v3 05/10] colo: Fix crash during device vmstate load Lukas Straub
2026-01-27 20:38   ` Peter Xu
2026-01-30 12:49     ` Lukas Straub
2026-02-02 14:12       ` Peter Xu
2026-02-03  9:25         ` Lukas Straub
2026-01-25 20:40 ` [PATCH v3 06/10] migration-test: Add COLO migration unit test Lukas Straub
2026-01-26 14:40   ` Fabiano Rosas
2026-01-27 20:49   ` Peter Xu
2026-01-30 10:24     ` Lukas Straub
2026-02-02 14:26       ` Peter Xu [this message]
2026-02-03  9:18         ` Lukas Straub
2026-02-03 21:21           ` Peter Xu
2026-02-06 19:11             ` Lukas Straub
2026-01-28 12:32   ` Fabiano Rosas
2026-01-25 20:40 ` [PATCH v3 07/10] Convert colo main documentation to restructuredText Lukas Straub
2026-01-25 20:40 ` [PATCH v3 08/10] qemu-colo.rst: Miscellaneous changes Lukas Straub
2026-01-26 10:21   ` Zhang Chen
2026-01-26 10:56     ` Lukas Straub
2026-01-25 20:40 ` [PATCH v3 09/10] qemu-colo.rst: Add my copyright Lukas Straub
2026-01-26 10:23   ` Zhang Chen
2026-01-25 20:40 ` [PATCH v3 10/10] qemu-colo.rst: Simplify the block replication setup Lukas Straub

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYCz_se6Ji2g5d_L@x1.local \
    --to=peterx@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dave@treblig.org \
    --cc=farosas@suse.de \
    --cc=lizhijian@fujitsu.com \
    --cc=lukasstraub2@web.de \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhangckid@gmail.com \
    --cc=zhanghailiang@xfusion.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.