* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 2:56 [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed Li Zhijian via
@ 2024-04-17 6:13 ` Philippe Mathieu-Daudé
2024-04-17 6:47 ` Zhang, Chen
2024-04-17 7:02 ` Zhijian Li (Fujitsu) via
2024-04-17 13:13 ` Fabiano Rosas
` (3 subsequent siblings)
4 siblings, 2 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-04-17 6:13 UTC (permalink / raw)
To: Li Zhijian, zhanghailiang, peterx, farosas
Cc: qemu-devel, chen.zhang, Wen Congyang, Xie Changlong
On 17/4/24 04:56, Li Zhijian via wrote:
> bdrv_activate_all() should not be called from the coroutine context, move
> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
> protected.
>
> The backtrace is as follows:
> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
>
> CC: Fabiano Rosas <farosas@suse.de>
Cc: qemu-stable@nongnu.org
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> V2: fix missing bql_unlock() in error path.
> ---
> migration/colo.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index 84632a603e..5600a43d78 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque)
> return NULL;
> }
>
> + /* Make sure all file formats throw away their mutable metadata */
> + bql_lock();
Note there is also the convenient BQL_LOCK_GUARD() macro.
> + bdrv_activate_all(&local_err);
> + if (local_err) {
> + bql_unlock();
> + error_report_err(local_err);
> + return NULL;
> + }
> + bql_unlock();
> +
> failover_init_state();
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 6:13 ` Philippe Mathieu-Daudé
@ 2024-04-17 6:47 ` Zhang, Chen
2024-04-17 7:01 ` Philippe Mathieu-Daudé
2024-04-17 7:02 ` Zhijian Li (Fujitsu) via
1 sibling, 1 reply; 10+ messages in thread
From: Zhang, Chen @ 2024-04-17 6:47 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, Li Zhijian, Zhang, Hailiang,
peterx@redhat.com, farosas@suse.de
Cc: qemu-devel@nongnu.org, Wen Congyang, Xie Changlong
> -----Original Message-----
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Sent: Wednesday, April 17, 2024 2:14 PM
> To: Li Zhijian <lizhijian@fujitsu.com>; Zhang, Hailiang
> <zhanghailiang@xfusion.com>; peterx@redhat.com; farosas@suse.de
> Cc: qemu-devel@nongnu.org; Zhang, Chen <chen.zhang@intel.com>; Wen
> Congyang <wencongyang2@huawei.com>; Xie Changlong
> <xiechanglong.d@gmail.com>
> Subject: Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop:
> Assertion `!qemu_in_coroutine()' failed.
>
> On 17/4/24 04:56, Li Zhijian via wrote:
> > bdrv_activate_all() should not be called from the coroutine context,
> > move it to the QEMU thread colo_process_incoming_thread() with the
> > bql_lock protected.
> >
> > The backtrace is as follows:
> > #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop ()
> at ../block/graph-lock.c:260
> > #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop
> (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
> > #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0)
> at ../block.c:6906
> > #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
> > #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0)
> at ../migration/migration.c:793
> > #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144,
> i1=22042) at ../util/coroutine-ucontext.c:175
> > #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
> >
> > CC: Fabiano Rosas <farosas@suse.de>
>
> Cc: qemu-stable@nongnu.org
>
> > Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
> > Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and
> > bdrv_is_root_node() GRAPH_RDLOCK")
> > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
It looks good to me. And already verified this patch in my environment.
After address Phillippe's comments please add:
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Tested-by: Zhang Chen <chen.zhang@intel.com>
Thanks
Chen
> > ---
> > V2: fix missing bql_unlock() in error path.
> > ---
> > migration/colo.c | 18 ++++++++++--------
> > 1 file changed, 10 insertions(+), 8 deletions(-)
> >
> > diff --git a/migration/colo.c b/migration/colo.c index
> > 84632a603e..5600a43d78 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void
> *opaque)
> > return NULL;
> > }
> >
> > + /* Make sure all file formats throw away their mutable metadata */
> > + bql_lock();
>
> Note there is also the convenient BQL_LOCK_GUARD() macro.
>
> > + bdrv_activate_all(&local_err);
> > + if (local_err) {
> > + bql_unlock();
> > + error_report_err(local_err);
> > + return NULL;
> > + }
> > + bql_unlock();
> > +
> > failover_init_state();
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 6:47 ` Zhang, Chen
@ 2024-04-17 7:01 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-04-17 7:01 UTC (permalink / raw)
To: Zhang, Chen, Li Zhijian, Zhang, Hailiang, peterx@redhat.com,
farosas@suse.de
Cc: qemu-devel@nongnu.org, Wen Congyang, Xie Changlong
On 17/4/24 08:47, Zhang, Chen wrote:
>
>
>> -----Original Message-----
>> From: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Sent: Wednesday, April 17, 2024 2:14 PM
>> To: Li Zhijian <lizhijian@fujitsu.com>; Zhang, Hailiang
>> <zhanghailiang@xfusion.com>; peterx@redhat.com; farosas@suse.de
>> Cc: qemu-devel@nongnu.org; Zhang, Chen <chen.zhang@intel.com>; Wen
>> Congyang <wencongyang2@huawei.com>; Xie Changlong
>> <xiechanglong.d@gmail.com>
>> Subject: Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop:
>> Assertion `!qemu_in_coroutine()' failed.
>>
>> On 17/4/24 04:56, Li Zhijian via wrote:
>>> bdrv_activate_all() should not be called from the coroutine context,
>>> move it to the QEMU thread colo_process_incoming_thread() with the
>>> bql_lock protected.
>>>
>>> The backtrace is as follows:
>>> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop ()
>> at ../block/graph-lock.c:260
>>> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop
>> (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
>>> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0)
>> at ../block.c:6906
>>> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
>>> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0)
>> at ../migration/migration.c:793
>>> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144,
>> i1=22042) at ../util/coroutine-ucontext.c:175
>>> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
>>>
>>> CC: Fabiano Rosas <farosas@suse.de>
>>
>> Cc: qemu-stable@nongnu.org
>>
>>> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
>>> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and
>>> bdrv_is_root_node() GRAPH_RDLOCK")
>>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>
> It looks good to me. And already verified this patch in my environment.
> After address Phillippe's comments
(I'm only suggesting, in case using the macro results in code
more easily maintainable)
> please add:
>
> Reviewed-by: Zhang Chen <chen.zhang@intel.com>
> Tested-by: Zhang Chen <chen.zhang@intel.com>
>
> Thanks
> Chen
>
>>> ---
>>> V2: fix missing bql_unlock() in error path.
>>> ---
>>> migration/colo.c | 18 ++++++++++--------
>>> 1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/migration/colo.c b/migration/colo.c index
>>> 84632a603e..5600a43d78 100644
>>> --- a/migration/colo.c
>>> +++ b/migration/colo.c
>>> @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void
>> *opaque)
>>> return NULL;
>>> }
>>>
>>> + /* Make sure all file formats throw away their mutable metadata */
>>> + bql_lock();
>>
>> Note there is also the convenient BQL_LOCK_GUARD() macro.
>>
>>> + bdrv_activate_all(&local_err);
>>> + if (local_err) {
>>> + bql_unlock();
>>> + error_report_err(local_err);
>>> + return NULL;
>>> + }
>>> + bql_unlock();
>>> +
>>> failover_init_state();
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 6:13 ` Philippe Mathieu-Daudé
2024-04-17 6:47 ` Zhang, Chen
@ 2024-04-17 7:02 ` Zhijian Li (Fujitsu) via
1 sibling, 0 replies; 10+ messages in thread
From: Zhijian Li (Fujitsu) via @ 2024-04-17 7:02 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, zhanghailiang@xfusion.com,
peterx@redhat.com, farosas@suse.de
Cc: qemu-devel@nongnu.org, chen.zhang@intel.com, Wen Congyang,
Xie Changlong
On 17/04/2024 14:13, Philippe Mathieu-Daudé wrote:
> On 17/4/24 04:56, Li Zhijian via wrote:
>> bdrv_activate_all() should not be called from the coroutine context, move
>> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
>> protected.
>>
>> The backtrace is as follows:
>> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
>> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
>> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
>> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
>> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
>> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
>> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
>>
>> CC: Fabiano Rosas <farosas@suse.de>
>
> Cc: qemu-stable@nongnu.org
>
>> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
>> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>> ---
>> V2: fix missing bql_unlock() in error path.
>> ---
>> migration/colo.c | 18 ++++++++++--------
>> 1 file changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 84632a603e..5600a43d78 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque)
>> return NULL;
>> }
>> + /* Make sure all file formats throw away their mutable metadata */
>> + bql_lock();
>
> Note there is also the convenient BQL_LOCK_GUARD() macro.
Thanks for your reminder, Philippe.
IIUC, BQL_LOCK_GUARD() will lock all the rest code till the subroutine ends.
So it's not suitable for this case where we only need a partial critical section.
Thanks
Zhijian
>
>> + bdrv_activate_all(&local_err);
>> + if (local_err) {
>> + bql_unlock();
>> + error_report_err(local_err);
>> + return NULL;
>> + }
>> + bql_unlock();
>> +
>> failover_init_state();
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 2:56 [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed Li Zhijian via
2024-04-17 6:13 ` Philippe Mathieu-Daudé
@ 2024-04-17 13:13 ` Fabiano Rosas
2024-04-17 16:41 ` Peter Xu
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Fabiano Rosas @ 2024-04-17 13:13 UTC (permalink / raw)
To: Li Zhijian via, zhanghailiang, peterx
Cc: qemu-devel, chen.zhang, Wen Congyang, Xie Changlong, Li Zhijian
Li Zhijian via <qemu-devel@nongnu.org> writes:
> bdrv_activate_all() should not be called from the coroutine context, move
> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
> protected.
>
> The backtrace is as follows:
> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
>
> CC: Fabiano Rosas <farosas@suse.de>
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 2:56 [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed Li Zhijian via
2024-04-17 6:13 ` Philippe Mathieu-Daudé
2024-04-17 13:13 ` Fabiano Rosas
@ 2024-04-17 16:41 ` Peter Xu
2024-04-28 12:39 ` Michael Tokarev
2024-04-28 12:42 ` Michael Tokarev
4 siblings, 0 replies; 10+ messages in thread
From: Peter Xu @ 2024-04-17 16:41 UTC (permalink / raw)
To: Li Zhijian
Cc: zhanghailiang, farosas, qemu-devel, chen.zhang, Wen Congyang,
Xie Changlong, Philippe Mathieu-Daudé
On Wed, Apr 17, 2024 at 10:56:34AM +0800, Li Zhijian via wrote:
> bdrv_activate_all() should not be called from the coroutine context, move
> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
> protected.
>
> The backtrace is as follows:
> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
>
> CC: Fabiano Rosas <farosas@suse.de>
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
queued (and cc stable), thanks.
--
Peter Xu
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 2:56 [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed Li Zhijian via
` (2 preceding siblings ...)
2024-04-17 16:41 ` Peter Xu
@ 2024-04-28 12:39 ` Michael Tokarev
2024-04-28 12:58 ` Peter Xu
2024-04-28 12:42 ` Michael Tokarev
4 siblings, 1 reply; 10+ messages in thread
From: Michael Tokarev @ 2024-04-28 12:39 UTC (permalink / raw)
To: Li Zhijian, zhanghailiang, peterx, farosas
Cc: qemu-devel, chen.zhang, Wen Congyang, Xie Changlong
17.04.2024 05:56, Li Zhijian via wrote:
> bdrv_activate_all() should not be called from the coroutine context, move
> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
> protected.
>
> The backtrace is as follows:
> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
>
> CC: Fabiano Rosas <farosas@suse.de>
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Commit 2b3912f135 is in 8.2 (v8.1.0-1575-g2b3912f135). Is this fix supposed to go
to stable-8.2 series? The prob here is that in 8.2, there's no bql_lock/unlock.
I guess it should use qemu_mutex_lock_iothread() instead, for before
195801d700c008 "system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()".
Thanks,
/mjt
> diff --git a/migration/colo.c b/migration/colo.c
> index 84632a603e..5600a43d78 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque)
> return NULL;
> }
>
> + /* Make sure all file formats throw away their mutable metadata */
> + bql_lock();
> + bdrv_activate_all(&local_err);
> + if (local_err) {
> + bql_unlock();
> + error_report_err(local_err);
> + return NULL;
> + }
> + bql_unlock();
> +
> failover_init_state();
>
> mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
> @@ -922,7 +932,6 @@ out:
> int coroutine_fn colo_incoming_co(void)
> {
> MigrationIncomingState *mis = migration_incoming_get_current();
> - Error *local_err = NULL;
> QemuThread th;
>
> assert(bql_locked());
> @@ -931,13 +940,6 @@ int coroutine_fn colo_incoming_co(void)
> return 0;
> }
>
> - /* Make sure all file formats throw away their mutable metadata */
> - bdrv_activate_all(&local_err);
> - if (local_err) {
> - error_report_err(local_err);
> - return -EINVAL;
> - }
> -
> qemu_thread_create(&th, "COLO incoming", colo_process_incoming_thread,
> mis, QEMU_THREAD_JOINABLE);
>
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-28 12:39 ` Michael Tokarev
@ 2024-04-28 12:58 ` Peter Xu
0 siblings, 0 replies; 10+ messages in thread
From: Peter Xu @ 2024-04-28 12:58 UTC (permalink / raw)
To: Michael Tokarev
Cc: Li Zhijian, zhanghailiang, farosas, qemu-devel, chen.zhang,
Wen Congyang, Xie Changlong
On Sun, Apr 28, 2024 at 03:39:09PM +0300, Michael Tokarev wrote:
> 17.04.2024 05:56, Li Zhijian via wrote:
> > bdrv_activate_all() should not be called from the coroutine context, move
> > it to the QEMU thread colo_process_incoming_thread() with the bql_lock
> > protected.
> >
> > The backtrace is as follows:
> > #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260
> > #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259
> > #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906
> > #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935
> > #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793
> > #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175
> > #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
> >
> > CC: Fabiano Rosas <farosas@suse.de>
> > Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277
> > Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK")
> > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>
> Commit 2b3912f135 is in 8.2 (v8.1.0-1575-g2b3912f135). Is this fix supposed to go
> to stable-8.2 series?
Yes we probably should. I've copied qemu-stable when queuing this patch
for that:
https://lore.kernel.org/r/20240423223813.3237060-27-peterx@redhat.com
> The prob here is that in 8.2, there's no bql_lock/unlock.
> I guess it should use qemu_mutex_lock_iothread() instead, for before
> 195801d700c008 "system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()".
Right, some further care needs to be taken on this one indeed.
> 17.04.2024 05:56, Li Zhijian via wrote:
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque)
> > return NULL;
> > }
> > + /* Make sure all file formats throw away their mutable metadata */
> > + bql_lock();
> > + bdrv_activate_all(&local_err);
> > + if (local_err) {
> > + bql_unlock();
> > + error_report_err(local_err);
> > + return NULL;
> > + }
> > + bql_unlock();
>
> FWIW, this can be simplified as follows:
>
> bql_lock();
> bdrv_activate_all(&local_err);
> bql_unlock();
> if (local_err) {
> error_report_err(local_err);
> return NULL;
> }
>
> (I know it is already too late)
Agree. :)
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
2024-04-17 2:56 [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed Li Zhijian via
` (3 preceding siblings ...)
2024-04-28 12:39 ` Michael Tokarev
@ 2024-04-28 12:42 ` Michael Tokarev
4 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2024-04-28 12:42 UTC (permalink / raw)
To: Li Zhijian, zhanghailiang, peterx, farosas
Cc: qemu-devel, chen.zhang, Wen Congyang, Xie Changlong
17.04.2024 05:56, Li Zhijian via wrote:
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque)
> return NULL;
> }
>
> + /* Make sure all file formats throw away their mutable metadata */
> + bql_lock();
> + bdrv_activate_all(&local_err);
> + if (local_err) {
> + bql_unlock();
> + error_report_err(local_err);
> + return NULL;
> + }
> + bql_unlock();
FWIW, this can be simplified as follows:
bql_lock();
bdrv_activate_all(&local_err);
bql_unlock();
if (local_err) {
error_report_err(local_err);
return NULL;
}
(I know it is already too late)
/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt
^ permalink raw reply [flat|nested] 10+ messages in thread