* [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
@ 2023-12-04 19:40 Philippe Mathieu-Daudé
2023-12-04 19:50 ` Stefan Hajnoczi
2023-12-06 14:17 ` Miguel Luis
0 siblings, 2 replies; 13+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-12-04 19:40 UTC (permalink / raw)
To: qemu-devel
Cc: Richard Henderson, Paolo Bonzini, Philippe Mathieu-Daudé,
Michal Suchánek, Stefan Hajnoczi
Unplugging vCPU triggers the following assertion in
tcg_register_thread():
796 void tcg_register_thread(void)
797 {
...
812 /* Claim an entry in tcg_ctxs */
813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
814 g_assert(n < tcg_max_ctxs);
Implement and use tcg_unregister_thread() so when a
vCPU is unplugged, the tcg_cur_ctxs refcount is
decremented.
Reported-by: Michal Suchánek <msuchanek@suse.de>
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
RFC: untested
Report: https://lore.kernel.org/qemu-devel/20231204183638.GZ9696@kitsune.suse.cz/
---
include/tcg/startup.h | 5 +++++
accel/tcg/tcg-accel-ops-mttcg.c | 1 +
accel/tcg/tcg-accel-ops-rr.c | 1 +
tcg/tcg.c | 17 +++++++++++++++++
4 files changed, 24 insertions(+)
diff --git a/include/tcg/startup.h b/include/tcg/startup.h
index f71305765c..520942a4a1 100644
--- a/include/tcg/startup.h
+++ b/include/tcg/startup.h
@@ -45,6 +45,11 @@ void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
*/
void tcg_register_thread(void);
+/**
+ * tcg_unregister_thread: Unregister this thread with the TCG runtime
+ */
+void tcg_unregister_thread(void);
+
/**
* tcg_prologue_init(): Generate the code for the TCG prologue
*
diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index fac80095bb..88d7427aad 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -120,6 +120,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
tcg_cpus_destroy(cpu);
qemu_mutex_unlock_iothread();
+ tcg_unregister_thread();
rcu_remove_force_rcu_notifier(&force_rcu.notifier);
rcu_unregister_thread();
return NULL;
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index 611932f3c3..c2af3aad21 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -302,6 +302,7 @@ static void *rr_cpu_thread_fn(void *arg)
rr_deal_with_unplugged_cpus();
}
+ tcg_unregister_thread();
rcu_remove_force_rcu_notifier(&force_rcu);
rcu_unregister_thread();
return NULL;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index d2ea22b397..5125342d70 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -781,11 +781,18 @@ static void alloc_tcg_plugin_context(TCGContext *s)
* modes.
*/
#ifdef CONFIG_USER_ONLY
+
void tcg_register_thread(void)
{
tcg_ctx = &tcg_init_ctx;
}
+
+void tcg_unregister_thread(void)
+{
+}
+
#else
+
void tcg_register_thread(void)
{
TCGContext *s = g_malloc(sizeof(*s));
@@ -814,6 +821,16 @@ void tcg_register_thread(void)
tcg_ctx = s;
}
+
+void tcg_unregister_thread(void)
+{
+ unsigned int n;
+
+ n = qatomic_fetch_dec(&tcg_cur_ctxs);
+ g_free(tcg_ctxs[n]);
+ qatomic_set(&tcg_ctxs[n], NULL);
+}
+
#endif /* !CONFIG_USER_ONLY */
/* pool based memory allocation */
--
2.41.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-04 19:40 [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread() Philippe Mathieu-Daudé
@ 2023-12-04 19:50 ` Stefan Hajnoczi
2023-12-04 20:09 ` Michal Suchánek
2023-12-06 14:17 ` Miguel Luis
1 sibling, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2023-12-04 19:50 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, Richard Henderson, Paolo Bonzini,
Michal Suchánek
On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> Unplugging vCPU triggers the following assertion in
Unplugging leaks the tcg context refcount but does not trigger the
assertion directly. Maybe clarify that by changing the wording:
"Plugging a vCPU after it has been unplugged triggers..."
> tcg_register_thread():
>
> 796 void tcg_register_thread(void)
> 797 {
> ...
> 812 /* Claim an entry in tcg_ctxs */
> 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
> 814 g_assert(n < tcg_max_ctxs);
>
> Implement and use tcg_unregister_thread() so when a
> vCPU is unplugged, the tcg_cur_ctxs refcount is
> decremented.
>
> Reported-by: Michal Suchánek <msuchanek@suse.de>
> Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> RFC: untested
> Report: https://lore.kernel.org/qemu-devel/20231204183638.GZ9696@kitsune.suse.cz/
> ---
> include/tcg/startup.h | 5 +++++
> accel/tcg/tcg-accel-ops-mttcg.c | 1 +
> accel/tcg/tcg-accel-ops-rr.c | 1 +
> tcg/tcg.c | 17 +++++++++++++++++
> 4 files changed, 24 insertions(+)
>
> diff --git a/include/tcg/startup.h b/include/tcg/startup.h
> index f71305765c..520942a4a1 100644
> --- a/include/tcg/startup.h
> +++ b/include/tcg/startup.h
> @@ -45,6 +45,11 @@ void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
> */
> void tcg_register_thread(void);
>
> +/**
> + * tcg_unregister_thread: Unregister this thread with the TCG runtime
> + */
> +void tcg_unregister_thread(void);
> +
> /**
> * tcg_prologue_init(): Generate the code for the TCG prologue
> *
> diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
> index fac80095bb..88d7427aad 100644
> --- a/accel/tcg/tcg-accel-ops-mttcg.c
> +++ b/accel/tcg/tcg-accel-ops-mttcg.c
> @@ -120,6 +120,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
>
> tcg_cpus_destroy(cpu);
> qemu_mutex_unlock_iothread();
> + tcg_unregister_thread();
> rcu_remove_force_rcu_notifier(&force_rcu.notifier);
> rcu_unregister_thread();
> return NULL;
> diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> index 611932f3c3..c2af3aad21 100644
> --- a/accel/tcg/tcg-accel-ops-rr.c
> +++ b/accel/tcg/tcg-accel-ops-rr.c
> @@ -302,6 +302,7 @@ static void *rr_cpu_thread_fn(void *arg)
> rr_deal_with_unplugged_cpus();
> }
>
> + tcg_unregister_thread();
> rcu_remove_force_rcu_notifier(&force_rcu);
> rcu_unregister_thread();
> return NULL;
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index d2ea22b397..5125342d70 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -781,11 +781,18 @@ static void alloc_tcg_plugin_context(TCGContext *s)
> * modes.
> */
> #ifdef CONFIG_USER_ONLY
> +
> void tcg_register_thread(void)
> {
> tcg_ctx = &tcg_init_ctx;
> }
> +
> +void tcg_unregister_thread(void)
> +{
> +}
> +
> #else
> +
> void tcg_register_thread(void)
> {
> TCGContext *s = g_malloc(sizeof(*s));
> @@ -814,6 +821,16 @@ void tcg_register_thread(void)
>
> tcg_ctx = s;
> }
> +
> +void tcg_unregister_thread(void)
> +{
> + unsigned int n;
> +
> + n = qatomic_fetch_dec(&tcg_cur_ctxs);
> + g_free(tcg_ctxs[n]);
> + qatomic_set(&tcg_ctxs[n], NULL);
> +}
tcg_ctxs[n] may not be our context, so this looks like it could free
another thread's context and lead to undefined behavior.
I haven't read the code so I can't suggest an alternative myself.
Stefan
> +
> #endif /* !CONFIG_USER_ONLY */
>
> /* pool based memory allocation */
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-04 19:50 ` Stefan Hajnoczi
@ 2023-12-04 20:09 ` Michal Suchánek
2023-12-04 23:02 ` Richard Henderson
0 siblings, 1 reply; 13+ messages in thread
From: Michal Suchánek @ 2023-12-04 20:09 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Philippe Mathieu-Daudé, qemu-devel, Richard Henderson,
Paolo Bonzini
On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
> On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> >
> > Unplugging vCPU triggers the following assertion in
>
> Unplugging leaks the tcg context refcount but does not trigger the
> assertion directly. Maybe clarify that by changing the wording:
>
> "Plugging a vCPU after it has been unplugged triggers..."
>
> > tcg_register_thread():
> >
> > 796 void tcg_register_thread(void)
> > 797 {
> > ...
> > 812 /* Claim an entry in tcg_ctxs */
> > 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
> > 814 g_assert(n < tcg_max_ctxs);
> >
> > Implement and use tcg_unregister_thread() so when a
> > vCPU is unplugged, the tcg_cur_ctxs refcount is
> > decremented.
> >
> > Reported-by: Michal Suchánek <msuchanek@suse.de>
> > Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
> > Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> > ---
> > RFC: untested
> > Report: https://lore.kernel.org/qemu-devel/20231204183638.GZ9696@kitsune.suse.cz/
> > ---
> > include/tcg/startup.h | 5 +++++
> > accel/tcg/tcg-accel-ops-mttcg.c | 1 +
> > accel/tcg/tcg-accel-ops-rr.c | 1 +
> > tcg/tcg.c | 17 +++++++++++++++++
> > 4 files changed, 24 insertions(+)
> >
> > diff --git a/include/tcg/startup.h b/include/tcg/startup.h
> > index f71305765c..520942a4a1 100644
> > --- a/include/tcg/startup.h
> > +++ b/include/tcg/startup.h
> > @@ -45,6 +45,11 @@ void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
> > */
> > void tcg_register_thread(void);
> >
> > +/**
> > + * tcg_unregister_thread: Unregister this thread with the TCG runtime
> > + */
> > +void tcg_unregister_thread(void);
> > +
> > /**
> > * tcg_prologue_init(): Generate the code for the TCG prologue
> > *
> > diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
> > index fac80095bb..88d7427aad 100644
> > --- a/accel/tcg/tcg-accel-ops-mttcg.c
> > +++ b/accel/tcg/tcg-accel-ops-mttcg.c
> > @@ -120,6 +120,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
> >
> > tcg_cpus_destroy(cpu);
> > qemu_mutex_unlock_iothread();
> > + tcg_unregister_thread();
> > rcu_remove_force_rcu_notifier(&force_rcu.notifier);
> > rcu_unregister_thread();
> > return NULL;
> > diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> > index 611932f3c3..c2af3aad21 100644
> > --- a/accel/tcg/tcg-accel-ops-rr.c
> > +++ b/accel/tcg/tcg-accel-ops-rr.c
> > @@ -302,6 +302,7 @@ static void *rr_cpu_thread_fn(void *arg)
> > rr_deal_with_unplugged_cpus();
> > }
> >
> > + tcg_unregister_thread();
> > rcu_remove_force_rcu_notifier(&force_rcu);
> > rcu_unregister_thread();
> > return NULL;
> > diff --git a/tcg/tcg.c b/tcg/tcg.c
> > index d2ea22b397..5125342d70 100644
> > --- a/tcg/tcg.c
> > +++ b/tcg/tcg.c
> > @@ -781,11 +781,18 @@ static void alloc_tcg_plugin_context(TCGContext *s)
> > * modes.
> > */
> > #ifdef CONFIG_USER_ONLY
> > +
> > void tcg_register_thread(void)
> > {
> > tcg_ctx = &tcg_init_ctx;
> > }
> > +
> > +void tcg_unregister_thread(void)
> > +{
> > +}
> > +
> > #else
> > +
> > void tcg_register_thread(void)
> > {
> > TCGContext *s = g_malloc(sizeof(*s));
> > @@ -814,6 +821,16 @@ void tcg_register_thread(void)
> >
> > tcg_ctx = s;
> > }
> > +
> > +void tcg_unregister_thread(void)
> > +{
> > + unsigned int n;
> > +
> > + n = qatomic_fetch_dec(&tcg_cur_ctxs);
> > + g_free(tcg_ctxs[n]);
> > + qatomic_set(&tcg_ctxs[n], NULL);
> > +}
>
> tcg_ctxs[n] may not be our context, so this looks like it could free
> another thread's context and lead to undefined behavior.
There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
well. That would require a bitmap of used threads contexts rather than a
counter, though.
Thanks
Michal
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-04 20:09 ` Michal Suchánek
@ 2023-12-04 23:02 ` Richard Henderson
2023-12-05 10:09 ` Michal Suchánek
0 siblings, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2023-12-04 23:02 UTC (permalink / raw)
To: Michal Suchánek, Stefan Hajnoczi
Cc: Philippe Mathieu-Daudé, qemu-devel, Paolo Bonzini
On 12/4/23 12:09, Michal Suchánek wrote:
> On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
>> On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>> +void tcg_unregister_thread(void)
>>> +{
>>> + unsigned int n;
>>> +
>>> + n = qatomic_fetch_dec(&tcg_cur_ctxs);
>>> + g_free(tcg_ctxs[n]);
>>> + qatomic_set(&tcg_ctxs[n], NULL);
>>> +}
>>
>> tcg_ctxs[n] may not be our context, so this looks like it could free
>> another thread's context and lead to undefined behavior.
Correct.
> There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
> well. That would require a bitmap of used threads contexts rather than a
> counter, though.
Or don't free the context at all, but re-use it when incrementing and tcg_ctxs[n] != null
(i.e. plugging in a repacement vcpu). After all, there can only be tcg_max_ctxs contexts.
r~
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-04 23:02 ` Richard Henderson
@ 2023-12-05 10:09 ` Michal Suchánek
2023-12-06 12:56 ` Michal Suchánek
0 siblings, 1 reply; 13+ messages in thread
From: Michal Suchánek @ 2023-12-05 10:09 UTC (permalink / raw)
To: Richard Henderson
Cc: Stefan Hajnoczi, Philippe Mathieu-Daudé, qemu-devel,
Paolo Bonzini
On Mon, Dec 04, 2023 at 03:02:45PM -0800, Richard Henderson wrote:
> On 12/4/23 12:09, Michal Suchánek wrote:
> > On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
> > > On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> > > > +void tcg_unregister_thread(void)
> > > > +{
> > > > + unsigned int n;
> > > > +
> > > > + n = qatomic_fetch_dec(&tcg_cur_ctxs);
> > > > + g_free(tcg_ctxs[n]);
> > > > + qatomic_set(&tcg_ctxs[n], NULL);
> > > > +}
> > >
> > > tcg_ctxs[n] may not be our context, so this looks like it could free
> > > another thread's context and lead to undefined behavior.
>
> Correct.
>
> > There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
> > well. That would require a bitmap of used threads contexts rather than a
> > counter, though.
>
> Or don't free the context at all, but re-use it when incrementing and
> tcg_ctxs[n] != null (i.e. plugging in a repacement vcpu). After all, there
> can only be tcg_max_ctxs contexts.
But you would not know which contexts are in use and which aren't without
tracking the association of contexts to CPUs.
Unless there is a cpu array somewhere and you can use the same index for
both to create the association.
Thanks
Michal
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-05 10:09 ` Michal Suchánek
@ 2023-12-06 12:56 ` Michal Suchánek
2023-12-06 14:29 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 13+ messages in thread
From: Michal Suchánek @ 2023-12-06 12:56 UTC (permalink / raw)
To: Richard Henderson
Cc: Stefan Hajnoczi, Philippe Mathieu-Daudé, qemu-devel,
Paolo Bonzini
On Tue, Dec 05, 2023 at 11:09:59AM +0100, Michal Suchánek wrote:
> On Mon, Dec 04, 2023 at 03:02:45PM -0800, Richard Henderson wrote:
> > On 12/4/23 12:09, Michal Suchánek wrote:
> > > On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
> > > > On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> > > > > +void tcg_unregister_thread(void)
> > > > > +{
> > > > > + unsigned int n;
> > > > > +
> > > > > + n = qatomic_fetch_dec(&tcg_cur_ctxs);
> > > > > + g_free(tcg_ctxs[n]);
> > > > > + qatomic_set(&tcg_ctxs[n], NULL);
> > > > > +}
> > > >
> > > > tcg_ctxs[n] may not be our context, so this looks like it could free
> > > > another thread's context and lead to undefined behavior.
> >
> > Correct.
> >
> > > There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
> > > well. That would require a bitmap of used threads contexts rather than a
> > > counter, though.
> >
> > Or don't free the context at all, but re-use it when incrementing and
> > tcg_ctxs[n] != null (i.e. plugging in a repacement vcpu). After all, there
> > can only be tcg_max_ctxs contexts.
>
> But you would not know which contexts are in use and which aren't without
> tracking the association of contexts to CPUs.
>
> Unless there is a cpu array somewhere and you can use the same index for
> both to create the association.
I tried to use cpu_index for correlating the tcg_ctx with cpu. I added
some asserts that only null contexts are allocated and non-null contexts
released but qemu crashes somewhere in tcg sometime after the guest gets
to switch root.
Thanks
Michal
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-04 19:40 [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread() Philippe Mathieu-Daudé
2023-12-04 19:50 ` Stefan Hajnoczi
@ 2023-12-06 14:17 ` Miguel Luis
2023-12-06 15:25 ` Michal Suchánek
2023-12-06 15:44 ` Alex Bennée
1 sibling, 2 replies; 13+ messages in thread
From: Miguel Luis @ 2023-12-06 14:17 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, qemu-devel
Cc: Richard Henderson, Paolo Bonzini, Michal Suchánek,
Stefan Hajnoczi
Hi!
On 04/12/2023 18:40, Philippe Mathieu-Daudé wrote:
> Unplugging vCPU triggers the following assertion in
> tcg_register_thread():
>
> 796 void tcg_register_thread(void)
> 797 {
> ...
> 812 /* Claim an entry in tcg_ctxs */
> 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
> 814 g_assert(n < tcg_max_ctxs);
>
> Implement and use tcg_unregister_thread() so when a
> vCPU is unplugged, the tcg_cur_ctxs refcount is
> decremented.
I've had addressed this issue before (posted at [1]) and had exercised
it with vCPU hotplug/unplug patches for ARM although I was not sure about what
would be needed to be done regarding plugins on the context of
tcg_unregister_thread. I guess they would need to be freed also?
> Reported-by: Michal Suchánek <msuchanek@suse.de>
> Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> RFC: untested
> Report: https://lore.kernel.org/qemu-devel/20231204183638.GZ9696@kitsune.suse.cz/
> ---
> include/tcg/startup.h | 5 +++++
> accel/tcg/tcg-accel-ops-mttcg.c | 1 +
> accel/tcg/tcg-accel-ops-rr.c | 1 +
> tcg/tcg.c | 17 +++++++++++++++++
> 4 files changed, 24 insertions(+)
>
> diff --git a/include/tcg/startup.h b/include/tcg/startup.h
> index f71305765c..520942a4a1 100644
> --- a/include/tcg/startup.h
> +++ b/include/tcg/startup.h
> @@ -45,6 +45,11 @@ void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
> */
> void tcg_register_thread(void);
>
> +/**
> + * tcg_unregister_thread: Unregister this thread with the TCG runtime
> + */
> +void tcg_unregister_thread(void);
> +
> /**
> * tcg_prologue_init(): Generate the code for the TCG prologue
> *
> diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
> index fac80095bb..88d7427aad 100644
> --- a/accel/tcg/tcg-accel-ops-mttcg.c
> +++ b/accel/tcg/tcg-accel-ops-mttcg.c
> @@ -120,6 +120,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
>
> tcg_cpus_destroy(cpu);
> qemu_mutex_unlock_iothread();
> + tcg_unregister_thread();
> rcu_remove_force_rcu_notifier(&force_rcu.notifier);
> rcu_unregister_thread();
> return NULL;
> diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> index 611932f3c3..c2af3aad21 100644
> --- a/accel/tcg/tcg-accel-ops-rr.c
> +++ b/accel/tcg/tcg-accel-ops-rr.c
> @@ -302,6 +302,7 @@ static void *rr_cpu_thread_fn(void *arg)
> rr_deal_with_unplugged_cpus();
> }
>
> + tcg_unregister_thread();
> rcu_remove_force_rcu_notifier(&force_rcu);
> rcu_unregister_thread();
> return NULL;
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index d2ea22b397..5125342d70 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -781,11 +781,18 @@ static void alloc_tcg_plugin_context(TCGContext *s)
> * modes.
> */
> #ifdef CONFIG_USER_ONLY
> +
> void tcg_register_thread(void)
> {
> tcg_ctx = &tcg_init_ctx;
> }
> +
> +void tcg_unregister_thread(void)
> +{
> +}
> +
> #else
> +
> void tcg_register_thread(void)
> {
> TCGContext *s = g_malloc(sizeof(*s));
> @@ -814,6 +821,16 @@ void tcg_register_thread(void)
>
> tcg_ctx = s;
> }
> +
> +void tcg_unregister_thread(void)
> +{
> + unsigned int n;
> +
> + n = qatomic_fetch_dec(&tcg_cur_ctxs);
> + g_free(tcg_ctxs[n]);
Is the above off-by-one?
> + qatomic_set(&tcg_ctxs[n], NULL);
> +}
> +
Thank you
Miguel
[1]: https://lore.kernel.org/qemu-devel/20230926103654.34424-5-salil.mehta@huawei.com/
> #endif /* !CONFIG_USER_ONLY */
>
> /* pool based memory allocation */
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-06 12:56 ` Michal Suchánek
@ 2023-12-06 14:29 ` Philippe Mathieu-Daudé
2023-12-06 15:15 ` Stefan Hajnoczi
0 siblings, 1 reply; 13+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-12-06 14:29 UTC (permalink / raw)
To: Michal Suchánek, Richard Henderson
Cc: Stefan Hajnoczi, qemu-devel, Paolo Bonzini
Hi Stefan,
On 6/12/23 13:56, Michal Suchánek wrote:
> On Tue, Dec 05, 2023 at 11:09:59AM +0100, Michal Suchánek wrote:
>> On Mon, Dec 04, 2023 at 03:02:45PM -0800, Richard Henderson wrote:
>>> On 12/4/23 12:09, Michal Suchánek wrote:
>>>> On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
>>>>> On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>>>>> +void tcg_unregister_thread(void)
>>>>>> +{
>>>>>> + unsigned int n;
>>>>>> +
>>>>>> + n = qatomic_fetch_dec(&tcg_cur_ctxs);
>>>>>> + g_free(tcg_ctxs[n]);
>>>>>> + qatomic_set(&tcg_ctxs[n], NULL);
>>>>>> +}
>>>>>
>>>>> tcg_ctxs[n] may not be our context, so this looks like it could free
>>>>> another thread's context and lead to undefined behavior.
>>>
>>> Correct.
>>>
>>>> There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
>>>> well. That would require a bitmap of used threads contexts rather than a
>>>> counter, though.
>>>
>>> Or don't free the context at all, but re-use it when incrementing and
>>> tcg_ctxs[n] != null (i.e. plugging in a repacement vcpu). After all, there
>>> can only be tcg_max_ctxs contexts.
>>
>> But you would not know which contexts are in use and which aren't without
>> tracking the association of contexts to CPUs.
>>
>> Unless there is a cpu array somewhere and you can use the same index for
>> both to create the association.
>
> I tried to use cpu_index for correlating the tcg_ctx with cpu. I added
> some asserts that only null contexts are allocated and non-null contexts
> released but qemu crashes somewhere in tcg sometime after the guest gets
> to switch root.
Since this isn't trivial and is a long standing issue, let's not
block the 8.2 release with it.
Regards,
Phil.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-06 14:29 ` Philippe Mathieu-Daudé
@ 2023-12-06 15:15 ` Stefan Hajnoczi
0 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2023-12-06 15:15 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: Michal Suchánek, Richard Henderson, qemu-devel,
Paolo Bonzini
On Wed, 6 Dec 2023 at 09:29, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> Hi Stefan,
>
> On 6/12/23 13:56, Michal Suchánek wrote:
> > On Tue, Dec 05, 2023 at 11:09:59AM +0100, Michal Suchánek wrote:
> >> On Mon, Dec 04, 2023 at 03:02:45PM -0800, Richard Henderson wrote:
> >>> On 12/4/23 12:09, Michal Suchánek wrote:
> >>>> On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
> >>>>> On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> >>>>>> +void tcg_unregister_thread(void)
> >>>>>> +{
> >>>>>> + unsigned int n;
> >>>>>> +
> >>>>>> + n = qatomic_fetch_dec(&tcg_cur_ctxs);
> >>>>>> + g_free(tcg_ctxs[n]);
> >>>>>> + qatomic_set(&tcg_ctxs[n], NULL);
> >>>>>> +}
> >>>>>
> >>>>> tcg_ctxs[n] may not be our context, so this looks like it could free
> >>>>> another thread's context and lead to undefined behavior.
> >>>
> >>> Correct.
> >>>
> >>>> There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
> >>>> well. That would require a bitmap of used threads contexts rather than a
> >>>> counter, though.
> >>>
> >>> Or don't free the context at all, but re-use it when incrementing and
> >>> tcg_ctxs[n] != null (i.e. plugging in a repacement vcpu). After all, there
> >>> can only be tcg_max_ctxs contexts.
> >>
> >> But you would not know which contexts are in use and which aren't without
> >> tracking the association of contexts to CPUs.
> >>
> >> Unless there is a cpu array somewhere and you can use the same index for
> >> both to create the association.
> >
> > I tried to use cpu_index for correlating the tcg_ctx with cpu. I added
> > some asserts that only null contexts are allocated and non-null contexts
> > released but qemu crashes somewhere in tcg sometime after the guest gets
> > to switch root.
>
> Since this isn't trivial and is a long standing issue, let's not
> block the 8.2 release with it.
Sounds good.
Thanks,
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-06 14:17 ` Miguel Luis
@ 2023-12-06 15:25 ` Michal Suchánek
2023-12-06 15:49 ` Miguel Luis
2023-12-06 15:44 ` Alex Bennée
1 sibling, 1 reply; 13+ messages in thread
From: Michal Suchánek @ 2023-12-06 15:25 UTC (permalink / raw)
To: Miguel Luis
Cc: Philippe Mathieu-Daudé, qemu-devel, Richard Henderson,
Paolo Bonzini, Stefan Hajnoczi
On Wed, Dec 06, 2023 at 01:17:08PM -0100, Miguel Luis wrote:
> Hi!
>
> On 04/12/2023 18:40, Philippe Mathieu-Daudé wrote:
> > Unplugging vCPU triggers the following assertion in
> > tcg_register_thread():
> >
> > 796 void tcg_register_thread(void)
> > 797 {
> > ...
> > 812 /* Claim an entry in tcg_ctxs */
> > 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
> > 814 g_assert(n < tcg_max_ctxs);
> >
> > Implement and use tcg_unregister_thread() so when a
> > vCPU is unplugged, the tcg_cur_ctxs refcount is
> > decremented.
>
>
> I've had addressed this issue before (posted at [1]) and had exercised
> it with vCPU hotplug/unplug patches for ARM although I was not sure about what
> would be needed to be done regarding plugins on the context of
> tcg_unregister_thread. I guess they would need to be freed also?
Doesn't it have the same problem that it will randomly free some context
which is not necessarily associated with the unplugged CPU?
Consider machine with 4 CPUs, they are likely added in order - cpu0
getting context0, cpu1 context1, etc.
Unplug CPU 1. Given that context 3 is top the would be unallocated by
the decrement, or am I missing something?
Thanks
Michal
>
> Thank you
>
> Miguel
>
> [1]: https://lore.kernel.org/qemu-devel/20230926103654.34424-5-salil.mehta@huawei.com/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-06 14:17 ` Miguel Luis
2023-12-06 15:25 ` Michal Suchánek
@ 2023-12-06 15:44 ` Alex Bennée
1 sibling, 0 replies; 13+ messages in thread
From: Alex Bennée @ 2023-12-06 15:44 UTC (permalink / raw)
To: Miguel Luis
Cc: Philippe Mathieu-Daudé, qemu-devel, Richard Henderson,
Paolo Bonzini, Michal Suchánek, Stefan Hajnoczi
Miguel Luis <miguel.luis@oracle.com> writes:
> Hi!
>
> On 04/12/2023 18:40, Philippe Mathieu-Daudé wrote:
>> Unplugging vCPU triggers the following assertion in
>> tcg_register_thread():
>>
>> 796 void tcg_register_thread(void)
>> 797 {
>> ...
>> 812 /* Claim an entry in tcg_ctxs */
>> 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
>> 814 g_assert(n < tcg_max_ctxs);
>>
>> Implement and use tcg_unregister_thread() so when a
>> vCPU is unplugged, the tcg_cur_ctxs refcount is
>> decremented.
>
>
> I've had addressed this issue before (posted at [1]) and had exercised
> it with vCPU hotplug/unplug patches for ARM although I was not sure about what
> would be needed to be done regarding plugins on the context of
> tcg_unregister_thread. I guess they would need to be freed also?
Good catch. They should indeed.
>
>
>> Reported-by: Michal Suchánek <msuchanek@suse.de>
>> Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> ---
<snip>
>> void tcg_register_thread(void)
>> {
>> TCGContext *s = g_malloc(sizeof(*s));
>> @@ -814,6 +821,16 @@ void tcg_register_thread(void)
>>
>> tcg_ctx = s;
>> }
>> +
>> +void tcg_unregister_thread(void)
>> +{
>> + unsigned int n;
>> +
>> + n = qatomic_fetch_dec(&tcg_cur_ctxs);
>> + g_free(tcg_ctxs[n]);
Perhaps a bit of re-factoring and we could have a tcg_alloc_context and
tcg_free_context to keep everything together?
>
>
> Is the above off-by-one?
>
>
>> + qatomic_set(&tcg_ctxs[n], NULL);
>> +}
>> +
>
> Thank you
>
> Miguel
>
> [1]: https://lore.kernel.org/qemu-devel/20230926103654.34424-5-salil.mehta@huawei.com/
>
>
>> #endif /* !CONFIG_USER_ONLY */
>>
>> /* pool based memory allocation */
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-06 15:25 ` Michal Suchánek
@ 2023-12-06 15:49 ` Miguel Luis
2023-12-06 16:29 ` Michal Suchánek
0 siblings, 1 reply; 13+ messages in thread
From: Miguel Luis @ 2023-12-06 15:49 UTC (permalink / raw)
To: Michal Suchánek
Cc: Philippe Mathieu-Daudé, qemu-devel@nongnu.org,
Richard Henderson, Paolo Bonzini, Stefan Hajnoczi
> On 6 Dec 2023, at 14:25, Michal Suchánek <msuchanek@suse.de> wrote:
>
> On Wed, Dec 06, 2023 at 01:17:08PM -0100, Miguel Luis wrote:
>> Hi!
>>
>> On 04/12/2023 18:40, Philippe Mathieu-Daudé wrote:
>>> Unplugging vCPU triggers the following assertion in
>>> tcg_register_thread():
>>>
>>> 796 void tcg_register_thread(void)
>>> 797 {
>>> ...
>>> 812 /* Claim an entry in tcg_ctxs */
>>> 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
>>> 814 g_assert(n < tcg_max_ctxs);
>>>
>>> Implement and use tcg_unregister_thread() so when a
>>> vCPU is unplugged, the tcg_cur_ctxs refcount is
>>> decremented.
>>
>>
>> I've had addressed this issue before (posted at [1]) and had exercised
>> it with vCPU hotplug/unplug patches for ARM although I was not sure about what
>> would be needed to be done regarding plugins on the context of
>> tcg_unregister_thread. I guess they would need to be freed also?
>
> Doesn't it have the same problem that it will randomly free some context
> which is not necessarily associated with the unplugged CPU?
>
> Consider machine with 4 CPUs, they are likely added in order - cpu0
> getting context0, cpu1 context1, etc.
>
> Unplug CPU 1. Given that context 3 is top the would be unallocated by
> the decrement, or am I missing something?
>
I think you’re right and I share of the same opinion that matching a tcg thread
to a vCPU would be handy to solve this and maybe sorting tcg_ctxs after
unregistering the thread.
Thank you
Miguel
> Thanks
>
> Michal
>
>>
>> Thank you
>>
>> Miguel
>>
>> [1]: https://lore.kernel.org/qemu-devel/20230926103654.34424-5-salil.mehta@huawei.com/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread()
2023-12-06 15:49 ` Miguel Luis
@ 2023-12-06 16:29 ` Michal Suchánek
0 siblings, 0 replies; 13+ messages in thread
From: Michal Suchánek @ 2023-12-06 16:29 UTC (permalink / raw)
To: Miguel Luis
Cc: Philippe Mathieu-Daudé, qemu-devel@nongnu.org,
Richard Henderson, Paolo Bonzini, Stefan Hajnoczi
On Wed, Dec 06, 2023 at 03:49:28PM +0000, Miguel Luis wrote:
>
>
> > On 6 Dec 2023, at 14:25, Michal Suchánek <msuchanek@suse.de> wrote:
> >
> > On Wed, Dec 06, 2023 at 01:17:08PM -0100, Miguel Luis wrote:
> >> Hi!
> >>
> >> On 04/12/2023 18:40, Philippe Mathieu-Daudé wrote:
> >>> Unplugging vCPU triggers the following assertion in
> >>> tcg_register_thread():
> >>>
> >>> 796 void tcg_register_thread(void)
> >>> 797 {
> >>> ...
> >>> 812 /* Claim an entry in tcg_ctxs */
> >>> 813 n = qatomic_fetch_inc(&tcg_cur_ctxs);
> >>> 814 g_assert(n < tcg_max_ctxs);
> >>>
> >>> Implement and use tcg_unregister_thread() so when a
> >>> vCPU is unplugged, the tcg_cur_ctxs refcount is
> >>> decremented.
> >>
> >>
> >> I've had addressed this issue before (posted at [1]) and had exercised
> >> it with vCPU hotplug/unplug patches for ARM although I was not sure about what
> >> would be needed to be done regarding plugins on the context of
> >> tcg_unregister_thread. I guess they would need to be freed also?
> >
> > Doesn't it have the same problem that it will randomly free some context
> > which is not necessarily associated with the unplugged CPU?
> >
> > Consider machine with 4 CPUs, they are likely added in order - cpu0
> > getting context0, cpu1 context1, etc.
> >
> > Unplug CPU 1. Given that context 3 is top the would be unallocated by
> > the decrement, or am I missing something?
> >
>
> I think you’re right and I share of the same opinion that matching a tcg thread
> to a vCPU would be handy to solve this and maybe sorting tcg_ctxs after
> unregistering the thread.
Tried to apply the patch. It does not crash right away, and due to virsh
limitation I get only one (8-thread) core to hotplug so it did survive a few
hotplug cycles. After a while of hotplugging it crashed, anyway.
Given the atomic_dec there is probably no expectation that the
processing is sequential.
Thanks
Michal
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-12-06 16:30 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-04 19:40 [RFC PATCH-for-8.2?] accel/tcg: Implement tcg_unregister_thread() Philippe Mathieu-Daudé
2023-12-04 19:50 ` Stefan Hajnoczi
2023-12-04 20:09 ` Michal Suchánek
2023-12-04 23:02 ` Richard Henderson
2023-12-05 10:09 ` Michal Suchánek
2023-12-06 12:56 ` Michal Suchánek
2023-12-06 14:29 ` Philippe Mathieu-Daudé
2023-12-06 15:15 ` Stefan Hajnoczi
2023-12-06 14:17 ` Miguel Luis
2023-12-06 15:25 ` Michal Suchánek
2023-12-06 15:49 ` Miguel Luis
2023-12-06 16:29 ` Michal Suchánek
2023-12-06 15:44 ` Alex Bennée
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).