* [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
@ 2019-01-02 14:16 Laurent Vivier
2019-01-02 14:28 ` Philippe Mathieu-Daudé
2019-01-07 23:02 ` Paolo Bonzini
0 siblings, 2 replies; 7+ messages in thread
From: Laurent Vivier @ 2019-01-02 14:16 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson, Peter Crosthwaite
We can have a race condition between qemu_cpu_kick_thread() and
qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
qemu_cpu_kick_thread() can try to kick a thread that is exiting.
pthread_kill() returns an error and qemu is stopped by an exit(1).
qemu:qemu_cpu_kick_thread: No such process
We can ignore safely this error.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
cpus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/cpus.c b/cpus.c
index 0ddeeefc14..4717490bd0 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
}
cpu->thread_kicked = true;
err = pthread_kill(cpu->thread->thread, SIG_IPI);
- if (err) {
+ if (err && err != ESRCH) {
fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
exit(1);
}
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
2019-01-02 14:16 [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread() Laurent Vivier
@ 2019-01-02 14:28 ` Philippe Mathieu-Daudé
2019-01-07 23:02 ` Paolo Bonzini
1 sibling, 0 replies; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-01-02 14:28 UTC (permalink / raw)
To: Laurent Vivier, qemu-devel
Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson
On 1/2/19 3:16 PM, Laurent Vivier wrote:
> We can have a race condition between qemu_cpu_kick_thread() and
> qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
> qemu_cpu_kick_thread() can try to kick a thread that is exiting.
> pthread_kill() returns an error and qemu is stopped by an exit(1).
>
> qemu:qemu_cpu_kick_thread: No such process
>
> We can ignore safely this error.
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> cpus.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/cpus.c b/cpus.c
> index 0ddeeefc14..4717490bd0 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
> }
> cpu->thread_kicked = true;
> err = pthread_kill(cpu->thread->thread, SIG_IPI);
> - if (err) {
> + if (err && err != ESRCH) {
> fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
> exit(1);
> }
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
2019-01-02 14:16 [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread() Laurent Vivier
2019-01-02 14:28 ` Philippe Mathieu-Daudé
@ 2019-01-07 23:02 ` Paolo Bonzini
2019-01-08 18:47 ` Laurent Vivier
2019-01-15 19:59 ` Emilio G. Cota
1 sibling, 2 replies; 7+ messages in thread
From: Paolo Bonzini @ 2019-01-07 23:02 UTC (permalink / raw)
To: Laurent Vivier, qemu-devel; +Cc: Richard Henderson, Peter Crosthwaite
On 02/01/19 15:16, Laurent Vivier wrote:
> We can have a race condition between qemu_cpu_kick_thread() and
> qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
> qemu_cpu_kick_thread() can try to kick a thread that is exiting.
> pthread_kill() returns an error and qemu is stopped by an exit(1).
>
> qemu:qemu_cpu_kick_thread: No such process
>
> We can ignore safely this error.
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
> cpus.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/cpus.c b/cpus.c
> index 0ddeeefc14..4717490bd0 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
> }
> cpu->thread_kicked = true;
> err = pthread_kill(cpu->thread->thread, SIG_IPI);
> - if (err) {
> + if (err && err != ESRCH) {
> fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
> exit(1);
> }
>
You could in principle be sending the signal to another thread, so the
fix is a bit hackish. However, I don't have a better idea that is not
racy. :(
The problem is that qemu_cpu_kick does not use any spinlock or mutex to
synchronize against cpu_remove_sync's qemu_thread_join. I think once
the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you
do not need to reset cpu->thread_kicked anymore, but I don't think
that's enough to fix it.
Paolo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
2019-01-07 23:02 ` Paolo Bonzini
@ 2019-01-08 18:47 ` Laurent Vivier
2019-01-15 16:34 ` Laurent Vivier
2019-01-15 19:59 ` Emilio G. Cota
1 sibling, 1 reply; 7+ messages in thread
From: Laurent Vivier @ 2019-01-08 18:47 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: Richard Henderson, Peter Crosthwaite
On 08/01/2019 00:02, Paolo Bonzini wrote:
> On 02/01/19 15:16, Laurent Vivier wrote:
>> We can have a race condition between qemu_cpu_kick_thread() and
>> qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
>> qemu_cpu_kick_thread() can try to kick a thread that is exiting.
>> pthread_kill() returns an error and qemu is stopped by an exit(1).
>>
>> qemu:qemu_cpu_kick_thread: No such process
>>
>> We can ignore safely this error.
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>> cpus.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index 0ddeeefc14..4717490bd0 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
>> }
>> cpu->thread_kicked = true;
>> err = pthread_kill(cpu->thread->thread, SIG_IPI);
>> - if (err) {
>> + if (err && err != ESRCH) {
>> fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
>> exit(1);
>> }
>>
>
> You could in principle be sending the signal to another thread, so the
> fix is a bit hackish. However, I don't have a better idea that is not
> racy. :(
>
> The problem is that qemu_cpu_kick does not use any spinlock or mutex to
> synchronize against cpu_remove_sync's qemu_thread_join. I think once
> the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you
> do not need to reset cpu->thread_kicked anymore, but I don't think
> that's enough to fix it.
Will you take the patch through one of your pull requests or should I
add it to the trivial-patches branch?
Thanks,
Laurent
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
2019-01-08 18:47 ` Laurent Vivier
@ 2019-01-15 16:34 ` Laurent Vivier
2019-01-15 18:11 ` Paolo Bonzini
0 siblings, 1 reply; 7+ messages in thread
From: Laurent Vivier @ 2019-01-15 16:34 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: Peter Crosthwaite, Richard Henderson
On 08/01/2019 19:47, Laurent Vivier wrote:
> On 08/01/2019 00:02, Paolo Bonzini wrote:
>> On 02/01/19 15:16, Laurent Vivier wrote:
>>> We can have a race condition between qemu_cpu_kick_thread() and
>>> qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
>>> qemu_cpu_kick_thread() can try to kick a thread that is exiting.
>>> pthread_kill() returns an error and qemu is stopped by an exit(1).
>>>
>>> qemu:qemu_cpu_kick_thread: No such process
>>>
>>> We can ignore safely this error.
>>>
>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>> ---
>>> cpus.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/cpus.c b/cpus.c
>>> index 0ddeeefc14..4717490bd0 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
>>> }
>>> cpu->thread_kicked = true;
>>> err = pthread_kill(cpu->thread->thread, SIG_IPI);
>>> - if (err) {
>>> + if (err && err != ESRCH) {
>>> fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
>>> exit(1);
>>> }
>>>
>>
>> You could in principle be sending the signal to another thread, so the
>> fix is a bit hackish. However, I don't have a better idea that is not
>> racy. :(
>>
>> The problem is that qemu_cpu_kick does not use any spinlock or mutex to
>> synchronize against cpu_remove_sync's qemu_thread_join. I think once
>> the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you
>> do not need to reset cpu->thread_kicked anymore, but I don't think
>> that's enough to fix it.
>
> Will you take the patch through one of your pull requests or should I
> add it to the trivial-patches branch?
Paolo?
Thanks,
Laurent
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
2019-01-15 16:34 ` Laurent Vivier
@ 2019-01-15 18:11 ` Paolo Bonzini
0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2019-01-15 18:11 UTC (permalink / raw)
To: Laurent Vivier, qemu-devel; +Cc: Peter Crosthwaite, Richard Henderson
On 15/01/19 17:34, Laurent Vivier wrote:
> On 08/01/2019 19:47, Laurent Vivier wrote:
>> On 08/01/2019 00:02, Paolo Bonzini wrote:
>>> On 02/01/19 15:16, Laurent Vivier wrote:
>>>> We can have a race condition between qemu_cpu_kick_thread() and
>>>> qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
>>>> qemu_cpu_kick_thread() can try to kick a thread that is exiting.
>>>> pthread_kill() returns an error and qemu is stopped by an exit(1).
>>>>
>>>> qemu:qemu_cpu_kick_thread: No such process
>>>>
>>>> We can ignore safely this error.
>>>>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>> ---
>>>> cpus.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/cpus.c b/cpus.c
>>>> index 0ddeeefc14..4717490bd0 100644
>>>> --- a/cpus.c
>>>> +++ b/cpus.c
>>>> @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
>>>> }
>>>> cpu->thread_kicked = true;
>>>> err = pthread_kill(cpu->thread->thread, SIG_IPI);
>>>> - if (err) {
>>>> + if (err && err != ESRCH) {
>>>> fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
>>>> exit(1);
>>>> }
>>>>
>>>
>>> You could in principle be sending the signal to another thread, so the
>>> fix is a bit hackish. However, I don't have a better idea that is not
>>> racy. :(
>>>
>>> The problem is that qemu_cpu_kick does not use any spinlock or mutex to
>>> synchronize against cpu_remove_sync's qemu_thread_join. I think once
>>> the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you
>>> do not need to reset cpu->thread_kicked anymore, but I don't think
>>> that's enough to fix it.
>>
>> Will you take the patch through one of your pull requests or should I
>> add it to the trivial-patches branch?
>
> Paolo?
I queued it now, thanks.
Paolo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread()
2019-01-07 23:02 ` Paolo Bonzini
2019-01-08 18:47 ` Laurent Vivier
@ 2019-01-15 19:59 ` Emilio G. Cota
1 sibling, 0 replies; 7+ messages in thread
From: Emilio G. Cota @ 2019-01-15 19:59 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Laurent Vivier, qemu-devel, Peter Crosthwaite, Richard Henderson
On Tue, Jan 08, 2019 at 00:02:36 +0100, Paolo Bonzini wrote:
> On 02/01/19 15:16, Laurent Vivier wrote:
> > We can have a race condition between qemu_cpu_kick_thread() and
> > qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
> > qemu_cpu_kick_thread() can try to kick a thread that is exiting.
> > pthread_kill() returns an error and qemu is stopped by an exit(1).
> >
> > qemu:qemu_cpu_kick_thread: No such process
> >
> > We can ignore safely this error.
> >
> > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > ---
> > cpus.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/cpus.c b/cpus.c
> > index 0ddeeefc14..4717490bd0 100644
> > --- a/cpus.c
> > +++ b/cpus.c
> > @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
> > }
> > cpu->thread_kicked = true;
> > err = pthread_kill(cpu->thread->thread, SIG_IPI);
> > - if (err) {
> > + if (err && err != ESRCH) {
> > fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
> > exit(1);
> > }
> >
>
> You could in principle be sending the signal to another thread, so the
> fix is a bit hackish. However, I don't have a better idea that is not
> racy. :(
>
> The problem is that qemu_cpu_kick does not use any spinlock or mutex to
> synchronize against cpu_remove_sync's qemu_thread_join. I think once
> the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you
> do not need to reset cpu->thread_kicked anymore, but I don't think
> that's enough to fix it.
I think the per-cpu lock series[1] can help here. For instance, in
qemu_cpu_kick_thread we can acquire the CPU lock, then check cpu->unplug.
If it's set, then we don't send the signal, because the thread is on its
way out. If it isn't set, then we send the signal while still holding the
CPU lock. This guarantees that the thread exists, since cpu_remove_sync
will acquire the CPU lock to set cpu->unplug.
Thanks,
Emilio
[1] https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02979.html
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-01-15 19:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-02 14:16 [Qemu-devel] [PATCH] cpus: ignore ESRCH in qemu_cpu_kick_thread() Laurent Vivier
2019-01-02 14:28 ` Philippe Mathieu-Daudé
2019-01-07 23:02 ` Paolo Bonzini
2019-01-08 18:47 ` Laurent Vivier
2019-01-15 16:34 ` Laurent Vivier
2019-01-15 18:11 ` Paolo Bonzini
2019-01-15 19:59 ` Emilio G. Cota
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).