All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bin Wu <wu.wubin@huawei.com>
To: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, famz@redhat.com, boby.chen@huawei.com,
	subo7@huawei.com, kathy.wangting@huawei.com,
	rudy.zhangmin@huawei.com, arei.gonglei@huawei.com,
	stefanha@redhat.com, bruce.fon@huawei.com
Subject: Re: [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue
Date: Tue, 10 Feb 2015 08:55:50 +0800	[thread overview]
Message-ID: <54D95716.8010300@huawei.com> (raw)
In-Reply-To: <54D87957.4060303@redhat.com>

On 2015/2/9 17:09, Paolo Bonzini wrote:
> 
> 
> On 09/02/2015 07:50, Bin Wu wrote:
>> From: Bin Wu <wu.wubin@huawei.com>
>>
>> We tested VMs migration with their disk images by drive_mirror. With
>> migration, two VMs copyed large files between each other. During the
>> test, a segfault occured. The stack was as follow:
>>
>> (gdb) bt
>> qemu-coroutine-lock.c:66
>> to=0x7fa5a1798648) at qemu-coroutine.c:97
>> request=0x7fa28c2ffa10, reply=0x7fa28c2ffa30, qiov=0x0, offset=0) at
>> block/nbd-client.c:165
>> sector_num=8552704, nb_sectors=2040, qiov=0x7fa5a1757468, offset=0) at
>> block/nbd-client.c:262
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468) at
>> block/nbd-client.c:296
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/nbd.c:291
>> req=0x7fa28c2ffbb0, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3471
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block.c:3480
>> nb_sectors=2048, qiov=0x7fa5a1757468) at block/raw_bsd.c:62
>> req=0x7fa28c2ffe30, offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468,
>> flags=0) at block.c:3321
>> offset=4378984448, bytes=1048576, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3447
>> sector_num=8552704, nb_sectors=2048, qiov=0x7fa5a1757468, flags=(unknown: 0)) at
>> block.c:3471
>> coroutine-ucontext.c:121
>>
>> After analyzing the stack and reviewing the code, we find the
>> qemu_co_queue_run_restart should not be put in the coroutine_swap function which
>> can be invoked by qemu_coroutine_enter or qemu_coroutine_yield. Only
>> qemu_coroutine_enter needs to restart the co_queue.
>>
>> The error scenario is as follow: coroutine C1 enters C2, C2 yields
>> back to C1, then C1 ternimates and the related coroutine memory
>> becomes invalid. After a while, the C2 coroutine is entered again.
>> At this point, C1 is used as a parameter passed to
>> qemu_co_queue_run_restart. Therefore, qemu_co_queue_run_restart
>> accesses an invalid memory and a segfault error ocurrs.
>>
>> The qemu_co_queue_run_restart function re-enters coroutines waiting
>> in the co_queue. However, this function should be only used int the
>> qemu_coroutine_enter context. Only in this context, when the current
>> coroutine gets execution control again(after the execution of
>> qemu_coroutine_switch), we can restart the target coutine because the
>> target coutine has yielded back to the current coroutine or it has
>> terminated.
> 
> qemu_coroutine_yield can be executed for other reasons than locks.  In
> those cases, it is correct to call qemu_co_queue_run_restart.  I think
> it's an NBD bug.
> 
> Paolo
> 

Maybe I didn't describe the error scenario clearly, but it's a normal coroutine
using case, not a NBD bug. Please reference stefan's reply, thanks.

>> First we want to put qemu_co_queue_run_restart in qemu_coroutine_enter,
>> but we find we can not access the target coroutine if it terminates.
>>
>> Signed-off-by: Bin Wu <wu.wubin@huawei.com>
>> ---
>>  qemu-coroutine.c | 16 ++++++++++------
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/qemu-coroutine.c b/qemu-coroutine.c
>> index 525247b..cc0bdfa 100644
>> --- a/qemu-coroutine.c
>> +++ b/qemu-coroutine.c
>> @@ -99,29 +99,31 @@ static void coroutine_delete(Coroutine *co)
>>      qemu_coroutine_delete(co);
>>  }
>>  
>> -static void coroutine_swap(Coroutine *from, Coroutine *to)
>> +static CoroutineAction coroutine_swap(Coroutine *from, Coroutine *to)
>>  {
>>      CoroutineAction ret;
>>  
>>      ret = qemu_coroutine_switch(from, to, COROUTINE_YIELD);
>>  
>> -    qemu_co_queue_run_restart(to);
>> -
>>      switch (ret) {
>>      case COROUTINE_YIELD:
>> -        return;
>> +        break;
>>      case COROUTINE_TERMINATE:
>>          trace_qemu_coroutine_terminate(to);
>> +        qemu_co_queue_run_restart(to);
>>          coroutine_delete(to);
>> -        return;
>> +        break;
>>      default:
>>          abort();
>>      }
>> +
>> +    return ret;
>>  }
>>  
>>  void qemu_coroutine_enter(Coroutine *co, void *opaque)
>>  {
>>      Coroutine *self = qemu_coroutine_self();
>> +    CoroutineAction ret;
>>  
>>      trace_qemu_coroutine_enter(self, co, opaque);
>>  
>> @@ -132,7 +134,9 @@ void qemu_coroutine_enter(Coroutine *co, void *opaque)
>>  
>>      co->caller = self;
>>      co->entry_arg = opaque;
>> -    coroutine_swap(self, co);
>> +    ret = coroutine_swap(self, co);
>> +    if (ret == COROUTINE_YIELD)
>> +        qemu_co_queue_run_restart(co);
>>  }
>>  
>>  void coroutine_fn qemu_coroutine_yield(void)
>>
> 
> 
> .
> 

-- 
Bin Wu

  reply	other threads:[~2015-02-10  0:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-09  6:50 [Qemu-devel] [PATCH v2] qemu-coroutine: segfault when restarting co_queue Bin Wu
2015-02-09  9:09 ` Paolo Bonzini
2015-02-10  0:55   ` Bin Wu [this message]
2015-02-09  9:42 ` Kevin Wolf
2015-02-09 14:48 ` Stefan Hajnoczi
2015-02-10  0:51   ` Bin Wu
2015-02-10  3:16   ` Wen Congyang
2015-02-10  3:48     ` Bin Wu
2015-02-10  4:49       ` Wen Congyang
2015-02-10 10:13   ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54D95716.8010300@huawei.com \
    --to=wu.wubin@huawei.com \
    --cc=arei.gonglei@huawei.com \
    --cc=boby.chen@huawei.com \
    --cc=bruce.fon@huawei.com \
    --cc=famz@redhat.com \
    --cc=kathy.wangting@huawei.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rudy.zhangmin@huawei.com \
    --cc=stefanha@redhat.com \
    --cc=subo7@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.