qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Re: [Qemu-devel] qemu 2.0 segfaults in event notifier
       [not found]   ` <606EBA1F-638A-487D-8551-8D183D79937E@profihost.ag>
@ 2014-06-02 13:40     ` Stefan Hajnoczi
  2014-06-02 14:22       ` Stefan Priebe - Profihost AG
  2014-06-02 19:32       ` Stefan Priebe
  0 siblings, 2 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2014-06-02 13:40 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: famz@redhat.com, qemu-devel, qemu-stable@nongnu.org

On Fri, May 30, 2014 at 04:10:39PM +0200, Stefan Priebe wrote:
> even with
> +From 271c0f68b4eae72691721243a1c37f46a3232d61 Mon Sep 17 00:00:00 2001
> +From: Fam Zheng <famz@redhat.com>
> +Date: Wed, 21 May 2014 10:42:13 +0800
> +Subject: [PATCH] aio: Fix use-after-free in cancellation path
> 
> applied i saw today segfault with the following backtrace:
> 
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007f9dd633343f in event_notifier_set (e=0x124) at util/event_notifier-posix.c:97
> 97      util/event_notifier-posix.c: No such file or directory.
> (gdb) bt
> #0  0x00007f9dd633343f in event_notifier_set (e=0x124) at util/event_notifier-posix.c:97
> #1  0x00007f9dd5f4eafc in aio_notify (ctx=0x0) at async.c:246
> #2  0x00007f9dd5f4e697 in qemu_bh_schedule (bh=0x7f9b98eeeb30) at async.c:128
> #3  0x00007f9dd5fa2c44 in rbd_finish_aiocb (c=0x7f9dd9069ad0, rcb=0x7f9dd85f1770) at block/rbd.c:585

Hi Stefan,
Please print the QEMUBH:
(gdb) p *(QEMUBH*)0x7f9b98eeeb30

It would also be interesting to print out the qemu_aio_context->first_bh
linked list of QEMUBH structs to check whether 0x7f9b98eeeb30 is on the
list.

The aio_bh_new() and aio_bh_schedule() APIs are supposed to be
thread-safe.  In theory the rbd.c code is fine.  But maybe there is a
race condition somewhere.

If you want to debug interactively, ping me on #qemu on irc.oftc.net.

Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] qemu 2.0 segfaults in event notifier
  2014-06-02 13:40     ` [Qemu-devel] qemu 2.0 segfaults in event notifier Stefan Hajnoczi
@ 2014-06-02 14:22       ` Stefan Priebe - Profihost AG
  2014-06-02 19:32       ` Stefan Priebe
  1 sibling, 0 replies; 6+ messages in thread
From: Stefan Priebe - Profihost AG @ 2014-06-02 14:22 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: famz@redhat.com, qemu-devel, qemu-stable@nongnu.org


> Am 02.06.2014 um 15:40 schrieb Stefan Hajnoczi <stefanha@gmail.com>:
> 
>> On Fri, May 30, 2014 at 04:10:39PM +0200, Stefan Priebe wrote:
>> even with
>> +From 271c0f68b4eae72691721243a1c37f46a3232d61 Mon Sep 17 00:00:00 2001
>> +From: Fam Zheng <famz@redhat.com>
>> +Date: Wed, 21 May 2014 10:42:13 +0800
>> +Subject: [PATCH] aio: Fix use-after-free in cancellation path
>> 
>> applied i saw today segfault with the following backtrace:
>> 
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x00007f9dd633343f in event_notifier_set (e=0x124) at util/event_notifier-posix.c:97
>> 97      util/event_notifier-posix.c: No such file or directory.
>> (gdb) bt
>> #0  0x00007f9dd633343f in event_notifier_set (e=0x124) at util/event_notifier-posix.c:97
>> #1  0x00007f9dd5f4eafc in aio_notify (ctx=0x0) at async.c:246
>> #2  0x00007f9dd5f4e697 in qemu_bh_schedule (bh=0x7f9b98eeeb30) at async.c:128
>> #3  0x00007f9dd5fa2c44 in rbd_finish_aiocb (c=0x7f9dd9069ad0, rcb=0x7f9dd85f1770) at block/rbd.c:585
> 
> Hi Stefan,
> Please print the QEMUBH:
> (gdb) p *(QEMUBH*)0x7f9b98eeeb30
> 
> It would also be interesting to print out the qemu_aio_context->first_bh
> linked list of QEMUBH structs to check whether 0x7f9b98eeeb30 is on the
> list.
> 
> The aio_bh_new() and aio_bh_schedule() APIs are supposed to be
> thread-safe.  In theory the rbd.c code is fine.  But maybe there is a
> race condition somewhere.
> 
> If you want to debug interactively, ping me on #qemu on irc.oftc.net.

Hi,

that would be great what's your username? On trip right now. Will be on irc in 4-5 hours or tomorrow in 16 hours.

Greets,
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] qemu 2.0 segfaults in event notifier
  2014-06-02 13:40     ` [Qemu-devel] qemu 2.0 segfaults in event notifier Stefan Hajnoczi
  2014-06-02 14:22       ` Stefan Priebe - Profihost AG
@ 2014-06-02 19:32       ` Stefan Priebe
  2014-06-02 20:45         ` Paolo Bonzini
  2014-06-03  9:14         ` Stefan Hajnoczi
  1 sibling, 2 replies; 6+ messages in thread
From: Stefan Priebe @ 2014-06-02 19:32 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: famz@redhat.com, qemu-devel, qemu-stable@nongnu.org

Am 02.06.2014 15:40, schrieb Stefan Hajnoczi:
> On Fri, May 30, 2014 at 04:10:39PM +0200, Stefan Priebe wrote:
>> even with
>> +From 271c0f68b4eae72691721243a1c37f46a3232d61 Mon Sep 17 00:00:00 2001
>> +From: Fam Zheng <famz@redhat.com>
>> +Date: Wed, 21 May 2014 10:42:13 +0800
>> +Subject: [PATCH] aio: Fix use-after-free in cancellation path
>>
>> applied i saw today segfault with the following backtrace:
>>
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x00007f9dd633343f in event_notifier_set (e=0x124) at util/event_notifier-posix.c:97
>> 97      util/event_notifier-posix.c: No such file or directory.
>> (gdb) bt
>> #0  0x00007f9dd633343f in event_notifier_set (e=0x124) at util/event_notifier-posix.c:97
>> #1  0x00007f9dd5f4eafc in aio_notify (ctx=0x0) at async.c:246
>> #2  0x00007f9dd5f4e697 in qemu_bh_schedule (bh=0x7f9b98eeeb30) at async.c:128
>> #3  0x00007f9dd5fa2c44 in rbd_finish_aiocb (c=0x7f9dd9069ad0, rcb=0x7f9dd85f1770) at block/rbd.c:585
>
> Hi Stefan,
> Please print the QEMUBH:
> (gdb) p *(QEMUBH*)0x7f9b98eeeb30

new trace:
(gdb) bt
#0  0x00007f69e421c43f in event_notifier_set (e=0x124) at 
util/event_notifier-posix.c:97
#1  0x00007f69e3e37afc in aio_notify (ctx=0x0) at async.c:246
#2  0x00007f69e3e37697 in qemu_bh_schedule (bh=0x7f5dac217f60) at 
async.c:128
#3  0x00007f69e3e8bc44 in rbd_finish_aiocb (c=0x7f5dac0c3f30, 
rcb=0x7f5dafa50610) at block/rbd.c:585
#4  0x00007f69e17bee44 in librbd::AioCompletion::complete() () from 
/usr/lib/librbd.so.1
#5  0x00007f69e17be832 in 
librbd::AioCompletion::complete_request(CephContext*, long) () from 
/usr/lib/librbd.so.1
#6  0x00007f69e1c946ba in Context::complete(int) () from 
/usr/lib/librados.so.2
#7  0x00007f69e17f1e85 in ObjectCacher::C_WaitForWrite::finish(int) () 
from /usr/lib/librbd.so.1
#8  0x00007f69e1c946ba in Context::complete(int) () from 
/usr/lib/librados.so.2
#9  0x00007f69e1d373c8 in Finisher::finisher_thread_entry() () from 
/usr/lib/librados.so.2
#10 0x00007f69dbd43b50 in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007f69dba8e13d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#12 0x0000000000000000 in ?? ()

this i another core dump so address differ:
(gdb) p *(QEMUBH*)0x7f5dac217f60
$1 = {ctx = 0x0, cb = 0x7f69e3e8bb75 <rbd_finish_bh>, opaque = 
0x7f5dafa50610, next = 0x7f69e6b04d10, scheduled = false,
   idle = false, deleted = true}

> It would also be interesting to print out the qemu_aio_context->first_bh
> linked list of QEMUBH structs to check whether 0x7f9b98eeeb30 is on the
> list.

Do you mean just this:
(gdb) p *(QEMUBH*)qemu_aio_context->first_bh
$3 = {ctx = 0x7f69e68a4e00, cb = 0x7f69e41546a5 <virtio_net_tx_bh>, 
opaque = 0x7f69e6b4a5e0, next = 0x7f69e6b4a570,
   scheduled = false, idle = false, deleted = false}

> The aio_bh_new() and aio_bh_schedule() APIs are supposed to be
> thread-safe.  In theory the rbd.c code is fine.  But maybe there is a
> race condition somewhere.

rbd.c was fine with 1.7.0

Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] qemu 2.0 segfaults in event notifier
  2014-06-02 19:32       ` Stefan Priebe
@ 2014-06-02 20:45         ` Paolo Bonzini
  2014-06-02 20:57           ` Stefan Priebe
  2014-06-03  9:14         ` Stefan Hajnoczi
  1 sibling, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2014-06-02 20:45 UTC (permalink / raw)
  To: Stefan Priebe, Stefan Hajnoczi
  Cc: famz@redhat.com, qemu-devel, qemu-stable@nongnu.org

Il 02/06/2014 21:32, Stefan Priebe ha scritto:
>
> #0  0x00007f69e421c43f in event_notifier_set (e=0x124) at
> util/event_notifier-posix.c:97
> #1  0x00007f69e3e37afc in aio_notify (ctx=0x0) at async.c:246
> #2  0x00007f69e3e37697 in qemu_bh_schedule (bh=0x7f5dac217f60) at
> async.c:128
> #3  0x00007f69e3e8bc44 in rbd_finish_aiocb (c=0x7f5dac0c3f30,
> rcb=0x7f5dafa50610) at block/rbd.c:585
> #4  0x00007f69e17bee44 in librbd::AioCompletion::complete() () from
> /usr/lib/librbd.so.1
> #5  0x00007f69e17be832 in
> librbd::AioCompletion::complete_request(CephContext*, long) () from
> /usr/lib/librbd.so.1
> #6  0x00007f69e1c946ba in Context::complete(int) () from
> /usr/lib/librados.so.2
> #7  0x00007f69e17f1e85 in ObjectCacher::C_WaitForWrite::finish(int) ()
> from /usr/lib/librbd.so.1
> #8  0x00007f69e1c946ba in Context::complete(int) () from
> /usr/lib/librados.so.2
> #9  0x00007f69e1d373c8 in Finisher::finisher_thread_entry() () from
> /usr/lib/librados.so.2
> #10 0x00007f69dbd43b50 in start_thread () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> #11 0x00007f69dba8e13d in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #12 0x0000000000000000 in ?? ()

Can you also print qemu_aio_context?  Also print the backtrace of all 
threads, using "thread apply all bt full".

Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] qemu 2.0 segfaults in event notifier
  2014-06-02 20:45         ` Paolo Bonzini
@ 2014-06-02 20:57           ` Stefan Priebe
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Priebe @ 2014-06-02 20:57 UTC (permalink / raw)
  To: Paolo Bonzini, Stefan Hajnoczi
  Cc: famz@redhat.com, qemu-devel, qemu-stable@nongnu.org

Am 02.06.2014 22:45, schrieb Paolo Bonzini:
> Il 02/06/2014 21:32, Stefan Priebe ha scritto:
>>
>> #0  0x00007f69e421c43f in event_notifier_set (e=0x124) at
>> util/event_notifier-posix.c:97
>> #1  0x00007f69e3e37afc in aio_notify (ctx=0x0) at async.c:246
>> #2  0x00007f69e3e37697 in qemu_bh_schedule (bh=0x7f5dac217f60) at
>> async.c:128
>> #3  0x00007f69e3e8bc44 in rbd_finish_aiocb (c=0x7f5dac0c3f30,
>> rcb=0x7f5dafa50610) at block/rbd.c:585
>> #4  0x00007f69e17bee44 in librbd::AioCompletion::complete() () from
>> /usr/lib/librbd.so.1
>> #5  0x00007f69e17be832 in
>> librbd::AioCompletion::complete_request(CephContext*, long) () from
>> /usr/lib/librbd.so.1
>> #6  0x00007f69e1c946ba in Context::complete(int) () from
>> /usr/lib/librados.so.2
>> #7  0x00007f69e17f1e85 in ObjectCacher::C_WaitForWrite::finish(int) ()
>> from /usr/lib/librbd.so.1
>> #8  0x00007f69e1c946ba in Context::complete(int) () from
>> /usr/lib/librados.so.2
>> #9  0x00007f69e1d373c8 in Finisher::finisher_thread_entry() () from
>> /usr/lib/librados.so.2
>> #10 0x00007f69dbd43b50 in start_thread () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> #11 0x00007f69dba8e13d in clone () from /lib/x86_64-linux-gnu/libc.so.6
>> #12 0x0000000000000000 in ?? ()
>
> Can you also print qemu_aio_context?

(gdb) print  qemu_aio_context
$1 = (AioContext *) 0x7f69e68a4e00

(gdb) print *(AioContext*)0x7f69e68a4e00
$2 = {source = {callback_data = 0x0, callback_funcs = 0x0, source_funcs 
= 0x7f69e462d020, ref_count = 2, context = 0x7f69e68a5190,
     priority = 0, flags = 1, source_id = 1, poll_fds = 0x7f69e686aea0, 
prev = 0x0, next = 0x7f69e743ccd0, name = 0x0, priv = 0x0},
   lock = {lock = {lock = {__data = {__lock = 0, __count = 0, __owner = 
0, __nusers = 0, __kind = 2, __spins = 0, __list = {
             __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 
times>, "\002", '\000' <repeats 22 times>, __align = 0}},
     head = 0, tail = 0, cond = {cond = {__data = {__lock = 0, __futex = 
0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0,
           __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = 
'\000' <repeats 47 times>, __align = 0}}, owner_thread = {
       thread = 0}, nesting = 0, cb = 0x7f69e3e37b4f <aio_rfifolock_cb>, 
cb_opaque = 0x7f69e68a4e00}, aio_handlers = {
     lh_first = 0x7f69e68a4f60}, walking_handlers = 0, bh_lock = {lock = 
{__data = {__lock = 0, __count = 0, __owner = 0,
         __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, 
__next = 0x0}},
       __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 22 
times>, __align = 0}}, first_bh = 0x7f69e6b04d10,
   walking_bh = 0, notifier = {rfd = 4, wfd = 4}, pollfds = 
0x7f69e68a4630, thread_pool = 0x0, tlg = {tl = {0x7f69e68a4fa0,
       0x7f69e68a5010, 0x7f69e68a5080}}}

>  Also print the backtrace of all
> threads, using "thread apply all bt full".

http://pastebin.com/raw.php?i=uzcpN0zk

THanks,
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] qemu 2.0 segfaults in event notifier
  2014-06-02 19:32       ` Stefan Priebe
  2014-06-02 20:45         ` Paolo Bonzini
@ 2014-06-03  9:14         ` Stefan Hajnoczi
  1 sibling, 0 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2014-06-03  9:14 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: famz@redhat.com, qemu-devel, qemu-stable@nongnu.org

On Mon, Jun 02, 2014 at 09:32:55PM +0200, Stefan Priebe wrote:
> Am 02.06.2014 15:40, schrieb Stefan Hajnoczi:
> >On Fri, May 30, 2014 at 04:10:39PM +0200, Stefan Priebe wrote:
> new trace:
> (gdb) bt
> #0  0x00007f69e421c43f in event_notifier_set (e=0x124) at
> util/event_notifier-posix.c:97
> #1  0x00007f69e3e37afc in aio_notify (ctx=0x0) at async.c:246
> #2  0x00007f69e3e37697 in qemu_bh_schedule (bh=0x7f5dac217f60) at
> async.c:128
> #3  0x00007f69e3e8bc44 in rbd_finish_aiocb (c=0x7f5dac0c3f30,
> rcb=0x7f5dafa50610) at block/rbd.c:585
> #4  0x00007f69e17bee44 in librbd::AioCompletion::complete() () from
> /usr/lib/librbd.so.1
> #5  0x00007f69e17be832 in
> librbd::AioCompletion::complete_request(CephContext*, long) () from
> /usr/lib/librbd.so.1
> #6  0x00007f69e1c946ba in Context::complete(int) () from
> /usr/lib/librados.so.2
> #7  0x00007f69e17f1e85 in ObjectCacher::C_WaitForWrite::finish(int) () from
> /usr/lib/librbd.so.1
> #8  0x00007f69e1c946ba in Context::complete(int) () from
> /usr/lib/librados.so.2
> #9  0x00007f69e1d373c8 in Finisher::finisher_thread_entry() () from
> /usr/lib/librados.so.2
> #10 0x00007f69dbd43b50 in start_thread () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> #11 0x00007f69dba8e13d in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #12 0x0000000000000000 in ?? ()
> 
> this i another core dump so address differ:
> (gdb) p *(QEMUBH*)0x7f5dac217f60
> $1 = {ctx = 0x0, cb = 0x7f69e3e8bb75 <rbd_finish_bh>, opaque =
> 0x7f5dafa50610, next = 0x7f69e6b04d10, scheduled = false,
>   idle = false, deleted = true}

Thanks, this releaved the bug.

I will CC you on a fix.  Please try it out and reply with "Tested-by:
Stefan Priebe <s.priebe@profihost.ag>" if it works.

Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-03  9:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <53863BC6.3040108@profihost.ag>
     [not found] ` <53863C9A.4040905@profihost.ag>
     [not found]   ` <606EBA1F-638A-487D-8551-8D183D79937E@profihost.ag>
2014-06-02 13:40     ` [Qemu-devel] qemu 2.0 segfaults in event notifier Stefan Hajnoczi
2014-06-02 14:22       ` Stefan Priebe - Profihost AG
2014-06-02 19:32       ` Stefan Priebe
2014-06-02 20:45         ` Paolo Bonzini
2014-06-02 20:57           ` Stefan Priebe
2014-06-03  9:14         ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).