From: Cornelia Huck <cornelia.huck@de.ibm.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
Fam Zheng <famz@redhat.com>,
qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
Date: Tue, 8 Dec 2015 13:00:08 +0100 [thread overview]
Message-ID: <20151208130008.5d0fc318.cornelia.huck@de.ibm.com> (raw)
In-Reply-To: <20151208095954.GD5071@noname.str.redhat.com>
On Tue, 8 Dec 2015 10:59:54 +0100
Kevin Wolf <kwolf@redhat.com> wrote:
> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
> > On Mon, 7 Dec 2015 11:02:51 +0100
> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> >
> > > On Thu, 3 Dec 2015 13:00:00 +0800
> > > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > > From: Fam Zheng <famz@redhat.com>
> > > >
> > > > The assertion problem was noticed in 06c3916b35a, but it wasn't
> > > > completely fixed, because even though the req is not marked as
> > > > serialising, it still gets serialised by wait_serialising_requests
> > > > against other serialising requests, which could lead to the same
> > > > assertion failure.
> > > >
> > > > Fix it by even more explicitly skipping the serialising for this
> > > > specific case.
> > > >
> > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > > Message-id: 1448962590-2842-2-git-send-email-famz@redhat.com
> > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > ---
> > > > block/backup.c | 2 +-
> > > > block/io.c | 12 +++++++-----
> > > > include/block/block.h | 4 ++--
> > > > trace-events | 2 +-
> > > > 4 files changed, 11 insertions(+), 9 deletions(-)
> > >
> > > This one causes segfaults for me:
> > >
> > > Program received signal SIGSEGV, Segmentation fault.
> > > bdrv_is_inserted (bs=0x800000000000) at /data/git/yyy/qemu/block.c:3071
> > > 3071 if (!drv) {
> > >
> > > (gdb) bt
> > > #0 bdrv_is_inserted (bs=0x800000000000) at /data/git/yyy/qemu/block.c:3071
>
> This looks like some kind of memory corruption that hit blk->bs. It's
> most definitely not a valid pointer anyway.
>
> > > #1 0x0000000080216974 in blk_is_inserted (blk=<optimized out>)
> > > at /data/git/yyy/qemu/block/block-backend.c:986
> > > #2 0x00000000802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
> > > at /data/git/yyy/qemu/block/block-backend.c:991
> > > #3 0x0000000080216d12 in blk_check_byte_request (blk=blk@entry=0x3ffb17e7960,
> > > offset=offset@entry=4928966656, size=16384)
> > > at /data/git/yyy/qemu/block/block-backend.c:558
> > > #4 0x0000000080216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960,
> > > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
> > > at /data/git/yyy/qemu/block/block-backend.c:589
> > > #5 0x0000000080217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
> > > 9626888, iov=0x8098c658, nb_sectors=<optimized out>, cb=
> > > 0x80081150 <virtio_blk_rw_complete>, opaque=0x80980620)
> > > at /data/git/yyy/qemu/block/block-backend.c:727
> > > #6 0x000000008008186e in submit_requests (niov=<optimized out>,
> > > num_reqs=<optimized out>, start=<optimized out>, mrb=<optimized out>,
> > > blk=<optimized out>) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
> > > #7 virtio_blk_submit_multireq (mrb=<optimized out>, blk=<optimized out>)
> > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
> > > #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffffffeb58)
> > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
> > > #9 0x00000000800823ee in virtio_blk_handle_output (vdev=<optimized out>,
> > > vq=<optimized out>) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
> > > #10 0x00000000801e367e in aio_dispatch (ctx=0x80918520)
> > > at /data/git/yyy/qemu/aio-posix.c:326
> > > #11 0x00000000801d28b0 in aio_ctx_dispatch (source=<optimized out>,
> > > callback=<optimized out>, user_data=<optimized out>)
> > > at /data/git/yyy/qemu/async.c:231
> > > #12 0x000003fffd36a05a in g_main_context_dispatch ()
> > > from /lib64/libglib-2.0.so.0
> > > #13 0x00000000801e0ffa in glib_pollfds_poll ()
> > > at /data/git/yyy/qemu/main-loop.c:211
> > > #14 os_host_main_loop_wait (timeout=<optimized out>)
> > > at /data/git/yyy/qemu/main-loop.c:256
> > > #15 main_loop_wait (nonblocking=<optimized out>)
> > > at /data/git/yyy/qemu/main-loop.c:504
> > > #16 0x00000000800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
> > > #17 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
> > > at /data/git/yyy/qemu/vl.c:4684
> > >
> > > Relevant part of command line:
> > >
> > > -drive file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none -device virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
> >
> > I played around a bit. The main part of this change seems to be calling
> > wait_serialising_requests() conditionally; reverting this makes the
> > guest boot again.
> >
> > I then tried to find out when wait_serialising_requests() was NOT
> > called and added fprintfs: well, it was _always_ called. I then added a
> > fprintf for flags at the beginning of the function: this produced a
> > segfault no matter whether wait_serialising_requests() was called
> > conditionally or unconditionally. Weird race?
> >
> > Anything further I can do? I guess this patch fixes a bug for someone,
> > but it means insta-death for my setup...
>
> If it happens immediately, perhaps running under valgrind is possible
> and could give some hints about what happened with blk->bs?
Just a quick update: This triggers on a qemu built with a not-so-fresh
gcc 4.7.2 (and it seems to depend on compiler optimizations as well).
We can't trigger it with newer gccs. No hints from valgrind, sadly.
Investigation ongoing.
next prev parent reply other threads:[~2015-12-08 12:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-03 4:59 [Qemu-devel] [PULL for-2.5 0/4] Block patches Stefan Hajnoczi
2015-12-03 4:59 ` [Qemu-devel] [PULL for-2.5 1/4] iothread: include id in thread name Stefan Hajnoczi
2015-12-07 9:58 ` Cornelia Huck
2015-12-03 5:00 ` [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests Stefan Hajnoczi
2015-12-07 10:02 ` Cornelia Huck
2015-12-07 16:42 ` Cornelia Huck
2015-12-08 2:08 ` Fam Zheng
2015-12-08 9:59 ` Kevin Wolf
2015-12-08 12:00 ` Cornelia Huck [this message]
2015-12-08 12:30 ` Christian Borntraeger
2015-12-08 13:28 ` Christian Borntraeger
2015-12-08 13:45 ` Kevin Wolf
2015-12-08 13:58 ` Christian Borntraeger
2015-12-08 14:03 ` Christian Borntraeger
2015-12-08 14:10 ` Kevin Wolf
2015-12-08 14:24 ` Christian Borntraeger
2015-12-08 14:38 ` Cornelia Huck
2015-12-08 14:15 ` Peter Maydell
2015-12-03 5:00 ` [Qemu-devel] [PULL for-2.5 3/4] iotests: Add "add_drive_raw" method Stefan Hajnoczi
2015-12-03 5:00 ` [Qemu-devel] [PULL for-2.5 4/4] iotests: Add regresion test case for write notifier assertion failure Stefan Hajnoczi
2015-12-03 11:53 ` [Qemu-devel] [PULL for-2.5 0/4] Block patches Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151208130008.5d0fc318.cornelia.huck@de.ibm.com \
--to=cornelia.huck@de.ibm.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).