From: Kevin Wolf <kwolf@redhat.com>
To: Kangjie Xi <imxikangjie@gmail.com>
Cc: qemu-devel@nongnu.org, mreitz@redhat.com
Subject: Re: [Qemu-devel] About [PULL 20/25] block: Guard against NULL bs->drv
Date: Wed, 6 Dec 2017 10:12:43 +0100 [thread overview]
Message-ID: <20171206091243.GC4207@localhost.localdomain> (raw)
In-Reply-To: <CAJqJJggWTki8i_U7xr2oRSFR4sZt4eK5hRO+B55iOYVMyfGwDg@mail.gmail.com>
Am 06.12.2017 um 08:28 hat Kangjie Xi geschrieben:
> Hi,
>
> I encountered a qemu-nbd segfault, finally I found it was caused by
> NULL bs-drv, which is located in block/io.c function bdrv_co_flush
> line 2377:
>
> https://git.qemu.org/?p=qemu.git;a=blob;f=block/io.c;h=4fdf93a0144fa4761a14b8cc6b2a9a6b6e5d5bec;hb=d470ad42acfc73c45d3e8ed5311a491160b4c100#l2377
>
> It is before the patch at line 2402, so the patch needs to be updated
> to fix NULL bs-drv at line 2337.
>
> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg03425.html
Can you please post a full backtrace? Do you see any error message
on stderr before the process crashes?
I don't see at the moment how this can happen, except the case that Max
mentioned where bs->drv = NULL is set when an image corruption is
detected - this involves an error message, though.
We check bdrv_is_inserted() as the first thing, which includes a NULL
check for bs->drv. So it must have been non-NULL at the start of the
function and then become NULL. I suppose this can theoretically happen
in qemu_co_queue_wait() if another flush request detects image
corruption.
Max: I think bs->drv = NULL in the middle of a request was a stupid
idea. In fact, it's already a stupid idea to have any BDS with
bs->drv = NULL. Maybe it would be better to schedule a BH that replaces
the qcow2 node with a dummy node (null-co?) and properly closes the
qcow2 one.
Kevin
> > @@ -2373,6 +2399,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
> > }
> >
> > BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
> > + if (!bs->drv) {
> > + /* bs->drv->bdrv_co_flush() might have ejected the BDS
> > + * (even in case of apparent success) */
> > + ret = -ENOMEDIUM;
> > + goto out;
> > + }
> > if (bs->drv->bdrv_co_flush_to_disk) {
> > ret = bs->drv->bdrv_co_flush_to_disk(bs);
> > } else if (bs->drv->bdrv_aio_flush) {
>
> I have tested the latest qemu-2.11.0-rc2 and I am sure the qemu-nbd
> segfault is caused by NULL bs-drv in block/io.c line 2337.
>
> kernel: qemu-nbd[18768]: segfault at f8 ip 000055a24f7536a7 sp
> 00007f59b1137a40 error 4 in qemu-nbd[55a24f6d1000+188000]
>
> However I have no methods to reproduce the segfault manually, the
> qemu-nbd segfaut just occurs in my server cluster every week.
>
> Thanks
> -Kangjie
next prev parent reply other threads:[~2017-12-06 9:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-06 7:28 [Qemu-devel] About [PULL 20/25] block: Guard against NULL bs->drv Kangjie Xi
2017-12-06 9:12 ` Kevin Wolf [this message]
2017-12-06 10:08 ` Kangjie Xi
2017-12-08 13:39 ` Max Reitz
2017-12-08 13:51 ` Kevin Wolf
2017-12-08 14:02 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171206091243.GC4207@localhost.localdomain \
--to=kwolf@redhat.com \
--cc=imxikangjie@gmail.com \
--cc=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.