* [Qemu-devel] AIO error case @ 2018-05-22 22:01 Nishanth Aravamudan 2018-05-23 17:53 ` John Snow 0 siblings, 1 reply; 4+ messages in thread From: Nishanth Aravamudan @ 2018-05-22 22:01 UTC (permalink / raw) To: qemu-devel Hi! I'm tracking an error case in the native AIO path, and was wondering if there was a latent (albeit possibly hard to hit) bug. Specifically util/async.c::aio_get_linux_aio: #ifdef CONFIG_LINUX_AIO LinuxAioState *aio_get_linux_aio(AioContext *ctx) { if (!ctx->linux_aio) { ctx->linux_aio = laio_init(); laio_attach_aio_context(ctx->linux_aio, ctx); } return ctx->linux_aio; } #endif laio_init() can in certain conditions return NULL, but that's not checked here and then the NULL result is passed directly into laio_attach_aio_context, which dereferences it without checking that the pointer is valid. I'm not sure what is appropriate if laio_init() returns NULL, returning NULL back to the caller of aio_get_linux_aio() has its own issues, because those callers don't seem to check its return value either. Thanks in advance! -Nish ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] AIO error case 2018-05-22 22:01 [Qemu-devel] AIO error case Nishanth Aravamudan @ 2018-05-23 17:53 ` John Snow 2018-05-23 18:25 ` Nishanth Aravamudan 0 siblings, 1 reply; 4+ messages in thread From: John Snow @ 2018-05-23 17:53 UTC (permalink / raw) To: Nishanth Aravamudan, qemu-devel, Qemu-block On 05/22/2018 06:01 PM, Nishanth Aravamudan via Qemu-devel wrote: > Hi! > Hi! CCing qemu-block@nongnu.org; > I'm tracking an error case in the native AIO path, and was wondering if > there was a latent (albeit possibly hard to hit) bug. Specifically > util/async.c::aio_get_linux_aio: > > #ifdef CONFIG_LINUX_AIO > LinuxAioState *aio_get_linux_aio(AioContext *ctx) > { > if (!ctx->linux_aio) { > ctx->linux_aio = laio_init(); > laio_attach_aio_context(ctx->linux_aio, ctx); > } > return ctx->linux_aio; > } > #endif > > laio_init() can in certain conditions return NULL, but that's not checked > here and then the NULL result is passed directly into > laio_attach_aio_context, which dereferences it without checking that the > pointer is valid. > Looks like a good old-fashioned bug to me: `` if (event_notifier_init(&s->e, false) < 0) { goto out_free_state; } if (io_setup(MAX_EVENTS, &s->ctx) != 0) { goto out_close_efd; } ... out_close_efd: event_notifier_cleanup(&s->e); out_free_state: g_free(s); return NULL; `` event_notifier_init looks like it is... usually(?) going to return -errno on error, and the man page for io_setup suggests that the libaio wrapper for io_setup is supposed to return -errno if it fails; we could probably hold onto that error code. We probably ought to forward those error codes up to the caller; maybe: int laio_init(LinuxAioState **linux_aio); > I'm not sure what is appropriate if laio_init() returns NULL, returning > NULL back to the caller of aio_get_linux_aio() has its own issues, because > those callers don't seem to check its return value either. > If laio_init can return NULL, aio_get_linux_aio needs to check for that. Then callers of that ought to be amended to return some kind of error: raw_co_prw can return whatever laio_init returned as an error code. raw_aio_plug doesn't have a way to communicate errors, yet... raw_aio_unplug doesn't have an error reporting mechanism either. Those last two are defined here: .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, but it looks like bdrv_io_un|plug doesn't return errors either. bdrv_io_plug is called by both blk_io_plug and itself, but has no other callers. blk_io_plug is called in two places: virtio_scsi_handle_cmd_req_prepare, which can report -errno error codes (so we can amend blk_io_plug and bdrv_io_plug to just bubble up errors), and virtio_blk_handle_vq -- which doesn't appear to be capable of handling an error, just indicating "progress". unplug has three usages, one in virtio-blk and two in virtio-scsi, all of which only appear to check for "progress" and don't seem to have explicit error reporting. Probably unplug can't actually fail like this though -- I'm assuming by the name that it must occur after a call to plug and that will have memoized the call to aio_get_linux_aio by then. So... probably what we need to do here is take a look at the virtio-blk usage of plug and do error-plumbing as necessary. Wanna send a patch? --js > Thanks in advance! > -Nish > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] AIO error case 2018-05-23 17:53 ` John Snow @ 2018-05-23 18:25 ` Nishanth Aravamudan 2018-05-23 18:27 ` John Snow 0 siblings, 1 reply; 4+ messages in thread From: Nishanth Aravamudan @ 2018-05-23 18:25 UTC (permalink / raw) To: John Snow; +Cc: qemu-devel, Qemu-block On Wed, May 23, 2018 at 10:53 AM, John Snow <jsnow@redhat.com> wrote: > > > > On 05/22/2018 06:01 PM, Nishanth Aravamudan via Qemu-devel wrote: > > Hi! > > > > Hi! CCing qemu-block@nongnu.org; > > > I'm tracking an error case in the native AIO path, and was wondering if > > there was a latent (albeit possibly hard to hit) bug. Specifically > > util/async.c::aio_get_linux_aio: > > > > #ifdef CONFIG_LINUX_AIO > > LinuxAioState *aio_get_linux_aio(AioContext *ctx) > > { > > if (!ctx->linux_aio) { > > ctx->linux_aio = laio_init(); > > laio_attach_aio_context(ctx->linux_aio, ctx); > > } > > return ctx->linux_aio; > > } > > #endif > > > > laio_init() can in certain conditions return NULL, but that's not checked > > here and then the NULL result is passed directly into > > laio_attach_aio_context, which dereferences it without checking that the > > pointer is valid. > > > > Looks like a good old-fashioned bug to me: Agreed! <snip> > Wanna send a patch? Yep I'll work on this over the next few days. Thanks for reply! -Nish ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] AIO error case 2018-05-23 18:25 ` Nishanth Aravamudan @ 2018-05-23 18:27 ` John Snow 0 siblings, 0 replies; 4+ messages in thread From: John Snow @ 2018-05-23 18:27 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: qemu-devel, Qemu-block On 05/23/2018 02:25 PM, Nishanth Aravamudan wrote: > On Wed, May 23, 2018 at 10:53 AM, John Snow <jsnow@redhat.com > <mailto:jsnow@redhat.com>> wrote: >> >> >> >> On 05/22/2018 06:01 PM, Nishanth Aravamudan via Qemu-devel wrote: >> > Hi! >> > >> >> Hi! CCing qemu-block@nongnu.org <mailto:qemu-block@nongnu.org>; >> >> > I'm tracking an error case in the native AIO path, and was wondering if >> > there was a latent (albeit possibly hard to hit) bug. Specifically >> > util/async.c::aio_get_linux_aio: >> > >> > #ifdef CONFIG_LINUX_AIO >> > LinuxAioState *aio_get_linux_aio(AioContext *ctx) >> > { >> > if (!ctx->linux_aio) { >> > ctx->linux_aio = laio_init(); >> > laio_attach_aio_context(ctx->linux_aio, ctx); >> > } >> > return ctx->linux_aio; >> > } >> > #endif >> > >> > laio_init() can in certain conditions return NULL, but that's not > checked >> > here and then the NULL result is passed directly into >> > laio_attach_aio_context, which dereferences it without checking that the >> > pointer is valid. >> > >> >> Looks like a good old-fashioned bug to me: > > > Agreed! > > <snip> > >> Wanna send a patch? > > Yep I'll work on this over the next few days. Thanks for reply! > > -Nish I looked at plug and unplug and it really looks like -- apart from the memoization of aio_get_linux_aio that might fail -- there's nothing in those calls that is expected to actually break. Might be saner to try to force the memoization earlier for virtio-blk and virtio-scsi and test the return at that time; then just assert that aio_get_linux_aio actually returns non-null in calls like un/plug that cannot fail. Should save you a lot of rewiring work. --js ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-05-23 18:27 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-05-22 22:01 [Qemu-devel] AIO error case Nishanth Aravamudan 2018-05-23 17:53 ` John Snow 2018-05-23 18:25 ` Nishanth Aravamudan 2018-05-23 18:27 ` John Snow
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).