From: Kevin Wolf <kwolf@redhat.com>
To: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: keith.busch@intel.com, jsnow@redhat.com, qemu-block@nongnu.org,
qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com,
den@virtuozzo.com, famz@redhat.com, vsementsov@virtuozzo.com
Subject: Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section"
Date: Fri, 7 Dec 2018 13:26:47 +0100 [thread overview]
Message-ID: <20181207122647.GE5119@linux.fritz.box> (raw)
In-Reply-To: <20181205122326.26625-1-dplotnikov@virtuozzo.com>
Am 05.12.2018 um 13:23 hat Denis Plotnikov geschrieben:
> At the time, the "drained section" doesn't protect Block Driver State
> from the requests appearing in the vCPU threads.
> This could lead to the data loss because of request coming to
> an unexpected BDS.
>
> For example, when a request comes to ide controller from the guest,
> the controller creates a request coroutine and executes the coroutine
> in the vCPU thread. If another thread(iothread) has entered the
> "drained section" on a BDS with bdrv_drained_begin, which protects
> BDS' AioContext from external requests, and released the AioContext
> because of finishing some coroutine by the moment of the request
> appearing at the ide controller, the controller acquires the AioContext
> and executes its request without any respect to the entered
> "drained section" producing any kinds of data inconsistency.
>
> The patch prevents this case by putting requests from external threads to
> the queue on AioContext while the context is protected for external requests
> and executes those requests later on the external requests protection removing.
>
> Also, the patch marks requests generated in a vCPU thread as external ones
> to make use of the request postponing.
>
> How to reproduce:
> 1. start vm with an ide disk and a linux guest
> 2. in the guest run: dd if=... of=... bs=4K count=100000 oflag=direct
> 3. (qemu) drive_mirror "disk-name"
> 4. wait until block job can receive block_job_complete
> 5. (qemu) block_job_complete "disk-name"
> 6. blk_aio_p[read|write]v may appear in vCPU context (here is the race)
>
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
This is getting closer, but I'd like to see two more major changes:
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 0ca25dfec6..8512bda44e 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -19,6 +19,7 @@
> #include "qemu/event_notifier.h"
> #include "qemu/thread.h"
> #include "qemu/timer.h"
> +#include "qemu/coroutine.h"
>
> typedef struct BlockAIOCB BlockAIOCB;
> typedef void BlockCompletionFunc(void *opaque, int ret);
> @@ -130,6 +131,11 @@ struct AioContext {
> QEMUTimerListGroup tlg;
>
> int external_disable_cnt;
> + /* Queue to store the requests coming when the context is disabled for
> + * external requests.
> + * Don't use a separate lock for protection relying the context lock
> + */
> + CoQueue postponed_reqs;
Why involve the AioContext at all? This could all be kept at the
BlockBackend level without extending the layering violation that
aio_disable_external() is.
BlockBackends get notified when their root node is drained, so hooking
things up there should be as easy, if not even easier than in
AioContext.
> /* Number of AioHandlers without .io_poll() */
> int poll_disable_cnt;
> @@ -483,6 +489,15 @@ static inline void aio_timer_init(AioContext *ctx,
> */
> int64_t aio_compute_timeout(AioContext *ctx);
>
> +/**
> + * aio_co_enter:
> + * @ctx: the context to run the coroutine
> + * @co: the coroutine to run
> + *
> + * Enter a coroutine in the specified AioContext.
> + */
> +void aio_co_enter(AioContext *ctx, struct Coroutine *co);
> +
> /**
> * aio_disable_external:
> * @ctx: the aio context
> @@ -491,9 +506,17 @@ int64_t aio_compute_timeout(AioContext *ctx);
> */
> static inline void aio_disable_external(AioContext *ctx)
> {
> + aio_context_acquire(ctx);
> atomic_inc(&ctx->external_disable_cnt);
> + aio_context_release(ctx);
> }
This acquire/release pair looks rather useless?
> +static void run_postponed_co(void *opaque)
> +{
> + AioContext *ctx = (AioContext *) opaque;
> +
> + qemu_co_queue_restart_all(&ctx->postponed_reqs);
> +}
> /**
> * aio_enable_external:
> * @ctx: the aio context
> @@ -504,12 +527,17 @@ static inline void aio_enable_external(AioContext *ctx)
> {
> int old;
>
> + aio_context_acquire(ctx);
> old = atomic_fetch_dec(&ctx->external_disable_cnt);
> assert(old > 0);
> if (old == 1) {
> + Coroutine *co = qemu_coroutine_create(run_postponed_co, ctx);
> + aio_co_enter(ctx, co);
> +
> /* Kick event loop so it re-arms file descriptors */
> aio_notify(ctx);
> }
> + aio_context_release(ctx);
> }
>
> /**
> @@ -564,15 +592,6 @@ void aio_co_schedule(AioContext *ctx, struct Coroutine *co);
> */
> void aio_co_wake(struct Coroutine *co);
>
> -/**
> - * aio_co_enter:
> - * @ctx: the context to run the coroutine
> - * @co: the coroutine to run
> - *
> - * Enter a coroutine in the specified AioContext.
> - */
> -void aio_co_enter(AioContext *ctx, struct Coroutine *co);
> -
> /**
> * Return the AioContext whose event loop runs in the current thread.
> *
> diff --git a/include/block/block.h b/include/block/block.h
> index 7f5453b45b..0685a73975 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -83,8 +83,14 @@ typedef enum {
> */
> BDRV_REQ_SERIALISING = 0x80,
>
> + /*
> + * marks requests comming from external sources,
> + * e.g block requests from dma running in the vCPU thread
> + */
> + BDRV_REQ_EXTERNAL = 0x100,
I don't like this one because BlockBackends should be used _only_ from
external sources.
I know that this isn't quite true today and there are still block jobs
calling BlockBackend internally while handling a BDS request, but I hope
with Vladimir's backup patches going it, this can go away.
So I suggest that for the time being, we invert the flag and have a
BDRV_REQ_INTERNAL instead, which is only used for BlockBackend requests,
hopefully doesn't have to be used in many places and which can go away
eventually.
> /* Mask of valid flags */
> - BDRV_REQ_MASK = 0xff,
> + BDRV_REQ_MASK = 0xfff,
> } BdrvRequestFlags;
>
> typedef struct BlockSizes {
Kevin
next prev parent reply other threads:[~2018-12-07 12:27 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-05 12:23 [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section" Denis Plotnikov
2018-12-07 12:26 ` Kevin Wolf [this message]
2018-12-10 12:14 ` Denis Plotnikov
2018-12-10 12:25 ` Kevin Wolf
2018-12-11 16:55 ` Denis Plotnikov
2018-12-12 12:24 ` Kevin Wolf
2018-12-13 11:07 ` Denis Plotnikov
2018-12-13 12:20 ` Kevin Wolf
2018-12-14 11:54 ` Denis Plotnikov
2018-12-18 8:53 ` Denis Plotnikov
2019-01-09 8:18 ` Denis Plotnikov
2019-01-15 7:22 ` Denis Plotnikov
2019-01-17 12:57 ` [Qemu-devel] PING: " Denis Plotnikov
2019-01-17 14:23 ` Kevin Wolf
2019-01-18 7:43 ` Denis Plotnikov
[not found] ` <20190313160412.GF5167@linux.fritz.box>
2019-04-02 8:35 ` [Qemu-devel] " Denis Plotnikov
2019-04-09 10:01 ` Kevin Wolf
2019-04-09 10:01 ` Kevin Wolf
2019-06-21 9:16 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2019-06-21 9:59 ` Vladimir Sementsov-Ogievskiy
2019-06-24 9:46 ` Denis Plotnikov
2019-06-26 8:46 ` Denis Plotnikov
2019-06-28 12:32 ` Kevin Wolf
2019-07-02 14:41 ` Denis Plotnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181207122647.GE5119@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=den@virtuozzo.com \
--cc=dplotnikov@virtuozzo.com \
--cc=famz@redhat.com \
--cc=jsnow@redhat.com \
--cc=keith.busch@intel.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).